Description
Hello guzba,
in avx.nim
lines 226-230 ::
226 func mm_permutevar_pd*(a: M128d, b: M128i): M256d {.importc: "_mm_permutevar_pd".}`
...
230 func mm_permutevar_pd*(a: M128d, b: M128i): M256d {.importc: "_mm_permutevar_pd".}
AFAIK should return the same vector-types that they received. These should be double/single-precision vectors of 128b..
func mm_permutevar_pd*(a: M128d, b: M128i): M128d {.importc: "_mm_permutevar_pd".}
...
func mm_permutevar_ps*(a: M128, b: M128i): M128 {.importc: "_mm_permutevar_ps".}
and if you don't mind - sure you will :) - one could fix the terrible Intel-naming just a bit by adding the missing ::
func mm_permutevar_epi32*(a, mask :M128i ): M128i =
mm_castsi128_ps(
mm_permutevar_ps( mm_castps_si128( a ), mask )
)
func mm_permutevar_epi64*(a, mask :M128i ): M128i =
mm_castsi128_pd(
mm_permutevar_pd( mm_castpd_si128( a ), mask )
)
Since everybd. has to add them anyways - after one got trapped... Same could/should be done for the 256bit
-sized vectors.
And don't get me wrong here - i just suggest to add what Intel has left out, but evbd. expects to find. But staying with the Intel-wording. Actually a permutevar_<type>
is a permute
-operation - well, many operation permute a vector. In this case it is a shuffle
-operation..
Maybe one could add a common_avx.nim
that adds those missing functions to make the intrinsics a bit more consistent ?
just my 20ct, greets Andreas
we are gettin' closer to nimsimd v2 :)