s390x: extracting an element at a non-const
index from a SIMD vector generates bad code
#137372
Labels
needs-triage
This issue may need triage. Remove it if it has been sufficiently triaged.
I'm trying to add an implementation of
vec_extract
for thes390x-unknown-linux-gnu
target in stdarch:https://godbolt.org/z/e65Mvf5vM
Turns into this LLVM
And generates the following assembly
in particular, the
vlgvf
(f
for fullword, there are variations for other widths) is the relevant instruction here. It extracts the value at the given index.Contrary to most other targets, the index argument to a
vec_extract
does not need to beconst
. Thestd::intrinsics::simd::simd_extract
function does need its index argument to be const, and therefore can't straightforwardly be used to implementvec_extract
.Attempt 1
I tried simple field extraction:
https://godbolt.org/z/sbhYj316x
(
extern "C"
is used so that the vector is passed by-value, but we see the same assembly when the vector is created within the function)Indexing into the underlying array in this way may soon be banned, though there is an alternative approach that does the same thing. Unfortunately, this version does not optimize well:
The portable-simd implementation of
Index
appears to be doing the same thing, and generates the same code https://godbolt.org/z/eecM6qdbW. That totally makes sense for most targets, because a pointer load is the best you can do.Attempt 2
I did find that this version does optimize well
but that is unwieldy kind of unwieldy, and while I can make it work for
stdarch
it won't work forportable_simd
.Solutions
I think there should be a way of indexing into a vector that emits an
extractelement
rather than agetelementptr
. Semantically that seems more accurate (and might optimize better in some cases?), even though on most targets the generated assembly will be the same.Some things I'm not sure about
vlgvf
is?)repr(simd)
types compiler-team#838 wants to fixconst
value is a terrible idea on some/most targets and unimplemented by designThe text was updated successfully, but these errors were encountered: