Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to vectorize @llvm.lrint.i64.f32 #55208

Closed
RKSimon opened this issue May 1, 2022 · 0 comments · Fixed by #71416
Closed

Failure to vectorize @llvm.lrint.i64.f32 #55208

RKSimon opened this issue May 1, 2022 · 0 comments · Fixed by #71416

Comments

@RKSimon
Copy link
Collaborator

RKSimon commented May 1, 2022

Pulled out of Issue #55202

https://godbolt.org/z/xqcjnTPhj

#include <cmath>
void testrint( const float * __restrict arg, float *out) {
    *out++ = std::rint( *arg++ );
    *out++ = std::rint( *arg++ );
    *out++ = std::rint( *arg++ );
    *out++ = std::rint( *arg++ );
}
void testlrint( const float * __restrict arg, long *out) {
    *out++ = std::lrint( *arg++ );
    *out++ = std::lrint( *arg++ );
    *out++ = std::lrint( *arg++ );
    *out++ = std::lrint( *arg++ );
}

while rint gets vectorized, lrint doesn't:

testrint(float const*, float*):                       # @testrint(float const*, float*)
        vroundps        $4, (%rdi), %xmm0
        vmovups %xmm0, (%rsi)
        retq
testlrint(float const*, long*):                      # @testlrint(float const*, long*)
        vcvtss2si       (%rdi), %rax
        movq    %rax, (%rsi)
        vcvtss2si       4(%rdi), %rax
        movq    %rax, 8(%rsi)
        vcvtss2si       8(%rdi), %rax
        movq    %rax, 16(%rsi)
        vcvtss2si       12(%rdi), %rax
        movq    %rax, 24(%rsi)
        retq
artagnon added a commit to artagnon/llvm-project that referenced this issue Sep 7, 2023
The issue llvm#55208 describes a current deficiency of the SLPVectorizer,
namely that it doesn't vectorize code written with lrint, while similar
code written with rint is vectorized. Add a test corresponding to this
issue for the RISC-V target.
artagnon added a commit that referenced this issue Sep 8, 2023
The issue #55208 describes a current deficiency of the SLPVectorizer,
namely that it doesn't vectorize code written with lrint, while similar
code written with rint is vectorized. Add a test corresponding to this
issue for the RISC-V target.
artagnon added a commit to artagnon/llvm-project that referenced this issue Oct 18, 2023
The issue llvm#55208 noticed that std::rint is vectorized by the
SLPVectorizer, but a very similar function, std::lrint, is not.
std::lrint corresponds to ISD::LRINT in the SelectionDAG, and
std::llrint is a familiar cousin corresponding to ISD::LLRINT. Now,
neither ISD::LRINT nor ISD::LLRINT have a corresponding vector variant,
and the LangRef makes this clear in the documentation of llvm.lrint.*
and llvm.llrint.*.

This patch extends the LangRef to include vector variants of
llvm.lrint.* and llvm.llrint.*, and lays the necessary ground-work of
scalarizing it for all targets. However, this patch would be devoid of
motivation unless we show the utility of these new vector variants.
Hence, the RISCV target has been chosen to implement a custom lowering
to the vfcvt.x.f.v instruction. The patch also includes a CostModel for
RISCV, and a trivial follow-up can potentially enable the SLPVectorizer
to vectorize std::lrint and std::llrint, fixing llvm#55208.

The patch includes tests, obviously for the RISCV target, but also for
the X86, AArch64, and PowerPC targets to justify the addition of the
vector variants to the LangRef.
artagnon added a commit that referenced this issue Oct 19, 2023
…#66924)

The issue #55208 noticed that std::rint is vectorized by the
SLPVectorizer, but a very similar function, std::lrint, is not.
std::lrint corresponds to ISD::LRINT in the SelectionDAG, and
std::llrint is a familiar cousin corresponding to ISD::LLRINT. Now,
neither ISD::LRINT nor ISD::LLRINT have a corresponding vector variant,
and the LangRef makes this clear in the documentation of llvm.lrint.*
and llvm.llrint.*.

This patch extends the LangRef to include vector variants of
llvm.lrint.* and llvm.llrint.*, and lays the necessary ground-work of
scalarizing it for all targets. However, this patch would be devoid of
motivation unless we show the utility of these new vector variants.
Hence, the RISCV target has been chosen to implement a custom lowering
to the vfcvt.x.f.v instruction. The patch also includes a CostModel for
RISCV, and a trivial follow-up can potentially enable the SLPVectorizer
to vectorize std::lrint and std::llrint, fixing #55208.

The patch includes tests, obviously for the RISCV target, but also for
the X86, AArch64, and PowerPC targets to justify the addition of the
vector variants to the LangRef.
artagnon added a commit to artagnon/llvm-project that referenced this issue Nov 6, 2023
With the recent change 98c90a1 (ISel: introduce vector ISD::LRINT,
ISD::LLRINT; custom RISCV lowering), it is now possible for
SLPVectorizer, LoopVectorize, and Scalarizer to operate on llvm.lrint
and llvm.llrint, with vector codegen for the RISC-V target. Make a
trivial change to VectorUtils, and update the corresponding tests.

A couple of important fixes have been landed since the original patch
was landed and reverted, and it is now safe to re-land the patch:
5e1d81a (LegalizeIntegerTypes: implement PromoteIntRes for xrint) and
fd887a3 (LegalizeVectorTypes: fix bug in widening of vec result in
xrint). See also llvm#71399 which proves that lrint and llrint will indeed
produce vector codegen on RISC-V.

Fixes llvm#55208.
artagnon added a commit that referenced this issue Nov 6, 2023
With the recent change 98c90a1 (ISel: introduce vector ISD::LRINT,
ISD::LLRINT; custom RISCV lowering), it is now possible for
SLPVectorizer, LoopVectorize, and Scalarizer to operate on llvm.lrint
and llvm.llrint, with vector codegen for the RISC-V target. Make a
trivial change to VectorUtils, and update the corresponding tests.

A couple of important fixes have been landed since the original patch
was landed and reverted, and it is now safe to re-land the patch:
5e1d81a (LegalizeIntegerTypes: implement PromoteIntRes for xrint) and
fd887a3 (LegalizeVectorTypes: fix bug in widening of vec result in
xrint). See also #71399, which proves that lrint and llrint will indeed
produce vector codegen on RISC-V.

Fixes #55208.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants