-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad-quality code passing union as argument on x86_64 #3971
Comments
llvm-gcc compiles this into a nice and simple: define i32 @a(i32 %a.0) nounwind readnone { :) |
Ugh. clang could certainly try harder to match to better types when coercing, but I'm not sure how far we should really go down this road. My feeling is that in this case we should treat this as a missed IR optimization. The cases I think clang is clearly at fault is when the mid level IR passes do not have enough information to know that there is a nicer way to handle the argument (for example, choosing <4 x float> over <2 x double> in some cases). |
instcombine can help with this. |
With the attached patch, I get this:ddunbar@ozzy:CodeGen$ xcc -emit-llvm -S -o - t4.c | llvm-as | opt -std-compile-opts | llvm-dis define i32 @a(i64) nounwind readnone {
|
Simplified version of the patch committed here: We're now down to: define i32 @a(i64 %tmp) nounwind readnone { |
Ah, with -std-compile-opts and that patch, this is completely simplified. Instcombine needs CSE to see the equivalence of the truncates. |
mentioned in issue llvm/llvm-bugzilla-archive#5167 |
Make the initialization of SwiftASTContextForExpressions lazy (NFC)
Extended Description
union UGeckoInstruction
{
unsigned hex;
};
unsigned a(union UGeckoInstruction a) {return a.hex;}
Run through clang -emit-llvm -O2 -arch=x86_64, gives the following:
; ModuleID = '-'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
target triple = "x86_64-pc-linux-gnu"
define i32 @a(i64) nounwind readnone {
entry:
%tmp5 = trunc i64 %0 to i8 ; [#uses=1]
%tmp7 = lshr i64 %0, 8 ; [#uses=1]
%tmp8 = trunc i64 %tmp7 to i8 ; [#uses=1]
%tmp10 = lshr i64 %0, 16 ; [#uses=1]
%tmp11 = trunc i64 %tmp10 to i8 ; [#uses=1]
%tmp13 = lshr i64 %0, 24 ; [#uses=1]
%tmp14 = trunc i64 %tmp13 to i8 ; [#uses=1]
%1 = zext i8 %tmp5 to i32 ; [#uses=1]
%2 = zext i8 %tmp8 to i32 ; [#uses=1]
%3 = shl i32 %2, 8 ; [#uses=1]
%4 = zext i8 %tmp11 to i32 ; [#uses=1]
%5 = shl i32 %4, 16 ; [#uses=1]
%6 = zext i8 %tmp14 to i32 ; [#uses=1]
%7 = shl i32 %6, 24 ; [#uses=1]
%8 = or i32 %7, %1 ; [#uses=1]
%9 = or i32 %8, %5 ; [#uses=1]
%10 = or i32 %9, %3 ; [#uses=1]
ret i32 %10
}
This is rather silly. I'm not sure whether to blame clang or the LLVM optimizers for this result, though.
The text was updated successfully, but these errors were encountered: