Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad-quality code passing union as argument on x86_64 #3971

Closed
efriedma-quic opened this issue Feb 17, 2009 · 8 comments
Closed

Bad-quality code passing union as argument on x86_64 #3971

efriedma-quic opened this issue Feb 17, 2009 · 8 comments
Labels
bugzilla Issues migrated from bugzilla clang:codegen

Comments

@efriedma-quic
Copy link
Collaborator

Bugzilla Link 3599
Resolution FIXED
Resolved on Feb 17, 2009 14:48
Version unspecified
OS Linux

Extended Description

union UGeckoInstruction
{
unsigned hex;
};
unsigned a(union UGeckoInstruction a) {return a.hex;}

Run through clang -emit-llvm -O2 -arch=x86_64, gives the following:
; ModuleID = '-'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
target triple = "x86_64-pc-linux-gnu"

define i32 @​a(i64) nounwind readnone {
entry:
%tmp5 = trunc i64 %0 to i8 ; [#uses=1]
%tmp7 = lshr i64 %0, 8 ; [#uses=1]
%tmp8 = trunc i64 %tmp7 to i8 ; [#uses=1]
%tmp10 = lshr i64 %0, 16 ; [#uses=1]
%tmp11 = trunc i64 %tmp10 to i8 ; [#uses=1]
%tmp13 = lshr i64 %0, 24 ; [#uses=1]
%tmp14 = trunc i64 %tmp13 to i8 ; [#uses=1]
%1 = zext i8 %tmp5 to i32 ; [#uses=1]
%2 = zext i8 %tmp8 to i32 ; [#uses=1]
%3 = shl i32 %2, 8 ; [#uses=1]
%4 = zext i8 %tmp11 to i32 ; [#uses=1]
%5 = shl i32 %4, 16 ; [#uses=1]
%6 = zext i8 %tmp14 to i32 ; [#uses=1]
%7 = shl i32 %6, 24 ; [#uses=1]
%8 = or i32 %7, %1 ; [#uses=1]
%9 = or i32 %8, %5 ; [#uses=1]
%10 = or i32 %9, %3 ; [#uses=1]
ret i32 %10
}

This is rather silly. I'm not sure whether to blame clang or the LLVM optimizers for this result, though.

@lattner
Copy link
Collaborator

lattner commented Feb 17, 2009

llvm-gcc compiles this into a nice and simple:
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128"
target triple = "x86_64-apple-darwin7"

define i32 @​a(i32 %a.0) nounwind readnone {
entry:
ret i32 %a.0
}

:)

@llvmbot
Copy link
Member

llvmbot commented Feb 17, 2009

Ugh.

clang could certainly try harder to match to better types when coercing, but I'm not sure how far we should really go down this road.

My feeling is that in this case we should treat this as a missed IR optimization. The cases I think clang is clearly at fault is when the mid level IR passes do not have enough information to know that there is a nicer way to handle the argument (for example, choosing <4 x float> over <2 x double> in some cases).

@lattner
Copy link
Collaborator

lattner commented Feb 17, 2009

instcombine can help with this.

@llvmbot
Copy link
Member

llvmbot commented Feb 17, 2009

With the attached patch, I get this:

ddunbar@ozzy:CodeGen$ xcc -emit-llvm -S -o - t4.c | llvm-as | opt -std-compile-opts | llvm-dis
; ModuleID = ''
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
target triple = "x86_64-apple-darwin10.0"

define i32 @​a(i64) nounwind readnone {
entry:
%1 = trunc i64 %0 to i32 ; [#uses=1]
ret i32 %1
}

@llvmbot
Copy link
Member

llvmbot commented Feb 17, 2009

instcombine patch

@lattner
Copy link
Collaborator

lattner commented Feb 17, 2009

Simplified version of the patch committed here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090216/073864.html

We're now down to:

define i32 @​a(i64 %tmp) nounwind readnone {
entry:
%tmp28 = trunc i64 %tmp to i32 ; [#uses=1]
%tmp1 = and i32 %tmp28, 255 ; [#uses=1]
%tmp2 = trunc i64 %tmp to i32 ; [#uses=1]
%tmp3 = and i32 %tmp2, 65280 ; [#uses=1]
%tmp4 = trunc i64 %tmp to i32 ; [#uses=1]
%tmp6 = and i32 %tmp4, 16711680 ; [#uses=1]
%tmp21 = trunc i64 %tmp to i32 ; [#uses=1]
%tmp12 = and i32 %tmp21, -16777216 ; [#uses=1]
%tmp15 = or i32 %tmp12, %tmp1 ; [#uses=1]
%tmp16 = or i32 %tmp15, %tmp6 ; [#uses=1]
%tmp17 = or i32 %tmp16, %tmp3 ; [#uses=1]
ret i32 %tmp17
}

@lattner
Copy link
Collaborator

lattner commented Feb 17, 2009

Ah, with -std-compile-opts and that patch, this is completely simplified. Instcombine needs CSE to see the equivalence of the truncates.

@llvmbot
Copy link
Member

llvmbot commented Nov 27, 2021

mentioned in issue llvm/llvm-bugzilla-archive#5167

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla clang:codegen
Projects
None yet
Development

No branches or pull requests

3 participants