-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Rewrite initial parameter frame layout in terms of new ABI info #101340
JIT: Rewrite initial parameter frame layout in terms of new ABI info #101340
Conversation
Rewrite `lvaAssignVirtualFrameOffsetsToArgs` to make use of the ABI information that was already computed as part of ABI classification in the frontend.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Considering the stack alignment to be part of the passed stack size does not really make sense since the callee does not own the extra bytes added to ensure alignment.
cc @dotnet/jit-contrib PTAL @BruceForstall @kunalspathak No diffs. Some minor TP improvements. Also FYI @shushanhf @dotnet/samsung since this affects LA64/RISCV64 as well. |
// Signature (int a0, int a1, int a2, struct {long} a3, ...) | ||
// - On Windows, the Arm64 varargs ABI can split a 16 byte struct across x7 and stack | ||
// - Arm32 generally allows structs to be split | ||
// - LA64/RISCV64 both allow splitting of 16-byte structs across 1 register and stack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the splitting, is only the register saved on the stack within the prolog?
Should the second part of the struct which passed on caller's stack be copyed to on the prolog stack?
If don't copy the second part, how to process this part as the two parts are not stored on a continue stack space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for arm32/arm64 we only need to store the register part. The stack segments are always at the beginning and will be contiguous with the register segments. I thought LA64/RV64 was similar here.
@tomeksowi is enabling code in #101288 to handle it more generally. LA64 can do the same.
Out of curiosity, can you show the ABI information JITDUMP for the case with the split segments that doesn't have the stack segment first? For win-arm64 I think the split args were deliberately designed this way to make varargs possible to handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the old-style ABI, LA64 will copy the second part to the prolog stack where the splitting struct's size is the whole size and had allocated the whole stack space, that is, we can copy the second part directly.
I'm finding the test case, but I think/doubt the new-style can not process this case as we expected liking the old-style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had found this case.
Late I will add this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
; V00 arg0 [V00 ] ( 1, 1 ) int -> [fp+0x54] do-not-enreg[]
; V01 arg1 [V01 ] ( 1, 1 ) struct (16) [fp+0x40] do-not-enreg[XS] addr-exposed ld-addr-op <Point>
; V02 arg2 [V02 ] ( 1, 1 ) struct (16) [fp+0x30] do-not-enreg[XS] addr-exposed ld-addr-op <Point>
; V03 arg3 [V03 ] ( 1, 1 ) struct (16) [fp+0x20] do-not-enreg[XS] addr-exposed ld-addr-op <Point>
; V04 arg4 [V04 ] ( 1, 1 ) struct (16) [fp+0x10] do-not-enreg[XS] addr-exposed ld-addr-op <Point>
;# V05 OutArgs [V05 ] ( 1, 1 ) struct ( 0) [sp+0x00] do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;
; Lcl frame size = 80
DEBUG: LOONGARCH64, frameType:1
G_M40687_IG01: ;; offset=0x0000
0xff7575a760 02FE8063 addi.d sp, sp, -96
0xff7575a764 29C00061 st.d ra, sp, 0
0xff7575a768 29C02076 st.d fp, sp, 8
0xff7575a76c 02C02076 addi.d fp, sp, 8
0xff7575a770 298152C4 st.w a0, fp, 84
0xff7575a774 29C102C5 st.d a1, fp, 64
0xff7575a778 29C122C6 st.d a2, fp, 72
0xff7575a77c 29C0C2C7 st.d a3, fp, 48
0xff7575a780 29C0E2C8 st.d a4, fp, 56
0xff7575a784 29C082C9 st.d a5, fp, 32
0xff7575a788 29C0A2CA st.d a6, fp, 40
0xff7575a78c 29C042CB st.d a7, fp, 16
0xff7575a790 28C1806C ld.d t0, sp, 96 // the old-style ABI will copy, but now the new stype not.
0xff7575a794 29C062CC st.d t0, fp, 24 // the old-style ABI will copy, but now the new stype not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
Maybe it's better to split to some PRs before 101288 if there are much modification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit apprehensive about merging a PR which breaks everything with split struct args. Maybe I'll handle split args in #101288 and then LA can enable the #ifdefs
as well? I don't think there will be that much modification but I don't have a solution fully fleshed out yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean this PR? I don't think this PR will affect any behavior for either LA64/RISCV64. The only break right now came from #101224.
The original code computes the split stack parameters at similar offsets as this PR does for both LA64/RISCV64:
runtime/src/coreclr/jit/lclvars.cpp
Lines 6352 to 6362 in bc9fc5a
#elif defined(TARGET_LOONGARCH64) || defined(TARGET_RISCV64) | |
if (varDsc->lvIsSplit) | |
{ | |
assert((varDsc->lvType == TYP_STRUCT) && (varDsc->GetOtherArgReg() == REG_STK)); | |
// This is a split struct. It will account for an extra (8 bytes) for the whole struct. | |
varDsc->SetStackOffset(varDsc->GetStackOffset() + TARGET_POINTER_SIZE); | |
argOffs += TARGET_POINTER_SIZE; | |
} | |
#else // TARGET* |
(In fact, this was one of the primary reasons I thought LA64/RV64 already worked this way for split parameters -- but the stack offset set there is never used.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible I misunderstood what that code there is doing. Regardless, I don't think there is an issue with this PR since the offset this PR might be assigning to split parameters are getting overwritten regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible I misunderstood what that code there is doing.
I think I didn't express clearly to make you understand.
Regardless, I don't think there is an issue with this PR since the offset this PR might be assigning to split parameters are getting overwritten regardless.
Yes, what I feedback is not about this PR and I don't think LA64/RV64's splitting error is about this PR.
We just discussed here with you.
You can proceed this PR with your plan.
Ping @kunalspathak -- can you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…otnet#101340) Rewrite `lvaAssignVirtualFrameOffsetsToArgs` to make use of the ABI information that was already computed as part of ABI classification in the frontend.
…otnet#101340) Rewrite `lvaAssignVirtualFrameOffsetsToArgs` to make use of the ABI information that was already computed as part of ABI classification in the frontend.
Rewrite
lvaAssignVirtualFrameOffsetsToArgs
to make use of the ABI information that was already computed as part of ABI classification in the frontend.No diffs are expected.