-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Fix block store local address containment on ARM #27338
Conversation
When the source/destination address was local, genCodeForCpBlkUnroll was folding the local offset into the address mode of the generated load/store instructions as if the local address was contained. But lowering didn't actually contain the address so useless ADD instructions were still being generated.
FX diff summary:
arm32:
|
The couple of ARM32 regressions are due to different register choices that result in longer instruction encoding (e.g. r12 vs. r4). --- D:\Projects\jit\diffs\dasmset_2\base\System.Reflection.Metadata.dasm 2019-10-21 20:12:54.000000000 +-0300
+++ D:\Projects\jit\diffs\dasmset_2\diff\System.Reflection.Metadata.dasm 2019-10-21 20:12:54.000000000 +-0300
@@ -143152,10 +143120,9 @@
add r2, lr
- addw r12, sp, 24 // [V05 tmp1]
G_M23755_IG04: ;; bbWeight=2
- ldr r4, [r2]
- str r4, [sp+0x18] // [V05 tmp1]
- ldr r4, [r2+4]
- str r4, [sp+0x1c] // [V05 tmp1+0x04]
- ldr r4, [r2+8]
- str r4, [sp+0x20] // [V05 tmp1+0x08]
+ ldr r12, [r2]
+ str r12, [sp+0x18]
+ ldr r12, [r2+4]
+ str r12, [sp+0x1c]
+ ldr r12, [r2+8]
+ str r12, [sp+0x20] The diff is enormous so I cannot check all of it but all the diffs I saw look like above - a removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you.
The CQ improvement is impressive.
@@ -326,6 +326,20 @@ void Lowering::LowerBlockStore(GenTreeBlk* blkNode) | |||
if (!blkNode->OperIs(GT_STORE_DYN_BLK) && (size <= CPBLK_UNROLL_LIMIT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while you are here could you please change the condition to if (!blkNode->OperIs(GT_STORE_BLK) && (size <= CPBLK_UNROLL_LIMIT))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! But it can probably wait until the next PR (immediately after this one). Let's not kill the CI more than it's needed. 😁
Looks like it can be even better
if (obj) {
}
else if (blk && unroll) {
}
else {
}
The less nesting the better.
Extracted from #21711
When the source/destination address was local, genCodeForCpBlkUnroll was folding the local offset into the address mode of the generated load/store instructions as if the local address was contained. But lowering didn't actually contain the address so useless ADD instructions were still being generated.