Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Fix block store local address containment on ARM #27338

Merged
merged 1 commit into from
Oct 25, 2019

Conversation

mikedn
Copy link

@mikedn mikedn commented Oct 21, 2019

Extracted from #21711

When the source/destination address was local, genCodeForCpBlkUnroll was folding the local offset into the address mode of the generated load/store instructions as if the local address was contained. But lowering didn't actually contain the address so useless ADD instructions were still being generated.

When the source/destination address was local, genCodeForCpBlkUnroll was folding the local offset into the address mode of the generated load/store instructions as if the local address was contained. But lowering didn't actually contain the address so useless ADD instructions were still being generated.
@mikedn
Copy link
Author

mikedn commented Oct 21, 2019

FX diff summary:
arm64

Total bytes of diff: -316012 (-0.30% of base)
    diff is an improvement.
Top file improvements by size (bytes):
     -162824 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-1.45% of base)
      -40920 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.28% of base)
      -37056 : Microsoft.CodeAnalysis.CSharp.dasm (-0.27% of base)
      -12672 : System.Private.Xml.dasm (-0.12% of base)
      -11520 : System.Private.CoreLib.dasm (-0.11% of base)
87 total files with size differences (87 improved, 0 regressed), 42 unchanged.
Top method improvements by size (bytes):
      -13600 (-5.98% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - ApplicationServerTraceEventParser:EnumerateTemplates(ref,ref):this (2 methods)
      -12760 (-8.71% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - CtfTraceEventSource:InitEventMap():ref (2 methods)
       -6840 (-5.04% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - KernelTraceEventParser:EnumerateTemplates(ref,ref):this (2 methods)
       -4584 (-4.42% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - ClrPrivateTraceEventParser:EnumerateTemplates(ref,ref):this (2 methods)
       -3944 (-4.03% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - ClrTraceEventParser:EnumerateTemplates(ref,ref):this (2 methods)
Top method improvements by size (percentage):
         -24 (-18.75% of base) : System.Private.CoreLib.dasm - Variant:get_AsDecimal():struct:this (2 methods)
         -24 (-15.00% of base) : System.Private.CoreLib.dasm - Decimal:System.IConvertible.ToDouble(ref):double:this (2 methods)
         -16 (-14.29% of base) : System.Private.CoreLib.dasm - Utf8Span:get_Chars():struct:this (2 methods)
         -16 (-14.29% of base) : System.Private.CoreLib.dasm - Utf8Span:get_Runes():struct:this (2 methods)
         -24 (-14.29% of base) : System.Private.CoreLib.dasm - Decimal:System.IConvertible.ToSingle(ref):float:this (2 methods)
10038 total methods with size differences (10038 improved, 0 regressed), 138264 unchanged.

arm32:

Total bytes of diff: -157608 (-0.21% of base)
    diff is an improvement.
Top file improvements by size (bytes):
      -98452 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-1.14% of base)
      -11320 : Microsoft.CodeAnalysis.CSharp.dasm (-0.12% of base)
      -11044 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.10% of base)
      -11040 : Microsoft.CodeAnalysis.dasm (-0.33% of base)
       -6944 : System.Private.CoreLib.dasm (-0.09% of base)
76 total files with size differences (76 improved, 0 regressed), 53 unchanged.
Top method regressions by size (bytes):
          80 ( 9.80% of base) : System.Reflection.Metadata.dasm - PEBuilder:SumRawDataSizes(struct,int):int (4 methods)
          28 ( 7.14% of base) : System.Data.Common.dasm - SqlStringStorage:Compare(int,int):int:this (2 methods)
Top method improvements by size (bytes):
      -13072 (-4.36% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - ApplicationServerTraceEventParser:EnumerateTemplates(ref,ref):this (2 methods)
       -8416 (-4.20% of base) : Microsoft.CodeAnalysis.dasm - DesktopAssemblyIdentityComparer:.cctor() (4 methods)
       -7752 (-5.21% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - KernelTraceEventParser:EnumerateTemplates(ref,ref):this (2 methods)
       -5780 (-3.75% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - CtfTraceEventSource:InitEventMap():ref (2 methods)
       -5760 (-5.21% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - ClrPrivateTraceEventParser:EnumerateTemplates(ref,ref):this (2 methods)
Top method regressions by size (percentage):
          80 ( 9.80% of base) : System.Reflection.Metadata.dasm - PEBuilder:SumRawDataSizes(struct,int):int (4 methods)
          28 ( 7.14% of base) : System.Data.Common.dasm - SqlStringStorage:Compare(int,int):int:this (2 methods)
Top method improvements by size (percentage):
         -36 (-19.57% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - StartStopActivityComputer:<.ctor>b__0_5(ref):this (2 methods)
       -1980 (-18.53% of base) : System.ComponentModel.TypeConverter.dasm - StandardCommands:.cctor() (2 methods)
         -36 (-18.00% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - StartStopActivityComputer:SetThreadToStartStopActivity(ref):this (2 methods)
         -36 (-17.65% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - StartStopActivityComputer:<.ctor>b__0_6(ref):this (2 methods)
         -36 (-17.65% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - StartStopActivityComputer:<.ctor>b__0_7(ref):this (2 methods)
8475 total methods with size differences (8473 improved, 2 regressed), 139867 unchanged.

@mikedn
Copy link
Author

mikedn commented Oct 21, 2019

The couple of ARM32 regressions are due to different register choices that result in longer instruction encoding (e.g. r12 vs. r4).

--- D:\Projects\jit\diffs\dasmset_2\base\System.Reflection.Metadata.dasm	2019-10-21 20:12:54.000000000 +-0300
+++ D:\Projects\jit\diffs\dasmset_2\diff\System.Reflection.Metadata.dasm	2019-10-21 20:12:54.000000000 +-0300
@@ -143152,10 +143120,9 @@
             add     r2, lr
-            addw    r12, sp, 24	// [V05 tmp1]
 
 G_M23755_IG04:		;; bbWeight=2   
-            ldr     r4, [r2]
-            str     r4, [sp+0x18]	// [V05 tmp1]
-            ldr     r4, [r2+4]
-            str     r4, [sp+0x1c]	// [V05 tmp1+0x04]
-            ldr     r4, [r2+8]
-            str     r4, [sp+0x20]	// [V05 tmp1+0x08]
+            ldr     r12, [r2]
+            str     r12, [sp+0x18]
+            ldr     r12, [r2+4]
+            str     r12, [sp+0x1c]
+            ldr     r12, [r2+8]
+            str     r12, [sp+0x20]

The diff is enormous so I cannot check all of it but all the diffs I saw look like above - a removed add instruction.

Copy link

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you.
The CQ improvement is impressive.

@@ -326,6 +326,20 @@ void Lowering::LowerBlockStore(GenTreeBlk* blkNode)
if (!blkNode->OperIs(GT_STORE_DYN_BLK) && (size <= CPBLK_UNROLL_LIMIT))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you are here could you please change the condition to if (!blkNode->OperIs(GT_STORE_BLK) && (size <= CPBLK_UNROLL_LIMIT))?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! But it can probably wait until the next PR (immediately after this one). Let's not kill the CI more than it's needed. 😁

Looks like it can be even better

if (obj) {
}
else if (blk && unroll) {
}
else {
}

The less nesting the better.

@sandreenko sandreenko requested a review from a team October 25, 2019 06:15
@sandreenko sandreenko merged commit 4d6fae3 into dotnet:master Oct 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants