JIT: Hoist in newly recognized loops #96753

jakobbotsch · 2024-01-10T11:51:30Z

Some stats from win-x64:

benchmarks.run_pgo

-Considered 25592 loops.  Of these, we hoisted expressions out of 4524 ( 17.68%).
-  A total of 6110 expressions were hoisted, an average of  1.35 per loop-with-hoisted-expr.
+Considered 41324 loops.  Of these, we hoisted expressions out of 5604 ( 13.56%).
+  A total of 7644 expressions were hoisted, an average of  1.36 per loop-with-hoisted-expr.

25.1% more hoisted expressions

realworld

-Considered 8819 loops.  Of these, we hoisted expressions out of 1421 ( 16.11%).
-  A total of 1894 expressions were hoisted, an average of  1.33 per loop-with-hoisted-expr.
+Considered 10109 loops.  Of these, we hoisted expressions out of 1507 ( 14.91%).
+  A total of 2002 expressions were hoisted, an average of  1.33 per loop-with-hoisted-expr.

5.7% more hoisted expressions

libraries_tests.run

-Considered 74103 loops.  Of these, we hoisted expressions out of 13577 ( 18.32%).
-  A total of 15872 expressions were hoisted, an average of  1.17 per loop-with-hoisted-expr.
+Considered 124601 loops.  Of these, we hoisted expressions out of 18888 ( 15.16%).
+  A total of 21560 expressions were hoisted, an average of  1.14 per loop-with-hoisted-expr.

35.8% more hoisted expressions

Create preheaders for all loops, not just loops recognized by old loop finding.

ghost · 2024-01-10T11:51:43Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Based on #96751

Author:	jakobbotsch
Assignees:	jakobbotsch
Labels:	`area-CodeGen-coreclr`
Milestone:	-

jakobbotsch · 2024-01-10T13:32:33Z

src/coreclr/jit/optimizer.cpp

+        // Note that there is a mismatch between the dominator tree dominance
+        // and loop header dominance; the dominator tree dominance relation
+        // guarantees that a block A that dominates B was exited before B is
+        // entered, meaning it could not possibly have thrown an exception. On
+        // the other hand loop finding guarantees only that the header was
+        // entered before other blocks in the loop. If the header is a
+        // try-begin then blocks inside the catch may not necessarily be fully
+        // dominated by the header, but may still be part of the loop.


This is something we could consider canonicalizing, though I'm not sure it is really necessary (I would expect most reasoning about the header to already need to take into account that only the "beginning" of it is guaranteed to be executed).

An example that hits the assert in the base looks like:

private static int Foo(int[] arr, int n) { int sum = 0; for (int i = 0; i < 100; i++) { try { sum += arr[i]; } catch (IndexOutOfRangeException) { } } return sum; }

I think we should be able to recognize and optimize loops like this. The flow graph looks like this:

----------------------------------------------------------------------------------------------------------------------------------------- BBnum BBid ref try hnd preds weight lp [IL range] [jump] [EH region] [flags] ----------------------------------------------------------------------------------------------------------------------------------------- BB01 [0000] 1 1 [000..006)-> BB05 ( cond ) i BB02 [0008] 1 BB01 1 [006..???)-> BB03 (always) internal LoopPH q BB03 [0002] 2 0 BB02,BB04 4 [006..00F)-> BB04 (always) T0 try { } i keep Loop idxlen bwd q BB04 [0004] 2 BB03,BB06 8 [012..01B)-> BB03 ( cond ) i bwd BB05 [0006] 2 BB01,BB04 1 [01B..01D) (return) i +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ funclets follow BB06 [0003] 1 0 0 [00F..012)-> BB04 ( cret ) H0 F catch { } i rare keep xentry flet bwd ----------------------------------------------------------------------------------------------------------------------------------------- ... L00 header: BB03 Members (3): [BB03..BB04];BB06 Entry: BB02 -> BB03 Exit: BB04 -> BB05 Back: BB04 -> BB03

BB06 is the catch block; it is part of the loop but its immediate dominator is BB02, since BB03 (the header) is not guaranteed to be exited before BB06 is entered.

I think loops like these are definitely ones we want to be able to recognize and handle. If we run into more odd special casing then I think we can canonicalize these cases by introducing a block before the "try" begin, so that the try begin does not become the header. (Note that a loop-inside-try case could also have the try-begin as the header, but would not have the catch blocks considered as part of the loop, so we would want to differentiate this case in the canonicalization.)

I think we can canonicalize these cases by introducing a block before the "try" begin, so that the try begin does not become the header

I like that idea. Removing cases where we pessimize due to difficult EH flow graph structures is a good thing.

jakobbotsch · 2024-01-10T23:52:04Z

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

azure-pipelines · 2024-01-10T23:52:23Z

Azure Pipelines successfully started running 2 pipeline(s).

jakobbotsch · 2024-01-11T12:22:49Z

cc @dotnet/jit-contrib PTAL @BruceForstall

Diffs. Some stats from win-x64:

benchmarks.run_pgo

-Considered 25592 loops.  Of these, we hoisted expressions out of 4524 ( 17.68%).
-  A total of 6110 expressions were hoisted, an average of  1.35 per loop-with-hoisted-expr.
+Considered 41324 loops.  Of these, we hoisted expressions out of 5604 ( 13.56%).
+  A total of 7644 expressions were hoisted, an average of  1.36 per loop-with-hoisted-expr.

25.1% more hoisted expressions

realworld

-Considered 8819 loops.  Of these, we hoisted expressions out of 1421 ( 16.11%).
-  A total of 1894 expressions were hoisted, an average of  1.33 per loop-with-hoisted-expr.
+Considered 10109 loops.  Of these, we hoisted expressions out of 1507 ( 14.91%).
+  A total of 2002 expressions were hoisted, an average of  1.33 per loop-with-hoisted-expr.

5.7% more hoisted expressions

libraries_tests.run

-Considered 74103 loops.  Of these, we hoisted expressions out of 13577 ( 18.32%).
-  A total of 15872 expressions were hoisted, an average of  1.17 per loop-with-hoisted-expr.
+Considered 124601 loops.  Of these, we hoisted expressions out of 18888 ( 15.16%).
+  A total of 21560 expressions were hoisted, an average of  1.14 per loop-with-hoisted-expr.

35.8% more hoisted expressions

cincuranet · 2024-01-16T17:28:31Z

Regressions:

[Perf] Windows/x64: 11 Regressions on 1/11/2024 10:37:40 PM perf-autofiling-issues#27331
[Perf] Windows/x64: 27 Regressions on 1/11/2024 10:37:40 PM #97093
[Perf] Windows/x64: 3 Regressions on 1/11/2024 5:32:36 PM perf-autofiling-issues#27453
[Perf] Windows/arm64: 2 Regressions on 1/12/2024 1:47:53 AM perf-autofiling-issues#27476

Improvements:

[Perf] Windows/x64: 22 Improvements on 1/11/2024 10:37:40 PM perf-autofiling-issues#27340
[Perf] Windows/x64: 20 Improvements on 1/11/2024 10:37:40 PM perf-autofiling-issues#27471

Some stats from win-x64: benchmarks.run_pgo ```diff -Considered 25592 loops. Of these, we hoisted expressions out of 4524 ( 17.68%). - A total of 6110 expressions were hoisted, an average of 1.35 per loop-with-hoisted-expr. +Considered 41324 loops. Of these, we hoisted expressions out of 5604 ( 13.56%). + A total of 7644 expressions were hoisted, an average of 1.36 per loop-with-hoisted-expr. ``` 25.1% more hoisted expressions realworld ```diff -Considered 8819 loops. Of these, we hoisted expressions out of 1421 ( 16.11%). - A total of 1894 expressions were hoisted, an average of 1.33 per loop-with-hoisted-expr. +Considered 10109 loops. Of these, we hoisted expressions out of 1507 ( 14.91%). + A total of 2002 expressions were hoisted, an average of 1.33 per loop-with-hoisted-expr. ``` 5.7% more hoisted expressions libraries_tests.run ```diff -Considered 74103 loops. Of these, we hoisted expressions out of 13577 ( 18.32%). - A total of 15872 expressions were hoisted, an average of 1.17 per loop-with-hoisted-expr. +Considered 124601 loops. Of these, we hoisted expressions out of 18888 ( 15.16%). + A total of 21560 expressions were hoisted, an average of 1.14 per loop-with-hoisted-expr. ``` 35.8% more hoisted expressions

jakobbotsch added 2 commits January 10, 2024 12:22

JIT: Canonicalize newly recognized loops

8071259

Create preheaders for all loops, not just loops recognized by old loop finding.

JIT: Hoist in newly recognized loops

5646422

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 10, 2024

ghost assigned jakobbotsch Jan 10, 2024

Fix exceptional flow case

2733f5c

jakobbotsch commented Jan 10, 2024

View reviewed changes

jakobbotsch mentioned this pull request Jan 10, 2024

Improve JIT loop optimizations (.NET 9) #93144

Closed

21 tasks

Merge branch 'main' of github.com:dotnet/runtime into hoist-new-loops

74d15cb

build-analysis bot mentioned this pull request Jan 11, 2024

Test failure - System.NullReferenceException in System.Threading.Lock.TryInitializeStatics #94728

Closed

jakobbotsch marked this pull request as ready for review January 11, 2024 12:22

jakobbotsch requested a review from BruceForstall January 11, 2024 12:22

BruceForstall approved these changes Jan 11, 2024

View reviewed changes

jakobbotsch merged commit 23a93aa into dotnet:main Jan 11, 2024
168 of 171 checks passed

jakobbotsch deleted the hoist-new-loops branch January 11, 2024 19:07

jakobbotsch mentioned this pull request Jan 16, 2024

[Perf] Windows/x64: 27 Regressions on 1/11/2024 10:37:40 PM #97093

Closed

DrewScoggins mentioned this pull request Jan 16, 2024

[Perf] Linux/x64: 5 Regressions on 1/11/2024 10:37:40 PM #97042

Open

github-actions bot locked and limited conversation to collaborators Feb 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Hoist in newly recognized loops #96753

JIT: Hoist in newly recognized loops #96753

jakobbotsch commented Jan 10, 2024 •

edited

Loading

ghost commented Jan 10, 2024

jakobbotsch Jan 10, 2024 •

edited

Loading

BruceForstall Jan 11, 2024

jakobbotsch commented Jan 10, 2024

azure-pipelines bot commented Jan 10, 2024

jakobbotsch commented Jan 11, 2024

cincuranet commented Jan 16, 2024 •

edited

Loading

JIT: Hoist in newly recognized loops #96753

JIT: Hoist in newly recognized loops #96753

Conversation

jakobbotsch commented Jan 10, 2024 • edited Loading

ghost commented Jan 10, 2024

jakobbotsch Jan 10, 2024 • edited Loading

Choose a reason for hiding this comment

BruceForstall Jan 11, 2024

Choose a reason for hiding this comment

jakobbotsch commented Jan 10, 2024

azure-pipelines bot commented Jan 10, 2024

jakobbotsch commented Jan 11, 2024

cincuranet commented Jan 16, 2024 • edited Loading

jakobbotsch commented Jan 10, 2024 •

edited

Loading

jakobbotsch Jan 10, 2024 •

edited

Loading

cincuranet commented Jan 16, 2024 •

edited

Loading