-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Hoist in newly recognized loops #96753
Conversation
Create preheaders for all loops, not just loops recognized by old loop finding.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsBased on #96751
|
// Note that there is a mismatch between the dominator tree dominance | ||
// and loop header dominance; the dominator tree dominance relation | ||
// guarantees that a block A that dominates B was exited before B is | ||
// entered, meaning it could not possibly have thrown an exception. On | ||
// the other hand loop finding guarantees only that the header was | ||
// entered before other blocks in the loop. If the header is a | ||
// try-begin then blocks inside the catch may not necessarily be fully | ||
// dominated by the header, but may still be part of the loop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something we could consider canonicalizing, though I'm not sure it is really necessary (I would expect most reasoning about the header to already need to take into account that only the "beginning" of it is guaranteed to be executed).
An example that hits the assert in the base looks like:
private static int Foo(int[] arr, int n)
{
int sum = 0;
for (int i = 0; i < 100; i++)
{
try
{
sum += arr[i];
}
catch (IndexOutOfRangeException)
{
}
}
return sum;
}
I think we should be able to recognize and optimize loops like this. The flow graph looks like this:
-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds weight lp [IL range] [jump] [EH region] [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000] 1 1 [000..006)-> BB05 ( cond ) i
BB02 [0008] 1 BB01 1 [006..???)-> BB03 (always) internal LoopPH q
BB03 [0002] 2 0 BB02,BB04 4 [006..00F)-> BB04 (always) T0 try { } i keep Loop idxlen bwd q
BB04 [0004] 2 BB03,BB06 8 [012..01B)-> BB03 ( cond ) i bwd
BB05 [0006] 2 BB01,BB04 1 [01B..01D) (return) i
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ funclets follow
BB06 [0003] 1 0 0 [00F..012)-> BB04 ( cret ) H0 F catch { } i rare keep xentry flet bwd
-----------------------------------------------------------------------------------------------------------------------------------------
...
L00 header: BB03
Members (3): [BB03..BB04];BB06
Entry: BB02 -> BB03
Exit: BB04 -> BB05
Back: BB04 -> BB03
BB06
is the catch block; it is part of the loop but its immediate dominator is BB02
, since BB03
(the header) is not guaranteed to be exited before BB06
is entered.
I think loops like these are definitely ones we want to be able to recognize and handle. If we run into more odd special casing then I think we can canonicalize these cases by introducing a block before the "try" begin, so that the try begin does not become the header. (Note that a loop-inside-try case could also have the try-begin as the header, but would not have the catch blocks considered as part of the loop, so we would want to differentiate this case in the canonicalization.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can canonicalize these cases by introducing a block before the "try" begin, so that the try begin does not become the header
I like that idea. Removing cases where we pessimize due to difficult EH flow graph structures is a good thing.
/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress |
Azure Pipelines successfully started running 2 pipeline(s). |
cc @dotnet/jit-contrib PTAL @BruceForstall Diffs. Some stats from win-x64: benchmarks.run_pgo -Considered 25592 loops. Of these, we hoisted expressions out of 4524 ( 17.68%).
- A total of 6110 expressions were hoisted, an average of 1.35 per loop-with-hoisted-expr.
+Considered 41324 loops. Of these, we hoisted expressions out of 5604 ( 13.56%).
+ A total of 7644 expressions were hoisted, an average of 1.36 per loop-with-hoisted-expr. 25.1% more hoisted expressions realworld -Considered 8819 loops. Of these, we hoisted expressions out of 1421 ( 16.11%).
- A total of 1894 expressions were hoisted, an average of 1.33 per loop-with-hoisted-expr.
+Considered 10109 loops. Of these, we hoisted expressions out of 1507 ( 14.91%).
+ A total of 2002 expressions were hoisted, an average of 1.33 per loop-with-hoisted-expr. 5.7% more hoisted expressions libraries_tests.run -Considered 74103 loops. Of these, we hoisted expressions out of 13577 ( 18.32%).
- A total of 15872 expressions were hoisted, an average of 1.17 per loop-with-hoisted-expr.
+Considered 124601 loops. Of these, we hoisted expressions out of 18888 ( 15.16%).
+ A total of 21560 expressions were hoisted, an average of 1.14 per loop-with-hoisted-expr. 35.8% more hoisted expressions |
Some stats from win-x64: benchmarks.run_pgo ```diff -Considered 25592 loops. Of these, we hoisted expressions out of 4524 ( 17.68%). - A total of 6110 expressions were hoisted, an average of 1.35 per loop-with-hoisted-expr. +Considered 41324 loops. Of these, we hoisted expressions out of 5604 ( 13.56%). + A total of 7644 expressions were hoisted, an average of 1.36 per loop-with-hoisted-expr. ``` 25.1% more hoisted expressions realworld ```diff -Considered 8819 loops. Of these, we hoisted expressions out of 1421 ( 16.11%). - A total of 1894 expressions were hoisted, an average of 1.33 per loop-with-hoisted-expr. +Considered 10109 loops. Of these, we hoisted expressions out of 1507 ( 14.91%). + A total of 2002 expressions were hoisted, an average of 1.33 per loop-with-hoisted-expr. ``` 5.7% more hoisted expressions libraries_tests.run ```diff -Considered 74103 loops. Of these, we hoisted expressions out of 13577 ( 18.32%). - A total of 15872 expressions were hoisted, an average of 1.17 per loop-with-hoisted-expr. +Considered 124601 loops. Of these, we hoisted expressions out of 18888 ( 15.16%). + A total of 21560 expressions were hoisted, an average of 1.14 per loop-with-hoisted-expr. ``` 35.8% more hoisted expressions
Some stats from win-x64:
benchmarks.run_pgo
25.1% more hoisted expressions
realworld
5.7% more hoisted expressions
libraries_tests.run
35.8% more hoisted expressions