Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Add some convenience classes for node count JIT metrics, and automatic dumping from compShutdown #92112

Merged
merged 4 commits into from
Sep 22, 2023

Conversation

jakobbotsch
Copy link
Member

  • Add a NodeCounts class that can be used to easily record the count of node types seen
  • Add a DumpOnShutdown class that can be used to easily register either a Histogram class or NodeCounts class to be dumped as part of Compiler::compShutdown

For example, I was investigating #91862 and wanted to know what types of nodes we usually see in the index/length positions of BOUNDS_CHECK inside optRemoveRangeCheck. I made the following change:

--- a/src/coreclr/jit/optimizer.cpp
+++ b/src/coreclr/jit/optimizer.cpp
@@ -8849,6 +8849,11 @@ void Compiler::AddModifiedElemTypeAllContainingLoops(unsigned lnum, CORINFO_CLAS
     }
 }
 
+NodeCounts     s_rangeCheckLengthNodes;
+NodeCounts     s_rangeCheckIndexNodes;
+DumpOnShutdown d1("BOUNDS_CHECK length nodes", &s_rangeCheckLengthNodes);
+DumpOnShutdown d2("BOUNDS_CHECK index nodes", &s_rangeCheckIndexNodes);
+
 //------------------------------------------------------------------------------
 // optRemoveRangeCheck : Given an indexing node, mark it as not needing a range check.
 //
@@ -8878,6 +8883,9 @@ GenTree* Compiler::optRemoveRangeCheck(GenTreeBoundsChk* check, GenTree* comma,
                  (check != nullptr && check->OperIs(GT_BOUNDS_CHECK) && comma == nullptr));
     noway_assert(check->OperIs(GT_BOUNDS_CHECK));
 
+    s_rangeCheckIndexNodes.record(check->GetIndex()->OperGet());
+    s_rangeCheckLengthNodes.record(check->GetArrayLength()->OperGet());
+
     GenTree* tree = comma != nullptr ? comma : check;
 
 #ifdef DEBUG

After that, a superpmi.exe replay over libraries_tests.run outputs the following when it finishes:

Loaded 674679  Jitted 674679  FailedCompile 0 Excluded 0 Missing 0
Total time: 565472.612700ms
BOUNDS_CHECK length nodes
ARR_LENGTH           :   51990
LCL_VAR              :   47159
CNS_INT              :    6751
COMMA                :       9
LCL_FLD              :       5
CAST                 :       4

BOUNDS_CHECK index nodes
LCL_VAR              :   62221
CNS_INT              :   42139
COMMA                :     623
CAST                 :     460
ADD                  :     397
RSH                  :      72
NEG                  :       5
UDIV                 :       1

- Add a NodeCounts class that can be used to easily record the count of
  node types seen
- Add a DumpOnShutdown class that can be used to easily register either
  a Histogram class or NodeCounts class to be dumped as part of
  Compiler::compShutdown
@ghost ghost assigned jakobbotsch Sep 15, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 15, 2023
@ghost
Copy link

ghost commented Sep 15, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details
  • Add a NodeCounts class that can be used to easily record the count of node types seen
  • Add a DumpOnShutdown class that can be used to easily register either a Histogram class or NodeCounts class to be dumped as part of Compiler::compShutdown

For example, I was investigating #91862 and wanted to know what types of nodes we usually see in the index/length positions of BOUNDS_CHECK inside optRemoveRangeCheck. I made the following change:

--- a/src/coreclr/jit/optimizer.cpp
+++ b/src/coreclr/jit/optimizer.cpp
@@ -8849,6 +8849,11 @@ void Compiler::AddModifiedElemTypeAllContainingLoops(unsigned lnum, CORINFO_CLAS
     }
 }
 
+NodeCounts     s_rangeCheckLengthNodes;
+NodeCounts     s_rangeCheckIndexNodes;
+DumpOnShutdown d1("BOUNDS_CHECK length nodes", &s_rangeCheckLengthNodes);
+DumpOnShutdown d2("BOUNDS_CHECK index nodes", &s_rangeCheckIndexNodes);
+
 //------------------------------------------------------------------------------
 // optRemoveRangeCheck : Given an indexing node, mark it as not needing a range check.
 //
@@ -8878,6 +8883,9 @@ GenTree* Compiler::optRemoveRangeCheck(GenTreeBoundsChk* check, GenTree* comma,
                  (check != nullptr && check->OperIs(GT_BOUNDS_CHECK) && comma == nullptr));
     noway_assert(check->OperIs(GT_BOUNDS_CHECK));
 
+    s_rangeCheckIndexNodes.record(check->GetIndex()->OperGet());
+    s_rangeCheckLengthNodes.record(check->GetArrayLength()->OperGet());
+
     GenTree* tree = comma != nullptr ? comma : check;
 
 #ifdef DEBUG

After that, a superpmi.exe replay over libraries_tests.run outputs the following when it finishes:

Loaded 674679  Jitted 674679  FailedCompile 0 Excluded 0 Missing 0
Total time: 565472.612700ms
BOUNDS_CHECK length nodes
ARR_LENGTH           :   51990
LCL_VAR              :   47159
CNS_INT              :    6751
COMMA                :       9
LCL_FLD              :       5
CAST                 :       4

BOUNDS_CHECK index nodes
LCL_VAR              :   62221
CNS_INT              :   42139
COMMA                :     623
CAST                 :     460
ADD                  :     397
RSH                  :      72
NEG                  :       5
UDIV                 :       1
Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch jakobbotsch marked this pull request as ready for review September 19, 2023 19:22
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @AndyAyersMS

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice. Would be useful to have some comments for the classes showing how to use them.

entry.Dumpable = dumpable;
break;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an assert that the DumpOnShutdownEntry table isn't all full?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

void NodeCounts::record(genTreeOps oper)
{
assert(oper < GT_COUNT);
m_counts[oper]++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this do an interlocked add?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that would depend on whether this class is instantiated as global, or as a member of Compiler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for good measure it should be interlocked add. NodeCounts is meant to be used as a static field (like Histogram) so you can collect stats for multiple compilations, so it can be racing with other threads, although I expect we mainly use it through SPMI where it won't due to the way parallelization works there.

Looks like Histogram should also be changed to use an interlocked add.

@jakobbotsch jakobbotsch merged commit e51acb5 into dotnet:main Sep 22, 2023
@jakobbotsch jakobbotsch deleted the jit-adhoc-metrics branch September 22, 2023 19:10
@ghost ghost locked as resolved and limited conversation to collaborators Oct 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants