-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port yield normalization from CoreCLR to Native AOT #103675
Merged
+390
−653
Merged
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
392c652
Initial commit
9237b9f
Use PalGetTickCount64
830d8d0
Add limits.h
d0a884c
Declare g_pFinalizerThread for Windows only
165bbb9
PR comments
b089bac
Fix build/x86
f4ed8e8
Remove finalizer thread from native aot
b646568
Remove unnecessary code
4127158
PR comments + Fix InterlockedExchange
a862782
Add TODOs
73d3d71
Use max/min and RhEnableFinalization
0d226da
Remove TODO
4c315c8
Fix Interlocked calls
17dac7e
Fix static_assert
e09fa54
Add PerformMeasurement call
73f3fef
Add YieldProcessorMeasurement
e07035e
Nit
9a76a18
Fix PerformMeasurement
6519b0b
Move PerformMeasurement
9606eb9
Fix PalInterlockedExchange64
234d61b
PR comments
51c4573
Fix build
e8e1290
Fix PalInterlocked
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -113,3 +113,4 @@ ThreadPoolWorkingThreadCount | |
ThreadRunning | ||
WaitHandleWaitStart | ||
WaitHandleWaitStop | ||
YieldProcessorMeasurement |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the measurement going to be triggered when this is deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still trying to figure this out, I'm not very familiar with Native AOT in general so I'd appreciate any suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we would need to call
YieldProcessorNormalization::PerformMeasurement()
from here or add aEnsureYieldProcessorNormalizedInitialized()
entry point to the new code that simply callsYieldProcessorNormalization::PerformMeasurement()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you happen to know if this function is called every ~4 seconds or faster than that? Currently we let
YieldProcessorNormalization::PerformMeasurement()
run every ~4 s so if that's the case, I believe we may add here the same call as in CoreCLRThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FinalizerStart
function is called once per process. It is equivalent ofFinalizerThreadStart
function in regular CoreCLR.I think you want to follow the same structure as in regular CoreCLR: Trigger the measurement from ScheduleMeasurementIfNecessary by calling
RhEnableFinalization
(it is equivalent ofFinalizerThread::EnableFinalization
in regular CoreCLR) and then add the measurement to loop inProcessFinalizers()
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure. The whole deal with measuring duration of something that is proportional to CPU cycle is not very precise, since the CPU cycle can change drastically and many times per second and will be different for every core. Unless machine is configured into HighPerformance power plan, every measurement is a bit of a coin toss and will produce the same result with the same error margins.
The main purpose of calibration is to continue using historically hard-coded spin counts in numerous places where we spinwait while allowing that to work on systems with vastly different pause durations (i.e. on post-skylake intel CPUs pause takes ~140 cycles, pre-skylake is about ~10 cycles). For such purpose the callibration is precise enough.
I am not sure about the value of redoing the measurement over and over.
Perhaps to support scenarios where a VM is migrated between pre/post skylake machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can add a periodic call PerformMeasurement in NativeAOT and see what happens.
My guess - nothing will change, just a bit more time spent in PerformMeasurement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is value in having the same behavior though.
If the re-measuring (or the whole calibration deal) could be somehow avoided or improved, it would make sense to do for both runtimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC there's a good reason to keep re-doing measurements, so probably keeping this behaviour in Native AOT would be better, I believe @kouvel or @mangod9 may elaborate better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The measurements done are very short and can be perturbed by CPU activity, the rolling min helps to stabilize it over time.