Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[9.0] Make CPU utilization checks in the thread pool configurable #112791

Open
wants to merge 1 commit into
base: release/9.0-staging
Choose a base branch
from

Conversation

kouvel
Copy link
Member

@kouvel kouvel commented Feb 21, 2025

  • Port of Make CPU utilization checks in the thread pool configurable #112789 to 9.0
  • On Windows, checking CPU utilization seems to involve a small amount of overhead, which can become noticeable or even significant in some scenarios. This change makes the intervals of time over which CPU utilization is computed configurable. Increasing the interval increases the period at which CPU utilization is updated. The same config var can also be used to disable CPU utilization checks and have features that use it behave as though CPU utilization is low.
  • CPU utilization is used by the starvation heuristic and hill climbing. When CPU utilization is very high, the starvation heuristic reduces the rate of thread injection in starved cases. When CPU utilization is high, hill climbing avoids settling on higher thread count control values.
  • CPU utilization is currently updated when the gate thread performs periodic activities, which happens typically every 500 ms when a worker thread is active. There is one gate thread per .NET process.
  • In scenarios where there are many .NET processes running, and where many of them frequently but lightly use the thread pool, overall CPU usage may be relatively low, but the overhead from CPU utilization checks can bubble up to a noticeable portion of overall CPU usage. In a scenario involving 100s of .NET processes, it was seen that CPU utilization checks amount to 0.5-1% of overall CPU usage on the machine, which was considered significant.

Customer Impact

A 1p customer running a large number of .NET services on a system with a large number of processors is seeing about 0.5-1% of total CPU time on the system being spent doing CPU utilization checks in the .NET thread pool. Due to other controls on usage, the positive impact of the CPU utilization checks is likely negligible, and the customer would like to reduce CPU usage from these checks, or to even eliminate the checks.

Regression?

No

Testing

Validated on a small test case that CPU usage from CPU utilization checks is reduced when the interval is increased, and eliminated when the checks are disabled.

Risk

Low, the change is under an opt-in config setting

- On Windows, checking CPU utilization seems to involve a small amount of overhead, which can become noticeable or even significant in some scenarios. This change makes the intervals of time over which CPU utilization is computed configurable. Increasing the interval increases the period at which CPU utilization is updated. The same config var can also be used to disable CPU utilization checks and have features that use it behave as though CPU utilization is low.
- CPU utilization is used by the starvation heuristic and hill climbing. When CPU utilization is very high, the starvation heuristic reduces the rate of thread injection in starved cases. When CPU utilization is high, hill climbing avoids settling on higher thread count control values.
- CPU utilization is currently updated when the gate thread performs periodic activities, which happens typically every 500 ms when a worker thread is active. There is one gate thread per .NET process.
- In scenarios where there are many .NET processes running, and where many of them frequently but lightly use the thread pool, overall CPU usage may be relatively low, but the overhead from CPU utilization checks can bubble up to a noticeable portion of overall CPU usage. In a scenario involving 100s of .NET processes, it was seen that CPU utilization checks amount to 0.5-1% of overall CPU usage on the machine, which was considered significant.
@kouvel kouvel added this to the 9.0.x milestone Feb 21, 2025
@kouvel kouvel self-assigned this Feb 21, 2025
@Copilot Copilot bot review requested due to automatic review settings February 21, 2025 17:49

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (1)

src/libraries/System.Private.CoreLib/src/System/Threading/PortableThreadPool.GateThread.cs:124

  • Ensure that the new behavior of conditionally updating CPU utilization based on the interval is covered by tests.
if (cpuUtilizationIntervalMs > 0 &&
Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. we will take for consideration in 9.0.x

@jeffschwMSFT jeffschwMSFT added the Servicing-consider Issue for next servicing release review label Feb 25, 2025
@rbhanda rbhanda modified the milestones: 9.0.x, 9.0.4 Feb 27, 2025
@rbhanda rbhanda added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Feb 27, 2025
@kouvel kouvel closed this Feb 28, 2025
@kouvel kouvel reopened this Feb 28, 2025
@kouvel
Copy link
Member Author

kouvel commented Mar 1, 2025

It seems the build analysis is failing in recent 9.0-staging PRs even though it doesn't look like there are any unknown failures. #112991 is another example. Not sure how to file an auto-tracking issue for that since there's not a particular failure to track. Would it be ok to bypass the build-analysis?

@kouvel
Copy link
Member Author

kouvel commented Mar 1, 2025

It may be because the dotnet-linker-tests job was canceled due to a disconnect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Threading Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants