Skip to content

Commit

Permalink
Move local thread pool work items to global by default when blocking …
Browse files Browse the repository at this point in the history
…on a task (#112796)

* Move local thread pool work items to global by default when blocking on a task

- Enables the change in #109841 by default
- A worker thread blocking on a task while having pending local work items can lead to priority inversion and deadlock-like issues, as local work items are stolen with the lowest priority by other threads. In a situation where all worker threads are blocked on tasks and new ones continue to block on tasks similarly, the global queues may not deplete for a long time and any local work items of blocked threads may not run meanwhile. There have been some reports of issues like this occurring while coinciding with other external issues that aggravate the problem.
- When a worker thread blocks on a task, the change moves any local work items to the high-priority global queue. Using the normal-priority queues can lead to other priority inversion issues in worst-case scenarios like above, where the normal-priority queues may consist of a number of new requests, and local work items typically represent already in-flight work.
  • Loading branch information
kouvel authored Feb 26, 2025
1 parent bb19ea4 commit f5c7344
Show file tree
Hide file tree
Showing 2 changed files with 46 additions and 25 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -3060,26 +3060,21 @@ private bool SpinThenBlockingWait(int millisecondsTimeout, CancellationToken can
bool returnValue = SpinWait(millisecondsTimeout);
if (!returnValue)
{
#if CORECLR
if (ThreadPoolWorkQueue.s_prioritizationExperiment)
{
// We're about to block waiting for the task to complete, which is expensive, and if
// the task being waited on depends on some other work to run, this thread could end up
// waiting for some other thread to do work. If the two threads are part of the same scheduler,
// such as the thread pool, that could lead to a (temporary) deadlock. This is made worse by
// it also leading to a possible priority inversion on previously queued work. Each thread in
// the thread pool has a local queue. A key motivator for this local queue is it allows this
// thread to create work items that it will then prioritize above all other work in the
// pool. However, while this thread makes its own local queue the top priority, that queue is
// every other thread's lowest priority. If this thread blocks, all of its created work that's
// supposed to be high priority becomes low priority, and work that's typically part of a
// currently in-flight operation gets deprioritized relative to new requests coming into the
// pool, which can lead to the whole system slowing down or even deadlocking. To address that,
// just before we block, we move all local work into a global queue, so that it's at least
// prioritized by other threads more fairly with respect to other work.
ThreadPoolWorkQueue.TransferAllLocalWorkItemsToHighPriorityGlobalQueue();
}
#endif
// We're about to block waiting for the task to complete, which is expensive, and if
// the task being waited on depends on some other work to run, this thread could end up
// waiting for some other thread to do work. If the two threads are part of the same scheduler,
// such as the thread pool, that could lead to a (temporary) deadlock. This is made worse by
// it also leading to a possible priority inversion on previously queued work. Each thread in
// the thread pool has a local queue. A key motivator for this local queue is it allows this
// thread to create work items that it will then prioritize above all other work in the
// pool. However, while this thread makes its own local queue the top priority, that queue is
// every other thread's lowest priority. If this thread blocks, all of its created work that's
// supposed to be high priority becomes low priority, and work that's typically part of a
// currently in-flight operation gets deprioritized relative to new requests coming into the
// pool, which can lead to the whole system slowing down or even deadlocking. To address that,
// just before we block, we move all local work into a global queue, so that it's at least
// prioritized by other threads more fairly with respect to other work.
ThreadPoolWorkQueue.TransferAllLocalWorkItemsToHighPriorityGlobalQueue();

var mres = new SetOnInvokeMres();
try
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -621,8 +621,8 @@ internal void EnsureThreadRequested()
// Only request a thread if the stage is NotScheduled.
// Otherwise let the current requested thread handle parallelization.
if (Interlocked.Exchange(
ref _separated.queueProcessingStage,
QueueProcessingStage.Scheduled) == QueueProcessingStage.NotScheduled)
ref _separated.queueProcessingStage,
QueueProcessingStage.Scheduled) == QueueProcessingStage.NotScheduled)
{
ThreadPool.RequestWorkerThread();
}
Expand Down Expand Up @@ -723,14 +723,40 @@ internal static void TransferAllLocalWorkItemsToHighPriorityGlobalQueue()
// Pop each work item off the local queue and push it onto the global. This is a
// bounded loop as no other thread is allowed to push into this thread's queue.
ThreadPoolWorkQueue queue = ThreadPool.s_workQueue;
bool addedHighPriorityWorkItem = false;
bool ensureThreadRequest = false;
while (tl.workStealingQueue.LocalPop() is object workItem)
{
queue.highPriorityWorkItems.Enqueue(workItem);
// If there's an unexpected exception here that happens to get handled, the lost work item, or missing thread
// request, etc., may lead to other issues. A fail-fast or try-finally here could reduce the effect of such
// uncommon issues to various degrees, but it's also uncommon to check for unexpected exceptions.
try
{
queue.highPriorityWorkItems.Enqueue(workItem);
addedHighPriorityWorkItem = true;
}
catch (OutOfMemoryException)
{
// This is not expected to throw under normal circumstances
tl.workStealingQueue.LocalPush(workItem);

// A work item had been removed temporarily and other threads may have missed stealing it, so ensure that
// there will be a thread request
ensureThreadRequest = true;
break;
}
}

Volatile.Write(ref queue._mayHaveHighPriorityWorkItems, true);
if (addedHighPriorityWorkItem)
{
Volatile.Write(ref queue._mayHaveHighPriorityWorkItems, true);
ensureThreadRequest = true;
}

queue.EnsureThreadRequested();
if (ensureThreadRequest)
{
queue.EnsureThreadRequested();
}
}

internal static bool LocalFindAndPop(object callback)
Expand Down

0 comments on commit f5c7344

Please sign in to comment.