idle cpu consumption >40% #996

avikivity · 2022-01-09T14:01:46Z

Bisected to 837cadb

reproducer: build/dev/apps/httpd/httpd --smp 2

Observe shard 0 eats 40% cpu instead of 2-3%.

The text was updated successfully, but these errors were encountered:

xemul · 2022-01-10T06:02:34Z

Samples: 226K of event 'cycles', Event count (approx.): 138621767385, Thread: reactor-12
Overhead  Command     Shared Object              Symbol
  14,73%  reactor-12  httpd                      [.] seastar::smp_message_queue::flush_request_batch                         ◆
   6,46%  reactor-12  httpd                      [.] seastar::smp_message_queue::process_completions                         ▒
   4,80%  reactor-12  httpd                      [.] seastar::smp_message_queue::process_queue<4ul, seastar::smp_message_queu▒
   3,29%  reactor-12  httpd                      [.] seastar::smp::poll_queues                                               ▒
   1,60%  reactor-12  httpd                      [.] seastar::smp_message_queue::flush_response_batch                        ▒
   1,11%  reactor-12  httpd                      [.] seastar::smp_message_queue::process_incoming                            ▒
   0,92%  reactor-12  [vdso]                     [.] 0x00000000000006c8                                                      ▒
   0,83%  reactor-12  httpd                      [.] seastar::aio_storage_context::submit_work                               ▒
   0,83%  reactor-12  httpd                      [.] seastar::internal::try_reap_events                                      ▒
   0,82%  reactor-12  httpd                      [.] seastar::reactor::poll_once

(#986)

xemul · 2022-01-10T08:29:14Z

It's fair_group replenishment timer wakes up reactor 2 times per millisecond. Maybe the on-demand replenishment was not such a bad idea after all.

Right now tokens replenisher runs from steady-clock timer of a group. This generates several troubles. First is that the replenishing rate is hard-coded to be 500 usec and it has no justification other than "seems to work decent enough with default latency goal". Next, when the reactor is idling it's woken up by this timer frequent enough to generate a noticeable user time. And finally, the timer sits on a group and is thus run by a single shard thus making the whole group depending on this shard stallness. The proposed fix is to make each shard replenishing the capacity when it really needs it. Benefits of this approach are: - no magic 500us constant for replenish rate - no dependency on a single shard rate (replenisher is shard-safe) - no user-time generation when idling fixes: scylladb#996 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

" The rate-limiter-based IO scheduler uses two token-buckets to rate limit the requests rate. The tokens are put into the second bucket (from where they are then grabbed for dispatch) by the procedure called "replenisher" which is run by a steady timer. This timer generates several troubles: its rate is magically selected, it runs on a single shard, it generates a noticeable user time when the reactor is idling. To fix that the proposal is to make io-queue poller replenish the tokens from all shards when they need them. Before this change it's worth tuning the replenishment threshold to be not less than the minimal capacity that can be claimed from the group. Verified on i3en instance with the rl-iosched. tests: unit(dev), manual.rl-iosched(dev) This set places one more item into the TODO list. If the disk slows down for some reason the replenisher may start generating more tokens for the 2nd bucket than there appears on the 1st. When it happens the replenishment code drops some re-generated tokens until some future time, thus slowing down its rate. This behavior is deliberate and was aimed at making the token-buckets adopt to the real disk speed. However, this logic may lead to false drops. The tokens appear on the 1st bucket in batches, with the "trendline" being at the expected rate. However, the replenisher most likely runs between those batches thus constantly generating more tokens just because those batches are not "linear enough". This is what surfaced during verification -- when the replenisher was switched into on-demand manner it became more "aggressive" thus losing more tokens. This was partially addressed by the threshold increase, but some more care is still needed. " Fixes #996 * 'br-fair-group-replenish-relax' of https://github.com/xemul/seastar: fair_queue: Replenish tokens on grab, not by timer fair_queue, io_queue: Configure replenish threshold from minimal IO request fair_group: Generalize duration -> capacity conversion fair_queue: Tune-up clock_type fair_queue: Remove unused _base

avikivity assigned xemul Jan 9, 2022

avikivity mentioned this issue Jan 9, 2022

idle cpu consumption > 40% scylladb/scylladb#9893

Closed

avikivity closed this as completed in 6e1d719 Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idle cpu consumption >40% #996

idle cpu consumption >40% #996

avikivity commented Jan 9, 2022

xemul commented Jan 10, 2022

xemul commented Jan 10, 2022

idle cpu consumption >40% #996

idle cpu consumption >40% #996

Comments

avikivity commented Jan 9, 2022

xemul commented Jan 10, 2022

xemul commented Jan 10, 2022