Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idle cpu consumption >40% #996

Closed
avikivity opened this issue Jan 9, 2022 · 2 comments
Closed

idle cpu consumption >40% #996

avikivity opened this issue Jan 9, 2022 · 2 comments
Assignees

Comments

@avikivity
Copy link
Member

Bisected to 837cadb

reproducer: build/dev/apps/httpd/httpd --smp 2

Observe shard 0 eats 40% cpu instead of 2-3%.

@xemul
Copy link
Contributor

xemul commented Jan 10, 2022

Samples: 226K of event 'cycles', Event count (approx.): 138621767385, Thread: reactor-12
Overhead  Command     Shared Object              Symbol
  14,73%  reactor-12  httpd                      [.] seastar::smp_message_queue::flush_request_batch                         ◆
   6,46%  reactor-12  httpd                      [.] seastar::smp_message_queue::process_completions                         ▒
   4,80%  reactor-12  httpd                      [.] seastar::smp_message_queue::process_queue<4ul, seastar::smp_message_queu▒
   3,29%  reactor-12  httpd                      [.] seastar::smp::poll_queues                                               ▒
   1,60%  reactor-12  httpd                      [.] seastar::smp_message_queue::flush_response_batch                        ▒
   1,11%  reactor-12  httpd                      [.] seastar::smp_message_queue::process_incoming                            ▒
   0,92%  reactor-12  [vdso]                     [.] 0x00000000000006c8                                                      ▒
   0,83%  reactor-12  httpd                      [.] seastar::aio_storage_context::submit_work                               ▒
   0,83%  reactor-12  httpd                      [.] seastar::internal::try_reap_events                                      ▒
   0,82%  reactor-12  httpd                      [.] seastar::reactor::poll_once                                             

(#986)

@xemul
Copy link
Contributor

xemul commented Jan 10, 2022

It's fair_group replenishment timer wakes up reactor 2 times per millisecond. Maybe the on-demand replenishment was not such a bad idea after all.

xemul added a commit to xemul/seastar that referenced this issue Jan 11, 2022
Right now tokens replenisher runs from steady-clock timer of a group.
This generates several troubles.

First is that the replenishing rate is hard-coded to be 500 usec and
it has no justification other than "seems to work decent enough with
default latency goal". Next, when the reactor is idling it's woken up
by this timer frequent enough to generate a noticeable user time. And
finally, the timer sits on a group and is thus run by a single shard
thus making the whole group depending on this shard stallness.

The proposed fix is to make each shard replenishing the capacity when
it really needs it. Benefits of this approach are:

- no magic 500us constant for replenish rate
- no dependency on a single shard rate (replenisher is shard-safe)
- no user-time generation when idling

fixes: scylladb#996

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
avikivity added a commit that referenced this issue Jan 11, 2022
"
The rate-limiter-based IO scheduler uses two token-buckets to rate limit
the requests rate. The tokens are put into the second bucket (from where
they are then grabbed for dispatch) by the procedure called "replenisher"
which is run by a steady timer.

This timer generates several troubles: its rate is magically selected,
it runs on a single shard, it generates a noticeable user time when the
reactor is idling.

To fix that the proposal is to make io-queue poller replenish the tokens
from all shards when they need them. Before this change it's worth tuning
the replenishment threshold to be not less than the minimal capacity that
can be claimed from the group.

Verified on i3en instance with the rl-iosched.

tests: unit(dev), manual.rl-iosched(dev)

This set places one more item into the TODO list.

If the disk slows down for some reason the replenisher may start
generating more tokens for the 2nd bucket than there appears on the 1st.
When it happens the replenishment code drops some re-generated tokens
until some future time, thus slowing down its rate. This behavior is
deliberate and was aimed at making the token-buckets adopt to the real
disk speed.

However, this logic may lead to false drops. The tokens appear on the
1st bucket in batches, with the "trendline" being at the expected rate.
However, the replenisher most likely runs between those batches thus
constantly generating more tokens just because those batches are not
"linear enough".

This is what surfaced during verification -- when the replenisher was
switched into on-demand manner it became more "aggressive" thus losing
more tokens. This was partially addressed by the threshold increase, but
some more care is still needed.
"

Fixes #996

* 'br-fair-group-replenish-relax' of https://github.com/xemul/seastar:
  fair_queue: Replenish tokens on grab, not by timer
  fair_queue, io_queue: Configure replenish threshold from minimal IO request
  fair_group: Generalize duration -> capacity conversion
  fair_queue: Tune-up clock_type
  fair_queue: Remove unused _base
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants