Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use atomics instead of synchronized for Gauge #482

Merged
merged 5 commits into from Sep 5, 2019
Merged

Use atomics instead of synchronized for Gauge #482

merged 5 commits into from Sep 5, 2019

Conversation

ghost
Copy link

@ghost ghost commented Jun 5, 2019

Idea:

We could implement Gauge#set with initializing a new DoubleAdder to avoid exposing invalid state, but that might have significant allocation (GC) overhead.

Pros:

  • avoids thread contention for highly concurrent workloads (get, set)
  • reduces constant overhead (get, set)

Cons:

  • introduces allocations (set and inc / dec)
  • increases GC overhead for high throughput workloads (set and inc / dec)

Relates to #480

@ghost
Copy link
Author

ghost commented Jun 5, 2019

I ran multiple benchmarks with 24 threads and picked the result with the smallest error, but of course I can’t know how it behaves under different workloads.

Benchmark:

# JMH version: 1.21
# VM version: JDK 11.0.2, OpenJDK 64-Bit Server VM, 11.0.2+9
# VM invoker: /Library/Java/JavaVirtualMachines/openjdk-11.0.2.jdk/Contents/Home/bin/java
# VM options: -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=49590:/Applications/IntelliJ IDEA CE.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 10 s each
# Measurement: 4 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 24 threads, will synchronize iterations

Before:

Benchmark                                                 Mode  Cnt     Score     Error  Units
GaugeBenchmark.prometheusSimpleGaugeChildDecBenchmark     avgt    4    31.720 ±   1.850  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildIncBenchmark     avgt    4    33.041 ±   0.608  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildSetBenchmark     avgt    4  1441.987 ±  29.700  ns/op
GaugeBenchmark.prometheusSimpleGaugeDecBenchmark          avgt    4   154.518 ±  17.714  ns/op
GaugeBenchmark.prometheusSimpleGaugeIncBenchmark          avgt    4   155.453 ±   7.384  ns/op
GaugeBenchmark.prometheusSimpleGaugeSetBenchmark          avgt    4  3047.817 ± 168.989  ns/op

After:

Benchmark                                                 Mode  Cnt    Score    Error  Units
GaugeBenchmark.prometheusSimpleGaugeChildDecBenchmark     avgt    4   30.618 ±  0.867  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildIncBenchmark     avgt    4   31.543 ±  5.164  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildSetBenchmark     avgt    4  314.646 ±  4.892  ns/op
GaugeBenchmark.prometheusSimpleGaugeDecBenchmark          avgt    4  150.270 ±  9.382  ns/op
GaugeBenchmark.prometheusSimpleGaugeIncBenchmark          avgt    4  156.412 ±  9.367  ns/op
GaugeBenchmark.prometheusSimpleGaugeSetBenchmark          avgt    4  537.230 ± 67.680  ns/op

@brian-brazil
Copy link
Contributor

What I was thinking about back when adding this is that we could adjust DoubleAdder to handle this more efficiently if it ever became an issue.

@ghost
Copy link
Author

ghost commented Jun 5, 2019

I’m just thinking out loudly:
We could introduce DoubleAdder#set as an atomic alternative to reset and add.
I'm not sure how to update base and cells atomically without locking using busy.

@ghost
Copy link
Author

ghost commented Jun 7, 2019

81283a20:

Benchmark                                                 Mode  Cnt    Score    Error  Units
GaugeBenchmark.codahaleCounterDecBenchmark                avgt    4   35.166 ± 16.052  ns/op
GaugeBenchmark.codahaleCounterIncBenchmark                avgt    4   31.423 ± 10.979  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildDecBenchmark     avgt    4   38.966 ±  3.249  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildIncBenchmark     avgt    4   37.798 ±  9.753  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildSetBenchmark     avgt    4  429.808 ± 92.764  ns/op
GaugeBenchmark.prometheusSimpleGaugeDecBenchmark          avgt    4  182.829 ± 38.544  ns/op
GaugeBenchmark.prometheusSimpleGaugeIncBenchmark          avgt    4  184.254 ± 43.754  ns/op
GaugeBenchmark.prometheusSimpleGaugeSetBenchmark          avgt    4  582.121 ± 86.330  ns/op
GaugeBenchmark.prometheusSimpleGaugeNoLabelsDecBenchmark  avgt    4   37.136 ±  6.599  ns/op
GaugeBenchmark.prometheusSimpleGaugeNoLabelsIncBenchmark  avgt    4   39.167 ±  3.439  ns/op
GaugeBenchmark.prometheusSimpleGaugeNoLabelsSetBenchmark  avgt    4  456.339 ± 61.951  ns/op

Copy link
Contributor

@brian-brazil brian-brazil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, I don't think it's correct when there's the (rare) mix of inc and set.

Copy link
Contributor

@brian-brazil brian-brazil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, it'd be simpler to just set the values on the Cells - but we'd still want to keep the cas on busy around for the right semantics in sum.

if (busy == 0 && casBusy()) {
try {
if (cells == as) { // recheck under lock
// allocate new cells instead of updating existing cells
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, it'd be simpler to just set the values on the Cells - but we'd still want to keep the cas on busy around for the right semantics in sum.

I don't think we could ensure correctness in sum with updating existing cells in set, as it's not possible to take a snapshot of cells atomically, so there would be no way to detect the intermediate state of cells caused by concurrent updates of individual cells.

@ghost
Copy link
Author

ghost commented Jun 13, 2019

2874c5ee:

Benchmark                                                 Mode  Cnt    Score    Error  Units
GaugeBenchmark.codahaleCounterDecBenchmark                avgt    4   35.740 ± 20.478  ns/op
GaugeBenchmark.codahaleCounterIncBenchmark                avgt    4   32.936 ±  1.891  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildDecBenchmark     avgt    4   36.128 ±  3.627  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildIncBenchmark     avgt    4   38.462 ±  1.829  ns/op
GaugeBenchmark.prometheusSimpleGaugeChildSetBenchmark     avgt    4  444.700 ± 97.494  ns/op
GaugeBenchmark.prometheusSimpleGaugeDecBenchmark          avgt    4  163.684 ±  4.272  ns/op
GaugeBenchmark.prometheusSimpleGaugeIncBenchmark          avgt    4  161.102 ± 12.008  ns/op
GaugeBenchmark.prometheusSimpleGaugeSetBenchmark          avgt    4  492.110 ± 50.022  ns/op
GaugeBenchmark.prometheusSimpleGaugeNoLabelsDecBenchmark  avgt    4   37.190 ± 12.319  ns/op
GaugeBenchmark.prometheusSimpleGaugeNoLabelsIncBenchmark  avgt    4   35.756 ±  5.226  ns/op
GaugeBenchmark.prometheusSimpleGaugeNoLabelsSetBenchmark  avgt    4  471.402 ± 92.968  ns/op

Rudolf Rakos added 5 commits July 31, 2019 17:41
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
@ghost
Copy link
Author

ghost commented Jul 31, 2019

I rebased the changes on master.
I'll share load testing and profiling results soon™️.

@ghost
Copy link
Author

ghost commented Aug 15, 2019

I couldn't find Gauge in YourKit snapshots and screenshots from July. 🙁
I'll need to profile the changes again, so I'll try to prove that it eliminated blocking.

@ghost
Copy link
Author

ghost commented Sep 5, 2019

I can confirm that there's no more blocking. I've been trying to get profiling results to have something to show that proves that, but because it's so fast now it doesn't show up during sampling using a reasonable sampling rate, that does not affect performance by itself.

Do you have any further concerns about merging this?

@brian-brazil brian-brazil merged commit d7825cf into prometheus:master Sep 5, 2019
@brian-brazil
Copy link
Contributor

Thanks!

@ghost
Copy link
Author

ghost commented Sep 13, 2019

Thank you. 🙂

@ghost ghost deleted the gauge-atomic branch October 16, 2019 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant