-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HdrSummary based on Summary and HdrHistogram #484
Conversation
Please note that this is work in progress and requires careful review and testing. |
I'll be interested to see the benchmarks. I'd not have two Summary variants though, replacing what's there is what'd be considered. |
I ran multiple benchmarks with It looks like this improves Benchmark:
|
simpleclient_hdrhistogram/src/main/java/io/prometheus/client/HdrTimeWindowQuantiles.java
Outdated
Show resolved
Hide resolved
I created a separate |
I'm wondering if there's any kind of integration / system test suite which I could use to ensure that |
The unittest for Summary should cover that. We can always copy in the Hdr code if needed. |
0dbe690a:
|
f21fbf0c:
|
I'll check if we could use https://maven.apache.org/plugins/maven-shade-plugin/ . |
Benchmarks with unrealistically frequent rotations by hardcoding The results were consistent across many runs over the last 2 days. Before - 4e0e752 -
After - b2489083 -
Before - 4e0e752 -
After - b2489083 -
|
580d689e:
580d689e -
580d689e -
|
|
I rebased the changes on |
benchmark/pom.xml
Outdated
<dependency> | ||
<groupId>io.prometheus</groupId> | ||
<artifactId>simpleclient_hdrhistogram</artifactId> | ||
<version>${project.version}</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ERROR] Failed to execute goal on project benchmarks: Could not resolve dependencies for project io.prometheus:benchmarks:jar:0.6.1-SNAPSHOT: Could not find artifact io.prometheus:simpleclient_hdrhistogram:jar:0.6.1-SNAPSHOT in sonatype-nexus-snapshots (https://oss.sonatype.org/content/repositories/snapshots)
I ran into the same issue with IntelliJ IDEA locally, it was trying to resolve nonexistent artifacts.
I fixed it by using ${project.version}
instead of 0.6.1-SNAPSHOT
for dependencies between modules.
It looks like the same fix doesn't work for CicleCI, but maybe there's a better way to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI changed recently, so it's possible it doesn't handle it well. Let's worry about it at merge time.
|
||
private final List<Double> quantiles = new ArrayList<Double>(); | ||
private long highestToLowestValueRatio = 1000; | ||
private int numberOfSignificantValueDigits = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll follow up regarding defaults in HdrHistogram/HdrHistogram#156.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set the default initial dynamic range to 1000
and the default number of significant value digits to 2
based on previous suggestions.
We used the following binary compatible in-place upgrade version for testing: |
I found |
YourKit screenshots about blocked theads. Before - 4e0e752: |
YourKit screenshots about CPU time. Before - 4e0e752: |
I'm not familiar with this tool, I think it's saying things are a fair bit better? |
It's YourKit Java Profiler, similar to JProfiler, slightly smarter than VisualVM or JConsole. I collected data for a couple of minutes from both versions under similar workload (hopefully). The I didn't make proper notes, and I'd still like to test our most intensive code path, but I think it improved enough that it doesn't affect our latency and throughput anymore. |
I'd like to get your view on how to proceed with this to get it merged, as the performance looks good now. We probably should wait for this fix to be merged in HdrHistogram HdrHistogram/HdrHistogram#156, so we have a reliable version to depend on. We should decide if it should be done as a drop in replacement or in a separate module. What do you think? |
It looks like at least an order of magnitude improvement, so I think we should go with replacement. |
I can confirm that the performance overhead essentially disappeared allowing us to gather metrics in our hot code path. I'll rebase the changes from #484 (comment) on |
Hello @brian-brazil, |
I was asked to double check if it's acceptable that Edit: I was told that it's a breaking API change that |
I added the |
I'll update this Pull Request to the in-place upgrade version when |
I'm planning to finish this around the end of the year. I'm still waiting for the |
@rrakos-evo it seems HDRHistorgram has released a version with the required fix. Any chance the above great work can be complete? |
I'm sorry for the delay, I'm planning to finish this around the end of next week, unless something urgent pops up. I can confirm that this has been working fine in production without any issues in the last ~3 months. 🙂🐎🚀 |
I might be able to pick this up in 2 weeks... 😞 |
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
…of ConcurrentLinkedQueue<ConcurrentDoubleHistogram> Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
Signed-off-by: Rudolf Rakos <rrakos@evolutiongaming.com>
I rebased the changes on 0.8.1 / a814b67. I'll update this Pull Request to the in-place upgrade version after a couple of days of testing. |
@brian-brazil Hello, I hope you're doing well :) Please note that this PR brings kinda breaking change in summary behaviour, before it was fine to pass negative measurements, and now it would result in runtime exception, also Histogram collector is not throwing exception on negative measurements, such behaviour might result in pretty unpleasant surprise for users, as most probably they're not handling exceptions from Summary collector (as it's currently is not throwing them). |
@brian-brazil @ghost sorry to be the Shrek Donkey, but are we there yet? |
The PR needs work still. If you want histograms with Prometheus then the Histogram type is recommended, as client-side quantiles are not aggregatable. |
Thanks you Brian, going through the documentation just now. |
I'll soon be making various breaking changes to this client library to switch to the OpenMetrics data model, so if we're going to try to get this in now is a good time so that all the potential breakages could be in one release. |
I'm happy to help with this MR. Do you have something off the top of your head about what has to be fixed? |
Reading the above discussion, this would need to be made a drop-in replacement with no new dependencies. |
Superseded by #605 |
Idea:
Pros:
get
,insert
)get
,insert
)Cons:
rotate
)maxAgeBuckets
,numberOfSignificantValueDigits
)Neutral:
error
is configured globally (numberOfSignificantValueDigits
)See http://hdrhistogram.org and https://github.com/HdrHistogram/HdrHistogram for more info on
HdrHistogram
.Related to #481 , #482 , #483
Resolves #480
cc @t3hnar