-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhancement: Use BTreeMap
instead of HashMap
for logs and metrics
#1838
Conversation
Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>
Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>
Would it make sense to add tests here that assert the output is sorted in some way? For example, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the code looks good.
I'm not entirely clear why adding ValueMap
and ValueArray
are necessary as part of this change, but I feel like I'm missing some context around that. (Edit: I see now why in: #1428)
Having a test for sorted-ness would be useful as one of us may try to get clever one day and optimize things, breaking the sorting.
Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>
I've added the unit tests which check serialization order for
The first commit of this PR 1cc021b is supposed to be merged as #1836, and then this PR will be rebased on top of master. So it contains 1cc021b just to not be blocked by waiting until #1836 is merged. |
Out of curiosity, has anybody benchmarked how this might affect performance? |
Excellent question! This is a perfect PR for trying out #1416. |
I'm also curious if we should be comparing the |
I'll run some benchmarks but it looks like some of our benches currently don't run so I will figure that out and check against this branch. I should have a decent machine for benching now... |
Overall benches look very positive, my system is somewhat noisy but these are macro benchmarks anyways:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks fantastic! Benches look good, just a few questions, otherwise LGTM!
They do look positive, surprisingly so (at least to me). I'm wondering if these benches have an small set of keys that would end up fitting in a single B-Tree node, and so not actually stress the harder cases much. I see |
@bruceg yeah, |
The |
I'm also uncertain why this is marked as a breaking change. What user facing behavior is being changed by this that would break existing uses? I agree this does change our internal API, but I think we have reserved the breaking change indicator for things that cause changes to user visible behavior that they have to compensate for (ie config fields change, etc) |
@a-rodin did you have a chance to look at IndexMap? https://docs.rs/indexmap/1.3.2/indexmap/ It looks also suitable for this. |
While this is a breaking change (we're changing the order output of fields) JSON by it's nature is unordered, and the internal ordering is left up to the implementer. It may be worth noting to users that ordering may change in the future. We can also consider that users may be using text, or in the future protobufs or some other data type. |
There were no ordering guarantees before, so adding them now shouldn't be a breaking change? |
My reasoning was that it is observable in principle, but as we didn't have any guarantees about this it doesn't break anything. |
BTreeMap
instead of HashMap
for logs and metricsBTreeMap
instead of HashMap
for logs and metrics
We discussed it, but the main issue is that |
I ran the benchmarks (or at least I tried, see below) and discovered that either we don't have good benchmarks, or they are very susceptible to external noise. I set up an EC2 instance (m5.2xlarge 8 thread CPU, 32GB RAM) with nothing else running. I noticed that one of the benchmarks regressed by almost 20% (lua_add_fields/lua), so I reran it. Here are some of the results of running it in a loop:
The estimate of our run time varied from 264.29ms to 307.70 ms, which swamps most estimates of improvement or regression. Increasing the measurement time to 180s (more than 10x the default) doesn't help much:
So, what do we know? This change likely doesn't have a massive impact on performance. I don't think we should hold up this needed change to wait for us to figure out what the actual impact is. We need a better benchmarking system to be able to prove anything. Benchmarking is hard. :( (Note that this is the when running 6dc3823 cherry picked on top of the last commit that would run benchmarks. I might have made mistakes handling the merge conflicts wrong, but it shouldn't affect the reliability of the performance numbers. See #1872) |
Just a thought: aren't EC2 instances intrinsically noisy because the underlying hardware is shared between multiple VMs? I think to make really precise benchmarks it is better to run them on an operating system running on bare metal. |
Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>
Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>
Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>
Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>
Because #1838 (comment) and #1838 (comment) suggest there there were no blocking regressions in performance and also both #1838 (comment) and #1838 (comment) are resolved now, I'm merging this. |
@bruceg I would say some of our benchmarks in my experience can show varaiance, but in general I try to rely on the pipe based ones and the http ones. Those for me have always be super consistent over multiple runs. Also one thing to note |
Closes #1428, closes #1439. Expected to be merged after #1836.
This PR replaces the map type for logs and metrics from
HashMap
toBTreeMap
. It does it for both logs and metrics at once because this way the changes in Protobuf deserialization are the simplest (the default map type just changed fromHashMap
toBTreeMap
for all Protobuf structures inbuild.rs
).I wanted to introduce special type aliases for
BTreeMap<Atom, String>
in log event andBTreeMap<String, String>
in metrics as was proposed in #1662 (comment), but it would still keep multiple mentions ofBTreeMap
over the codebase. Because of that I think such a replacement is better to be done as a separate issue, probably a part of the Tech-debt payment #1 milestone, but not in this PR to keep things scoped.See #1662 (comment) for details.