enhancement: Use `BTreeMap` instead of `HashMap` for logs and metrics #1838

ghost · 2020-02-18T13:57:02Z

Closes #1428, closes #1439. Expected to be merged after #1836.

This PR replaces the map type for logs and metrics from HashMap to BTreeMap. It does it for both logs and metrics at once because this way the changes in Protobuf deserialization are the simplest (the default map type just changed from HashMap to BTreeMap for all Protobuf structures in build.rs).

I wanted to introduce special type aliases for BTreeMap<Atom, String> in log event and BTreeMap<String, String> in metrics as was proposed in #1662 (comment), but it would still keep multiple mentions of BTreeMap over the codebase. Because of that I think such a replacement is better to be done as a separate issue, probably a part of the Tech-debt payment #1 milestone, but not in this PR to keep things scoped.

See #1662 (comment) for details.

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

binarylogic · 2020-02-18T16:13:14Z

Would it make sense to add tests here that assert the output is sorted in some way? For example, the console sink should be sorting keys now, correct? Maybe this should be added as a behavior test?

Hoverbear

Overall the code looks good.

I'm not entirely clear why adding ValueMap and ValueArray are necessary as part of this change, but I feel like I'm missing some context around that. (Edit: I see now why in: #1428)

Having a test for sorted-ness would be useful as one of us may try to get clever one day and optimize things, breaking the sorting.

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

ghost · 2020-02-18T18:44:11Z

@binarylogic

Would it make sense to add tests here that assert the output is sorted in some way? For example, the console sink should be sorting keys now, correct? Maybe this should be added as a behavior test?

I've added the unit tests which check serialization order for console sink and iteration order for fields in log events. The behavior tests are currently not able to check the fields order.

@Hoverbear

I'm not entirely clear why adding ValueMap and ValueArray are necessary as part of this change, but I feel like I'm missing some context around that. (Edit: I see now why in: #1428)

The first commit of this PR 1cc021b is supposed to be merged as #1836, and then this PR will be rebased on top of master. So it contains 1cc021b just to not be blocked by waiting until #1836 is merged.

bruceg · 2020-02-19T17:37:10Z

Out of curiosity, has anybody benchmarked how this might affect performance?

binarylogic · 2020-02-19T17:45:13Z

Excellent question! This is a perfect PR for trying out #1416.

binarylogic · 2020-02-19T17:48:47Z

I'm also curious if we should be comparing the /benches results?

LucioFranco · 2020-02-19T19:43:39Z

I'll run some benchmarks but it looks like some of our benches currently don't run so I will figure that out and check against this branch. I should have a decent machine for benching now...

LucioFranco · 2020-02-19T20:16:24Z

Overall benches look very positive, my system is somewhat noisy but these are macro benchmarks anyways:

pipe                    time:   [147.74 ms 152.41 ms 157.02 ms]
                        thrpt:  [60.734 MiB/s 62.571 MiB/s 64.549 MiB/s]
                 change:
                        time:   [-25.235% -21.497% -17.903%] (p = 0.00 < 0.05)
                        thrpt:  [+21.807% +27.384% +33.752%]
                        Performance has improved.

Benchmarking pipe_with_tiny_lines: Warming up for 3.0000 s
pipe_with_tiny_lines    time:   [109.46 ms 114.05 ms 118.78 ms]
                        thrpt:  [822.14 KiB/s 856.27 KiB/s 892.17 KiB/s]
                 change:
                        time:   [-24.639% -20.734% -16.496%] (p = 0.00 < 0.05)
                        thrpt:  [+19.755% +26.158% +32.694%]
                        Performance has improved.

Benchmarking pipe_with_huge_lines: Warming up for 3.0000 s
pipe_with_huge_lines    time:   [717.95 ms 723.49 ms 726.37 ms]
                        thrpt:  [262.59 MiB/s 263.63 MiB/s 265.67 MiB/s]
                 change:
                        time:   [+1.2460% +2.3198% +3.3286%] (p = 0.00 < 0.05)
                        thrpt:  [-3.2214% -2.2672% -1.2306%]
                        Change within noise threshold.

Benchmarking pipe_with_many_writers: Warming up for 3.0000 s
pipe_with_many_writers  time:   [299.94 ms 302.09 ms 304.37 ms]
                        thrpt:  [31.333 MiB/s 31.569 MiB/s 31.795 MiB/s]
                 change:
                        time:   [+1.6253% +4.0722% +7.1013%] (p = 0.00 < 0.05)
                        thrpt:  [-6.6305% -3.9129% -1.5993%]
                        Change within noise threshold.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benchmarking interconnected: Warming up for 3.0000 s
interconnected          time:   [237.65 ms 240.24 ms 244.07 ms]
                        thrpt:  [78.146 MiB/s 79.395 MiB/s 80.257 MiB/s]
                 change:
                        time:   [+6.3640% +8.4617% +10.522%] (p = 0.00 < 0.05)
                        thrpt:  [-9.5202% -7.8016% -5.9832%]
                        Performance has regressed.

Benchmarking transforms: Warming up for 3.0000 s
transforms              time:   [310.74 ms 316.92 ms 323.36 ms]
                        thrpt:  [32.442 MiB/s 33.101 MiB/s 33.759 MiB/s]
                 change:
                        time:   [-21.314% -19.847% -18.331%] (p = 0.00 < 0.05)
                        thrpt:  [+22.446% +24.761% +27.087%]
                        Performance has improved.

LucioFranco

Overall, looks fantastic! Benches look good, just a few questions, otherwise LGTM!

src/event/mod.rs

src/transforms/coercer.rs

bruceg · 2020-02-20T01:22:08Z

Overall benches look very positive, my system is somewhat noisy but these are macro benchmarks anyways:

They do look positive, surprisingly so (at least to me). I'm wondering if these benches have an small set of keys that would end up fitting in a single B-Tree node, and so not actually stress the harder cases much. I see transforms at least adds a couple more fields in the transforms. The regex benchmark parses a lot of fields out of the message, but that's just as likely to stress the regex crate rather than our code. Do you have numbers for that one too? If not, I'll check myself.

LucioFranco · 2020-02-20T15:28:53Z

@bruceg yeah, B for rust's btree is 6 so I think maybe pushing events to contain more fields beyond that so that it will force a split could be good.

bruceg · 2020-02-20T15:36:13Z

The regex benchmark adds 10 fields, so that's a start. I guess many simple uses of Vector will be just a few fields. I was just thinking of sources like journald, which can have 25-30 fields per record.

bruceg · 2020-02-20T17:43:22Z

I'm also uncertain why this is marked as a breaking change. What user facing behavior is being changed by this that would break existing uses? I agree this does change our internal API, but I think we have reserved the breaking change indicator for things that cause changes to user visible behavior that they have to compensate for (ie config fields change, etc)

Hoverbear · 2020-02-20T17:50:12Z

@a-rodin did you have a chance to look at IndexMap? https://docs.rs/indexmap/1.3.2/indexmap/ It looks also suitable for this.

Hoverbear · 2020-02-20T17:52:57Z

While this is a breaking change (we're changing the order output of fields) JSON by it's nature is unordered, and the internal ordering is left up to the implementer. It may be worth noting to users that ordering may change in the future.

We can also consider that users may be using text, or in the future protobufs or some other data type.

LucioFranco · 2020-02-20T18:02:45Z

There were no ordering guarantees before, so adding them now shouldn't be a breaking change?

ghost · 2020-02-20T18:10:55Z

I'm also uncertain why this is marked as a breaking change. What user facing behavior is being changed by this that would break existing uses? I agree this does change our internal API, but I think we have reserved the breaking change indicator for things that cause changes to user visible behavior that they have to compensate for (ie config fields change, etc)

There were no ordering guarantees before, so adding them now shouldn't be a breaking change?

My reasoning was that it is observable in principle, but as we didn't have any guarantees about this it doesn't break anything.

ghost · 2020-02-20T18:13:36Z

@Hoverbear

@a-rodin did you have a chance to look at IndexMap? https://docs.rs/indexmap/1.3.2/indexmap/ It looks also suitable for this.

We discussed it, but the main issue is that prost doesn't support IndexMap. See #1428 (comment).

bruceg · 2020-02-20T20:01:55Z

I ran the benchmarks (or at least I tried, see below) and discovered that either we don't have good benchmarks, or they are very susceptible to external noise. I set up an EC2 instance (m5.2xlarge 8 thread CPU, 32GB RAM) with nothing else running. I noticed that one of the benchmarks regressed by almost 20% (lua_add_fields/lua), so I reran it. Here are some of the results of running it in a loop:

lua_add_fields/lua      time:   [275.77 ms 283.80 ms 292.01 ms]
lua_add_fields/lua      time:   [266.07 ms 274.29 ms 282.18 ms]
lua_add_fields/lua      time:   [275.07 ms 285.47 ms 292.59 ms]
lua_add_fields/lua      time:   [300.52 ms 307.70 ms 311.71 ms]
lua_add_fields/lua      time:   [258.50 ms 264.29 ms 269.05 ms]
lua_add_fields/lua      time:   [290.66 ms 296.30 ms 304.42 ms]
lua_add_fields/lua      time:   [260.00 ms 268.36 ms 279.30 ms]

The estimate of our run time varied from 264.29ms to 307.70 ms, which swamps most estimates of improvement or regression. Increasing the measurement time to 180s (more than 10x the default) doesn't help much:

lua_add_fields/lua      time:   [273.25 ms 274.81 ms 276.60 ms]
lua_add_fields/lua      time:   [272.05 ms 274.54 ms 276.68 ms]
lua_add_fields/lua      time:   [283.73 ms 285.96 ms 289.96 ms]
lua_add_fields/lua      time:   [281.14 ms 284.54 ms 287.78 ms]
lua_add_fields/lua      time:   [286.95 ms 289.36 ms 291.80 ms]

So, what do we know? This change likely doesn't have a massive impact on performance. I don't think we should hold up this needed change to wait for us to figure out what the actual impact is. We need a better benchmarking system to be able to prove anything. Benchmarking is hard. :(

(Note that this is the when running 6dc3823 cherry picked on top of the last commit that would run benchmarks. I might have made mistakes handling the merge conflicts wrong, but it shouldn't affect the reliability of the performance numbers. See #1872)

ghost · 2020-02-21T09:22:03Z

I set up an EC2 instance (m5.2xlarge 8 thread CPU, 32GB RAM) with nothing else running.

Just a thought: aren't EC2 instances intrinsically noisy because the underlying hardware is shared between multiple VMs? I think to make really precise benchmarks it is better to run them on an operating system running on bare metal.

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

ghost · 2020-02-21T13:10:55Z

Because #1838 (comment) and #1838 (comment) suggest there there were no blocking regressions in performance and also both #1838 (comment) and #1838 (comment) are resolved now, I'm merging this.

LucioFranco · 2020-02-21T15:57:00Z

@bruceg I would say some of our benchmarks in my experience can show varaiance, but in general I try to rely on the pipe based ones and the http ones. Those for me have always be super consistent over multiple runs.

Also one thing to note m5's from what I understand are not the right machine since their CPU's will be somewhat noisy. c5's do a much better job at not being noisy and that's generally what we've used in the past to run benchmarks.

a-rodin added 2 commits February 18, 2020 13:08

enhancement!: Support Array and Map kinds in Value

1cc021b

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

enhancement!: Use BTreeMap instead of HashMap for logs and metrics

6dc3823

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

ghost requested review from bruceg, LucioFranco and lukesteensen as code owners February 18, 2020 13:57

ghost mentioned this pull request Feb 18, 2020

enhancement!: Initial implementation of nested fields support for log events #1662

Closed

binarylogic removed request for lukesteensen and bruceg February 18, 2020 16:26

binarylogic assigned LucioFranco Feb 18, 2020

Hoverbear reviewed Feb 18, 2020

View reviewed changes

Add tests for serialization order

df57898

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

LucioFranco approved these changes Feb 19, 2020

View reviewed changes

src/event/mod.rs Outdated Show resolved Hide resolved

src/transforms/coercer.rs Show resolved Hide resolved

ghost changed the title ~~enhancement!: Use BTreeMap instead of HashMap for logs and metrics~~ enhancement: Use BTreeMap instead of HashMap for logs and metrics Feb 20, 2020

Hoverbear mentioned this pull request Feb 20, 2020

feat(new transform): Initial dedupe transform implementation #1848

Merged

ghost mentioned this pull request Feb 20, 2020

New flatten transform #1880

Closed

a-rodin added 4 commits February 21, 2020 12:40

Avoid cloning in serialization

0f8dc87

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

Avoid cloning in coercer transform

7db3a12

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

Merge branch 'master' into btree-map-for-logs-and-metrics

e2d8ec0

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

Fix influxdb_metrics sink

d1e1a5f

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

ghost merged commit 17340e2 into master Feb 21, 2020

bruceg deleted the btree-map-for-logs-and-metrics branch February 21, 2020 18:08

ghost mentioned this pull request Mar 2, 2020

Reapproach the log event data model and nesting strategy #704

Closed

blt mentioned this pull request Jun 21, 2022

chore(data model): drop PartialOrd from Event #13277

Merged

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement: Use `BTreeMap` instead of `HashMap` for logs and metrics #1838

enhancement: Use `BTreeMap` instead of `HashMap` for logs and metrics #1838

ghost commented Feb 18, 2020 •

edited by ghost

Loading

binarylogic commented Feb 18, 2020

Hoverbear left a comment •

edited

Loading

ghost commented Feb 18, 2020

bruceg commented Feb 19, 2020

binarylogic commented Feb 19, 2020

binarylogic commented Feb 19, 2020

LucioFranco commented Feb 19, 2020 •

edited

Loading

LucioFranco commented Feb 19, 2020 •

edited

Loading

LucioFranco left a comment

bruceg commented Feb 20, 2020 •

edited

Loading

LucioFranco commented Feb 20, 2020

bruceg commented Feb 20, 2020

bruceg commented Feb 20, 2020

Hoverbear commented Feb 20, 2020

Hoverbear commented Feb 20, 2020 •

edited

Loading

LucioFranco commented Feb 20, 2020

ghost commented Feb 20, 2020

ghost commented Feb 20, 2020

bruceg commented Feb 20, 2020

ghost commented Feb 21, 2020 •

edited by ghost

Loading

ghost commented Feb 21, 2020

LucioFranco commented Feb 21, 2020

enhancement: Use BTreeMap instead of HashMap for logs and metrics #1838

enhancement: Use BTreeMap instead of HashMap for logs and metrics #1838

Conversation

ghost commented Feb 18, 2020 • edited by ghost Loading

binarylogic commented Feb 18, 2020

Hoverbear left a comment • edited Loading

Choose a reason for hiding this comment

ghost commented Feb 18, 2020

bruceg commented Feb 19, 2020

binarylogic commented Feb 19, 2020

binarylogic commented Feb 19, 2020

LucioFranco commented Feb 19, 2020 • edited Loading

LucioFranco commented Feb 19, 2020 • edited Loading

LucioFranco left a comment

Choose a reason for hiding this comment

bruceg commented Feb 20, 2020 • edited Loading

LucioFranco commented Feb 20, 2020

bruceg commented Feb 20, 2020

bruceg commented Feb 20, 2020

Hoverbear commented Feb 20, 2020

Hoverbear commented Feb 20, 2020 • edited Loading

LucioFranco commented Feb 20, 2020

ghost commented Feb 20, 2020

ghost commented Feb 20, 2020

bruceg commented Feb 20, 2020

ghost commented Feb 21, 2020 • edited by ghost Loading

ghost commented Feb 21, 2020

LucioFranco commented Feb 21, 2020

enhancement: Use `BTreeMap` instead of `HashMap` for logs and metrics #1838

enhancement: Use `BTreeMap` instead of `HashMap` for logs and metrics #1838

ghost commented Feb 18, 2020 •

edited by ghost

Loading

Hoverbear left a comment •

edited

Loading

LucioFranco commented Feb 19, 2020 •

edited

Loading

LucioFranco commented Feb 19, 2020 •

edited

Loading

bruceg commented Feb 20, 2020 •

edited

Loading

Hoverbear commented Feb 20, 2020 •

edited

Loading

ghost commented Feb 21, 2020 •

edited by ghost

Loading