Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python benchmark for transform logging over time #7039

Merged
merged 3 commits into from
Aug 5, 2024

Conversation

Wumpf
Copy link
Member

@Wumpf Wumpf commented Aug 2, 2024

What

Adds a new log benchmark for transforms, python only for the moment (will likely add C++ and Rust as part of #6810).

Also, fixes the python benchmark setup to use release builds (oops!).


Benchmark results on my windows machine:

------------------------------------------------------------------------------------------------- benchmark: 3 tests -------------------------------------------------------------------------------------------------
Name (time in ms)                                    Min                 Max                Mean             StdDev              Median                IQR            Outliers       OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_bench_transforms_over_time[10000-1000]       1.0250 (1.0)        9.1189 (1.0)        1.1567 (1.0)       0.3008 (1.0)        1.1212 (1.0)       0.0974 (1.0)         14;36  864.5532 (1.0)         776           1
test_bench_transforms_over_time[10000-100]        7.9997 (7.80)      17.3248 (1.90)       8.8829 (7.68)      1.0412 (3.46)       8.6579 (7.72)      0.4209 (4.32)          3;5  112.5764 (0.13)        101           1
test_bench_transforms_over_time[10000-1]        830.6972 (810.44)   914.6887 (100.31)   860.9382 (744.33)   38.8470 (129.13)   835.1398 (744.83)   62.9516 (646.32)        1;0    1.1615 (0.00)          5           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In short: logging 10k transforms with rotation/scale/transform on an incrementing integer timescale takes:

  • 100x batches of 100: 8.7ms
  • 10x batches of 1000: 1.1ms
  • individually: 835ms

batch count scales pretty much linear which implies that each log call has roughly the same overhead, independent of the number of transforms logged.

(note that this benchmark logs to a memory recording)


To confirm this I took Jeremy's quick benchmark from the ticket and extended it a bit to include batches:

import time

import numpy as np
import rerun as rr

rr.init("rerun_example_transforms", spawn=True)
rr.set_time_sequence("frame", 0)

arrow = rr.Arrows3D(origins=[0, 0, 0], vectors=[0, 1, 0])

rr.log("base/arrow", arrow)

rand_trans = np.random.rand(1000, 3)
rand_quats = np.random.rand(1000, 4)

start = time.time()
for i in range(1000):
    rr.log(
        "base/arrow",
        rr.Transform3D(
            translation=rand_trans[i],
            rotation=rr.Quaternion(xyzw=rand_quats[i]),
        ),
    )

ellapsed = time.time() - start
print(
    f"Time to log 1000 transforms: {ellapsed:.3f}s. ({1000 / ellapsed:.3f}s transforms / sec)"
)


###########################################################
# Temporal batch transform logging (fails on Rerun 0.17.0)
###########################################################

times = np.arange(1000)

start = time.time()
rr.log_temporal_batch(
    "base/arrow",
    times=[rr.TimeSequenceBatch("frame", times)],
    components=[
        rr.Transform3D.indicator(),
        rr.components.Translation3DBatch(rand_trans),
        rr.components.RotationQuatBatch(rand_quats),
    ],
)
ellapsed = time.time() - start

print(
    f"Time to log 1000 transform with time in a single call: {ellapsed:.3f}s. ({1000 / ellapsed:.3f}s transforms / sec)"
)

Results for this on my windows machine (Python 3.11.9, AMD 7950X3D), each median out of 3 runs with respective viewer version already open:

  • 0.17: Time to log 1000 transforms: 0.296s. (3377.211s transforms / sec)
  • main: Time to log 1000 transforms: 0.077s. (13070.399s transforms / sec)
  • main: Time to log 1000 transform with time in a single call: 0.003s. (333383.992s transforms / sec)

So according to this simple benchmark it's about 3.9 times more transforms per second for individual log and (not entirely comparable and probably measure accuracy bound) factor 99.

I did not check this benchmark in since it's a bit too one-off, in particular the batch log I added is too unscientific


Verdict overall:

  • we got a lot better here but over 800ms for 10k individual transforms still feels embarrassing
  • we're clearly bound on the number of individual Python operations and almost independent on the amount of data processed by each call (that's not an excuse, means we have to reduce the in-Python overhead in the long run) which means that temporal batch logging gives us at least an escape hatch that lands with some more reasonable transforms

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@Wumpf Wumpf added 🔨 testing testing and benchmarks exclude from changelog PRs with this won't show up in CHANGELOG.md labels Aug 2, 2024
Copy link

github-actions bot commented Aug 2, 2024

Deployed docs

Commit Link
813c1ad https://landing-g3zwnwo9l-rerun.vercel.app/docs

@@ -282,11 +289,14 @@ py-build-notebook = { cmd = "pip install -e rerun_notebook", depends_on = [
# Dedicated alias for building the python bindings for the `py` environment.
py-build = "pixi run -e py py-build-common"

# Dedicated alias for building the python bindings in release modefor the `py` environment.
py-build-release = "pixi run -e py py-build-common-release"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there's anything we can do about it, but as an observation, this is very very similar to:

pixi run py-build --release

which almost works, but still depends on rerun-build instead of rerun-build-release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's the reason I didn't go that way - there doesn't seem to be a way to pass arguments through to dependent commands

Base automatically changed from andreas/allow-quaternion-arrays to main August 5, 2024 08:55
An error occurred while trying to automatically change base from andreas/allow-quaternion-arrays to main August 5, 2024 08:55
@Wumpf Wumpf force-pushed the andreas/transform3d-benchmark branch from 39fac06 to 148168f Compare August 5, 2024 08:57
@Wumpf Wumpf merged commit a93faab into main Aug 5, 2024
15 of 19 checks passed
@Wumpf Wumpf deleted the andreas/transform3d-benchmark branch August 5, 2024 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exclude from changelog PRs with this won't show up in CHANGELOG.md 🔨 testing testing and benchmarks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Logging of rr.Transform3D very slow
2 participants