Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow logging batches of quaternions from numpy arrays #7038

Merged
merged 2 commits into from
Aug 5, 2024

Conversation

Wumpf
Copy link
Member

@Wumpf Wumpf commented Aug 2, 2024

What

We generally want to encourage the explicit quaternion constructor: Quaternion(xyzw=[1,2,3,4]) since the ordering of the elements in a quaternion isn't standardized widely enough.
However, in many situations a user may just have a raw array of quaternions which is already in the correct configuration and wants to log that.

Prior to this PR rerun.components.RotationQuatBatch(my_np_array) would have failed.

Therefore, this PR relaxes this restriction, making the serailization code identical to vec4, but keeping the explicit constructor & more restrictive type checking for single elements.

(This came up during testing/benchmarking of batch logging of transforms)

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@Wumpf Wumpf merged commit 4f2dd28 into main Aug 5, 2024
7 of 13 checks passed
@Wumpf Wumpf deleted the andreas/allow-quaternion-arrays branch August 5, 2024 08:55
Wumpf added a commit that referenced this pull request Aug 5, 2024
### What

* based on #7038

Adds a new log benchmark for transforms, python only for the moment
(will likely add C++ and Rust as part of
#6810).

Also, fixes the python benchmark setup to use release builds (oops!).

---

Benchmark results on my windows machine:
```
------------------------------------------------------------------------------------------------- benchmark: 3 tests -------------------------------------------------------------------------------------------------
Name (time in ms)                                    Min                 Max                Mean             StdDev              Median                IQR            Outliers       OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_bench_transforms_over_time[10000-1000]       1.0250 (1.0)        9.1189 (1.0)        1.1567 (1.0)       0.3008 (1.0)        1.1212 (1.0)       0.0974 (1.0)         14;36  864.5532 (1.0)         776           1
test_bench_transforms_over_time[10000-100]        7.9997 (7.80)      17.3248 (1.90)       8.8829 (7.68)      1.0412 (3.46)       8.6579 (7.72)      0.4209 (4.32)          3;5  112.5764 (0.13)        101           1
test_bench_transforms_over_time[10000-1]        830.6972 (810.44)   914.6887 (100.31)   860.9382 (744.33)   38.8470 (129.13)   835.1398 (744.83)   62.9516 (646.32)        1;0    1.1615 (0.00)          5           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
```
In short: logging 10k transforms with rotation/scale/transform on an
incrementing integer timescale takes:
* 100x batches of 100:  **8.7ms**
* 10x batches of 1000: **1.1ms**
* individually: **835ms**

batch count scales pretty much linear which implies that each log call
has roughly the same overhead, independent of the number of transforms
logged.

(note that this benchmark logs to a memory recording)

----

* Fixes #4227
   * or rather confirmed it being fixed now
* cc: @nubertj please do reopen that issue if you still have issues with
transform logging performance on 0.18 (once out) or current [development
builds](https://github.com/rerun-io/rerun/releases/tag/prerelease) (in
case you get to it)

To confirm this I took [Jeremy's quick
benchmark](#4227 (comment))
from the ticket and extended it a bit to include batches:
```python
import time

import numpy as np
import rerun as rr

rr.init("rerun_example_transforms", spawn=True)
rr.set_time_sequence("frame", 0)

arrow = rr.Arrows3D(origins=[0, 0, 0], vectors=[0, 1, 0])

rr.log("base/arrow", arrow)

rand_trans = np.random.rand(1000, 3)
rand_quats = np.random.rand(1000, 4)

start = time.time()
for i in range(1000):
    rr.log(
        "base/arrow",
        rr.Transform3D(
            translation=rand_trans[i],
            rotation=rr.Quaternion(xyzw=rand_quats[i]),
        ),
    )

ellapsed = time.time() - start
print(
    f"Time to log 1000 transforms: {ellapsed:.3f}s. ({1000 / ellapsed:.3f}s transforms / sec)"
)


###########################################################
# Temporal batch transform logging (fails on Rerun 0.17.0)
###########################################################

times = np.arange(1000)

start = time.time()
rr.log_temporal_batch(
    "base/arrow",
    times=[rr.TimeSequenceBatch("frame", times)],
    components=[
        rr.Transform3D.indicator(),
        rr.components.Translation3DBatch(rand_trans),
        rr.components.RotationQuatBatch(rand_quats),
    ],
)
ellapsed = time.time() - start

print(
    f"Time to log 1000 transform with time in a single call: {ellapsed:.3f}s. ({1000 / ellapsed:.3f}s transforms / sec)"
)
```

Results for this on my windows machine (Python 3.11.9, AMD 7950X3D),
each median out of 3 runs with respective viewer version already open:

* 0.17: `Time to log 1000 transforms: 0.296s. (3377.211s transforms /
sec)`
* (very similar to [previously reported
numbers](#4227 (comment))
on Jeremy's linux machine)
* `main`: `Time to log 1000 transforms: 0.077s. (13070.399s transforms /
sec)`
* `main`: `Time to log 1000 transform with time in a single call:
0.003s. (333383.992s transforms / sec)`

So according to this simple benchmark it's about 3.9 times more
transforms per second for individual log and (not entirely comparable
and probably measure accuracy bound) factor 99.

I did not check this benchmark in since it's a bit too one-off, in
particular the batch log I added is too unscientific

---

Verdict overall:
* we got a lot better here but over 800ms for 10k individual transforms
still feels embarrassing
* we're clearly bound on the number of individual Python operations and
almost independent on the amount of data processed by each call (that's
not an excuse, means we have to reduce the in-Python overhead in the
long run) which means that temporal batch logging gives us at least an
escape hatch that lands with some more reasonable transforms

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/{{pr.number}}?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/{{pr.number}}?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!
* [x] If have noted any breaking changes to the log API in
`CHANGELOG.md` and the migration guide

- [PR Build Summary](https://build.rerun.io/pr/{{pr.number}})
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.

---------

Co-authored-by: Jeremy Leibs <jeremy@rerun.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants