Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support decoding multiplexed RRD streams #7091

Merged
merged 5 commits into from
Aug 8, 2024
Merged

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Aug 7, 2024

TL;DR: the following is now possible:

cat docs/snippets/all/archetypes/*_rust.rrd | rerun -

This will of course become more interesting as you build more and more complex CLI pipelines with rerun rrd

(Also fixed some missing buffered io while I was around.)

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@teh-cmc teh-cmc added enhancement New feature or request include in changelog CLI Related to the Rerun CLI labels Aug 7, 2024
pub struct Decoder<R: std::io::Read> {
version: CrateVersion,
compression: Compression,
read: R,
read: Reader<R>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever want to support unbuffered reads? Aren't those just strictly slower in almost all cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all cases -- if you're reading from an array of bytes, like we do in a bunch of places, adding extra buffering is pure waste of time and space.

/// This is particularly useful when working with stdio streams.
///
/// If you're not familiar with multiplexed RRD streams, then you probably want to use
/// [`Decoder::new`] instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need both constructors though? They look identical, except one requires a buffered reader.

Isn't it just better if we have one constructor that always handled concatenated streams?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three reasons:

  1. As mentioned above, buffering is a net loss is some real world cases that we depend on today.
  2. There's non negligible overhead to constantly check for unexpected FileHeaders, which there's really no reason to be paying for in most cases.
  3. This is a very specific constructor that relies specifically on std::io::BufReader, as opposed to any type that implements std::io::BufRead.

@teh-cmc teh-cmc force-pushed the cmc/multiplexed_decodeer branch from 9533541 to 951ce6b Compare August 8, 2024 07:49
@teh-cmc teh-cmc merged commit 84f63a0 into main Aug 8, 2024
10 of 19 checks passed
@teh-cmc teh-cmc deleted the cmc/multiplexed_decodeer branch August 8, 2024 07:50
teh-cmc added a commit that referenced this pull request Aug 8, 2024
You can now do this:
```
cat docs/snippets/all/archetypes/*_rust.rrd | rerun rrd print
```

and this:
```
cat docs/snippets/all/archetypes/*_rust.rrd | rrd merge -o /tmp/all_merged.rrd
```

and this
```
cat docs/snippets/all/archetypes/*_rust.rrd | rerun rrd compact --max-rows 99999999 --max-bytes 999999999 -o /tmp/all_compacted_max.rrd
```

- Part of #7048 
- DNM: requires #7091
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLI Related to the Rerun CLI enhancement New feature or request include in changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants