Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding an gp_lttb to handle lttb on gappy data. #517

Merged
merged 1 commit into from
Sep 16, 2022
Merged

Conversation

WireBaron
Copy link
Contributor

@WireBaron WireBaron commented Sep 3, 2022

This change adds a new gp_lttb aggregate to the toolkit_experimental
schema. This aggregate will identify any large gaps in the data and
will perform a piecemeal lttb on the data between the gaps.

Copy link
Contributor

@epgts epgts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

irreg_lttb seems like plenty good enough a name for experimental.

Some(state) => state,
};
state.series.sort_by_key(|point| point.ts);
let series = Cow::from(&state.series);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why copy-on-write? It looks like we never write, so we never copy, so we could reference state.series directly. I looked at lttb_final_inner and it looks like that doesn't need it either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixed in both places.

num_points: downsampled.len() as u32,
flags: time_vector::FLAG_IS_SORTED,
internal_padding: [0; 3],
points: (&*downsampled).into(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just downsampled.into() ought to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but it does move downsampled now. Luckily I just have to move the following line before this to keep this from being an issue.

@WireBaron WireBaron marked this pull request as ready for review September 15, 2022 04:14
let state = unsafe { state.to_inner() };
let needs_interval = state.is_none();

// Don't love this code, but need to compute gap_val if needed before time is moved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow. That's just a silly omission on TimestampTz, which is a simple wrapper around Datum, which is just a usize. We should derive Copy on it. Not right now, just FYI :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why they were reluctant to define copy on datum, since a datum may actually be a pointer pretending to be a usize. However it might be nice to implement a utility to give us a shallow copy, that we can use for things like timestamptz (which is legitimately a usize).

This change adds a new gp_lttb aggregate to the toolkit_experimental
schema.  This aggregate will identify any large gaps in the data and
will perform a piecemeal lttb on the data between the gaps.
@WireBaron WireBaron changed the title Adding an irreg_lttb to handle lttb on gappy data. Adding an gp_lttb to handle lttb on gappy data. Sep 16, 2022
@WireBaron
Copy link
Contributor Author

bors r+

@bors
Copy link
Contributor

bors bot commented Sep 16, 2022

@bors bors bot merged commit 7fc3749 into main Sep 16, 2022
@bors bors bot deleted the br/irregular_lttb branch September 16, 2022 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants