Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snowflake: fix timestamps at extreme ranges #3220

Merged
merged 4 commits into from
Feb 28, 2025

Conversation

rockwotj
Copy link
Collaborator

@rockwotj rockwotj commented Feb 28, 2025

Fix a bug where timestamps at the extreme ends of the spectrum could be encoded incorrectly due to a bug in int128.Div. Also fix Time types in avro to be something more usable.

@@ -145,7 +145,7 @@ func fls128(n Num) int {
if n.hi != 0 {
return 127 - bits.LeadingZeros64(uint64(n.hi))
}
return 64 - bits.LeadingZeros64(n.lo)
return 63 - bits.LeadingZeros64(n.lo)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the bug fix, the rest is mostly tests or refactoring/clean up :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, that reminded me of https://pkg.go.dev/math/rand#Int63 :)

@rockwotj rockwotj changed the title snowflake: fix timestamps are extreme ranges snowflake: fix timestamps at extreme ranges Feb 28, 2025
Copy link
Collaborator

@mihaitodor mihaitodor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a small nit. Feel free to 🐑 🚀

Comment on lines 61 to 79
// Compare returns -1 if a < b, 0 if a == b, and 1 if a > b.
func Compare(a, b Num) int {
if a.hi < b.hi || (a.hi == b.hi && a.lo < b.lo) {
return -1
}
if a.hi == b.hi && a.lo == b.lo {
return 0
}
return 1
}

// CompareUnsigned returns -1 if |a| < |b|, 0 if a == b, and 1 if |a| > |b|.
func CompareUnsigned(a, b Num) int {
if uint64(a.hi) < uint64(b.hi) || (a.hi == b.hi && a.lo < b.lo) {
return -1
}
if a.hi == b.hi && a.lo == b.lo {
return 0
}
return 1
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can leverage https://pkg.go.dev/cmp#Compare for these. Not sure if that's possible

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we can make this much more readable :) Thanks for this!

@@ -145,7 +145,7 @@ func fls128(n Num) int {
if n.hi != 0 {
return 127 - bits.LeadingZeros64(uint64(n.hi))
}
return 64 - bits.LeadingZeros64(n.lo)
return 63 - bits.LeadingZeros64(n.lo)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, that reminded me of https://pkg.go.dev/math/rand#Int63 :)

This all boiled down to an amazing off by one error in calculating the
MSB when there are less than 64 bits in the int128.

Added a bunch of tests around these values at every layer of the stack,
each was really an artifact of my debugging :)
When preserving logical types, preserve time values as timestamps
instead of duration strings, timestamps are much more natural to work
with in bloblang.
@rockwotj rockwotj merged commit 86f3b6d into redpanda-data:main Feb 28, 2025
4 checks passed
@rockwotj rockwotj deleted the snowflake-tz branch February 28, 2025 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants