-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove vec allocation in CRDS deserialize #5143
base: master
Are you sure you want to change the base?
remove vec allocation in CRDS deserialize #5143
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks mostly good. just wondering if you've checked into the tradeoffs here. thank you!
gossip/src/crds_value.rs
Outdated
let mut buffer = [0u8; PACKET_DATA_SIZE]; | ||
let position = { | ||
let mut cursor = std::io::Cursor::new(buffer.as_mut()); | ||
bincode::serialize_into(&mut cursor, &data).map_err(serde::de::Error::custom)?; | ||
cursor.position() as usize | ||
}; | ||
let hash = solana_sha256_hasher::hashv(&[signature.as_ref(), &buffer[0..position]]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have any numbers on the trade off between increased stack usage vs. reduced heap allocations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stack allocation of 1KB is essentially free and does not rely on any global locking. Heap allocation is VERY expensive in comparison, see numbers below (keep in mind that is microbenchmark data on 1 thread).
heap 1000 time: [529.70 ns 530.29 ns 530.92 ns]
stack 1000 time: [293.75 ns 295.05 ns 296.31 ns]
heap 100 time: [164.27 ns 164.97 ns 165.82 ns]
stack 100 time: [25.971 ns 26.017 ns 26.073 ns]
let mut buffer: MaybeUninit<[u8; PACKET_DATA_SIZE]> = MaybeUninit::uninit(); | ||
// SAFETY: we are only using this buffer to store serialize results | ||
let buf_ref = unsafe { buffer.assume_init_mut() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should do this as we haven't actually initialized the buffer. We currently do something like this elsewhere in the codebase, but we shouldn't do it there either and PR's that have addressed that behavior have gotten hung up in review.
I think there are two approaches we can consider here:
- Just do
[0u8; PACKET_DATA_SIZE]
like you previously had OR - Do something like I did in Introduce API to safely initialize Packets #3533 (which never merged 😆 )
- This linked PR also has some conversation about why we shouldn't do
assume_init()
in one of the other places that we do
- This linked PR also has some conversation about why we shouldn't do
Not having to zero the buffer should seemingly have some gains. But, I think removing the alloc is the more significant gain so maybe it makes sense to keep this PR simpler for now
CC @alessandrod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steviez I do not see a reason to zero-out this buffer. We are never actually reading any values from it for anything other than hashing. I agree the actual overhead is probably very small, but we are doing this up to 20K times per second. @alessandrod was the one who suggested that initializing it is not useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to speak for Alessandro, but here is a comment from him on a very similar item on that same PR I linked: #3533 (comment)
If I understand correctly, I think his suggestion is something like:
- Create
MaybeUninit<[u8; PACKET_DATA_SIZE]>
like you have it now - Serialize data into the buffer
- Given that a
MaybeUninit
is written to with pointers, you can't directly callbincode::serialize_into()
(unless I missed something) which is why I wrote that thin wrapper in my code
- Given that a
- Fill the remainder of the buffer with 0's
- Finally, do
.assume_init()
to mark that we have now fully initialized the buffer
Problem
Summary of Changes
Fixes #
partially addresses #5034