-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: sponge pad panics on input #46
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vlopes11! I left a couple of nits and a question inline
src/hash/rpo/tests.rs
Outdated
#[test] | ||
fn sponge_bytes_with_remainder_length() { | ||
Rpo256::hash(&vec![0_u8; 111]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this also just testing that it won't panic? Can you add a comment that explains what this test is for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the number 111 is quite "magic" here. Comment added; I also changed the number to a prime so we have the guarantee it won't be divisible.
src/hash/rpo/mod.rs
Outdated
@@ -145,6 +138,8 @@ impl Hasher for Rpo256 { | |||
// we don't need to apply any extra padding because we injected total number of elements | |||
// in the input list into the capacity portion of the state during initialization. | |||
if i > 0 { | |||
state[RATE_RANGE.start + i..RATE_RANGE.start + RATE_WIDTH - 1].fill(ZERO); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we use RATE_RANGE.start + i..RATE_RANGE.end
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, fixed!
src/hash/rpo/mod.rs
Outdated
let num_elements = if bytes.len() % BINARY_CHUNK_SIZE == 0 { | ||
bytes.len() / BINARY_CHUNK_SIZE | ||
} else { | ||
bytes.len() / BINARY_CHUNK_SIZE + 1 | ||
}; | ||
let num_elements = bytes.len() / BINARY_CHUNK_SIZE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by this and the below comment. I thought the first element of the capacity range is supposed to be either 1 or 0, not the number of elements to be hashed anymore. @Al-Kindi-0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have indeed a couple of unanswered questions there. I was thinking of moving the PR to draft until we have these answered.
We indeed set the first element as the count
Lines 109 to 110 in 6de7730
let mut state = [ZERO; STATE_WIDTH]; | |
state[CAPACITY_RANGE.start] = Felt::new(num_elements as u64); |
But during the first iteration, we override it with the input, as i
is set as 0
Lines 116 to 136 in 6de7730
let mut i = 0; | |
let mut buf = [0_u8; 8]; | |
for chunk in bytes.chunks(BINARY_CHUNK_SIZE) { | |
if i < num_elements - 1 { | |
buf[..BINARY_CHUNK_SIZE].copy_from_slice(chunk); | |
} else { | |
// if we are dealing with the last chunk, it may be smaller than BINARY_CHUNK_SIZE | |
// bytes long, so we need to handle it slightly differently. We also append a byte | |
// with value 1 to the end of the string; this pads the string in such a way that | |
// adding trailing zeros results in different hash | |
let chunk_len = chunk.len(); | |
buf = [0_u8; 8]; | |
buf[..chunk_len].copy_from_slice(chunk); | |
buf[chunk_len] = 1; | |
} | |
// convert the bytes into a field element and absorb it into the rate portion of the | |
// state; if the rate is filled up, apply the Rescue permutation and start absorbing | |
// again from zero index. | |
state[RATE_RANGE.start + i] = Felt::new(u64::from_le_bytes(buf)); | |
i += 1; |
There is a correct statement below that we don't need to add an extra element with [1, 0, 0, 0, 0 ,0, 0]
because the number of elements in the initial inputs indeed will differentiate cases with or without remainder chunks.
Lines 143 to 149 in 6de7730
// if we absorbed some elements but didn't apply a permutation to them (would happen when | |
// the number of elements is not a multiple of RATE_WIDTH), apply the RPO permutation. | |
// we don't need to apply any extra padding because we injected total number of elements | |
// in the input list into the capacity portion of the state during initialization. | |
if i > 0 { | |
Self::apply_permutation(&mut state); | |
} |
But, if we change it to be either 0
or 1
, then we have to apply an extra padding element for such cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question:
If we apply a montgomery reduction here (as opposed to Felt::from_mont
that will just take the raw number)
Lines 132 to 135 in 6de7730
// convert the bytes into a field element and absorb it into the rate portion of the | |
// state; if the rate is filled up, apply the Rescue permutation and start absorbing | |
// again from zero index. | |
state[RATE_RANGE.start + i] = Felt::new(u64::from_le_bytes(buf)); |
Why do we limit the chunk size to 7
bytes, increasing the permutation count?
Lines 40 to 41 in 6de7730
/// The number of byte chunks defining a field element when hashing a sequence of bytes | |
const BINARY_CHUNK_SIZE: usize = 7; |
I think we should either apply the mont reduce + take 8 bytes, or take 7 and use raw elements, saving a bit of overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @vlopes11 and @grjte for raising this issue.
Indeed, for hash(&[u8])
the padding used is not the same as in the RPO paper but was left as is from the previous implementation. I think this a good time to revisit this.
@vlopes11: Regarding your first point, I think that during the first iteration, it is state[RATE_RANGE.start + 0]
that is written to as opposed to state[CAPACITY_RANGE.start]
. Regarding your second point, I think you are right, we should be able to apply mont_reduce(u64)
to 8 byte-chunks or from_mont(u64)
to 7 byte-chunks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Al-Kindi-0 thanks for the clarification. As for the second item, I think its best to go for 8 bytes with montgomery redcution as the reduction itself should be far cheaper than a permutation - and safer. wdyt?
Regarding the padding, can you link the latest paper we should use as reference? I'll reproduce in the code the definition we have there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Al-Kindi-0 thanks for the clarification. As for the second item, I think its best to go for 8 bytes with montgomery redcution as the reduction itself should be far cheaper than a permutation - and safer. wdyt?
I am leaning also towards this option but maybe we should benchmark the two approaches to get a better idea.
Regarding the padding, can you link the latest paper we should use as reference? I'll reproduce in the code the definition we have there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Al-Kindi-0 it's clear what needs to be done. As pointed by @grjte , capacity will be either one or zero.
The last question I have is about one of the rules in the paper: 2.5 The zero-length input is not allowed
. Why is that? It's not immediately obvious because if we just permutate zeroes, we will also have a valid output (I think). And, considering sponge hashes are usually infallible, I guess we have to come to an output for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is was mainly done in order to simplify (thinking about) the new padding rule. Hashing of only zeros is still allowed but the length then is not zero.
bfc1c18
to
82e2d5d
Compare
82e2d5d
to
7dba1c2
Compare
A couple of comments on this:
|
7dba1c2
to
91ba526
Compare
91ba526
to
a74d091
Compare
a74d091
to
d912fec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you! I left a couple small comments inline.
d912fec
to
fd08b2a
Compare
fd08b2a
to
2eb7dba
Compare
2eb7dba
to
1e14270
Compare
1e14270
to
b70f8d5
Compare
b70f8d5
to
ed36ebc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you!
closes #44