Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: sponge pad panics on input #46

Merged
merged 1 commit into from
Feb 9, 2023
Merged

Conversation

vlopes11
Copy link
Contributor

closes #44

@grjte grjte requested a review from Al-Kindi-0 January 26, 2023 10:26
Copy link
Contributor

@grjte grjte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vlopes11! I left a couple of nits and a question inline

Comment on lines 205 to 226
#[test]
fn sponge_bytes_with_remainder_length() {
Rpo256::hash(&vec![0_u8; 111]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also just testing that it won't panic? Can you add a comment that explains what this test is for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the number 111 is quite "magic" here. Comment added; I also changed the number to a prime so we have the guarantee it won't be divisible.

@@ -145,6 +138,8 @@ impl Hasher for Rpo256 {
// we don't need to apply any extra padding because we injected total number of elements
// in the input list into the capacity portion of the state during initialization.
if i > 0 {
state[RATE_RANGE.start + i..RATE_RANGE.start + RATE_WIDTH - 1].fill(ZERO);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use RATE_RANGE.start + i..RATE_RANGE.end here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, fixed!

let num_elements = if bytes.len() % BINARY_CHUNK_SIZE == 0 {
bytes.len() / BINARY_CHUNK_SIZE
} else {
bytes.len() / BINARY_CHUNK_SIZE + 1
};
let num_elements = bytes.len() / BINARY_CHUNK_SIZE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this and the below comment. I thought the first element of the capacity range is supposed to be either 1 or 0, not the number of elements to be hashed anymore. @Al-Kindi-0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have indeed a couple of unanswered questions there. I was thinking of moving the PR to draft until we have these answered.

We indeed set the first element as the count

crypto/src/hash/rpo/mod.rs

Lines 109 to 110 in 6de7730

let mut state = [ZERO; STATE_WIDTH];
state[CAPACITY_RANGE.start] = Felt::new(num_elements as u64);

But during the first iteration, we override it with the input, as i is set as 0

crypto/src/hash/rpo/mod.rs

Lines 116 to 136 in 6de7730

let mut i = 0;
let mut buf = [0_u8; 8];
for chunk in bytes.chunks(BINARY_CHUNK_SIZE) {
if i < num_elements - 1 {
buf[..BINARY_CHUNK_SIZE].copy_from_slice(chunk);
} else {
// if we are dealing with the last chunk, it may be smaller than BINARY_CHUNK_SIZE
// bytes long, so we need to handle it slightly differently. We also append a byte
// with value 1 to the end of the string; this pads the string in such a way that
// adding trailing zeros results in different hash
let chunk_len = chunk.len();
buf = [0_u8; 8];
buf[..chunk_len].copy_from_slice(chunk);
buf[chunk_len] = 1;
}
// convert the bytes into a field element and absorb it into the rate portion of the
// state; if the rate is filled up, apply the Rescue permutation and start absorbing
// again from zero index.
state[RATE_RANGE.start + i] = Felt::new(u64::from_le_bytes(buf));
i += 1;

There is a correct statement below that we don't need to add an extra element with [1, 0, 0, 0, 0 ,0, 0] because the number of elements in the initial inputs indeed will differentiate cases with or without remainder chunks.

crypto/src/hash/rpo/mod.rs

Lines 143 to 149 in 6de7730

// if we absorbed some elements but didn't apply a permutation to them (would happen when
// the number of elements is not a multiple of RATE_WIDTH), apply the RPO permutation.
// we don't need to apply any extra padding because we injected total number of elements
// in the input list into the capacity portion of the state during initialization.
if i > 0 {
Self::apply_permutation(&mut state);
}

But, if we change it to be either 0 or 1, then we have to apply an extra padding element for such cases.

Copy link
Contributor Author

@vlopes11 vlopes11 Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question:

If we apply a montgomery reduction here (as opposed to Felt::from_mont that will just take the raw number)

crypto/src/hash/rpo/mod.rs

Lines 132 to 135 in 6de7730

// convert the bytes into a field element and absorb it into the rate portion of the
// state; if the rate is filled up, apply the Rescue permutation and start absorbing
// again from zero index.
state[RATE_RANGE.start + i] = Felt::new(u64::from_le_bytes(buf));

Why do we limit the chunk size to 7 bytes, increasing the permutation count?

/// The number of byte chunks defining a field element when hashing a sequence of bytes
const BINARY_CHUNK_SIZE: usize = 7;

I think we should either apply the mont reduce + take 8 bytes, or take 7 and use raw elements, saving a bit of overhead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @vlopes11 and @grjte for raising this issue.
Indeed, for hash(&[u8]) the padding used is not the same as in the RPO paper but was left as is from the previous implementation. I think this a good time to revisit this.
@vlopes11: Regarding your first point, I think that during the first iteration, it is state[RATE_RANGE.start + 0] that is written to as opposed to state[CAPACITY_RANGE.start]. Regarding your second point, I think you are right, we should be able to apply mont_reduce(u64) to 8 byte-chunks or from_mont(u64) to 7 byte-chunks.

Copy link
Contributor Author

@vlopes11 vlopes11 Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Al-Kindi-0 thanks for the clarification. As for the second item, I think its best to go for 8 bytes with montgomery redcution as the reduction itself should be far cheaper than a permutation - and safer. wdyt?

Regarding the padding, can you link the latest paper we should use as reference? I'll reproduce in the code the definition we have there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Al-Kindi-0 thanks for the clarification. As for the second item, I think its best to go for 8 bytes with montgomery redcution as the reduction itself should be far cheaper than a permutation - and safer. wdyt?

I am leaning also towards this option but maybe we should benchmark the two approaches to get a better idea.

Regarding the padding, can you link the latest paper we should use as reference? I'll reproduce in the code the definition we have there.

https://eprint.iacr.org/2022/1577.pdf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Al-Kindi-0 it's clear what needs to be done. As pointed by @grjte , capacity will be either one or zero.

The last question I have is about one of the rules in the paper: 2.5 The zero-length input is not allowed. Why is that? It's not immediately obvious because if we just permutate zeroes, we will also have a valid output (I think). And, considering sponge hashes are usually infallible, I guess we have to come to an output for this case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is was mainly done in order to simplify (thinking about) the new padding rule. Hashing of only zeros is still allowed but the length then is not zero.

@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from bfc1c18 to 82e2d5d Compare January 26, 2023 12:30
@vlopes11 vlopes11 requested a review from itzmeanjan January 26, 2023 12:41
@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from 82e2d5d to 7dba1c2 Compare January 26, 2023 13:50
@bobbinth
Copy link
Contributor

A couple of comments on this:

  1. We should use the same padding rule as for hash_elements() here. Specifically, if the number of elements to be hashed is a multiple of 8, we set capacity registers to ZERO's. If it is not a multiple of 8, then we set the first capacity register to ONE and the rest to ZERO's, and also append ONE to the input element string.
  2. For binary string, we should keep splitting the input into 7-byte chunks. Splitting it into 8-byte chunks would cause collisions as 8-byte chunks don't map one-to-one to Field elements.
  3. We may need to apply additional padding to the last chunk. Specifically, we need to append 1 to the string before converting the last chunk into a field element. I think this is already being done, but I'd probably add a test to make sure something like [0; 9] and [0; 10] hash to different values.

@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from 7dba1c2 to 91ba526 Compare February 2, 2023 18:59
@vlopes11 vlopes11 requested review from bobbinth and grjte February 2, 2023 19:04
@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from 91ba526 to a74d091 Compare February 2, 2023 20:01
@vlopes11 vlopes11 requested a review from bobbinth February 2, 2023 20:02
@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from a74d091 to d912fec Compare February 2, 2023 20:20
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left a couple small comments inline.

@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from d912fec to fd08b2a Compare February 2, 2023 22:35
@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from fd08b2a to 2eb7dba Compare February 3, 2023 12:05
@vlopes11 vlopes11 requested a review from bobbinth February 3, 2023 13:16
@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from 2eb7dba to 1e14270 Compare February 6, 2023 14:23
@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from 1e14270 to b70f8d5 Compare February 8, 2023 17:38
@vlopes11 vlopes11 requested a review from bobbinth February 8, 2023 19:28
@vlopes11 vlopes11 force-pushed the vlopes11-44-fix-rpo256-sponge-pad branch from b70f8d5 to ed36ebc Compare February 9, 2023 12:09
@grjte grjte requested review from grjte and removed request for grjte February 9, 2023 14:56
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good! Thank you!

Copy link
Contributor

@grjte grjte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you!

@bobbinth bobbinth merged commit 66da469 into next Feb 9, 2023
@bobbinth bobbinth deleted the vlopes11-44-fix-rpo256-sponge-pad branch February 9, 2023 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants