Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: minimal human-readable serialization of uints #243

Merged
merged 8 commits into from
May 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Changed

- Use Ethereum `Quantity` encoding for serde serialization when human-readable
- Updated `ark` to `0.4`, `fastrlp` to `0.3` and `pyo3` to `0.18`.

## [1.8.0] — 2023-04-19
Expand Down
2 changes: 1 addition & 1 deletion Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ named feature flag.
* [`arbitrary`](https://docs.rs/arbitrary): Implements the [`Arbitrary`](https://docs.rs/arbitrary/latest/arbitrary/trait.Arbitrary.html) trait, allowing [`Uint`]s to be generated for fuzz testing.
* [`quickcheck`](https://docs.rs/quickcheck): Implements the [`Arbitrary`](https://docs.rs/quickcheck/latest/quickcheck/trait.Arbitrary.html) trait, allowing [`Uint`]s to be generated for property based testing.
* [`proptest`](https://docs.rs/proptest): Implements the [`Arbitrary`](https://docs.rs/proptest/latest/proptest/arbitrary/trait.Arbitrary.html) trait, allowing [`Uint`]s to be generated for property based testing. Proptest is used for the `uint`s own test suite.
* [`serde`](https://docs.rs/serde): Implements the [`Serialize`](https://docs.rs/serde/latest/serde/trait.Serialize.html) and [`Deserialize`](https://docs.rs/serde/latest/serde/trait.Deserialize.html) traits for [`Uint`] using big-endian hex in human readable formats and big-endian byte strings in machine readable formats.
* [`serde`](https://docs.rs/serde): Implements the [`Serialize`](https://docs.rs/serde/latest/serde/trait.Serialize.html) and [`Deserialize`](https://docs.rs/serde/latest/serde/trait.Deserialize.html) traits for [`Uint`] and [`Bits`].Serialization uses big-endian hex in human readable formats and big-endian byte strings in machine readable formats. [`Uint`] uses ethereum `Quantity` format (0x-prefixed minimal string) when serializing in a human readable format.
* [`rlp`](https://docs.rs/rlp): Implements the [`Encodable`](https://docs.rs/rlp/latest/rlp/trait.Encodable.html) and [`Decodable`](https://docs.rs/rlp/latest/rlp/trait.Decodable.html) traits for [`Uint`] to allow serialization to/from RLP.
* [`fastrlp`](https://docs.rs/fastrlp): Implements the [`Encodable`](https://docs.rs/fastrlp/latest/fastrlp/trait.Encodable.html) and [`Decodable`](https://docs.rs/fastrlp/latest/fastrlp/trait.Decodable.html) traits for [`Uint`] to allow serialization to/from RLP.
* [`primitive-types`](https://docs.rs/primitive-types): Implements the [`From<_>`] conversions between corresponding types.
Expand Down
101 changes: 75 additions & 26 deletions src/support/serde.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,28 +10,69 @@ use serde::{
};
use std::{fmt::Write, str};

/// Canonical serialization for all human-readable instances of `Uint<0, 0>`,
/// and minimal human-readable `Uint<BITS, LIMBS>::ZERO` for any bit size.
const ZERO_STR: &str = "0x0";

impl<const BITS: usize, const LIMBS: usize> Uint<BITS, LIMBS> {
fn serialize_human_full<S: Serializer>(&self, s: S) -> Result<S::Ok, S::Error> {
if BITS == 0 {
return s.serialize_str(ZERO_STR);
}

let mut result = String::with_capacity(2 + nbytes(BITS) * 2);
result.push_str("0x");

self.as_le_bytes()
.iter()
.rev()
.try_for_each(|byte| write!(result, "{byte:02x}"))
.unwrap();

s.serialize_str(&result)
}

fn serialize_human_minimal<S: Serializer>(&self, s: S) -> Result<S::Ok, S::Error> {
if BITS == 0 {
return s.serialize_str(ZERO_STR);
}

let le_bytes = self.as_le_bytes();
let mut bytes = le_bytes.iter().rev().skip_while(|b| **b == 0);

// We avoid String allocation if there is no non-0 byte
// If there is a first byte, we allocate a string, and write the prefix
// and first byte to it
let mut result = match bytes.next() {
Some(b) => {
let mut result = String::with_capacity(2 + nbytes(BITS) * 2);
write!(result, "0x{b:x}").unwrap();
result
}
None => return s.serialize_str(ZERO_STR),
};
bytes
.try_for_each(|byte| write!(result, "{byte:02x}"))
.unwrap();

s.serialize_str(&result)
}

fn serialize_binary<S: Serializer>(&self, s: S) -> Result<S::Ok, S::Error> {
s.serialize_bytes(&self.to_be_bytes_vec())
}
}

/// Serialize a [`Uint`] value.
///
/// For human readable formats a `0x` prefixed lower case hex string is used.
/// For binary formats a byte array is used. Leading zeros are included.
impl<const BITS: usize, const LIMBS: usize> Serialize for Uint<BITS, LIMBS> {
fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
let bytes = self.to_be_bytes_vec();
if serializer.is_human_readable() {
// Special case for zero, which encodes as `0x0`.
if BITS == 0 {
return serializer.serialize_str("0x0");
}
// OPT: Allocation free method.
let mut result = String::with_capacity(2 * Self::BYTES + 2);
result.push_str("0x");
for byte in bytes {
write!(result, "{byte:02x}").unwrap();
}
serializer.serialize_str(&result)
self.serialize_human_minimal(serializer)
} else {
// Write as bytes directly
serializer.serialize_bytes(&bytes[..])
self.serialize_binary(serializer)
}
}
}
Expand All @@ -51,7 +92,11 @@ impl<'de, const BITS: usize, const LIMBS: usize> Deserialize<'de> for Uint<BITS,

impl<const BITS: usize, const LIMBS: usize> Serialize for Bits<BITS, LIMBS> {
fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
self.as_uint().serialize(serializer)
if serializer.is_human_readable() {
self.as_uint().serialize_human_full(serializer)
} else {
self.as_uint().serialize_binary(serializer)
}
}
}

Expand All @@ -75,18 +120,17 @@ impl<'de, const BITS: usize, const LIMBS: usize> Visitor<'de> for StrVisitor<BIT
where
E: Error,
{
let value = trim_hex_prefix(value);

// ensure the string is the correct length (two characters in the string per
// byte) exception: zero, for a Uint where BITS == 0
if BITS == 0 {
// special case, this class of ints has one member, zero.
// zero is represented as "0x0" only
if value != "0" {
return Err(Error::invalid_value(Unexpected::Str(value), &self));
}
// Shortcut for common case
if value == ZERO_STR {
return Ok(Uint::<BITS, LIMBS>::ZERO);
}
// `ZERO_STR` is the only valid serialization of `Uint<0, 0>`, so if we
// have not shortcut, we are in an error case
if BITS == 0 {
return Err(Error::invalid_value(Unexpected::Str(value), &self));
}

let value = trim_hex_prefix(value);
if nbytes(BITS) * 2 < value.len() {
return Err(Error::invalid_length(value.len(), &self));
}
Expand All @@ -105,7 +149,7 @@ impl<'de, const BITS: usize, const LIMBS: usize> Visitor<'de> for StrVisitor<BIT
}
limbs[i] = limb;
}
if BITS > 0 && limbs[LIMBS - 1] > Self::Value::MASK {
if limbs[LIMBS - 1] > Self::Value::MASK {
return Err(Error::invalid_value(Unexpected::Str(value), &self));
}
Ok(Uint::from_limbs(limbs))
Expand Down Expand Up @@ -163,6 +207,11 @@ mod tests {
let deserialized = serde_json::from_str(&serialized).unwrap();
assert_eq!(value, deserialized);
});
proptest!(|(value: Bits<BITS, LIMBS>)| {
let serialized = serde_json::to_string(&value).unwrap();
let deserialized = serde_json::from_str(&serialized[..]).unwrap();
assert_eq!(value, deserialized);
});
});
}

Expand Down