Skip to content

Commit

Permalink
Optimizing EIP-4844 transaction validation for mempool (using KZG pro…
Browse files Browse the repository at this point in the history
…ofs) (#5088)

* Fix missing variables/funcs in validate_blob_transaction_wrapper()

There is no `tx.message.blob_commitments` anymore, or `kzg_to_commitment()`

* Introduce KZGProof as its own type instead of using KZGCommitment

* Introduce high-level logic of new efficient transaction validation

To validate a 4844 transaction in the mempool, the verifier checks that each provided KZG commitment matches the
polynomial represented by the corresponding blob data.

     | d_1 | d_2 | d_3 | ... | d_4096 |    -> commitment

Before this patch, to do this validation, we reconstructed the commitment from the blob data (d_i above), and checked
it against the provided commitment. This was expensive because computing a commitment from blob data (even using
Lagrange basis) involves N scalar multiplications, where N is the number of field elements per blob.

Initial benchmarking showed that this was about 40ms for N=4096 which was deemed too expensive. For more details see:
             https://hackmd.io/@protolambda/eip-4844-implementer-notes#Optimizations
             protolambda/go-ethereum#4

In this patch, we speed this up by providing a KZG proof for each commitment. The verifier can check that proof to
ensure that the KZG commitment matches the polynomial represented by the corresponding blob data.

     | d_1 | d_2 | d_3 | ... | d_4096 |    -> commitment, proof

To do so, we evaluate the blob data polynomial at a random point `x` to get a value `y`. We then use the KZG proof to
ensure that the commited polynomial (i.e. the commitment) also evaluates to `y` at `x`. If the check passes, it means
that the KZG commitment matches the polynomial represented by the blob data.

This is significantly faster since evaluating the blob data polynomial at a random point using the Barycentric formula
can be done efficiently with only field operations (see https://hackmd.io/@vbuterin/barycentric_evaluation). Then,
verifying a KZG proof takes two pairing operations (which take about 0.6ms each). This brings the total verification
cost to about 2 ms per blob.

With some additional optimizations (using linear combination tricks as the ones linked above) we can batch all the
blobs together into a single efficient verification, and hence verify the entire transaction in 2.5 ms. The same
techniques can be used to efficiently verify blocks on the consensus side.

* Introduce polynomial helper functions for transaction validation

* Implement high-level logic of aggregated proof verification

* Add helper functions for aggregated proof verification

Also abstract `lincomb()` out of the `blob_to_kzg()` function to be used in the verification.

* Fixes after review on the consensus PR
  • Loading branch information
asn-d6 authored Jun 29, 2022
1 parent 62f2847 commit 0cf9afe
Showing 1 changed file with 100 additions and 15 deletions.
115 changes: 100 additions & 15 deletions EIPS/eip-4844.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ Compared to full data sharding, this EIP has a reduced cap on the number of thes
| `BLS_MODULUS` | `52435875175126190479447740508185965837690552500527637822603658699938581184513` |
| `KZG_SETUP_G2` | `Vector[G2Point, FIELD_ELEMENTS_PER_BLOB]`, contents TBD |
| `KZG_SETUP_LAGRANGE` | `Vector[KZGCommitment, FIELD_ELEMENTS_PER_BLOB]`, contents TBD |
| `ROOTS_OF_UNITY` | `Vector[BLSFieldElement, FIELD_ELEMENTS_PER_BLOB]` |
| `BLOB_COMMITMENT_VERSION_KZG` | `Bytes1(0x01)` |
| `POINT_EVALUATION_PRECOMPILE_ADDRESS` | `Bytes20(0x14)` |
| `POINT_EVALUATION_PRECOMPILE_GAS` | `50000` |
Expand All @@ -71,21 +72,24 @@ Compared to full data sharding, this EIP has a reduced cap on the number of thes
| `Blob` | `Vector[BLSFieldElement, FIELD_ELEMENTS_PER_BLOB]` | |
| `VersionedHash` | `Bytes32` | |
| `KZGCommitment` | `Bytes48` | Same as BLS standard "is valid pubkey" check but also allows `0x00..00` for point-at-infinity |
| `KZGProof` | `Bytes48` | Same as for `KZGCommitment` |

### Helpers

Converts a blob to its corresponding KZG point:

```python
def lincomb(points: List[KZGCommitment], scalars: List[BLSFieldElement]) -> KZGCommitment:
"""
BLS multiscalar multiplication. This function can be optimized using Pippenger's algorithm and variants.
"""
r = bls.Z1
for x, a in zip(points, scalars):
r = bls.add(r, bls.multiply(x, a))
return r

def blob_to_kzg(blob: Blob) -> KZGCommitment:
computed_kzg = bls.Z1
for value, point_kzg in zip(blob, KZG_SETUP_LAGRANGE):
assert value < BLS_MODULUS
computed_kzg = bls.add(
computed_kzg,
bls.multiply(point_kzg, value)
)
return computed_kzg
return lincomb(KZG_SETUP_LAGRANGE, blob)
```

Converts a KZG point into a versioned hash:
Expand All @@ -101,7 +105,7 @@ Verifies a KZG evaluation proof:
def verify_kzg_proof(polynomial_kzg: KZGCommitment,
x: BLSFieldElement,
y: BLSFieldElement,
quotient_kzg: KZGCommitment):
quotient_kzg: KZGProof) -> bool:
# Verify: P - y = Q * (X - x)
X_minus_x = bls.add(KZG_SETUP_G2[1], bls.multiply(bls.G2, BLS_MODULUS - x))
P_minus_y = bls.add(polynomial_kzg, bls.multiply(bls.G1, BLS_MODULUS - y))
Expand All @@ -111,6 +115,39 @@ def verify_kzg_proof(polynomial_kzg: KZGCommitment,
])
```

Efficiently evaluates a polynomial in evaluation form using the barycentric formula

```python
def bls_modular_inverse(x: BLSFieldElement) -> BLSFieldElement:
"""
Compute the modular inverse of x
i.e. return y such that x * y % BLS_MODULUS == 1 and return 0 for x == 0
"""
return pow(x, -1, BLS_MODULUS) if x != 0 else 0


def div(x, y):
"""Divide two field elements: `x` by `y`"""
return x * bls_modular_inverse(y) % BLS_MODULUS


def evaluate_polynomial_in_evaluation_form(poly: List[BLSFieldElement], x: BLSFieldElement) -> BLSFieldElement:
"""
Evaluate a polynomial (in evaluation form) at an arbitrary point `x`
Uses the barycentric formula:
f(x) = (1 - x**WIDTH) / WIDTH * sum_(i=0)^WIDTH (f(DOMAIN[i]) * DOMAIN[i]) / (x - DOMAIN[i])
"""
width = len(poly)
assert width == FIELD_ELEMENTS_PER_BLOB
inverse_width = bls_modular_inverse(width)

for i in range(width):
r += div(poly[i] * ROOTS_OF_UNITY[i], (x - ROOTS_OF_UNITY[i]) )
r = r * (pow(x, width, BLS_MODULUS) - 1) * inverse_width % BLS_MODULUS

return r
```

Approximates `2 ** (numerator / denominator)`, with the simplest possible approximation that is continuous and has a continuous derivative:

```python
Expand Down Expand Up @@ -321,20 +358,68 @@ class BlobTransactionNetworkWrapper(Container):
blob_kzgs: List[KZGCommitment, MAX_TX_WRAP_KZG_COMMITMENTS]
# BLSFieldElement = uint256
blobs: List[Vector[BLSFieldElement, FIELD_ELEMENTS_PER_BLOB], LIMIT_BLOBS_PER_TX]
# KZGProof = Bytes48
kzg_aggregated_proof: KZGProof
```

We do network-level validation of `BlobTransactionNetworkWrapper` objects as follows:

```python
def hash_to_bls_field(x: Container) -> BLSFieldElement:
"""
This function is used to generate Fiat-Shamir challenges. The output is not uniform over the BLS field.
"""
return int.from_bytes(hash_tree_root(x), "little") % BLS_MODULUS


def compute_powers(x: BLSFieldElement, n: uint64) -> List[BLSFieldElement]:
current_power = 1
powers = []
for _ in range(n):
powers.append(BLSFieldElement(current_power))
current_power = current_power * int(x) % BLS_MODULUS
return powers

def vector_lincomb(vectors: List[List[BLSFieldElement]], scalars: List[BLSFieldElement]) -> List[BLSFieldElement]:
"""
Given a list of vectors, compute the linear combination of each column with `scalars`, and return the resulting
vector.
"""
r = [0]*len(vectors[0])
for v, a in zip(vectors, scalars):
for i, x in enumerate(v):
r[i] = (r[i] + a * x) % BLS_MODULUS
return [BLSFieldElement(x) for x in r]

def validate_blob_transaction_wrapper(wrapper: BlobTransactionNetworkWrapper):
versioned_hashes = wrapper.tx.message.blob_versioned_hashes
kzgs = wrapper.blob_kzgs
commitments = wrapper.blob_kzgs
blobs = wrapper.blobs
assert len(versioned_hashes) == len(kzgs) == len(blobs)
for versioned_hash, kzg, blob in zip(versioned_hashes, kzgs, blobs):
# note: assert blob is not malformatted
assert kzg == blob_to_kzg(blob)
assert versioned_hash == kzg_to_versioned_hash(kzg)
# note: assert blobs are not malformatted

assert len(versioned_hashes) == len(commitments) == len(blobs)
number_of_blobs = len(blobs)

# Generate random linear combination challenges
r = hash_to_bls_field([blobs, commitments])
r_powers = compute_powers(r, number_of_blobs)

# Compute commitment to aggregated polynomial
aggregated_poly_commitment = lincomb(commitments, r_powers)

# Create aggregated polynomial in evaluation form
aggregated_poly = vector_lincomb(blobs, r_powers)

# Generate challenge `x` and evaluate the aggregated polynomial at `x`
x = hash_to_bls_field([aggregated_poly, aggregated_poly_commitment])
y = evaluate_polynomial_in_evaluation_form(aggregated_poly, x)

# Verify aggregated proof
assert verify_kzg_proof(aggregated_poly_commitment, x, y, wrapper.kzg_aggregated_proof)

# Now that all commitments have been verified, check that versioned_hashes matches the commitments
for versioned_hash, commitment in zip(versioned_hashes, commitments):
assert versioned_hash == kzg_to_versioned_hash(commitment)
```

## Rationale
Expand Down

0 comments on commit 0cf9afe

Please sign in to comment.