Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to ADR-020 (protobuf signing) #6359

Merged
merged 7 commits into from
Jun 10, 2020
Merged

Conversation

webmaster128
Copy link
Member

Description

This PR makes two independent changes (one commit each) to ADR-020:

  1. Explain the source of the signers list, which is hidden implicitely in the list of messages. This implies that a verifier needs to be able to understand each message in order to calculate each message's required signers. Explain how to obtain public keys when left out. Also explain how to connect the signature information spread between messages, signer_infos and signatures.
  2. Use the same SignDoc for signing and verification. Before the default way was to assume serialize_signer(a: SignDoc) == serialize_verifier(b: SignDocRaw). This not only assumes that two different protobuf implementations produce the same bytes but also that the two document types have the same binary representation, which (even if true in some cases) makes many assumptions on the binary representation of protobuf and is very hard to understand and review. With this change, we just need serialize_signer(a: SignDoc) == serialize_verifier(b: SignDoc), which we agreed is managable to get right in many different implementations. Along the way the outdated types TxRaw and SignDocRaw got obsolete.

Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer

@alexanderbez alexanderbez added T: ADR An issue or PR relating to an architectural decision record R4R labels Jun 7, 2020
@aaronc
Copy link
Member

aaronc commented Jun 8, 2020

I'm on board with 1) and having additional documentation for clarity.

I don't think the reduction of Tx to have bytes fields is better than having two types Tx and TxRaw. Having just one SignDoc type withbytes fields makes sense.

@webmaster128
Copy link
Member Author

I don't think the reduction of Tx to have bytes fields is better than having two types Tx and TxRaw. Having just one SignDoc type withbytes fields makes sense.

Okay, so if you are a transaction verifier, you get a Tx: how do you know which binary represenation of e.g. TxBody was used by the signer?

@aaronc
Copy link
Member

aaronc commented Jun 8, 2020

I don't think the reduction of Tx to have bytes fields is better than having two types Tx and TxRaw. Having just one SignDoc type withbytes fields makes sense.

Okay, so if you are a transaction verifier, you get a Tx: how do you know which binary represenation of e.g. TxBody was used by the signer?

As a transaction verifier you decode TxRaw to do signature verification. That's what's happening in #6216.

@webmaster128
Copy link
Member Author

As a transaction verifier you decode TxRaw to do signature verification.

Aha – that's interestion. Does it mean that TxRaw is what is propagated in p2p and stored in Tendermint? Then we're on the same page and just use different naming for things.

@aaronc
Copy link
Member

aaronc commented Jun 8, 2020

As a transaction verifier you decode TxRaw to do signature verification.

Aha – that's interestion. Does it mean that TxRaw is what is propagated in p2p and stored in Tendermint? Then we're on the same page and just use different naming for things.

Tx and TxRaw are semantically equivalent but TxRaw is the safe thing for clients to broadcast IMHO. But I agreed to say that Tx could be broadcasted because in most cases clients will re-encode TxBody equivalently and some people commented that just having one Tx type is simpler.

@webmaster128
Copy link
Member Author

webmaster128 commented Jun 8, 2020

Tx and TxRaw are semantically equivalent but TxRaw is the safe thing for clients to broadcast IMHO.

I don't think this is the case. They store the same content but TxRaw pins a serialization of TxBody and AuthInfo, which Tx does not do. TxBody is a complex structure with optionals and list and we need to know how to serialize it.

in most cases clients will re-encode TxBody equivalently

This might be the case but I think we need to be able to specify how this encoding works in order to not depend on the implementation specifics of one library. The two clean solutions are:

  1. specifying a canonicalization
  2. pinning one binary represenation as in TxRaw

In a call we decided to go for the later. It is okay to switch back to 1. but please let's not create a specification by reference implementation.

and some people commented that just having one Tx type is simpler

This was my motivation as well, which is why I proposed the structure of TxRaw as the only tx type here.

@aaronc
Copy link
Member

aaronc commented Jun 8, 2020

in most cases clients will re-encode TxBody equivalently

This might be the case but I think we need to be able to specify how this encoding works in order to not depend on the implementation specifics of one library.

I agree we should be explicit and using bytes in SignDoc and TxRaw is a safer way.

  1. specifying a canonicalization
  2. pinning one binary represenation as in TxRaw

In a call we decided to go for the later. It is okay to switch back to 1. but please let's not create a specification by reference implementation.

I'd prefer to stick with 2 as canonicalization has other problems, notably the inability to have non-critical extension fields, malleability when proto2 semantics are used, and implementation complexity.

and some people commented that just having one Tx type is simpler

This was my motivation as well, which is why I proposed the structure of TxRaw as the only tx type here.

I honestly think the best compromise is having two tx types - Tx for general-purpose decoding and TxRaw for signing, verification and broadcasting. I don't think it's that big a deal to have two. I certainly think having two types is simpler than the implementation complexity and tradeoffs of canonicalization.

@webmaster128
Copy link
Member Author

I honestly think the best compromise is having two tx types - Tx for general-purpose decoding and TxRaw for signing, verification and broadcasting.

Sure, fine for me. I'll update this PR accordingly.

@webmaster128
Copy link
Member Author

I added a commit that splits the type into the more convenient Tx and the more hardened TxRaw used for signing, breadcasting, storage, txhash calculation and verification.

@alexanderbez I marked your comment as resolved since it is the same that is discussed here in the main discussion. I guess the 3rd commit addresses it. If not, please feel free to follow up in the latest diff.

@alexanderbez
Copy link
Contributor

I'll delegate approval/merge/close to @aaronc here 👍

Copy link
Member

@aaronc aaronc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webmaster128 this looks pretty good. I just made a few grammar suggestions and a couple suggestions that hopefully improve clarity. Other than that, I'm fine to merge.

Thanks for your contribution by the way!

Co-authored-by: Aaron Craelius <aaron@regen.network>
Copy link
Member Author

@webmaster128 webmaster128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Co-authored-by: Aaron Craelius <aaron@regen.network>
@webmaster128
Copy link
Member Author

All suggestions committed

@aaronc
Copy link
Member

aaronc commented Jun 9, 2020

Thanks @webmaster128! Can you merge master in please?

@aaronc aaronc added the A:automerge Automatically merge PR once all prerequisites pass. label Jun 9, 2020
@webmaster128
Copy link
Member Author

I updated this but the bot refused to merge 🤷‍♂️. I now set Allow edits by maintainers to true – have fun

@aaronc
Copy link
Member

aaronc commented Jun 10, 2020

Hmm... I don't see a way to update the base branch myself. Maybe @Mergifyio update.

@alexanderbez taking your last comment as an implicit ACK, so when this is up to date, I'm planning to merge.

@mergify
Copy link
Contributor

mergify bot commented Jun 10, 2020

Command update .: success

Branch has been successfully updated

@fedekunze
Copy link
Collaborator

This branch is out-of-date with the base branch

@webmaster128
Copy link
Member Author

Merged master again. @Mergifyio update? Is this the magic word?

@mergify
Copy link
Contributor

mergify bot commented Jun 10, 2020

@webmaster128 is not allowed to run commands

@webmaster128
Copy link
Member Author

@webmaster128 is not allowed to run commands

😝

@aaronc aaronc merged commit b9f39c9 into cosmos:master Jun 10, 2020
@webmaster128 webmaster128 deleted the adr20-updates branch June 10, 2020 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A:automerge Automatically merge PR once all prerequisites pass. T: ADR An issue or PR relating to an architectural decision record
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants