Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cardano shelley update 2/3 #1112

Merged
merged 35 commits into from
Jul 23, 2020

Conversation

gabrielKerekes
Copy link
Contributor

Cardano Shelley update issue: #948

As promised (#1048) we are splitting the Cardano Shelley update into (at least) 3 PRs. This is the second of them.

This PR contains derivation of all the addresses supported in Shelley - Base Address, Pointer Address, Enterprise Address, Reward Address (Byron Address is also supported, but this has been added in the first PR (#1096)) as well as their use in transactions.

What's changed:

  • CardanoAddressParametersType has been added to protobuf, since we can now derive different addreses. Since this is used both in CardanoGetAddress and CardanoSignTx it was simpler to create a type around the parameters.
  • network_id has been added to CardanoGetAddress and also CardanoSignTx messages, since this is what is used to distinguish between mainnet/testnet in Shelley addresses. Protocol Magic and Network ID are unfortunately not the same thing.
  • Transaction output address now has to be received as bytes. The other alternative is receiving base58 or bech32 encoded strings, which could get messy while decoding (which decoder do we use?).
  • The Keychain class in seed.py had to be updated in order to support both Byron (m/44'/1815') and Shelley (m/1852'/1815') namespaces. This needs thorough review.
  • The old address.py file has been moved to byron_address.py and is still used when deriving Byron Addresses.
  • The new address.py file contains derivation of all the Shelley addresses (+ it also wraps Byron Address derivation).
  • Various test cases had to be added in order to test all the different address combinations during address derivation and transaction signing.
  • Helper files moved to a helpers folder so they'd be separated.
  • Bech32 helper file added in order to easily encode bech32 strings longer than segwit.

@matejcik
Copy link
Contributor

  • Transaction output address now has to be received as bytes. The other alternative is receiving base58 or bech32 encoded strings, which could get messy while decoding (which decoder do we use?).

can't the addresses be distinguished by their prefixes?
how do you choose which encoding to use when displaying the address?

@gabrielKerekes
Copy link
Contributor Author

gabrielKerekes commented Jul 17, 2020

can't the addresses be distinguished by their prefixes?
how do you choose which encoding to use when displaying the address?

In encoded form, not really. Bech32 is a subset of Base58 so it might happen that they're indistinguishable. When displaying the address from bytes, we can decide depending on the header - first 4 bits of the address.

@matejcik
Copy link
Contributor

can't the addresses be distinguished by their prefixes?
how do you choose which encoding to use when displaying the address?

In encoded form, not really. Bech32 is a subset of Base58 so it might happen that they're indistinguishable. When displaying the address from bytes, we can decide depending on the header - first 4 bits of the address.

bech32 has a distinct prefix string, isn't Cardano using that to, well, distinguish address types, which is the purpose of that?

I mean, personally I don't care if the host will do the decoding or Trezor does it, it's just strange. It's more of a question of your interchange formats. Currently the transaction format supported in trezorctl isn't used anywhere else, is it? Does Cardano have an interchange format which should be supported (or supportable) in trezorctl? Like, i dunno, output of the daemons/jsonrpc/whatnot, i don't know about the cardano ecosystem.

My point is, let's not make it deliberately more difficult for 3rd parties to use the Cardano API, if possible. If there's tooling on Cardano side to decode raw address bytes, that's that.

@refi93
Copy link
Contributor

refi93 commented Jul 17, 2020

Hi @matejcik

We've verified with IOHK that the way they distinguish Base58/Bech32 addresses is trying to decode as both and go down whatever path succeeds which does not seem safe for a hw wallet as it cannot be ruled out that there are addresses that are both a valid bech32 and base58 encoding and depending on the decoding you use you would get a different, still valid address - seems unlikely but better be safe than sorry.

If bech32/base58 were just two different encodings for the same type of address, we could enforce just one type, like base58 being required for BTC, but that's not the case for Cardano.
Cardano developers also mentioned that we could "reasonably expect to strip off the base58 or bech32 encoding before passing it down to the hardware device".

Regarding the prefix - we were told by cardano devs that they are using either "addr" or "addr_test" prefix (which we were told just today and will need to fix it in this PR) in their bech32 addresses, so there's ambiguity even on that and it cannot be ruled out that new prefixes will appear in the foreseeable future, which would require updating trezor code if we were to parse based on them.

There is also one more (minor) issue - Bech32 decoding on Trezor is currently made for segwit addresses and is limited to 90 bytes. Base addresses in Cardano have a length of 103. So this would also need to change.

Given the points above, we would suggest to stick to the hex format for output addresses for now to avoid potential bugs/exploits emerging from the base58/bech32 ambiguity and once the situation around shelley clears up, we can allow alternative formats. Meanwhile, programmatic users can use libraries to decode bech32/base58 to hex with minimum overhead and ctl users have access to online tools to translate the addresses which is not optimal but better than having the tx signing api bricked by some unexpected change on Cardano devs side given that it has not been launched yet.

@matejcik
Copy link
Contributor

thanks for the explanation.

it cannot be ruled out that there are addresses that are both a valid bech32 and base58 encoding and depending on the decoding you use you would get a different, still valid address - seems unlikely but better be safe than sorry.

that seems extremely unlikely:

  • the checksum (crc32 in Byron, polymod in Shelley presumably) would need to be valid in both variants of the decoding
  • the decoded address format would need to match Byron in the base58 case and Shelley in the bech32 case

Regarding the prefix - we were told by cardano devs that they are using either "addr" or "addr_test" prefix (which we were told just today) in their bech32 addresses, so there's ambiguity even on that and it cannot be ruled out that new prefixes will appear in the foreseeable future, which would require updating trezor code if we were to parse based on them.

This is rather problematic.

Issue one: how do we show Shelley addresses for display if we don't know the correct prefix to use?

Issue two: address checksums depend on the prefix. If we choose the wrong prefix for display, the address displayed on Trezor will not match the address the user was expecting, even after stripping the prefix.

(and one point of order: at no point will we use "hex-encoded" addresses. The host will need to send raw address bytes. It seems that this is what the code already expects (i haven't read the PR in detail yet), but at least some comments mention hex encoding)


Again, i'm not against sending raw address bytes, esp. if that is what IOHK themselves recommend.
But I'm seriously concerned about a design where that is the practical choice.

Another small drawback of not sending address strings is that Trezor cannot do checksum validation. A poorly written host software could accept a wrongly typed address, and the user would be asked to confirm a different address on the screen -- which leads them to think they're under attack while in practice the original address was just mistyped.

@refi93
Copy link
Contributor

refi93 commented Jul 17, 2020

Yes, the issue with showing bech32 addresses correctly given the uncertainty of prefixes is concerning - new prefixes appearing in the future is rather our concern based on the quick changes made to the protocol in the last few weeks, and we will get back to IOHK to get a final confirmation that only addr and addr_test will be used when Shelley is launched.

Regarding the hex encoding I mentioned - sorry, that was just my mistake at the moment of writing it as I was thinking of trezor connect/ledgerjs that use hextstrings to represent bytes in their api.

Regarding the ambiguity of bech32 vs base58 decoding, it's true that even if there was a collision in the sense that the same string could be validly decoded as bech32 and base58 as two different valid Cardano addresses, Trezor would still show to the user the actual address being serialized into the output, so the user would have a chance to easily validate that before signing the transaction.

So we agree to change the transaction outputs to have the address as a string, even on the protobuf layer - that way it would be consistent with the way we return addresses on protobuf layer, we can verify the bech32 prefixes and cautiously abort if a new prefix appears by any chance and have a more pleasant api. The small drawback here being the potential ambiguity of bech32/base58 decoding, but it's apparently outweighted by the benefits and it's not a security threat per se as the user would still have enough visibility to intervene if it happened by any chance. And therefore we will need to extend bech32 decoding to support addresses longer than 90 characters by an optional parameter to supply the max length. Does that seem right to you?

@mmahut
Copy link
Contributor

mmahut commented Jul 17, 2020

(...) the uncertainty of prefixes is concerning

I think the guideline is being defined in cardano-foundation/CIPs#6.

@SebastienGllmt
Copy link

the uncertainty of prefixes is concerning

As Marek linked, the prefixes are decided by CIP6 which is still in draft stage so feel free to propose changes (or things you think shouldn't change!) if it makes it easier for hardware wallets.

We've verified with IOHK that the way they distinguish Base58/Bech32 addresses is trying to decode as both and go down whatever path succeed

Byron addresses will always start with 0b1000 as a prefix. This is a different prefix than all the other address types and so there should be no ambiguity. You just need to encode everything as hex internally and then when displaying something on screen, you choose the display format depending on the prefix. You can see we don't need any kind of back-tracking in our Rust implementation https://github.com/Emurgo/cardano-serialization-lib/blob/master/rust/src/address.rs#L265

@prusnak
Copy link
Member

prusnak commented Jul 18, 2020

the prefixes are decided by CIP6 which is still in draft stage so feel free to propose changes

@SebastienGllmt Can you please shed a light on what date is the hardfork expected? Surely you can't do a hardfork in next two weeks when address format is not set in stone, right?

@SebastienGllmt
Copy link

@prusnak I work for EMURGO and not IOHK so I have no input on the date. Trust me, we're also running into the same issues as both you and VaccumLabs and doing our best to adapt to whatever timeline IOHK sets.

@mmahut
Copy link
Contributor

mmahut commented Jul 18, 2020

Also, please note it is not really a hard-fork (as you known it from Bitcoin), they are using a hardfork combinator to run both protocols in parallel during the switch era.

@prusnak
Copy link
Member

prusnak commented Jul 18, 2020

Also, please note it is not really a hard-fork (as you known it from Bitcoin), they are using a hardfork combinator to run both protocols in parallel during the switch era.

Does that mean that all Shelley firmware changes are optional and not mandatory to stay compatible with the network after the Shelley hardfork is launched?

@mmahut
Copy link
Contributor

mmahut commented Jul 18, 2020

Does that mean that all Shelley firmware changes are optional and not mandatory to stay compatible with the network after the Shelley hardfork is launched?

No, both are needed, as both protocols will be in use (for some time, until Byron is retied).

@prusnak
Copy link
Member

prusnak commented Jul 18, 2020

No, both are needed, as both protocols will be in use (for some time, until Byron is retied).

But one can still use the old code to transact on Byron and this is still supported via the combinator on the mainnet, no?

@mmahut
Copy link
Contributor

mmahut commented Jul 18, 2020

But one can still use the old code to transact on Byron and this is still supported via the combinator on the mainnet, no?

I'm not sure how granular the tx construction is in the current firmware, but once the hardfork combinator is activated via an update proposal to the network, you will be able to use Byron witnesses (that will be accepted by the Byron nodes), but the network will use Shelley format for transactions, fee estimations etc. VacuumLabs, please correct me if I'm wrong.

@refi93
Copy link
Contributor

refi93 commented Jul 18, 2020

we clarified offline with @mmahut that "byron" witnesses in Shelley transactions are indeed different from the witnesses used in Byron transactions even if they are used to "testify" for the same kind of addresses - they are a signature of a different message (in Byron txs, the message signed was a concatenation of protocol magic and tx hash whereas in Shelley the hash alone is being signed) and there are also differences in their format (see https://hydra.iohk.io/build/3509804/download/1/ledger-spec.pdf "Bootstrap Witnesses" section) so there is incompatibility even on that layer when it comes to Shelley transactions. As far as we know, after the hard-fork supposed to happen on July 29, Byron transaction format will no longer be accepted by the network so wallets need to switch over to the new format, Trezor included

@SebastienGllmt
Copy link

SebastienGllmt commented Jul 18, 2020

Yes, and keep in mind that the Byron witness generation changes on Thursday the 16th so be careful about referencing old documents since they're now out of date. You can check out our Rust code for bootstrap witness generation here, but there are no test vectors for bootstrap witnesses generation so I haven't had a chance to test our implementation yet.

Also use better names for functions
@prusnak prusnak added this to the 2020-08 milestone Jul 22, 2020
Copy link
Contributor

@tsusanka tsusanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Few comments/questions but in general: untested ACK.

@tsusanka
Copy link
Contributor

Btw the UI tests are failing. Maybe you have recorded without --ui-check-missing? Try to run pipenv run make -C core test_emu_ui_record again and let's see if that fixes it.

@gabrielKerekes
Copy link
Contributor Author

Oh, yes. I forgot to mention that I haven't regenerated the UI tests yet, so that I wouldn't have to generate them multiple times. I will do that after I fix staking_use_cases.

Require either a `staking_path` or a `staking_key_hash`.
Copy link
Contributor

@matejcik matejcik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is it from me

i'm not sure if we'll want to keep the whole commit history, what do you think @tsusanka ?

@tsusanka
Copy link
Contributor

i'm not sure if we'll want to keep the whole commit history, what do you think @tsusanka?

It depends if the commits are mostly editing some previous commits. In that case I would squash into one. If not we can keep them but let's "Create a merge commit" instead of "Rebase and merge".

@gabrielKerekes some input on this? Do you feel it is worth to keep the history?

@gabrielKerekes
Copy link
Contributor Author

I'd prefer squashing the commits, I don't think it's worh it to keep the history. So shall I do that?

@matejcik matejcik merged commit a928b09 into trezor:cardano-shelley Jul 23, 2020
@matejcik
Copy link
Contributor

i did that via squash&merge accept.

Thanks for your work!

Any ETA on the 3rd PR? It's not going into this release anyway, I'm just curious.

@gabrielKerekes
Copy link
Contributor Author

Thank you for your reviews and accepting the PRs!

The 3rd PR should be ready on Monday/Tuesday. It's really a shame it won't make it into the release. Is there any chance, that if it is of reasonable size, it could get in to the release? If not, when is the next chance for it to go out?

@tsusanka
Copy link
Contributor

The 3rd PR should be ready on Monday/Tuesday. It's really a shame it won't make it into the release. Is there any chance, that if it is of reasonable size, it could get in to the release? If not, when is the next chance for it to go out?

We have to be strict about the freezing date otherwise it's a mess :(. Next release should be on September 2nd.


Btw in your next PR please also add all these changes to core/CHANGELOG.md.

matejcik pushed a commit that referenced this pull request Jul 27, 2020
@gabrielKerekes gabrielKerekes deleted the cardano-shelley-update-pr2 branch July 2, 2021 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants