-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redesign receipt/tx trie generation on block imports #565
Redesign receipt/tx trie generation on block imports #565
Conversation
a43ff3e
to
186d898
Compare
@@ -132,6 +133,7 @@ def execute_bytecode(self, | |||
# Mining | |||
# | |||
def import_block(self, block): | |||
self.receipts = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stood out to me. What if instead of tracking this state we instead returned the receipt from apply_transaction
and accumulated the receipts in the loop here
It looks like this change goes well with the changes within the VMState
class. It looks like we could remove the need for that class to be passed the receipts since I think this changes things so that the only thing it needs to know is the gas_used
value from the latest receipt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but should we then pass gas_used
as an argument to __init__()
and keep updating it in add_transaction()
? Or did you have something else in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly not sure. Nothing specific in mind for exact implementation but starting with gas_used
being passed into __init__
seems like a good start. It'd be nice if it didn't need to be mutated but i'm more ok with mutating that value than the receipt storage on the VM
instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we need gas_used
in BaseState
so I'm passing it via __init__()
and updating it on BaseState.add_transaction()
, but in BaseVM
we also need that value in the .state
@Property, which is currently looking that up from BaseVM.receipts
but we could instead have a BaseVM.gas_used
and update that on BaseVM.apply_transaction()
. Would you prefer that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds viable. We can continue to iterate on this to I'd recommend just picking something that seems good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually need to keep the receipts somewhere in order to generate the receipt trie
@@ -266,6 +275,8 @@ def validate_block(self, block): | |||
) | |||
) | |||
|
|||
# FIXME: make_trie_root_and_nodes() is rather expensive, and we already run that once in | |||
# import_block(), so should refactor some of this code to avoid re-generating it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like we could fix this with some caching on the make_trie_root_and_nodes
function (or maybe a wrapper around that function which implements a cache).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I guess #573 might still be useful!
evm/vm/base.py
Outdated
@@ -459,7 +459,7 @@ def get_state_class(cls): | |||
return cls._state_class | |||
|
|||
@classmethod | |||
def get_state(cls, chaindb, block, prev_hashes, receipts=None): | |||
def get_state(cls, chaindb, block, prev_hashes, gas_used=0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like gas_used
shouldn't have a default value.
@@ -77,14 +80,12 @@ def apply_transaction(self, transaction): | |||
""" | |||
Apply the transaction to the vm in the current block. | |||
""" | |||
computation, block, trie_data_dict = self.state.apply_transaction( | |||
computation, block, receipt = self.state.apply_transaction( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder: the reason why apply_transaction
returned trie_data_dict
is for the stateless client to generate witness, which was discussed here: #247 (comment)
But since the sharding roadmap was (largely) changed, now the stateless client implementation won't happen in couple months, so I don't have a strong opinion about the change. Hope that we will find a solution to solve in both stateless/non-stateless clients when we reimplement stateless client. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have the VM incrementally generate the trie data, I suppose? The problem was that it was re-generating it from scratch for every transaction applied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except it wouldn't make the VM class stateless, so I guess that wouldn't really buy us anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hwwhww we'll continue to move towards stateless clients but for now going to prioritze full sync and then we'll figure out how to make it work with stateless clients also.
6d1a925
to
8da75f9
Compare
The second thoughts of this PR: [Previously]
[Now]
I took a look at I think now the The fastest way to solve it may be calculating the roots in p.s. I think that's the reason why |
#573 has an optimization to |
Oooooh, correct my own words: for the witness, we don't need If #573 doesn't work well, maybe we could redesign the APIs to be "for importing a block (only generating the root once per block)" and "for creating a block (generating the root per transaction?)". Maybe divide |
bceb324
to
283c768
Compare
evm/vm/base.py
Outdated
@@ -156,6 +159,13 @@ def mine_block(self, *args, **kwargs): | |||
Mine the current block. Proxies to self.pack_block method. | |||
""" | |||
block = self.block | |||
tx_root_hash, tx_kv_nodes = make_trie_root_and_nodes(block.transactions) | |||
receipt_root_hash, receipt_kv_nodes = make_trie_root_and_nodes(self.receipts) | |||
trie_data = merge(tx_kv_nodes, receipt_kv_nodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably unimportant: for performance optimization, this line could be removed, and just add them separately:
self.chaindb.persist_trie_data_dict(tx_kv_nodes)
self.chaindb.persist_trie_data_dict(receipt_kv_nodes)
We used to generate the tries for all block's transactions/receipts after applying every transaction, but that is way too expensive. Now the trie data is only generated once, after we've applied all transactions
It was no longer used anywhere and it doesn't work because our VM is no longer stateless
283c768
to
acfd862
Compare
We used to generate the tries for all block's transactions/receipts
after applying every transaction, but that is way too expensive. Now the
trie data is only generated once, after we've applied all transactions