Skip to content
This repository has been archived by the owner on Jul 14, 2023. It is now read-only.

Improvements to the networking storage read requests #10

Closed
wants to merge 3 commits into from

Conversation

tomaka
Copy link
Contributor

@tomaka tomaka commented Mar 30, 2023

Reopening #5

cc @cheme

Compared to #5, I've added a new onlyKeysAfterIgnoreLastNibble field.

@tomaka
Copy link
Contributor Author

tomaka commented Mar 30, 2023

One thing I'm not sure about is:
What happens if you query :child_storage:default:foo with includeDescendants equal to true? Should the response contain the child trie entries? I would say no, and the response should only contain a node with the root hash of the child trie.

@cheme
Copy link

cheme commented Mar 30, 2023

Totally agree about not including child trie descendant under prefix, query stay at a single trie level this way.

Actually this makes me think again that the query for child trie could be :

message ChildTrieInfo {
       enum ChildTrieNamespace {
               DEFAULT = 1;
       }

       required bytes root = 1;
       required bytes hash = 2;
       required ChildTrieNamespace namespace = 3;
}

Which let us directly get proof from root and not include the top trie proof content in the reply: the request and reply are strongly bounded to the child trie state only.
In consequence content of the proof will only be from a single trie, which makes things way simplier.
(I kept the hash field for technical reason)

@Noc2
Copy link
Contributor

Noc2 commented Mar 31, 2023

Feel free to ping me once I should merge it ;-)

@cheme
Copy link

cheme commented Mar 31, 2023

I am currently implementing it, will probably have a few other question.

One that I just see: should we allow stopping between a node and a value node, in this case the restart at a node key/prefix will not work since node and value share the same location.
The direction I plan to take is to just not allow it (check size on node only, not on value node (value node size will be part of its parent node obviously).

@tomaka
Copy link
Contributor Author

tomaka commented Mar 31, 2023

Yeah I would also not allow it.

In an ideal world the values would be Merkle-ized, and you could resume within the value as well, but we're clearly not there.

Copy link

@cheme cheme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a mvp of this in the trie crate (only trie level).

Basically what I did is follow the query at node level (either following a value or iterating).
Use some simple limit (either number of node or total size of encoded nodes), and whenever the limit is reached I exit.

More precisely the exit is only done at the next hash of node I try to access from a branch (if there is hash of a value, the value is fetched and added to the proof).
Then to restart the information is the node key to the content pointed by this hash (the node key of the last recorded node ++ the last recorded node partial key ++ the child index of the next existing and non recorded children).
This means that when we restart we always are in a state where next operation is to descend in a children.

@tomaka writing this I understand better why you mention that the lower bound limit is actually inclusive, as it is the node key of the next non value node queried from a branch.
So I propose to remove the equal in the PPP.

+ optional ChildTrieInfo child_trie_info = 2; // Read from the main trie if missing.
+ repeated Key keys = 3;
+ optional bytes onlyKeysAfter = 4;
+ optional bool onlyKeysAfterIgnoreLastNibble = 5;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea here would be to allow:
optional bytes usingRoot = 5;
This will be the state root, and could allow server to skip all block hash/number checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like an implementation detail to me.
An implementation needs to find the trie/storage that is requested in its database. If it uses the trie root hash as its database index, then it's indeed faster to provide this trie root hash, but that's really not something the networking should be aware of.

This would also raise the question: what happens if the request is invalid?
If an implementation uses the block hash as its database index, then what should it do if the provided root hash doesn't match?

It means that some implementations would accept the request and some others would reject the request, unless you make it mandatory for all implementations to check that the root hash matches the block hash, which would kill the point of providing this root hash.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an implementation uses the block hash as its database index, then what should it do if the provided root hash doesn't match?

If root hash is present the block hash is not checked, same for header.
In fact it should be make mandatory that block hash / number are not checked when this hash is provided and the client support using it.
I got this idea as a way to restore processing (since it is a network protocol we can be a bit technical).
The issue is that the implementer should not be force to implement it and could refer to block number/hash if unsupported.
Probably a cleaner design would be to pass it in place of the block hash/number (like an enum) and return unsupported if client do not implement or allow direct state access from root. (I initially put it next to OnlyKeys as the intention is also to resume a query).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very much in favor of not doing this, in the name of simplicity/purity. This really feels like introducing a wart in the design just to save a few hundred microseconds.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on the implementation, and for child trie it skip a full top state query, but yeah caching can solve this in a pretty non deterministic way in client side.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to just use a state root instead of the block hash/number (since we are at network protocol level).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to just use a state root instead of the block hash/number (since we are at network protocol level).

Well actually I cannot implement it due to internals with rocksdb, maybe as a V3 :(

@cheme
Copy link

cheme commented Apr 14, 2023

For info, I manage to implement this with Compact Proof encoding (the one we use in pov).
It was way more involve than I expected. Drawback are that you can only check state at the end of each reply. Similarily running with size limit low will means that we need to include more hashes (so less space saved).
Also when producing proof, the nodes cannot be streamed (ordering does not allow it and all proof must stay in memory), but with current protobuf schema it does not matter.
Considering the implementation effort I am not sure it is worth it. Probably rather start with the full encoded node design and maybe later switch to something like w3f/PSPs#56 (got also the drawback that we can only invalidate reply after reading all nodes but should not need to put everything in memory when recording).

@tomaka
Copy link
Contributor Author

tomaka commented Apr 17, 2023

I'm not familiar enough with your PSP to comment on this, but I agree that compact proofs aren't really interacting well with this proposal and IMO it makes more sense to stick to non-compact proofs.

@Noc2
Copy link
Contributor

Noc2 commented Jul 14, 2023

We will archive this repository in favor of the new RFCs repository by the Fellowship. So, I'm closing the issue here. I recommend opening a PR at the new repository.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants