-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for multiple trees per file #51
Conversation
This looks very promising! Thanks for such an effort. Reading the code made me realize that we might be able to do something simpler: What if we parallelized the inner phases? That is, we currently roughly have the following compiler pipeline:
The above works for one (1) tree. Would you be willing to try feeding each separate tree to the pipeline? You would only touch two points in the pipeline:
Which ends up looking like:
Where the split is a simple: let trees = tree.split("\n\n").collect::<Vec<_>>(); And the Combine phase looks similar to what you have. If this makes sense to you I think it would simplify a lot. Wdyt? On the other hand, I think the combine step is a good place for verifying that all the trees have the same contract name 👌🏻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small comments answering @follow-up
.
If you have any questions, please do reach out. This is a great feature, and I'm happy to answer whatever!
Thanks @alexfertel, the review was very helpful - I have just pushed a new commit with the simplified pipeline. I have also identified the cause of most of the existing test failures due to some slight modifications I made to the use of identifier mode in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heyo! This is indeed a step forward, thanks for taking the time.
I gave it a closer look now since I think that we can go forward with this design. Lmk if you have any questions!
src/hir/combiner.rs
Outdated
return Err(format!( | ||
"Contract name mismatch: expected '{}', found '{}'", | ||
first_contract_name, root.contract_name | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good first step, but we probably will want to report more information with this error, like the exact line where it happened, so we'll need to add custom Combiner errors, like what Parser et al have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexfertel, I believe I have made some progress on this front, although I'm not quite sure what additional information we want to report and how. For example, I am having some difficulty working out how we can get the span of the original text from the HIR (more context on L125 of combiner.rs
). Some input here would be great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am having some difficulty working out how we can get the span of the original text from the HIR
Ah, yes, this is a limitation of the current design, since a HIR doesn't really map to a tree, this is something to figure out. No worries, we don't have to tackle this here since it is unrelated and uphill.
src/syntax/tokenizer.rs
Outdated
@@ -140,7 +140,8 @@ impl Tokenizer { | |||
Self { | |||
pos: Cell::new(Position::new(0, 1, 1)), | |||
// Starts as `true` because the first token must always be a contract name. | |||
identifier_mode: Cell::new(true), | |||
// identifier_mode: Cell::new(true), // @follow-up - this is correct, but it breaks the tests because of invalid . character in root |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .
s in a root should be valid now, since I am calling sanitize
on contract names. We have multiple tests for this, but the main one that covers this functionality you can find here:
Line 12 in c38e779
fn scaffolds_trees() { |
Which, scaffolds this file:
Which contains a contract name with a .
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexfertel, based on the latest commit, all existing tests pass. The issue above occurs when changing the reset
function to set identifier_mode
to true. The rationale here that, given identifier_mode
is set to true by default since:
The first token must always be a contract name, which has to be a valid Solidity identifier.
it should also be true here as we don't want this state to be changed to false when reset
is called within tokenize
; however, since the contract name is not sanitized before the tokenization step then the tests fail with IdentifierCharInvalid('.')
.
Do you agree with this assessment? If so, how should we handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you were right; it's fine not to treat contract names as identifiers since we sanitize them later. Lmk if you find any bugs with this behavior.
src/hir/combiner.rs
Outdated
let identifier = get_contract_name_from_identifier(&text); | ||
let accumulated_identifier = contract_definition.identifier.clone(); | ||
if identifier != accumulated_identifier { | ||
let span = Span::default(); // @follow-up - how can we get the span from the HIR? Is it even necessary? This would be easier to do with verification of the AST. One option is to use the index of the HIR in the vector of HIRs since we know the identifier is the start of a given tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a good idea, but I'm hesitant to commit until I think more about it. We can report with the empty span for now.
I don't have much time right now to review, but I will do so soon after New Year's! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files
|
Heyo @giovannidisiena ! Sorry about the delay, I've been very busy. Finally had some time to look at this. Very good job! This is basically what I had in mind. What's left is testing, which I'll leave in your hands. I'd expect unit tests + integration tests in the I pushed a couple of commits: one with a few structural changes that are basically nits from me and another one fixing what you brought up -- you were right that One more thing we need to add to the TODO list is that we need to handle trees that are more than Thank you very much, excited about merging this PR! |
Hey @alexfertel, that's not a problem at all. I'm also very busy for the foreseeable so will have to attempt to chip away at the testing as and when possible. Brilliant, thank you for the assist there, and yes agreed regarding the additional whitespace.
Likewise! |
Hey @giovannidisiena, I just wanted to check how we're feeling about this one. I'd really like this not to get stale, so if you're not up for it I'm happy to take over! |
Absolutely @alexfertel, thanks for checking in. I may get some time toward the end of this week, so will keep you updated; otherwise, it may be best for you to take over since the next opportunity I have will likely be toward the end of March/April, which is obviously not ideal for keeping this from getting stale. I'll let you know how it goes! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just looking for some clarification on this. Additionally, both check
and scaffold
are currently erroring with "a Corner
must be the last child" – I'd thought this wouldn't be an issue since we are splitting the trees before forwarding on to the parser, but for some reason it is, and I haven't yet been able to track down the cause. If you have any ideas that would be great:)
@alexfertel I think that's almost everything you asked for now, except |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking awesome! I want to do some checks myself locally, which I'll do as soon as I get some time, hopefully today. Thanks for all this work ❤️
Great! And thanks for all the help ❤️ |
Okay, local checks done!
Nothing comes to mind right now, but any work there can be done in a different PR, it doesn't need to be a part of this one. If you think of anything feel free to open an issue. I think what's missing are the |
The only thing I can think of at the moment is perhaps fixing the contract name mismatch by applying the first identifier to all subsequent roots, but I'm happy to open it as a separate issue if you think it's not worth tackling right now. |
Ah, you're right. Yeah, I think we can tackle it separately. If you feel like working on it, don't let me hold you back, though. I'm happy either way. |
Okay, no problem; I'll see if I have time left over after the |
@alexfertel, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🚀
That's okay, that part of the code is not polished and needs lots of love. I'm merging this in! When you're happy with this, feel free to change it to ready for review. |
.tree
file
@alexfertel just realised I haven't updated the README, doing that now. |
I forgot about that as well, thank you! |
@alexfertel While writing up an example for the README, I noticed that the resulting Solidity test function names did not include the root identifier function name, so I have also added that and refactored the |
Ah, I see, yeah that makes sense. I'll take a better look tomorrow morn, and if everything looks good, I'll merge 👍🏻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🚀
WIP attempt at handling multiple trees per
.tree
file. Currently, it appears I have broken some of the existing tests but I'm not yet sure how/why.Steps taken:
TokenKind::Root
.hir::translate
whencheck::context::Context
is created since check mostly works with parse trees derived from HIRs.@alexfertel it would be great to get some initial feedback here, when you get some time please, to make sure I'm not completely barking up the wrong tree or making too much of a mess. I have left a few
follow-up
comments for both my and your benefit - I'm sure there's lots here that could be improved or otherwise written in a more idiomatic fashion so please do let me know where that is the case as I'm definitely still getting to grips with Rust!Partially closes #50.