feat: add support for multiple trees per file #51

giovannidisiena · 2023-12-15T16:34:41Z

WIP attempt at handling multiple trees per .tree file. Currently, it appears I have broken some of the existing tests but I'm not yet sure how/why.

Steps taken:

Find all roots in a .tree by adding a new TokenKind::Root.
Parse & translate each AST individually.
Combiner: Check that all roots belong to the same contract, i.e. Contract for each root Contract.function is the same.
Combiner: Merge all the HIRs by collecting the function nodes and appending them to the root contract node.
Combine and report any errors.
Add tests for all of the above.
Add roots check to bulloak check - this is done by hir::translate when check::context::Context is created since check mostly works with parse trees derived from HIRs.

@alexfertel it would be great to get some initial feedback here, when you get some time please, to make sure I'm not completely barking up the wrong tree or making too much of a mess. I have left a few follow-up comments for both my and your benefit - I'm sure there's lots here that could be improved or otherwise written in a more idiomatic fashion so please do let me know where that is the case as I'm definitely still getting to grips with Rust!

Partially closes #50.

alexfertel · 2023-12-15T17:16:28Z

This looks very promising! Thanks for such an effort.

Reading the code made me realize that we might be able to do something simpler: What if we parallelized the inner phases? That is, we currently roughly have the following compiler pipeline:

.tree -> &str -> scaffold entrypoint -> tokenize & parse -> semantic analysis -> HIR -> PT -> emit

The above works for one (1) tree. Would you be willing to try feeding each separate tree to the pipeline? You would only touch two points in the pipeline:

.tree -> &str -> split input -> scaffold entrypoint -> ... -> HIR -> Combine -> PT -> emit
                 ^^^^^^^^^^^                                         ^^^^^^^

Which ends up looking like:

                                scaffold entrypoint -> ... -> Hir 
                              /                                   \
.tree -> &str -> split input -> scaffold entrypoint -> ... -> Hir -> Combine -> PT -> emit
                              \                                   /
                                scaffold entrypoint -> ... -> Hir

Where the split is a simple:

let trees = tree.split("\n\n").collect::<Vec<_>>();

And the Combine phase looks similar to what you have.

If this makes sense to you I think it would simplify a lot. Wdyt?

On the other hand, I think the combine step is a good place for verifying that all the trees have the same contract name 👌🏻

alexfertel

A few small comments answering @follow-up.

If you have any questions, please do reach out. This is a great feature, and I'm happy to answer whatever!

src/hir/combiner.rs

src/hir/mod.rs

giovannidisiena · 2023-12-22T10:11:57Z

Thanks @alexfertel, the review was very helpful - I have just pushed a new commit with the simplified pipeline. I have also identified the cause of most of the existing test failures due to some slight modifications I made to the use of identifier mode in syntax::tokenizer, which I believe to be correct but have downstream effects, so if you could take a look at the remaining updated follow-up comments and provide your thoughts please that would be awesome.

alexfertel

Heyo! This is indeed a step forward, thanks for taking the time.

I gave it a closer look now since I think that we can go forward with this design. Lmk if you have any questions!

src/hir/combiner.rs

alexfertel · 2023-12-22T12:10:22Z

src/hir/combiner.rs

+                        return Err(format!(
+                            "Contract name mismatch: expected '{}', found '{}'",
+                            first_contract_name, root.contract_name
+                        ));


This is a good first step, but we probably will want to report more information with this error, like the exact line where it happened, so we'll need to add custom Combiner errors, like what Parser et al have.

@alexfertel, I believe I have made some progress on this front, although I'm not quite sure what additional information we want to report and how. For example, I am having some difficulty working out how we can get the span of the original text from the HIR (more context on L125 of combiner.rs). Some input here would be great!

I am having some difficulty working out how we can get the span of the original text from the HIR

Ah, yes, this is a limitation of the current design, since a HIR doesn't really map to a tree, this is something to figure out. No worries, we don't have to tackle this here since it is unrelated and uphill.

src/scaffold/mod.rs

src/syntax/tokenizer.rs

alexfertel · 2023-12-22T16:54:06Z

src/syntax/tokenizer.rs

@@ -140,7 +140,8 @@ impl Tokenizer {
        Self {
            pos: Cell::new(Position::new(0, 1, 1)),
            // Starts as `true` because the first token must always be a contract name.
-            identifier_mode: Cell::new(true),
+            // identifier_mode: Cell::new(true), // @follow-up - this is correct, but it breaks the tests because of invalid . character in root


The .s in a root should be valid now, since I am calling sanitize on contract names. We have multiple tests for this, but the main one that covers this functionality you can find here:

bulloak/tests/scaffold.rs

Line 12 in c38e779

fn scaffolds_trees() {

Which, scaffolds this file:

https://github.com/alexfertel/bulloak/blob/c38e7795c6a018019ed2e2e0712868b5db0202be/tests/scaffold/basic.tree

Which contains a contract name with a ..

@alexfertel, based on the latest commit, all existing tests pass. The issue above occurs when changing the reset function to set identifier_mode to true. The rationale here that, given identifier_mode is set to true by default since:

The first token must always be a contract name, which has to be a valid Solidity identifier.

it should also be true here as we don't want this state to be changed to false when reset is called within tokenize; however, since the contract name is not sanitized before the tokenization step then the tests fail with IdentifierCharInvalid('.').

Do you agree with this assessment? If so, how should we handle this?

Yes, you were right; it's fine not to treat contract names as identifiers since we sanitize them later. Lmk if you find any bugs with this behavior.

tests/check.rs

tests/scaffold.rs

alexfertel · 2023-12-30T15:36:28Z

src/hir/combiner.rs

+                                    let identifier = get_contract_name_from_identifier(&text);
+                                    let accumulated_identifier = contract_definition.identifier.clone();
+                                    if identifier != accumulated_identifier {
+                                        let span = Span::default(); // @follow-up - how can we get the span from the HIR? Is it even necessary? This would be easier to do with verification of the AST. One option is to use the index of the HIR in the vector of HIRs since we know the identifier is the start of a given tree.


Yeah, this is a good idea, but I'm hesitant to commit until I think more about it. We can report with the empty span for now.

alexfertel · 2023-12-30T15:40:26Z

I don't have much time right now to review, but I will do so soon after New Year's!

codecov · 2024-01-14T14:15:39Z

Codecov Report

Attention: Patch coverage is 98.38275% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 92.5%. Comparing base (02c171e) to head (8f80e68).

Additional details and impacted files

Files	Coverage Δ
src/check/rules/structural_match.rs	`96.1% <ø> (ø)`
src/check/violation.rs	`88.4% <100.0%> (+0.2%)`	⬆️
src/error.rs	`91.3% <100.0%> (+0.6%)`	⬆️
src/hir/mod.rs	`100.0% <100.0%> (ø)`
src/hir/translator.rs	`98.3% <100.0%> (ø)`
src/scaffold/emitter.rs	`97.9% <100.0%> (-0.1%)`	⬇️
src/scaffold/mod.rs	`87.9% <100.0%> (+2.6%)`	⬆️
src/scaffold/modifiers.rs	`90.8% <ø> (ø)`
src/syntax/tokenizer.rs	`94.6% <100.0%> (-0.1%)`	⬇️
src/utils.rs	`99.2% <100.0%> (+4.6%)`	⬆️
... and 1 more

alexfertel · 2024-01-14T14:18:23Z

Heyo @giovannidisiena ! Sorry about the delay, I've been very busy. Finally had some time to look at this.

Very good job! This is basically what I had in mind. What's left is testing, which I'll leave in your hands. I'd expect unit tests + integration tests in the tests folder.

I pushed a couple of commits: one with a few structural changes that are basically nits from me and another one fixing what you brought up -- you were right that identifier_mode should start as false. This was a bug given that we sanitize contract names now.

One more thing we need to add to the TODO list is that we need to handle trees that are more than \n\n apart. I think it's reasonable to just ignore any whitespace around and between the \n\n.

Thank you very much, excited about merging this PR!

giovannidisiena · 2024-01-19T21:17:39Z

Hey @alexfertel, that's not a problem at all. I'm also very busy for the foreseeable so will have to attempt to chip away at the testing as and when possible.

Brilliant, thank you for the assist there, and yes agreed regarding the additional whitespace.

Thank you very much, excited about merging this PR!

Likewise!

alexfertel · 2024-02-18T13:07:54Z

Hey @giovannidisiena, I just wanted to check how we're feeling about this one. I'd really like this not to get stale, so if you're not up for it I'm happy to take over!

giovannidisiena · 2024-02-19T18:44:20Z

Absolutely @alexfertel, thanks for checking in. I may get some time toward the end of this week, so will keep you updated; otherwise, it may be best for you to take over since the next opportunity I have will likely be toward the end of March/April, which is obviously not ideal for keeping this from getting stale. I'll let you know how it goes!

giovannidisiena

Just looking for some clarification on this. Additionally, both check and scaffold are currently erroring with "a Corner must be the last child" – I'd thought this wouldn't be an issue since we are splitting the trees before forwarding on to the parser, but for some reason it is, and I haven't yet been able to track down the cause. If you have any ideas that would be great:)

src/utils.rs

giovannidisiena · 2024-02-22T14:56:59Z

@alexfertel I think that's almost everything you asked for now, except Combiner unit tests that I will add. The only other thing I am unsure about is whether we need to handle any violation cases to be fixed when running bulloak check --fix?

alexfertel

This is looking awesome! I want to do some checks myself locally, which I'll do as soon as I get some time, hopefully today. Thanks for all this work ❤️

tests/check/missing_contract_identifier.tree

giovannidisiena · 2024-02-22T15:04:54Z

This is looking awesome! I want to do some checks myself locally, which I'll do as soon as I get some time, hopefully today. Thanks for all this work ❤️

Great! And thanks for all the help ❤️

alexfertel · 2024-02-22T15:39:10Z

Okay, local checks done!

any violation cases to be fixed when running bulloak check --fix

Nothing comes to mind right now, but any work there can be done in a different PR, it doesn't need to be a part of this one. If you think of anything feel free to open an issue.

I think what's missing are the Combiner unit tests, and then I think we need to check the README and update it to reflect the new functionality. Does that make sense to you?

giovannidisiena · 2024-02-22T15:42:28Z

Nothing comes to mind right now, but any work there can be done in a different PR, it doesn't need to be a part of this one. If you think of anything feel free to open an issue.

The only thing I can think of at the moment is perhaps fixing the contract name mismatch by applying the first identifier to all subsequent roots, but I'm happy to open it as a separate issue if you think it's not worth tackling right now.

alexfertel · 2024-02-22T15:45:07Z

The only thing I can think of at the moment is perhaps fixing the contract name mismatch by applying the first identifier to all subsequent roots, but I'm happy to open it as a separate issue if you think it's not worth tackling right now.

Ah, you're right. Yeah, I think we can tackle it separately. If you feel like working on it, don't let me hold you back, though. I'm happy either way.

giovannidisiena · 2024-02-22T15:55:27Z

Ah, you're right. Yeah, I think we can tackle it separately. If you feel like working on it, don't let me hold you back, though. I'm happy either way.

Okay, no problem; I'll see if I have time left over after the Combiner tests and README are done 👍

giovannidisiena · 2024-02-22T21:55:28Z

@alexfertel, Combiner tests have been added, and the corresponding logic has been updated accordingly. I'm not sure I'll get around to refactoring the structural match violation logic, although if you are willing to provide some pointers then that could speed things up and I may be able to take a crack at it.

alexfertel

LGTM! 🚀

src/hir/combiner.rs

alexfertel · 2024-02-23T10:05:20Z

I'm not sure I'll get around to refactoring the structural match violation logic

That's okay, that part of the code is not polished and needs lots of love. I'm merging this in!

When you're happy with this, feel free to change it to ready for review.

giovannidisiena · 2024-02-23T10:07:03Z

@alexfertel just realised I haven't updated the README, doing that now.

alexfertel · 2024-02-23T10:07:47Z

I forgot about that as well, thank you!

giovannidisiena · 2024-02-23T16:22:53Z

@alexfertel While writing up an example for the README, I noticed that the resulting Solidity test function names did not include the root identifier function name, so I have also added that and refactored the Combiner. May be worth another review to make sure you're still happy with it.

alexfertel · 2024-02-23T16:46:44Z

Ah, I see, yeah that makes sense. I'll take a better look tomorrow morn, and if everything looks good, I'll merge 👍🏻

alexfertel

LGTM! 🚀

wip

3f54c86

giovannidisiena mentioned this pull request Dec 15, 2023

feature request: add support for different targeting behaviours #50

Closed

alexfertel reviewed Dec 15, 2023

View reviewed changes

src/hir/combiner.rs Outdated Show resolved Hide resolved

src/hir/mod.rs Outdated Show resolved Hide resolved

src/hir/mod.rs Outdated Show resolved Hide resolved

src/hir/mod.rs Outdated Show resolved Hide resolved

simplify pipeline

b051a44

giovannidisiena requested a review from alexfertel December 22, 2023 10:12

alexfertel requested changes Dec 22, 2023

View reviewed changes

giovannidisiena added 10 commits December 23, 2023 07:41

extract contract part separator constant

a44a9af

make split_trees util

0776ca1

fix combine logic & verify in one step

2c8e2e4

remove from before

ca1f1c4

custom combiner errors

fb4ff0b

entrypoint functions

9922077

fix combiner logic

dd1a3fd

revert test changes

1b0f55a

move translate & combine fn to hir

67d3273

// wip combiner errors

cef22f5

giovannidisiena requested a review from alexfertel December 28, 2023 18:01

alexfertel reviewed Dec 30, 2023

View reviewed changes

alexfertel added 2 commits January 14, 2024 15:09

ref: smol cleanup

52b0283

fix(tokenizer): not treat contract names as identifiers

5bd8d45

giovannidisiena commented Feb 20, 2024

View reviewed changes

src/utils.rs Outdated Show resolved Hide resolved

fix(combiner): handle error/panic

edc95f6

giovannidisiena added 3 commits February 22, 2024 14:21

scaffold tests

4ed89f7

check tests

d262cf7

handle trees more than \n\n apart

35db211

giovannidisiena requested a review from alexfertel February 22, 2024 14:55

alexfertel reviewed Feb 22, 2024

View reviewed changes

tests/check/missing_contract_identifier.tree Outdated Show resolved Hide resolved

Merge branch 'main' into multiple-trees

6fd957e

combiner tests

d6d1d63

giovannidisiena requested a review from alexfertel February 22, 2024 21:55

alexfertel approved these changes Feb 23, 2024

View reviewed changes

src/hir/combiner.rs Show resolved Hide resolved

giovannidisiena marked this pull request as ready for review February 23, 2024 10:05

alexfertel changed the title ~~WIP: Handle multiple trees per .tree file~~ feat: add support for multiple trees per file Feb 23, 2024

giovannidisiena added 2 commits February 23, 2024 16:20

add function name to test name & refactor combiner

80b000b

update README

6f6de77

giovannidisiena requested a review from alexfertel February 23, 2024 16:23

fmt combiner

8f80e68

alexfertel approved these changes Feb 24, 2024

View reviewed changes

alexfertel merged commit 72bcfcd into alexfertel:main Feb 24, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add support for multiple trees per file #51

feat: add support for multiple trees per file #51

giovannidisiena commented Dec 15, 2023 •

edited

Loading

alexfertel commented Dec 15, 2023 •

edited

Loading

alexfertel left a comment •

edited

Loading

giovannidisiena commented Dec 22, 2023

alexfertel left a comment •

edited

Loading

alexfertel Dec 22, 2023

giovannidisiena Dec 28, 2023

alexfertel Dec 30, 2023

alexfertel Dec 22, 2023

giovannidisiena Dec 28, 2023 •

edited

Loading

alexfertel Jan 14, 2024

alexfertel Dec 30, 2023

alexfertel commented Dec 30, 2023

codecov bot commented Jan 14, 2024 •

edited

Loading

alexfertel commented Jan 14, 2024

giovannidisiena commented Jan 19, 2024

alexfertel commented Feb 18, 2024

giovannidisiena commented Feb 19, 2024

giovannidisiena left a comment

giovannidisiena commented Feb 22, 2024

alexfertel left a comment

giovannidisiena commented Feb 22, 2024

alexfertel commented Feb 22, 2024 •

edited

Loading

giovannidisiena commented Feb 22, 2024

alexfertel commented Feb 22, 2024

giovannidisiena commented Feb 22, 2024

giovannidisiena commented Feb 22, 2024 •

edited

Loading

alexfertel left a comment

alexfertel commented Feb 23, 2024

giovannidisiena commented Feb 23, 2024

alexfertel commented Feb 23, 2024

giovannidisiena commented Feb 23, 2024

alexfertel commented Feb 23, 2024

alexfertel left a comment

feat: add support for multiple trees per file #51

feat: add support for multiple trees per file #51

Conversation

giovannidisiena commented Dec 15, 2023 • edited Loading

alexfertel commented Dec 15, 2023 • edited Loading

alexfertel left a comment • edited Loading

Choose a reason for hiding this comment

giovannidisiena commented Dec 22, 2023

alexfertel left a comment • edited Loading

Choose a reason for hiding this comment

alexfertel Dec 22, 2023

Choose a reason for hiding this comment

giovannidisiena Dec 28, 2023

Choose a reason for hiding this comment

alexfertel Dec 30, 2023

Choose a reason for hiding this comment

alexfertel Dec 22, 2023

Choose a reason for hiding this comment

giovannidisiena Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

alexfertel Jan 14, 2024

Choose a reason for hiding this comment

alexfertel Dec 30, 2023

Choose a reason for hiding this comment

alexfertel commented Dec 30, 2023

codecov bot commented Jan 14, 2024 • edited Loading

Codecov Report

alexfertel commented Jan 14, 2024

giovannidisiena commented Jan 19, 2024

alexfertel commented Feb 18, 2024

giovannidisiena commented Feb 19, 2024

giovannidisiena left a comment

Choose a reason for hiding this comment

giovannidisiena commented Feb 22, 2024

alexfertel left a comment

Choose a reason for hiding this comment

giovannidisiena commented Feb 22, 2024

alexfertel commented Feb 22, 2024 • edited Loading

giovannidisiena commented Feb 22, 2024

alexfertel commented Feb 22, 2024

giovannidisiena commented Feb 22, 2024

giovannidisiena commented Feb 22, 2024 • edited Loading

alexfertel left a comment

Choose a reason for hiding this comment

alexfertel commented Feb 23, 2024

giovannidisiena commented Feb 23, 2024

alexfertel commented Feb 23, 2024

giovannidisiena commented Feb 23, 2024

alexfertel commented Feb 23, 2024

alexfertel left a comment

Choose a reason for hiding this comment

giovannidisiena commented Dec 15, 2023 •

edited

Loading

alexfertel commented Dec 15, 2023 •

edited

Loading

alexfertel left a comment •

edited

Loading

alexfertel left a comment •

edited

Loading

giovannidisiena Dec 28, 2023 •

edited

Loading

codecov bot commented Jan 14, 2024 •

edited

Loading

alexfertel commented Feb 22, 2024 •

edited

Loading

giovannidisiena commented Feb 22, 2024 •

edited

Loading