chore: Add strategy for caching the parsing layer #3849

kevaundray · 2023-12-17T21:38:34Z

Description

This PR demonstrates how parsing would roughly look if we did it entirely as a separate layer.

Problem*

Resolves

Summary*

Additional Context

Documentation*

Check one:

No documentation needed.
Documentation included in this PR.
[Exceptional Case] Documentation to be submitted in a separate PR.

PR Checklist*

I have tested the changes locally.
I have formatted the changes with Prettier and/or cargo fmt on default settings.

kevaundray · 2023-12-17T21:42:33Z

compiler/noirc_frontend/src/hir/mod.rs

+    fn populate_cache_with_file_manager(&mut self) {
+        for file_id in self.file_manager.as_file_map().all_file_ids() {
+            let file_path = self.file_manager.path(*file_id);
+            let file_extension =
+                file_path.extension().expect("expected all file paths to have an extension");
+            // TODO: Another reason we may not want to have this method here
+            // TODO: is the fact that we want to not have the compiler worry
+            // TODO about the nr file extension
+            if file_extension != "nr" {
+                continue;
+            }
+            self.file_to_ast_cache.insert(*file_id, parse_file(&self.file_manager, *file_id));
        }
    }


This is the main change, notice that with this we can now parallelize parsing and given that the FileIds are deterministic, we can also avoid parsing unchanged things we've already parsed.

On the other hand, it may be easier to always parse everything. A future problem is that incremental compilation may not work effectively if the FileIds are not deterministic -- we can just hash the file_path in order to get the file_id in that case.

kevaundray · 2023-12-17T21:59:53Z

This is likely failing because we are parsing all the files in the FileManager when Context is created, however the stdlib files are not in Context at that particular point in time, so it never populates the cache with stdlib file.

prepare_crate populates the file manager with stdlib content, however by the the Context has already been created

kevaundray · 2023-12-17T22:01:57Z

#3844 adds file_manager_with_stdlib which mitigates this issue, if it is that

kevaundray · 2023-12-17T22:05:41Z

compiler/noirc_frontend/src/hir/mod.rs

+    // TODO: Do not merge with this method!
+    pub fn repopulate_cache(&mut self) {
+        for file_id in self.file_manager.as_file_map().all_file_ids() {
+            let file_path = self.file_manager.path(*file_id);
+            let file_extension =
+                file_path.extension().expect("expected all file paths to have an extension");
+            // TODO: Another reason we may not want to have this method here
+            // TODO: is the fact that we want to not have the compiler worry
+            // TODO about the nr file extension
+            if file_extension != "nr" {
+                continue;
+            }
+            self.file_to_ast_cache.insert(*file_id, parse_file(&self.file_manager, *file_id));
+        }
+    }


Here is the hack, I added to repopulate the cache once I know we have added the stdlib to the file manager

I just copied the above method and made this one public

@kevaundray

# Description Rework of #3849 after file_manager_with_stdlib was created. Also as @kevaundray suggested extracted the parsing above the context. Resolves #3838 ## Problem\* Parsing is currently done when collecting, allowing only to parse files used, but making it very difficult to do parallel parsing or to have cached parsing. ## Summary\* This PR extracts parsing to its own pass, and makes the Context take the parsed files. The creator of the context is the one in charge to do the parsing of the file manager, so Nargo, the LSP and wasm need to handle the parsed files. This PR uses rayon to do parallel parsing in Nargo & LSP. It reduces the time taken to process the on save notification in the LSP from ~700ms on protocol circuits to ~250ms. With parsing being in its own pass, this opens the door for the LSP to cache parsing. It has access to the file manager and the parsed files, so it can detect which files have changed on disk and only parse those when necessary. ## Additional Context ## Documentation\* Check one: - [x] No documentation needed. - [ ] Documentation included in this PR. - [ ] **[Exceptional Case]** Documentation to be submitted in a separate PR. # PR Checklist\* - [x] I have tested the changes locally. - [x] I have formatted the changes with [Prettier](https://prettier.io/) and/or `cargo fmt` on default settings. --------- Co-authored-by: Koby Hall <102518238+kobyhallx@users.noreply.github.com>

kevaundray added 3 commits December 17, 2023 21:34

add method to return all file_ids

c63329b

add method to cache ASTs

aaaaeef

modify rest of code

fcf6d5f

kevaundray commented Dec 17, 2023

View reviewed changes

add method to repopulate cache

d30c342

kevaundray commented Dec 17, 2023

View reviewed changes

sirasistant mentioned this pull request Jan 17, 2024

feat: Extract parsing to its own pass and do it in parallel #4063

Merged

5 tasks

TomAFrench closed this Jan 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: Add strategy for caching the parsing layer #3849

chore: Add strategy for caching the parsing layer #3849

kevaundray commented Dec 17, 2023

kevaundray Dec 17, 2023

kevaundray commented Dec 17, 2023

kevaundray commented Dec 17, 2023

kevaundray Dec 17, 2023

kevaundray Dec 17, 2023

chore: Add strategy for caching the parsing layer #3849

chore: Add strategy for caching the parsing layer #3849

Conversation

kevaundray commented Dec 17, 2023

Description

Problem*

Summary*

Additional Context

Documentation*

PR Checklist*

kevaundray Dec 17, 2023

Choose a reason for hiding this comment

kevaundray commented Dec 17, 2023

kevaundray commented Dec 17, 2023

kevaundray Dec 17, 2023

Choose a reason for hiding this comment

kevaundray Dec 17, 2023

Choose a reason for hiding this comment