-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document state storage #50298
Document state storage #50298
Conversation
@jasonmalinowski PTAL |
17ddaa7
to
ab1d621
Compare
@jasonmalinowski any concern about:
We dont' have anything happening ambiently that would cause that order to change, right? i.e. when typing, it's not lke we remove projects/re-add to the end? |
I don't think i like that. Primarily because wouldn't that really be confusing when reviewing code? In some places, |
@CyrusNajmabadi |
src/Workspaces/Core/Portable/Workspace/Solution/TextDocumentStates.cs
Outdated
Show resolved
Hide resolved
src/Workspaces/Core/Portable/Workspace/Solution/TextDocumentStates.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I like it. A few more detail thoughts;
- I like that the enumerator of states is always going to give you the ordered result, in that it prevents some bugs that otherwise might happen by accident. That said, I'm not sure about the perf impacts.
- I do worry about the use of ImmutableArray<> for storing the IDs in order; I believe the use of ImmutableList was because the array can get large (easily multiple hundreds of items) and we might see overhead during some loads. I see @CyrusNajmabadi experimented with changing this in Experiment with targetted perf improvements around Project.Documents. #48682, so I'll defer to him and @sharwell as far as what they're seeing for the tradeoff and go with that. So treat this signoff as contingent on them also signing off.
- I would like some of the methods on TextDocumentStates renamed; generally rename "value" to "state" since that's what it's returning.
@@ -821,7 +821,7 @@ public async Task TestAnalyzerConfigFile_Properties() | |||
|
|||
Assert.Equal(1, project.Documents.Count()); | |||
Assert.Equal(1, project.AnalyzerConfigDocuments.Count()); | |||
Assert.Equal(1, project.State.AnalyzerConfigDocumentIds.Count()); | |||
Assert.Equal(1, project.State.AnalyzerConfigDocumentStates.Count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not even sure why we have this assert.
src/Workspaces/Core/Portable/Workspace/Solution/ProjectChanges.cs
Outdated
Show resolved
Hide resolved
src/Workspaces/Core/Portable/Workspace/Solution/ProjectState.cs
Outdated
Show resolved
Hide resolved
src/Workspaces/Core/Portable/Workspace/Solution/TextDocumentStates.cs
Outdated
Show resolved
Hide resolved
src/Workspaces/Core/Portable/Workspace/Solution/TextDocumentStates.cs
Outdated
Show resolved
Hide resolved
=> Map.TryGetValue(documentId, out var state) ? state : null; | ||
|
||
public TState GetRequiredValue(DocumentId documentId) | ||
=> Map.TryGetValue(documentId, out var state) ? state : throw ExceptionUtilities.Unreachable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I'd throw unreachable here; that's usually used if we have code we think we can't hit but compiler flow analysis doesn't know better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you throw? It's the same InvalidOperationException as Contract.ThrowIfNull and friends. ExceptionUtilities.Unreachable
means we don't think it's gonna be reached, but it's not something that's supposed to leak out of public API as an expected outcome of that API.
src/Workspaces/Core/Portable/Workspace/Solution/TextDocumentStates.cs
Outdated
Show resolved
Hide resolved
src/Workspaces/Core/Portable/Workspace/Solution/TextDocumentStates.cs
Outdated
Show resolved
Hide resolved
@jasonmalinowski Re perf: Planning to update the PR to use segmented dictionary/array Sam is working on. |
@CyrusNajmabadi @sharwell @jasonmalinowski PTAL - implemented optimizations on top of previous changes. |
Fix comment Fix
4196855
to
62a18c6
Compare
ProjectState
to a dedicated data structure:TextDocumentStates<TState>
.TextDocumentStates<TState>
for source generated documents as well.Project.Documents
andProject.DocumentIds
: up to 3.3x and 3.2x faster, respectively. Fixes Experiment with targetted perf improvements around Project.Documents. #48682.Before
After
The cost of updating a document however is now very high and might need mitigation (*).
Before
After (with SetItem re-hashing)
After (with SetItem copying)
Measured on
Implementation
Uses
ImmutableSegmentedDictionary
as a backing storage for a map of document id to the corresponding document state.This data structure allows us to implement very fast lookup and ordered enumeration. The cost is shifted to mutation operations (an addition, removal and update of documents). In practice, this cost of additional/removal shouldn't be high since additions and removals of documents is batched.
(*) Additional potential memory optimizations of updating operations:
Relies on
ImmutableSegmentedDictionary
storing entries in the order they were added until an item is removed. Documents this behavior as a guaranteed property of the data structure. The ordering can be restored after removal of an item ifCompact
is called.Adds
ImmutableSegmentedDictionary.GetAddedEntry(int index)
method that allows to index into the underlying entries array. This method throws if an item has been removed from the dictionary (withoutCompact
called). This API allows us to implementProject.GetXxxDocumentIds
methods via a direct index into the underlying entries array of theImmutableSegmentedDictionary
and completely avoid building and maintaining a separate list of DocumentIds.