You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
oxc-project/oxc#4367 added an AST traversal at start of building Semantic, to count how many nodes, scopes, symbols, and references there are in the AST.
It uses these counts to pre-allocate sufficient capacity in AstNodes, ScopeTree and SymbolTable before the main traversal which populates these structures with data. This avoids these structures needing to grow during the main traversal, with all the memory-copying involved in that. The result was a big boost for performance.
Yes, maybe that would be good - could size the bindings hash map for each scope correctly up front. But where do we store those "bindings per scope" counts? Would have to be in a Vec which we don't know the right size for until Counter pass has finished, so it'd resize all the time - i.e. that creates the same kind of problem again that this PR solves!
I can now see a way around this problem.
ScopeId is just a wrapper around u32, and scope_id fields of AST nodes are unused at this point. So we could store the binding counts in scope_id fields. In SemanticBuilder's main AST pass, it'd retrieve the binding count from scope_id field, use that count to initialize Bindings for that scope with appropriate capacity, and then write actual ScopeId into scope_id field.
A bit hacky to misuse scope_id fields in this way, but should work.
What we don't know is whether the gain of pre-allocating Bindings hashmaps to required capacity will outweigh the cost of calculating the counts. Maybe it won't because most scopes contain less than 4 bindings anyway (which is probably default capacity of HashMap), and these hashmaps are usually fairly small so growth is not so costly. But probably worth trying and measuring it.
NB: If we do try this, we may need to take into account the load factor of HashMap. To be able to store 8 items in a hashmap requires the hashmap to have space for more than 8 elements. Docs for HashMap::with_capacity suggest it takes load factor into account, but we should make sure or we may misread benchmark results as it won't be doing quite what we think it is.
The text was updated successfully, but these errors were encountered:
ScopeId is just a wrapper around u32, and scope_id fields of AST nodes are unused at this point. So we could store the binding counts in scope_id fields. In SemanticBuilder's main AST pass, it'd retrieve the binding count from scope_id field, use that count to initialize Bindings for that scope with appropriate capacity, and then write actual ScopeId into scope_id field.
A bit hacky to misuse scope_id fields in this way, but should work.
What we don't know is whether the gain of pre-allocating Bindings hashmaps to required capacity will outweigh the cost of calculating the counts. Maybe it won't because most scopes contain less than 4 bindings anyway (which is probably default capacity of HashMap), and these hashmaps are usually fairly small so growth is not so costly. But probably worth trying and measuring it.
NB: If we do try this, we may need to take into account the load factor of HashMap. To be able to store 8 items in a hashmap requires the hashmap to have space for more than 8 elements. Docs for HashMap::with_capacity suggest it takes load factor into account, but we should make sure or we may misread benchmark results as it won't be doing quite what we think it is.
Clever! I'm looking forward to seeing you try this!
oxc-project/oxc#4367 added an AST traversal at start of building
Semantic
, to count how many nodes, scopes, symbols, and references there are in the AST.It uses these counts to pre-allocate sufficient capacity in
AstNodes
,ScopeTree
andSymbolTable
before the main traversal which populates these structures with data. This avoids these structures needing to grow during the main traversal, with all the memory-copying involved in that. The result was a big boost for performance.@Dunqing suggested:
I replied:
I can now see a way around this problem.
ScopeId
is just a wrapper aroundu32
, andscope_id
fields of AST nodes are unused at this point. So we could store the binding counts inscope_id
fields. InSemanticBuilder
's main AST pass, it'd retrieve the binding count fromscope_id
field, use that count to initializeBindings
for that scope with appropriate capacity, and then write actualScopeId
intoscope_id
field.A bit hacky to misuse
scope_id
fields in this way, but should work.What we don't know is whether the gain of pre-allocating
Bindings
hashmaps to required capacity will outweigh the cost of calculating the counts. Maybe it won't because most scopes contain less than 4 bindings anyway (which is probably default capacity ofHashMap
), and these hashmaps are usually fairly small so growth is not so costly. But probably worth trying and measuring it.NB: If we do try this, we may need to take into account the load factor of
HashMap
. To be able to store 8 items in a hashmap requires the hashmap to have space for more than 8 elements. Docs forHashMap::with_capacity
suggest it takes load factor into account, but we should make sure or we may misread benchmark results as it won't be doing quite what we think it is.The text was updated successfully, but these errors were encountered: