Pre-allocate sufficient capacity for transformer to create new scopes, symbols, references #75

overlookmotel · 2024-07-23T12:13:24Z

oxc-project/oxc#4367 added an AST traversal at start of building Semantic, to count how many nodes, scopes, symbols, and references there are in the AST.

It uses these counts to pre-allocate sufficient capacity in AstNodes, ScopeTree and SymbolTable before the main traversal which populates these structures with data. This avoids these structures needing to grow during the main traversal, with all the memory-copying involved in that. The result was a big boost for performance.

However, this is not completely ideal for transformer. ScopeTree and SymbolTable are sized exactly for the number of scopes/symbols in the AST. But transformer creates new scopes, symbols and references, and when it does so, it will require these structures to grow, which incurs a slow-down.

Probably it's not a massive slow-down, because at present the transformer does not create AstNodes. AstNodes is the biggest of the structures, and most of the perf gain of oxc-project/oxc#4367 probably came from preventing it needing to grow. Growing ScopeTree or SymbolTable is less expensive.

But (a) we should still try to avoid this growth and (b) transformer may in future create AstNodes too.

Question is, how do we determine how many scopes and symbols transformer will need to create?

It depends on what transforms are enabled.
It depends on how many of certain AST node types are in the AST (which dictates how many transformations will occur).

We could either:

Attempt to calculate exactly how many additional symbols/scopes/references will be required when compiling counts.
Use a heuristic e.g. allocate 20% more capacity than required to leave "headroom" for transformer.

If going the heuristic route, I have no idea if 20% is a good number. It may be that there's so much diversity among source files Oxc is expected to handle that there is no good number.

Another approach

Just a thought: Is there a way to sidestep this issue entirely?

Presently we need the structures in ScopeTree etc to be contiguous blocks of memory, so they can be indexed into. Can we change that by e.g. using pointers instead of indexes?

Or use an alternative indexing scheme where the underlying storage is a series of blocks, each double the size of the previous block? Such a storage structure could grow without any memory copies, but at the cost of making indexing into it require more operations to convert ID to block number & index into that block. Something like:

fn id_to_block(id: NonZeroU32) -> (u32, u32) {
    let block_number = 32 - id.leading_zeros();
    // index = id with highest bit unset
    let index_within_block = u32::from(id) & ((1 << (block_number - 1)) - 1);
    (block_number, index_within_block)
}

The text was updated successfully, but these errors were encountered:

Dunqing · 2024-07-23T14:38:00Z

Use a heuristic e.g. allocate 20% more capacity than required to leave "headroom" for transformer.

I think we can give a try for this. Because doing this is so easy. Then let the benchmark result guide us on what to do next.

overlookmotel · 2024-07-23T14:46:38Z

Yes, we could try that. I guess it needs an option to pass in to SemanticBuilder to enable this. Only the transformer needs the extra capacity reserved. Processes which don't alter the AST (e.g. linter) don't need it.

Dunqing · 2024-07-23T14:51:34Z

Can we do reserve in the transformer?

overlookmotel · 2024-08-31T08:36:09Z

reserve would grow the Vecs, so cause the memory copying that we're trying to avoid. To prevent memory copies, you have to reserve sufficient capacity upfront before you push any data into the Vec. i.e. it need to happen earlier.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-allocate sufficient capacity for transformer to create new scopes, symbols, references #75

Pre-allocate sufficient capacity for transformer to create new scopes, symbols, references #75

overlookmotel commented Jul 23, 2024

Dunqing commented Jul 23, 2024

overlookmotel commented Jul 23, 2024

Dunqing commented Jul 23, 2024

overlookmotel commented Aug 31, 2024

Pre-allocate sufficient capacity for transformer to create new scopes, symbols, references #75

Pre-allocate sufficient capacity for transformer to create new scopes, symbols, references #75

Comments

overlookmotel commented Jul 23, 2024

Another approach

Dunqing commented Jul 23, 2024

overlookmotel commented Jul 23, 2024

Dunqing commented Jul 23, 2024

overlookmotel commented Aug 31, 2024