Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(linter): add Set #4697

Closed
wants to merge 1 commit into from
Closed

perf(linter): add Set #4697

wants to merge 1 commit into from

Conversation

DonIsaac
Copy link
Contributor

@DonIsaac DonIsaac commented Aug 6, 2024

Many rules store a Vec<String> in their configs for things like ignore lists or JSX tags that should be checked. They are created once (in from_configuration) and mostly used for Vec::contains. There are a few problems with this:

  1. Strings are always heap-allocated. We should inline when possible, using CompactStr.
  2. Vec::contains is O(n), and is usually used in a hot path.
  3. Vec::contains requires an &T, meaning there are many cases when rules have an &str and need to allocate a new String just to pass it to contains. This is highly wasteful.
  4. Many rules have duplicate logic for parsing a Vec<String> from a Value.

This PR adds Set<T>, which seeks to make this use case more performant. It's very simple ordered set of unique items backed by a Vec. Read the doc comments over the struct definition for a more in-depth rationale.

Improvements

  1. Set<T> supports O(log(n)) containment checks at the cost of O(nlog(n)) initialization. This is fine since we create it once and read from it many times.
  2. Set<T> where T: AsRef<str>> supports a contains_str method that is identical to contains but does not force consumers to allocate new strings
  3. Set<CompactStr> implements TryFrom<Value>, which DRYs up rule config parsing. This trait is intentionally not implemented for Set<String> to nudge rules towards using CompactStr.

Following PRs will modify existing rules to use Set<CompactStr> over Vec<String>. Since most of these cases are within Jest and VITest rules, I don't expect improvements to appear on benchmarks (rules are skipped when non-applicable, and we have no benchmarks for Jest specs).

@github-actions github-actions bot added the A-linter Area - Linter label Aug 6, 2024
Copy link
Contributor Author

DonIsaac commented Aug 6, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @DonIsaac and the rest of your teammates on Graphite Graphite

Copy link

codspeed-hq bot commented Aug 6, 2024

CodSpeed Performance Report

Merging #4697 will not alter performance

Comparing don/08-05-perf_linter_add_set (21086ad) with main (06efae6)

Summary

✅ 29 untouched benchmarks

Copy link

graphite-app bot commented Aug 6, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@DonIsaac DonIsaac force-pushed the don/08-05-perf_linter_add_set branch 5 times, most recently from 05c4a88 to af1db2e Compare August 6, 2024 18:39
@DonIsaac DonIsaac added the C-performance Category - Solution not expected to change functional behavior, only performance label Aug 6, 2024
@DonIsaac DonIsaac force-pushed the don/08-05-perf_linter_add_set branch 2 times, most recently from 5d3fa08 to dd0f1ad Compare August 6, 2024 20:57
@DonIsaac DonIsaac force-pushed the don/08-05-perf_linter_add_set branch from dd0f1ad to 21086ad Compare August 6, 2024 20:58
@DonIsaac DonIsaac marked this pull request as ready for review August 6, 2024 21:05
Copy link
Member

@Boshen Boshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid our own data structures for the sake of maintainability and discoverability.

This will make it difficult for contributors to learn - remember most contributors come from JavaScript, we should lower the barrier.

As for optimization, perhaps replacing String for CompactStr is more than enough. Hot spots can also use smallvec. But reading from the heap where the vec is small won't be a performance hot spot because they'll likely live in a nearby cache.

@DonIsaac DonIsaac closed this Aug 7, 2024
@Boshen Boshen deleted the don/08-05-perf_linter_add_set branch November 21, 2024 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linter Area - Linter C-performance Category - Solution not expected to change functional behavior, only performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants