Reduce memory usage by deduplicating type information #649

ChayimFriedman2 · 2025-01-05T16:56:16Z

We were storing the type information, 3 words wide, for each memo in each slot, while it is always constant wrt. the ingredient (different slots of the same ingredients will always have the same memos in the same order). This introduces some more unsafety, and the result wasn't as fast so I also had to use some lock-free structures, but the result is worth it: this shaves off 230mb from rust-analyzer with new Salsa.

netlify · 2025-01-05T16:56:35Z

✅ Deploy Preview for salsa-rs canceled.

Name	Link
🔨 Latest commit	`fd9cfb8`
🔍 Latest deploy log	https://app.netlify.com/sites/salsa-rs/deploys/67bdeafb190eda0008c5a7a0

codspeed-hq · 2025-01-05T19:41:22Z

CodSpeed Performance Report

Merging #649 will not alter performance

_{Comparing ChayimFriedman2:common-typeid-real (fd9cfb8) with master (7d417ac)}

Summary

✅ 11 untouched benchmarks

nikomatsakis

I love the idea, but I'm finding myself a bit confused-- we have some tech debt here in that the mdbook is not complete or up-to-date, I'm realizing that I wish it were, because I'd love to read an mdbook update explaining the data structure(s) here. It feels like we could be more efficient still, but I've probably just not fully brought everything back in cache

src/table/memo.rs

ChayimFriedman2 · 2025-01-26T05:42:26Z

@nikomatsakis I addressed your comments.

MichaReiser · 2025-01-26T08:21:46Z

The benchmark regressions are a bit concerning (as high as 8%). Do we understand where they come from?

src/zalsa.rs

Veykril

I wonder, could we maybe store the types info in the corresponding Page instead? Then we only have duplication of type info among pages which is neglible while not having to go through the ingredient on lookups. It would also keep more info (for drop and the like) localized

ChayimFriedman2 · 2025-01-28T18:53:10Z

Hmm... If we share it in an Arc it should be very little duplication.

nikomatsakis · 2025-01-30T22:05:34Z

I don't quite follow here--I'll try to make time tomorrow to review this PR. I'm still finding myself a bit confused about what type information exactly we are deduplicating and where.

ChayimFriedman2 · 2025-01-31T08:11:18Z

@Veykril I implemented your suggestion to store the types on the page.

ChayimFriedman2 · 2025-02-03T21:42:53Z

@MichaReiser I've investigated the perf regressions and I think they should be fixed now.

MichaReiser · 2025-02-03T21:49:41Z

Thanks. The regression now seems to be specific to benchmarks where we primarily create elements. Can we try to optimize that code path as well (a 18% regression is rather significant)

ChayimFriedman2 · 2025-02-03T22:05:48Z

@MichaReiser I was referring to that regression, I think I fixed it.

MichaReiser · 2025-02-04T07:51:04Z

@ChayimFriedman2 Hmm, in that case I'm unsure if the underlying problem was addressed because the codspeed benchmark still show an 18% regression.

ChayimFriedman2 · 2025-02-04T11:54:41Z

@MichaReiser It wasn't reran back then, now it shows an 11% regression. I'll see if I can shrink it even more.

ChayimFriedman2 · 2025-02-04T13:13:07Z

I think the benchmarks are flawed. They re-create the ingredients on every iteration, which is very much not like real-world usages which create the ingredient once. I'll put a PR to fix this.

ChayimFriedman2 · 2025-02-04T13:56:18Z

OK, after #667 it shows there is no perf regression.

nikomatsakis · 2025-02-20T20:34:57Z

@ChayimFriedman2 needs rebase

ChayimFriedman2 · 2025-02-21T06:04:35Z

@nikomatsakis Rebased.

Veykril · 2025-02-21T10:54:34Z

src/table.rs

+    /// Access the [`MemoTable`] for this slot.
+    ///
+    /// # Safety
+    ///
+    /// You must have mutable access to the slot.
+    unsafe fn memos_no_revision(&self) -> &MemoTable;


Then why does this not just take &mut self

Problems with the borrow checker when calling Page::get() borrows the entire page then we cannot access Page::memo_types.

src/tracked_struct.rs

Veykril · 2025-02-21T11:00:11Z

src/table.rs

-    fn memos_mut(&mut self, slot: SlotIndex) -> &mut MemoTable {
-        self.get_mut(slot).memos_mut()
+    fn memos_mut(&mut self, slot: SlotIndex) -> MemoTableWithTypes<'_> {
+        // SAFETY: We have a `&mut` reference.
+        let memos = unsafe { self.get(slot).memos_no_revision() };


This change seems unnecessary to me, we can just use get_mut and keep the memos_mut accessor as before

But we need to attach the types to it anyway. I can change create a new type MemoTableWithTypesMut, but this just seems unnecessary. And given it does not return a mutable reference anymore, memos_mut() seems inappropriate name now.

I don't mind the name change, I was more raising this regarding the unnecessary unsafe block. There is no reason to decay the mutable references in this call chain to an immutable one, doing so results in this extra unsafe that isn't really necessary otherwise

Ah, given your other comment to my review I guess that would require duplicating the function for mutability

What do you mean by

duplicating the function for mutability

?

Problems with the borrow checker when calling Page::get() borrows the entire page then we cannot access Page::memo_types.

I interpreted this as you'd needing to keep the current version for the borrowck issue either way

No, what I meant is that if I this function takes &mut self, then Page.get_mut().memos_mut() clashes with Page.memo_types, leading to a borrowck error.

src/table/memo.rs

MichaReiser · 2025-02-25T07:51:10Z

OK, after #667 it shows there is no perf regression.

It seems the perf regression is back :(

We were storing the type information, 3 words wide, for each memo in each slot, while it is always constant wrt. the ingredient (different slots of the same ingredients will always have the same memos in the same order). This introduces some more unsafety, and the result wasn't as fast so I also had to use some lock-free structures, but the result is worth it: this shaves off 230mb from rust-analyzer with new Salsa.

ChayimFriedman2 · 2025-02-25T16:26:17Z

@MichaReiser I rebased and now they're at 5% max. It's hard to investigate because codspeed doesn't identify the same methods, so I don't know what is the cause, but the 8% I'm pretty sure was measurement error or something like that (e.g. it showed a regression for shallow_verify_memo(), where literally nothing it calls changed).

MichaReiser · 2025-02-25T16:33:47Z

The benchmarks have been fairly stable after @Veykril made some changes. That's why I'm not convinced that they're outright wrong. Have you tried recording a profile locally and e.g. comparing them in firefox-profiler?

Veykril · 2025-02-25T16:47:36Z

There does seem to be some noise still left as my rebased #710 version reports fairly different perf changes now (though I don't think the PR here is entirely noise)

MichaReiser · 2025-02-25T17:13:07Z

There does seem to be some noise still left as my rebased #710 version reports fairly different perf changes now (though I don't think the PR here is entirely noise)

Rebasing does change the base to which codspeed reports the performance. So that makes sense to some extend. What's interesting is that the perf results are very stable for e.g. https://codspeed.io/salsa-rs/salsa/branches/Veykril%3Aveykril%2Fpush-orlktsrpwzrv

ChayimFriedman2 · 2025-02-25T18:22:57Z

I'm trying to investigate the regressions locally now.

ChayimFriedman2 · 2025-02-25T22:40:11Z

The benchmark regressions seem unavoidable (I also suspect part of them is noise), and they are not large. The main reason is having two different RwLocks to lock and unlock (only read lock though) when accessing a memo: the types and the memos, instead of just one. I can't see an easy way to get rid of this, so I think the memory decrease is worth the perf penalty (some ideas I had about how we can potentially mitigate the lock on the types: synchronizing on the memos lock instead, but that means we have to go through the entire database and lock each and every memo when we want to register an ingredient; not having a lock, and instead use an atomic pointer and replace it with the new types when registering an ingredient, either with an arc-swap or without even reference counting: arc-swap is doable, but the overhead is already pretty tiny and I don't know how much it'll save, and custom-made implementation will mean we need some kind of GC to collect the unreachable types on new revision - we already have one for memos, but expanding it to types won't be trivial; or even statically register the memo ingredients at compile-time with something like linkme - while I look the idea, it will require throughout design work).

ChayimFriedman2 force-pushed the common-typeid-real branch 2 times, most recently from 2ecdea8 to 13ecc36 Compare January 7, 2025 06:41

davidbarsky mentioned this pull request Jan 15, 2025

Experiment: Per ingredient sync table #650

Open

nikomatsakis reviewed Jan 22, 2025

View reviewed changes

ChayimFriedman2 force-pushed the common-typeid-real branch from 13ecc36 to 994285d Compare January 26, 2025 05:41

Veykril reviewed Jan 28, 2025

View reviewed changes

src/zalsa.rs Outdated Show resolved Hide resolved

Veykril reviewed Jan 28, 2025

View reviewed changes

ChayimFriedman2 force-pushed the common-typeid-real branch from 994285d to e92d921 Compare January 31, 2025 08:10

ChayimFriedman2 force-pushed the common-typeid-real branch 3 times, most recently from 1abc1d8 to 793c14b Compare February 3, 2025 21:41

ChayimFriedman2 force-pushed the common-typeid-real branch from 793c14b to 4a69972 Compare February 4, 2025 13:53

ChayimFriedman2 force-pushed the common-typeid-real branch 2 times, most recently from 6e5346f to 681547d Compare February 21, 2025 06:04

Veykril mentioned this pull request Feb 21, 2025

Remove extra page indirection in Table #710

Open

Veykril reviewed Feb 21, 2025

View reviewed changes

src/table/memo.rs Show resolved Hide resolved

ChayimFriedman2 force-pushed the common-typeid-real branch 3 times, most recently from 246d80e to 4eea15f Compare February 24, 2025 22:15

ChayimFriedman2 force-pushed the common-typeid-real branch from 4eea15f to fd9cfb8 Compare February 25, 2025 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce memory usage by deduplicating type information #649

Reduce memory usage by deduplicating type information #649

ChayimFriedman2 commented Jan 5, 2025

netlify bot commented Jan 5, 2025 •

edited

Loading

codspeed-hq bot commented Jan 5, 2025 •

edited

Loading

nikomatsakis left a comment

ChayimFriedman2 commented Jan 26, 2025

MichaReiser commented Jan 26, 2025 •

edited

Loading

Veykril left a comment •

edited

Loading

ChayimFriedman2 commented Jan 28, 2025

nikomatsakis commented Jan 30, 2025

ChayimFriedman2 commented Jan 31, 2025

ChayimFriedman2 commented Feb 3, 2025

MichaReiser commented Feb 3, 2025

ChayimFriedman2 commented Feb 3, 2025

MichaReiser commented Feb 4, 2025

ChayimFriedman2 commented Feb 4, 2025

ChayimFriedman2 commented Feb 4, 2025

ChayimFriedman2 commented Feb 4, 2025

nikomatsakis commented Feb 20, 2025

ChayimFriedman2 commented Feb 21, 2025

Veykril Feb 21, 2025

ChayimFriedman2 Feb 22, 2025

Veykril Feb 21, 2025

ChayimFriedman2 Feb 22, 2025

Veykril Feb 22, 2025

Veykril Feb 22, 2025

ChayimFriedman2 Feb 22, 2025

Veykril Feb 23, 2025

ChayimFriedman2 Feb 23, 2025

MichaReiser commented Feb 25, 2025

ChayimFriedman2 commented Feb 25, 2025

MichaReiser commented Feb 25, 2025

Veykril commented Feb 25, 2025 •

edited

Loading

MichaReiser commented Feb 25, 2025

ChayimFriedman2 commented Feb 25, 2025

ChayimFriedman2 commented Feb 25, 2025

Reduce memory usage by deduplicating type information #649

Are you sure you want to change the base?

Reduce memory usage by deduplicating type information #649

Conversation

ChayimFriedman2 commented Jan 5, 2025

netlify bot commented Jan 5, 2025 • edited Loading

✅ Deploy Preview for salsa-rs canceled.

codspeed-hq bot commented Jan 5, 2025 • edited Loading

Merging #649 will not alter performance

Summary

nikomatsakis left a comment

Choose a reason for hiding this comment

ChayimFriedman2 commented Jan 26, 2025

MichaReiser commented Jan 26, 2025 • edited Loading

Veykril left a comment • edited Loading

Choose a reason for hiding this comment

ChayimFriedman2 commented Jan 28, 2025

nikomatsakis commented Jan 30, 2025

ChayimFriedman2 commented Jan 31, 2025

ChayimFriedman2 commented Feb 3, 2025

MichaReiser commented Feb 3, 2025

ChayimFriedman2 commented Feb 3, 2025

MichaReiser commented Feb 4, 2025

ChayimFriedman2 commented Feb 4, 2025

ChayimFriedman2 commented Feb 4, 2025

ChayimFriedman2 commented Feb 4, 2025

nikomatsakis commented Feb 20, 2025

ChayimFriedman2 commented Feb 21, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser commented Feb 25, 2025

ChayimFriedman2 commented Feb 25, 2025

MichaReiser commented Feb 25, 2025

Veykril commented Feb 25, 2025 • edited Loading

MichaReiser commented Feb 25, 2025

ChayimFriedman2 commented Feb 25, 2025

ChayimFriedman2 commented Feb 25, 2025

netlify bot commented Jan 5, 2025 •

edited

Loading

codspeed-hq bot commented Jan 5, 2025 •

edited

Loading

MichaReiser commented Jan 26, 2025 •

edited

Loading

Veykril left a comment •

edited

Loading

Veykril commented Feb 25, 2025 •

edited

Loading