-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(storage): enable parallel writes by using per-repo and per-digest locking #2968
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2968 +/- ##
==========================================
- Coverage 91.00% 90.94% -0.06%
==========================================
Files 176 177 +1
Lines 32900 32972 +72
==========================================
+ Hits 29940 29987 +47
- Misses 2234 2254 +20
- Partials 726 731 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
21c56e6
to
a83648b
Compare
b1540db
to
d5b8180
Compare
Looks reasonable. Are there actual benchmark numbers that show improvement under contention? |
6f48c6d
to
cd30d2d
Compare
I have updated the tests using the benchmark HG action to post the comparison in the job summary: And: The results for minio are almost unchanged. |
I ran zb locally with config (note GC and dedupe are enabled):
And zb command: This is the result with the code in #2996 (just some ci improvements) zb_diff.md |
linking issue with tests and comments |
We really need to add DELETEs in the zb tests, since that path will really stress test gc and dedupe. |
How does dedupe locking and imgstore locking work wrt each other? Thread1: Create path imgstore lock, unlock, dedupe lock, unlock They can cross each other in unexpected ways! Also, I am expected dedupe "lock/unlock" actually done in boltdb right? |
Can you clarify what you mean? The image store has locks per repo, and per digest.
I do not know what you are talking about. boltdb has its own internal transaction/locking mechanism, which we don't meddle with. |
If you have to take two locks, make sure ordering is ensured, maybe by repo names. |
Like I said 1st lock is on the repo, 2nd lock is on the digest. And on release digest, then repo. We don't lock multiple repos at the same time in the same goroutine, as that results in deadlocks. Also we don't lock multiple digests at the same time in the same goroutine, that's not necessary. I had also mentioned this in #2968 (comment). Ideally we would have locked multiple repos while dedupe ran on them, tried it, and it was 100% guarantee of a deadlock. |
I am still at the code carefully. |
We need to reproduce / fix the docker buildx issue, but I don't think it is related to this change. It was reported in 2024 as well, and it is not clear if they tested the workaround back then. |
I would give another shot to locking multiple repos instead of using the digest locking as it is implemented right now. The issue is we can't order the repo locks. When the user pushes a blob, that specific repo is locked right away, and only later, when checking for duplicates, we can lock the rest of the affected repos. |
@andaaron is there any updates? I can try new build with fix, because errors are really painful ( |
Hi @shcherbak, not from my side. I will try to look into this next weekend. |
i'm on collecting surrounding information about several concrete events, will send in an hour. |
HEAD 500 and surrounding logs (grep -C3 -A3)
|
PATCH 500 and surrounding logs (grep -C3 -A3)
|
…t locking - lock per repo on pushes/pulls/retention, in short index operations - lock per digest when using multiple operations affecting the cachedb and storage (blob writes/deletes/moves/links in storage which need to be in accordance with cachedb content) Do not lock multiple repos at the same time in the same goroutine! It will cause deadlocks. Same applies to digests. Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
…ilures Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
…aining image cache Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Should fix issues such as #2964
(blob writes/deletes/moves/links in storage which need to be in accordance with cachedb content)
Do not lock multiple repos at the same time in the same goroutine! It will cause deadlocks.
Same applies to digests.
In separate commits:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.