-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: #462 parallel uploads to the same blob storage #485
fix: #462 parallel uploads to the same blob storage #485
Conversation
May I ask someone for a review? Maybe @TomAugspurger @martindurant since you reviewed the last PRs? Thank you in advance. |
Is it possible to test this change, something that would fail without it? |
Hi @martindurant , thank you for the reply. In the original issue #462, we discovered that
I think it is not easy to write a pytest to show this issue explicitly. |
I'm afraid I don't know anything about the function of these IDs... The problem was with assigning a constant ID to multiple uploads? I'm not sure I understand "who wins" in the case that two processes are writing at the same time. |
Please add
Someone should also update the CI to a newer version of python, I would say at least 3.10. |
31cc6ab
to
d83487a
Compare
Hi @martindurant ,
The problem happens even when two different blocks are being written to the blob storage at the same time, because the original implementation only take into account the number of blocks. This is the case we want to solve here.
d83487a should have fixed the pipeline for py38. Please run the pipeline again, thanks. |
Rather than hashing the data (which might be expensive), would any random value do? |
d83487a
to
da89523
Compare
Closes #462.