Address #2220 (slow download perf against PyPi mirror) #2319
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Addressing the extremely slow performance detailed in #2220. There are two changes to increase download performance:
accept-encoding: identity
, in the spirit of Accept encoding identity pypa/pip#16881. accept-encoding: identity
I think this related
pip
PR has a good explanation of what's going on: pypa/pip#1688The
files.pythonhosted.org
server is the 1st kind. Example debug log I added inuv
when installing against PyPI:(there is no
content-encoding
header in this response, thewhl
hasn't been compressed, and there is a content-length header)Our internal mirror is the third case. It does seem sensible that our mirror should be modified to act like the 1st kind. But
uv
should handle all three cases likepip
does.2. buffer increase
In #2220 I observed that
pip
's downloading was causing up-to 128KiB flushes in our mirror.After fix 1,
uv
was still only causing up-to 8KiB flushes, and was slower to download thanpip
. Increasing this buffer from the default 8KiB led to a download performance improvement against our mirror and the expected observed 128KiB flushes.Test Plan
Ran benchmarking as instructed by @charliermarsh
No performance improvement or regression.