-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add async support for temporary file handling. Closes #344 #867
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, but I left several change requests.
- Remove redundant decorators - Adjust type annotations by removing unnecessary parentheses - Import from its correct location - Remove unnecessary async versions of - Add blank lines for better readability after control blocks - Remove redundant type casts and isinstance checks - Group related tests under a test class for better organization - Fix path variable naming consistency - Ensure proper handling of blocking I/O within async functions
You also need to fix the pyright errors. |
Given that this is a new feature, I think it warrants a new documentation chapter. The code examples don't need to be comprehensive, but should make readers aware of this feature. |
…nore ASYNC230 - Insert a blank line after the end of every control block to separate it from the following code (per review instructions). - Explicitly specify return types and adjust type annotations to resolve Pyright errors. - Maintain blocking I/O inside to_thread.run_sync() as per ASYNC230 compliance for runtime code. - Suppress ASYNC230 warnings in the test suite by adding per-file ignores in pyproject.toml.
I will add the document work and commit it as soon as possible. Thank you for your thorough review! |
src/anyio/_core/_tempfile.py
Outdated
self._async_file = None | ||
|
||
|
||
class SpooledTemporaryFile(Generic[AnyStr]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally for SpooledTemporaryFile is that writes/reads to/from the BytesIO buffer will not go to a thread
SpooledTemporaryFile needs to not go to a thread for these 'small' file things because the usecase is starlette multipart form file uploads which needs to remain fast for tiny uploads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kludex would you like to participate in the review of this feature too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@11kkw I think @graingert is referring to something like this: https://github.com/Tinche/aiofiles/blob/main/src/aiofiles/tempfile/temptypes.py#L25-L55
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made the necessary changes based on your feedback, but I'm not entirely sure if I've implemented them correctly due to my limited experience. I'd really appreciate it if you could review my changes and provide any further suggestions. Thank you for your guidance!"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class needs an aclose()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it needs the aclose()
.
- Optimized to avoid unnecessary thread switching for small files - Fixed blank lines after control blocks for better readability - Resolved Pyright type errors - Updated async tempfile documentation based on review feedback
Tests on 3.14 are failing because a new alpha added yet more attributes to |
Oh, before I forget: the API reference needs to be updated to include the new classes and functions. |
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
…te API docs - Improved type annotations for mkstemp and mkdtemp functions using overloads per Typeshed. - Fixed test failures on Python 3.14 caused by additional attributes in pathlib.Path. - Updated API reference to include new classes and functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some minor clean-ups as suggestions. The only real piece missing is the additions to the API reference documentation, and then @graingert needs to re-check the work you did on SpooledTemporaryFile
. We're almost done now!
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
Co-authored-by: Alex Grönholm <alex.gronholm@nextday.fi>
src/anyio/_core/_tempfile.py
Outdated
if not getattr(self._fp, "_rolled", True): | ||
result = self._fp.write(s) | ||
if self._fp._max_size and self._fp.tell() > self._fp._max_size: | ||
self._fp.rollover() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this do blocking IO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write will do the rollover for us in this situation, which will result in blocking disk IO on the event loop there, so we need to make sure it won't.
You don't have to use SpooledTemporyFile here you can just copy the source code here with attribution, and modify it to use an async file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@11kkw Yeah, so instead of delegating to tempfile.SpooledTemporaryFile
, @graingert is suggesting that you roll an independent implementation in order to get more control over async checkpoints and such.
src/anyio/_core/_tempfile.py
Outdated
) -> SpooledTemporaryFile[AnyStr]: | ||
fp = await to_thread.run_sync( | ||
partial( | ||
tempfile.SpooledTemporaryFile, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating the SpooledTemporyFile doesn't perform blocking disk IO so this shouldn't need a thread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, you already mentioned that ^ here.
src/anyio/_core/_tempfile.py
Outdated
self.dir: AnyStr | None = dir | ||
self.errors = errors | ||
self._async_file: AsyncFile | None = None | ||
self._fp: Any = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think should be initialized here, how does the upstream SpooledTemporyFile behave?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't do any file I/O if that's what you mean, so yes, self._fp should be initialized to tempfile.SpooledTemporaryFile
here.
raise RuntimeError("Underlying file is not initialized.") | ||
|
||
if not getattr(self._fp, "_rolled", True): | ||
result = self._fp.write(s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result = self._fp.write(s) | |
return self._fp.write(s) |
@@ -90,6 +90,9 @@ extend-select = [ | |||
[tool.ruff.lint.isort] | |||
"required-imports" = ["from __future__ import annotations"] | |||
|
|||
[tool.ruff.lint.per-file-ignores] | |||
"tests/test_tempfile.py" = ["ASYNC230"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which open call is blocking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one in test_mkstemp()
.
async def test_exception_handling(self) -> None: | ||
async with NamedTemporaryFile[bytes]() as af: | ||
filename = af.name | ||
assert os.path.exists(filename) # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a mypy error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the name
attribute is retrieved using __getattr__()
which is declared to return object
. I think this has been used elsewhere in anyio too – perhaps we should declare it to return Any
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, that won't be necessary if this is just reimplemented by copying stdlib code.
raise RuntimeError("Underlying file is not initialized.") | ||
|
||
if not getattr(self._fp, "_rolled", True): | ||
result = self._fp.write(s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to have a bit of a think here. According to the rules of trio every async function should always checkpoint, however for SpooledTemporyFile to be useful it should be as close to a BytesIO in terms of performance at small sizes. The documentation should have a big red note here or write should include a checkpoint if we're writing to the memory buffer
|
||
|
||
@pytest.mark.parametrize("prefix", [b"mkdtemp_", "mkdtemp_"]) | ||
async def test_mkdtemp(prefix: AnyStr) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The typevar is unwarranted here.
async def test_mkdtemp(prefix: AnyStr) -> None: | |
async def test_mkdtemp(prefix: str | bytes) -> None: |
@11kkw Let us know if you need help. I understand all this back and forth may feel pretty exhausting! |
src/anyio/_core/_tempfile.py
Outdated
self._async_file = None | ||
|
||
|
||
class SpooledTemporaryFile(Generic[AnyStr]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a generic? The tempfile.SpooledTemporaryFile
is not a generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty generic to me? https://github.com/python/typeshed/blob/ac8f2632ec37bb4a82ade0906e6ce9bdb33883d3/stdlib/tempfile.pyi#L277
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, mypy reports the type as SpooledTemporaryFile[builtins.str]
while pyright reports it as SpooledTemporaryFile[Unknown]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from tempfile import SpooledTemporaryFile
from anyio import SpooledTemporaryFile as AnyIOSpooledTemporaryFile
async def main() -> None:
with SpooledTemporaryFile() as f1:
reveal_type(f1)
async with AnyIOSpooledTemporaryFile() as f2:
reveal_type(f2)
Gives me:
/home/alex/workspace/anyio/client.py:7:21 - information: Type of "f1" is "SpooledTemporaryFile[bytes]"
/home/alex/workspace/anyio/client.py:10:21 - information: Type of "f2" is "SpooledTemporaryFile[Unknown]"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh...I didn't notice before that those TemporaryFile
etc. in typeshed were actually functions! That explains a few things. I'll work on fixing the SpooledTemporaryFile
implementation tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored SpooledTemporaryFile
in a way that should satisfy @graingert. However, this caused a lot fo mypy errors that I can't understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for bringing this up so late. However, since the official documentation states, The dir, prefix and suffix parameters have the same meaning and defaults as with mkstemp()
I assumed that the parameters for mkstemp are of type AnyStr
, and I proceeded with that approach.
https://docs.python.org/3/library/tempfile.html#tempfile.mkstemp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever the reason, using AnyStr
in both the class type argument AND the initializer arguments makes no sense because the type parameter in the class is supposed to determine whether the input and output type is str
or bytes
. But the reason for the initializer arguments to be parametrized is because they in part determine file name and you want them to be either str
or bytes
. Two different, unrelated uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever the reason, using
AnyStr
in both the class type argument AND the initializer arguments makes no sense because the type parameter in the class is supposed to determine whether the input and output type isstr
orbytes
. But the reason for the initializer arguments to be parametrized is because they in part determine file name and you want them to be eitherstr
orbytes
. Two different, unrelated uses.
I agree with this change
Changes
feat: Add async support for temporary file handling (#344)
to_thread
for handling async operations with tempfiles.Checklist
tests/
) which would fail without your patch.docs/
, in case of behavior changes or new features). (Pending)docs/versionhistory.rst
). (Pending)Updating the changelog
(Changelog not updated)
Additional Notes
I have implemented async support for temporary file handling.
Currently, I haven't updated the documentation or changelog yet.
I'm unsure if this implementation follows the best approach, so I would appreciate feedback on the current implementation first.
Once I receive input from the review, I will update the documentation and changelog accordingly if needed.