-
-
Notifications
You must be signed in to change notification settings - Fork 31.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-127647: Add typing.Reader and Writer protocols #127648
Conversation
A few design considerations:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit sad that we can't use covariance and contravariance in the protocols since a subclass could use it with an invariant type.
Small improvements to the docstrings and signature
Doc/library/typing.rst
Outdated
|
||
Protocol for reading from a file or other input stream. | ||
|
||
.. method:: read(size=...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. method:: read(size=...) | |
.. method:: read(size=..., /) |
(Same for other methods)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unsure of how much of the signature to document. For example, would it also make sense to add argument and/or return types? I've added the slashes for now.
|
That feels terribly hackish to me, even if there is precedence. And in my experience, hackish stuff often falls on one's feet one way or the other. But I can move it to |
This has been the way the cpython/Lib/collections/__init__.py Lines 32 to 33 in 023b7d2
|
Well, considering that |
I'm open to something less hacky if you can find a generalized solution that doesn't make |
I don't think the typing spec needs to be cognisant of the hacks we use to make things work at runtime in the stdlib! This knowledge is irrelevant for third-party users of protocols, to users of type checkers and to implementers of type checkers. It's definitely not encouraged for any external users to make use of the |
CI seems flaky, failures are unrelated. |
@AlexWaygood I've moved the protocols to
I'm unsure why that is. Originally I thought it was because I added a |
It seems that by using the new generic syntax ( |
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM now but it might be good to have one or two simple tests covering the new code. It probably doesn't need to be much more than this (added to test_io.py
)
cpython/Lib/test/test_typing.py
Lines 4260 to 4262 in a083633
def test_supports_index(self): | |
self.assertIsSubclass(int, typing.SupportsIndex) | |
self.assertNotIsSubclass(str, typing.SupportsIndex) |
and maybe an addition to the test_typing
tests similar to this one:
cpython/Lib/test/test_typing.py
Lines 4274 to 4303 in a083633
def test_collections_protocols_allowed(self): | |
@runtime_checkable | |
class Custom(collections.abc.Iterable, Protocol): | |
def close(self): ... | |
class A: pass | |
class B: | |
def __iter__(self): | |
return [] | |
def close(self): | |
return 0 | |
self.assertIsSubclass(B, Custom) | |
self.assertNotIsSubclass(A, Custom) | |
@runtime_checkable | |
class ReleasableBuffer(collections.abc.Buffer, Protocol): | |
def __release_buffer__(self, mv: memoryview) -> None: ... | |
class C: pass | |
class D: | |
def __buffer__(self, flags: int) -> memoryview: | |
return memoryview(b'') | |
def __release_buffer__(self, mv: memoryview) -> None: | |
pass | |
self.assertIsSubclass(D, ReleasableBuffer) | |
self.assertIsInstance(D(), ReleasableBuffer) | |
self.assertNotIsSubclass(C, ReleasableBuffer) | |
self.assertNotIsInstance(C(), ReleasableBuffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you! Just a few docs nits remaining
I'll leave it a little bit in case any of the other reviewers here have any remaining concerns before merging |
__slots__ = () | ||
|
||
@abc.abstractmethod | ||
def read(self, size=..., /): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if this has been discussed before, but I'm unsure on the runtime use of size=...
(I didn't notice this earlier in my documentation review, sorry).
Almost every other read(size)
method I can find has a default of either None
or -1
. I also can't find another method in the stdlib with a default of ...
(outside of the recently-added protocols in wsgiref.types
).
Would it be better to have size=-1
, to indicate that the method takes an int
? I'm not sure how much we want typeshed-like practices to leak into the standard library.
def read(self, size=..., /): | |
def read(self, size=-1, /): |
A
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mandating defaults is not really something you can do in a protocol. I also wouldn't want to mandate that implementors have to use a default of -1
, because – as you said – some implementations use None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but this isn't a protocol -- it's an ABC, which do have defaults -- see e.g. collections.abc.Generator
. All of the read()
methods in io
are documented as having read(size=-1, /)
, and given these ABCs are going into io
, I think we should be consistent with that interface, or have good reason to diverge from it (& document why).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's supposed to be a protocol, not an ABC. (Notwithstanding the fact that all protocols are ABCs.) It's just a protocol in the implementation for performance reasons. And using -1
as a default would give users the very wrong impression that they can use read(-1)
when that may or may not actually be supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using
-1
as a default would give users the very wrong impression that they can useread(-1)
when that may or may not actually be supported.
From the documentation of io.RawIOBase.read()
:
Read up to size bytes from the object and return them. As a convenience, if size is unspecified or -1, all bytes until EOF are returned.
I would expect the io.Reader.read()
ABC/protocol to have this same guarantee, for a 'properly' implemented read()
method (according to the io
expecations). In the proposed documentation, we say:
Read data from the input stream and return it. If size is specified, it should be an integer, and at most size items (bytes/characters) will be read.
This forbids None
(good!), but is silent on what happens should size be omitted. I still think we should use -1
instead of ...
, but at the very least we should include in the documentation the contract for what happens when read()
is called with no arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. -1
is the default for some implementations, but the protocol should not make any default mandatory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure I follow, though -- currently the ABC/protocol has the mandatory default of size=...
, which is always invalid and not the correct type. -1
is valid for all interfaces specified by the io
documentation, which is where this new type is being added.
I ran the following with mypy --strict
and it passed, so I don't think type checkers care about default values (and as discussed, the type forbids using non-integer types):
import abc
class Reader(metaclass=abc.ABCMeta):
__slots__ = ()
@abc.abstractmethod
def read(self, size: int = -1, /) -> bytes: pass
class CustomReader(Reader):
def read(self, size: int = -1, /) -> bytes:
return b''
class CustomReaderZero(Reader):
def read(self, size: int = 0, /) -> bytes:
return b''
assert issubclass(CustomReader, Reader)
assert issubclass(CustomReaderZero, Reader)
assert isinstance(CustomReader(), Reader)
assert isinstance(CustomReaderZero(), Reader)
def reader_func(r: Reader) -> None:
r.read()
reader_func(CustomReader())
reader_func(CustomReaderZero())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Sebastian here; we should use ...
because the protocol need not mandate any particular default.
-1 is valid for all interfaces specified by the io documentation
The protocol should also match other file-like classes defined elsewhere in the stdlib or even in third-party libraries. When defining a protocol it's often useful to be permissive, so that all objects that are intended to match the protocol actually match it.
Could this be merged before the next alpha next week? |
Reader
andWriter
protocols #127647📚 Documentation preview 📚: https://cpython-previews--127648.org.readthedocs.build/