Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-127647: Add typing.Reader and Writer protocols #127648

Merged
merged 37 commits into from
Mar 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
b45fec0
Add typing.Reader and Writer protocols
srittau Dec 5, 2024
1525e05
Add a note about Iterable
srittau Dec 5, 2024
7867ec1
Fix docs formatting
srittau Dec 5, 2024
6a22a02
Small wording improvements
srittau Dec 5, 2024
5d632a3
Simplify the docs/improve formatting
srittau Dec 5, 2024
4d50c2e
Explicitly document the methods
srittau Dec 5, 2024
1e1ea41
Mark protocol members as abstract
srittau Dec 6, 2024
56a38a0
Add .. versionadded
srittau Dec 6, 2024
f2c331b
Added slashes to documented signatures
srittau Dec 6, 2024
6764b6a
Fix overindentation
srittau Dec 6, 2024
022acaa
Fix documentation of Reader.__iter__()
srittau Dec 6, 2024
b86073d
Remove the @runtime_checkable flags
srittau Dec 6, 2024
65eb040
Merge branch 'main' into typing-readable-writable
srittau Jan 6, 2025
2b9159d
Merge branch 'main' into typing-readable-writable
srittau Feb 25, 2025
1f42b21
Remove Reader.__iter__() and readline()
srittau Feb 25, 2025
0325f5a
Move protocols to io
srittau Feb 25, 2025
632511a
Update whatsnew
srittau Feb 25, 2025
3b384f9
Update NEWS file
srittau Feb 25, 2025
5bdb4cc
Fix abstractmethod import
srittau Feb 25, 2025
35dcaf4
Fix runtime_checkable link in docs
srittau Feb 25, 2025
5584a57
Add Reader and Writer to proto allowlist
srittau Feb 25, 2025
af81301
Import Reader and Writer into _pyio
srittau Feb 25, 2025
5a8b915
Import _collections_abc dynamically
srittau Feb 25, 2025
b1593fa
Merge branch 'main' into typing-readable-writable
srittau Feb 25, 2025
577b893
Use metaclass instead of deriving from `ABC`
srittau Feb 25, 2025
cedfa42
Use __class_getitem__ instead of making the class generic
srittau Feb 25, 2025
a0b9e47
Remove type annotations
srittau Feb 25, 2025
53a2250
Move import back to top level
srittau Feb 25, 2025
03aa3a2
Merge branch 'main' into typing-readable-writable
srittau Feb 27, 2025
ca72c19
Fix doc reference to decorator
srittau Feb 27, 2025
3b5975e
Fix references in docs
srittau Feb 27, 2025
96080fe
Split signature
srittau Feb 27, 2025
3723370
Document that Reader and Writer are generic
srittau Feb 27, 2025
76003a8
Add tests
srittau Feb 27, 2025
43e23f0
Add missing import
srittau Feb 27, 2025
c644770
Doc fixes
srittau Feb 27, 2025
bfab2fd
Merge branch 'main' into typing-readable-writable
srittau Mar 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions Doc/library/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1147,6 +1147,55 @@ Text I/O
It inherits from :class:`codecs.IncrementalDecoder`.


Static Typing
-------------

The following protocols can be used for annotating function and method
arguments for simple stream reading or writing operations. They are decorated
with :deco:`typing.runtime_checkable`.

.. class:: Reader[T]

Generic protocol for reading from a file or other input stream. ``T`` will
usually be :class:`str` or :class:`bytes`, but can be any type that is
read from the stream.

.. versionadded:: next

.. method:: read()
read(size, /)

Read data from the input stream and return it. If *size* is
specified, it should be an integer, and at most *size* items
(bytes/characters) will be read.

For example::

def read_it(reader: Reader[str]):
data = reader.read(11)
assert isinstance(data, str)

.. class:: Writer[T]

Generic protocol for writing to a file or other output stream. ``T`` will
usually be :class:`str` or :class:`bytes`, but can be any type that can be
written to the stream.

.. versionadded:: next

.. method:: write(data, /)

Write *data* to the output stream and return the number of items
(bytes/characters) written.

For example::

def write_binary(writer: Writer[bytes]):
writer.write(b"Hello world!\n")

See :ref:`typing-io` for other I/O related protocols and classes that can be
used for static type checking.

Performance
-----------

Expand Down
32 changes: 25 additions & 7 deletions Doc/library/typing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2834,17 +2834,35 @@ with :func:`@runtime_checkable <runtime_checkable>`.
An ABC with one abstract method ``__round__``
that is covariant in its return type.

ABCs for working with IO
------------------------
.. _typing-io:

ABCs and Protocols for working with I/O
---------------------------------------

.. class:: IO
TextIO
BinaryIO
.. class:: IO[AnyStr]
TextIO[AnyStr]
BinaryIO[AnyStr]

Generic type ``IO[AnyStr]`` and its subclasses ``TextIO(IO[str])``
Generic class ``IO[AnyStr]`` and its subclasses ``TextIO(IO[str])``
and ``BinaryIO(IO[bytes])``
represent the types of I/O streams such as returned by
:func:`open`.
:func:`open`. Please note that these classes are not protocols, and
their interface is fairly broad.

The protocols :class:`io.Reader` and :class:`io.Writer` offer a simpler
alternative for argument types, when only the ``read()`` or ``write()``
methods are accessed, respectively::

def read_and_write(reader: Reader[str], writer: Writer[bytes]):
data = reader.read()
writer.write(data.encode())

Also consider using :class:`collections.abc.Iterable` for iterating over
the lines of an input stream::

def read_config(stream: Iterable[str]):
for line in stream:
...

Functions and decorators
------------------------
Expand Down
5 changes: 5 additions & 0 deletions Doc/whatsnew/3.14.rst
Original file line number Diff line number Diff line change
Expand Up @@ -619,6 +619,11 @@ io
:exc:`BlockingIOError` if the operation cannot immediately return bytes.
(Contributed by Giovanni Siragusa in :gh:`109523`.)

* Add protocols :class:`io.Reader` and :class:`io.Writer` as a simpler
alternatives to the pseudo-protocols :class:`typing.IO`,
:class:`typing.TextIO`, and :class:`typing.BinaryIO`.
(Contributed by Sebastian Rittau in :gh:`127648`.)


json
----
Expand Down
2 changes: 1 addition & 1 deletion Lib/_pyio.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
_setmode = None

import io
from io import (__all__, SEEK_SET, SEEK_CUR, SEEK_END) # noqa: F401
from io import (__all__, SEEK_SET, SEEK_CUR, SEEK_END, Reader, Writer) # noqa: F401

valid_seek_flags = {0, 1, 2} # Hardwired values
if hasattr(os, 'SEEK_HOLE') :
Expand Down
56 changes: 55 additions & 1 deletion Lib/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,12 +46,14 @@
"BufferedReader", "BufferedWriter", "BufferedRWPair",
"BufferedRandom", "TextIOBase", "TextIOWrapper",
"UnsupportedOperation", "SEEK_SET", "SEEK_CUR", "SEEK_END",
"DEFAULT_BUFFER_SIZE", "text_encoding", "IncrementalNewlineDecoder"]
"DEFAULT_BUFFER_SIZE", "text_encoding", "IncrementalNewlineDecoder",
"Reader", "Writer"]


import _io
import abc

from _collections_abc import _check_methods
from _io import (DEFAULT_BUFFER_SIZE, BlockingIOError, UnsupportedOperation,
open, open_code, FileIO, BytesIO, StringIO, BufferedReader,
BufferedWriter, BufferedRWPair, BufferedRandom,
Expand Down Expand Up @@ -97,3 +99,55 @@ class TextIOBase(_io._TextIOBase, IOBase):
pass
else:
RawIOBase.register(_WindowsConsoleIO)

#
# Static Typing Support
#

GenericAlias = type(list[int])


class Reader(metaclass=abc.ABCMeta):
"""Protocol for simple I/O reader instances.
This protocol only supports blocking I/O.
"""

__slots__ = ()

@abc.abstractmethod
def read(self, size=..., /):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if this has been discussed before, but I'm unsure on the runtime use of size=... (I didn't notice this earlier in my documentation review, sorry).

Almost every other read(size) method I can find has a default of either None or -1. I also can't find another method in the stdlib with a default of ... (outside of the recently-added protocols in wsgiref.types).

Would it be better to have size=-1, to indicate that the method takes an int? I'm not sure how much we want typeshed-like practices to leak into the standard library.

Suggested change
def read(self, size=..., /):
def read(self, size=-1, /):

A

Copy link
Contributor Author

@srittau srittau Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mandating defaults is not really something you can do in a protocol. I also wouldn't want to mandate that implementors have to use a default of -1, because – as you said – some implementations use None.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but this isn't a protocol -- it's an ABC, which do have defaults -- see e.g. collections.abc.Generator. All of the read() methods in io are documented as having read(size=-1, /), and given these ABCs are going into io, I think we should be consistent with that interface, or have good reason to diverge from it (& document why).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's supposed to be a protocol, not an ABC. (Notwithstanding the fact that all protocols are ABCs.) It's just a protocol in the implementation for performance reasons. And using -1 as a default would give users the very wrong impression that they can use read(-1) when that may or may not actually be supported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using -1 as a default would give users the very wrong impression that they can use read(-1) when that may or may not actually be supported.

From the documentation of io.RawIOBase.read():

Read up to size bytes from the object and return them. As a convenience, if size is unspecified or -1, all bytes until EOF are returned.

I would expect the io.Reader.read() ABC/protocol to have this same guarantee, for a 'properly' implemented read() method (according to the io expecations). In the proposed documentation, we say:

Read data from the input stream and return it. If size is specified, it should be an integer, and at most size items (bytes/characters) will be read.

This forbids None (good!), but is silent on what happens should size be omitted. I still think we should use -1 instead of ..., but at the very least we should include in the documentation the contract for what happens when read() is called with no arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. -1 is the default for some implementations, but the protocol should not make any default mandatory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure I follow, though -- currently the ABC/protocol has the mandatory default of size=..., which is always invalid and not the correct type. -1 is valid for all interfaces specified by the io documentation, which is where this new type is being added.

I ran the following with mypy --strict and it passed, so I don't think type checkers care about default values (and as discussed, the type forbids using non-integer types):

import abc

class Reader(metaclass=abc.ABCMeta):
    __slots__ = ()

    @abc.abstractmethod
    def read(self, size: int = -1, /) -> bytes: pass

class CustomReader(Reader):
    def read(self, size: int = -1, /) -> bytes:
        return b''

class CustomReaderZero(Reader):
    def read(self, size: int = 0, /) -> bytes:
        return b''

assert issubclass(CustomReader, Reader)
assert issubclass(CustomReaderZero, Reader)
assert isinstance(CustomReader(), Reader)
assert isinstance(CustomReaderZero(), Reader)

def reader_func(r: Reader) -> None:
    r.read()

reader_func(CustomReader())
reader_func(CustomReaderZero())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Sebastian here; we should use ... because the protocol need not mandate any particular default.

-1 is valid for all interfaces specified by the io documentation

The protocol should also match other file-like classes defined elsewhere in the stdlib or even in third-party libraries. When defining a protocol it's often useful to be permissive, so that all objects that are intended to match the protocol actually match it.

"""Read data from the input stream and return it.
If *size* is specified, at most *size* items (bytes/characters) will be
read.
"""

@classmethod
def __subclasshook__(cls, C):
if cls is Reader:
return _check_methods(C, "read")
return NotImplemented

__class_getitem__ = classmethod(GenericAlias)


class Writer(metaclass=abc.ABCMeta):
"""Protocol for simple I/O writer instances.
This protocol only supports blocking I/O.
"""

__slots__ = ()

@abc.abstractmethod
def write(self, data, /):
"""Write *data* to the output stream and return the number of items written."""

@classmethod
def __subclasshook__(cls, C):
if cls is Writer:
return _check_methods(C, "write")
return NotImplemented

__class_getitem__ = classmethod(GenericAlias)
18 changes: 18 additions & 0 deletions Lib/test/test_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -4916,6 +4916,24 @@ class PySignalsTest(SignalsTest):
test_reentrant_write_text = None


class ProtocolsTest(unittest.TestCase):
class MyReader:
def read(self, sz=-1):
return b""

class MyWriter:
def write(self, b: bytes):
pass

def test_reader_subclass(self):
self.assertIsSubclass(MyReader, io.Reader[bytes])
self.assertNotIsSubclass(str, io.Reader[bytes])

def test_writer_subclass(self):
self.assertIsSubclass(MyWriter, io.Writer[bytes])
self.assertNotIsSubclass(str, io.Writer[bytes])


def load_tests(loader, tests, pattern):
tests = (CIOTest, PyIOTest, APIMismatchTest,
CBufferedReaderTest, PyBufferedReaderTest,
Expand Down
35 changes: 35 additions & 0 deletions Lib/test/test_typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from functools import lru_cache, wraps, reduce
import gc
import inspect
import io
import itertools
import operator
import os
Expand Down Expand Up @@ -4294,6 +4295,40 @@ def __release_buffer__(self, mv: memoryview) -> None:
self.assertNotIsSubclass(C, ReleasableBuffer)
self.assertNotIsInstance(C(), ReleasableBuffer)

def test_io_reader_protocol_allowed(self):
@runtime_checkable
class CustomReader(io.Reader[bytes], Protocol):
def close(self): ...

class A: pass
class B:
def read(self, sz=-1):
return b""
def close(self):
pass

self.assertIsSubclass(B, CustomReader)
self.assertIsInstance(B(), CustomReader)
self.assertNotIsSubclass(A, CustomReader)
self.assertNotIsInstance(A(), CustomReader)

def test_io_writer_protocol_allowed(self):
@runtime_checkable
class CustomWriter(io.Writer[bytes], Protocol):
def close(self): ...

class A: pass
class B:
def write(self, b):
pass
def close(self):
pass

self.assertIsSubclass(B, CustomWriter)
self.assertIsInstance(B(), CustomWriter)
self.assertNotIsSubclass(A, CustomWriter)
self.assertNotIsInstance(A(), CustomWriter)

def test_builtin_protocol_allowlist(self):
with self.assertRaises(TypeError):
class CustomProtocol(TestCase, Protocol):
Expand Down
1 change: 1 addition & 0 deletions Lib/typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1876,6 +1876,7 @@ def _allow_reckless_class_checks(depth=2):
'Reversible', 'Buffer',
],
'contextlib': ['AbstractContextManager', 'AbstractAsyncContextManager'],
'io': ['Reader', 'Writer'],
'os': ['PathLike'],
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Add protocols :class:`io.Reader` and :class:`io.Writer` as
alternatives to :class:`typing.IO`, :class:`typing.TextIO`, and
:class:`typing.BinaryIO`.
Loading