-
-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we do anything about timeouts for subparts of an operation? #686
Comments
As usual, a possible answer came to me while I was writing down the question in detail—not sure it is the answer, though: Be honest about what’s happening and separate getting a connection from the pool from making the actual request, so that a timeout can be imposed on either one separately or on both at once. Not sure how to prevent user code from grabbing a connection and monopolizing it; but that not sure this needs to be prevented. This requires departing from a requests-like API, but... is that necessarily a problem when there is no de facto standard library for asynchronous HTTP (like requests is for synchronous HTTP)? P.S.: That reminds me: this, of course, is equally relevant to the design of requests3, so maybe @kennethreitz has already thought of something. |
To further your dismay, there are actually three. One to be applied for each iter of a stream's body. We could do something nuts like expose a series of steps around which the user of asks can implement their own timeouts with their async lib of choice, but to what gain? Is it worth it? Breaking it in to an optional second API that treats creating connections, getting them from the pool, sending the request, reading the headers and status line, reading the body etc. Edit: fatfingered on mobile. Cont... Treating the above steps individually we can get this "trio timeout" ethos in to asks, or any other lib. Fundamentally we write libs to simplify. I don't think it's a great idea to remove the current style of api, but I'm also not against exposing a second more granular api for greater user control. |
Hi,
There is, at any given moment in my application, a pool 10 to 100 HTTP
connections managed by asks, and thousands of tasks trying to make an
HTTP request. If I enclose the |request|/|get|/... call in trio’s
|move_on_after|
<https://trio.readthedocs.io/en/latest/reference-core.html#a-simple-timeout-example>,
the thing that would be time-limited is not the server’s response
time, but that time plus the time to acquire a connection from the
pool. Which might be what’s needed in a particular situation, but not
in mine.
Splitting the get-a-connection-from-the-pool step from the rest looks
like a requirement in this case. Consider a server with a 5-minute
session cookie timeout. You do not want your interaction with this
server blocked in the middle because you're out of connections and
everbody else happens to be ssslllooowww.
…--
-- Matthias Urlichs
|
@theelous3 Well, if the question is I don’t like having a timeout argument in asks, the answer is explained at length in my first link above (incl. the specific case of the original requests library), but the short version is that the composability sucks, i.e. it becomes extremely tedious to say “I want such-and-such done in five seconds” when such-and-such involves async operations other than a single asks call, the common result being that nobody cares to do so (and ultimately software that appears to hang). Your example of a one-second network operation taking more like one and a half seconds, in particular, sounds like a bug to be fixed in the underlying network API rather than harsh reality to be dealt with. (Let’s assume we’re not CPU-bound and keep the OS’s preemptive scheduling out of the picture.) If you’re finding it curious that I’m speaking about bunching things together when suggesting that the asks request call be split, well, so am I; but if we (as trio users) settled on design principle (here: don’t roll your own timeouts), we’d better apply it across the board, and my point is we can’t, really, given the current asks API. |
@smurfix Well, yeah, but this appears to be a separate point: reserving a connection explicitly, as opposed to separating the time to reserve from the time to make a request; I was specifically thinking of any way to do the latter without doing the former. The rabbit hole of a general API for reserving connections is just too deep: I’ll then want to reserve five connections out of the common pool and put them into a smaller pool to use on my own, or pool connections to separate domains separately (as web browsers commonly do), &c. I’d rather not try to solve that without an actual use case. |
I totally get why asks has the connection limit (and aiohttp does something similar), but over time I'm coming to believe that this is just the wrong level to do connection limiting. If you set the asks connection pool to unlimited, and then use your own I'm a bit less dogmatic about the "read timeout" then I used to be... it's absurd that there's no good way for requests to provide an overall timeout, but even once we solve that, the "read timeout" is still somewhat useful – there are lots of cases where it will successfully catch things, and you don't have to think about the size of the file being downloaded when you set it. |
Renamed this to reflect the open question. I think the answer is "no" -- individual APIs of outside libraries can take a timeout argument if there's some subpart they want to have a separate timeout on; I can't think of a way to make subpart timeouts "better" analogously to how Trio's cancel scopes make entire-operation timeouts "better". But leaving open for now in case anyone has a different opinion. |
Closing because we haven't seen discussion of this - but if you've found this with an idea please do post it below! |
@njsmith has written rather a lot on timeouts and why they should be provided generically by the asyncronous processing library, instead of being passed as an additional parameter to the individual calls.
Imagine my dismay when I saw not one, but two separate timeout arguments in @theelous3’s asks HTTP client library (the only one currently available with trio support). And the accompanying discussion, frankly speaking, felt extremely underwhelming from the point of view of @njsmith’s argument (“what thing is to be accomplished” as opposed to “what sequence of things is going to be performed”).
... And then I realized there was—in my particular context—no obvious way to do better. There is, at any given moment in my application, a pool 10 to 100 HTTP connections managed by asks, and thousands of tasks trying to make an HTTP request. If I enclose the
request
/get
/... call in trio’smove_on_after
, the thing that would be time-limited is not the server’s response time, but that time plus the time to acquire a connection from the pool. Which might be what’s needed in a particular situation, but not in mine.So, I guess my question is both to @njsmith and to @theelous3: how can trio’s timeout API can be usefully applied (or extended to apply) in the situation asks faces?
The text was updated successfully, but these errors were encountered: