-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP2: HttpClient is still sending DATA after RST #1510
Comments
Is this duplex stream scenario? As far as I can tell the dotnet/corefx#40180 was intended to avoid locking issue, not sequencing described here. As far as I can tell the chain of WriteAsync() from Http2WriteStream(), SendDataAsync() and SendStreamDataAsync() does not have any state check so it would not know if RST was already sent. |
@wfurt those methods get called while sending the request body in SendRequestBodyAsync. We should never send the RST_STREAM until the exception propagates up to the catch in SendRequestBodyAsync, and after than we shouldn't ever be trying to send any more request body. |
@jkotalik How easy is it to reproduce this? Do you see client errors too? (I assume so, since it looks like Kestrel is aborting the connection, which should cause requests to fail.) What client error do you see? It would be interesting to try to narrow down the stress cases here. I assume this is cancellation-related (since we're sending RST_STREAM). |
@eiriktsarpalis will give it more try tomorrow @geoffkizer. It seems like it takes few hours to hit this but we should be able to isolate this to specific test. |
I was able to see it with a higher frequency than every hour. I was seeing every 15 minutes. @eiriktsarpalis has already started investigating. |
I uploaded interesting portion of original capture @geoffkizer. (wireshark will complain about missing ACK but it should still decode HTTP2) The sequence for stream 44339 is: packet 6: GET /duplex It shows data length of 0. However it also shows frame 271: 2198 on Wire, 1106 bytes captures. So it can be incomplete capture. It would be nice to add some X-Test-Info: header @eiriktsarpalis so we can can correlate back to test and test parameters from packet capture. Since the cancelation comes almost at the same time as request is made, I assume we are canceling explicitly, not because of timeout. |
Does packet 271 also show EndStream set? We don't ever send a 0-length packet except when we send EndStream. |
I haven't been able to reproduce the error. Using stress suite from master and the latest 3.0.1xx nightlies with the following arguments:
Mostly produce the following errors:
Which however are expected, they emanate from the |
If that is race condition it can depend on machine configuration. BTW the packet capture seems incomplete and missing data @jkotalik. I don't know if that is weirdness of loopback. Did we ever try this with two machines @eiriktsarpalis ? |
Not really, still trying to get it to repro locally. |
@jkotalik could you provide a hardcoded configuration of the stress runner that repro's the issue? I haven't been able to hit it at all. |
This issue hasn't been updated in a while. Should I close or maybe change the milestone? |
Let's change the milestone for now. I'm still busy debugging https://github.com/dotnet/corefx/issues/40194, which is higher priority than this. I'll try to repro this again once I get the chance. |
Moved to 5.0 as we do not have anything actionable now. |
I'm going to work on reproing now that dotnet/corefx#40194 is resolved. Assigning myself for now. |
I'm able to reproduce using the latest nightlies. |
@eiriktsarpalis it is assigned to you - are you working on it actively? |
No, this has been dropped on the floor since my trip. I can unassign since I'm focusing on load testing. |
Moving to 6.0 and unassigning. |
It seems to be caused by the same RST_STREAM vs SendEndStream race condition that is explained in #42200 |
@alnikola should we close it as duplicate then? |
Yes, let's close it as duplicate. |
This log analysis proves that there is the same root cause as in #42200. |
Duplicate of #42200 |
I found this while running the Http client stress test. From what I can tell, this is after dotnet/corefx#40180 was merged, which should have fixed this issue.
What I see is that the client is sending a RST frame and then sending a data frame afterward.
Here is the pcap file: https://1drv.ms/u/s!At4nHcZpt7bWwRu60uYm66LT0Yce?e=apU9jY. Look for stream-id 44339, you will see a rst and then a data frame from the client to the server.
On the server, we are seeing errors like:
The text was updated successfully, but these errors were encountered: