Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP2: HttpClient is still sending DATA after RST #1510

Closed
jkotalik opened this issue Aug 27, 2019 · 25 comments
Closed

HTTP2: HttpClient is still sending DATA after RST #1510

jkotalik opened this issue Aug 27, 2019 · 25 comments
Assignees
Labels
area-System.Net.Http bug tenet-reliability Reliability/stability related issue (stress, load problems, etc.)
Milestone

Comments

@jkotalik
Copy link
Contributor

I found this while running the Http client stress test. From what I can tell, this is after dotnet/corefx#40180 was merged, which should have fixed this issue.

What I see is that the client is sending a RST frame and then sending a data frame afterward.

       .NET Core: 3.0.0-preview9-19415-13
    ASP.NET Core: 3.0.0-preview9.19416.11
 System.Net.Http: C:\Program Files\dotnet\shared\Microsoft.NETCore.App\3.0.0-preview9-19415-13\System.Net.Http.dll, last modified 8/16/2019 4:12:38 AM
          Server: Kestrel
      Server URL: http://127.0.0.1:5001/
         Tracing: False
     ASP.NET Log: True
     Concurrency: 12
  Content Length: 1000
   HTTP2 Version: 2.0
        Lifetime: (infinite)
      Operations:  0: GET,  1: GET Partial,  2: GET Headers,  3: GET Parameters,  4: GET Aborted,  5: POST,  6: POST Multipart Data,  7: POST Duplex,  8: POST Duplex Slow,  9: POST Duplex String, 10: POST Duplex Dispose, 11: POST ExpectContinue, 12: HEAD, 13: PUT, 14: PUT Slow, 15: GET Slow
     Random Seed: 515848072
    Cancellation: 0%
Max Content Size: 1000
Query Parameters: 1

Here is the pcap file: https://1drv.ms/u/s!At4nHcZpt7bWwRu60uYm66LT0Yce?e=apU9jY. Look for stream-id 44339, you will see a rst and then a data frame from the client to the server.

On the server, we are seeing errors like:

fail: Microsoft.AspNetCore.Server.Kestrel[13]
      Connection id "0HLPAGU4RFFT0", Request id "0HLPAGU4RFFT0:0000AD3F": An unhandled exception was thrown by the application.
System.IO.IOException: The request stream was aborted.
 ---> Microsoft.AspNetCore.Connections.ConnectionAbortedException: The HTTP/2 connection faulted.
 ---> Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2ConnectionErrorException: HTTP/2 connection error (STREAM_CLOSED): A frame of type DATA was received after stream 44339 was reset or aborted.
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2Connection.ProcessDataFrameAsync(ReadOnlySequence`1& payload) in src\Servers\Kestrel\Core\src\Internal\Http2\Http2Connection.cs:line 476
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2Connection.ProcessFrameAsync[TContext](IHttpApplication`1 application, ReadOnlySequence`1& payload) in src\Servers\Kestrel\Core\src\Internal\Http2\Http2Connection.cs:line 423
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2Connection.ProcessRequestsAsync[TContext](IHttpApplication`1 application) in src\Servers\Kestrel\Core\src\Internal\Http2\Http2Connection.cs:line 222
   --- End of inner exception stack trace ---
   --- End of inner exception stack trace ---
   at System.IO.Pipelines.PipeCompletion.ThrowLatchedException()
   at System.IO.Pipelines.Pipe.GetReadResult(ReadResult& result)
   at System.IO.Pipelines.Pipe.GetReadAsyncResult()
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2MessageBody.ReadAsync(CancellationToken cancellationToken) in src\Servers\Kestrel\Core\src\Internal\Http2\Http2MessageBody.cs:line 102
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpRequestStream.CopyToAsyncInternal(Stream destination, CancellationToken cancellationToken) in src\Servers\Kestrel\Core\src\Internal\Http\HttpRequestStream.cs:line 203
   at HttpStress.StressServer.<>c.<<MapRoutes>b__7_6>d.MoveNext() in C:\Users\jukotali\code\Secondary\corefx\src\System.Net.Http\tests\StressTests\HttpStress\StressServer.cs:line 207
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application) in src\Servers\Kestrel\Core\src\Internal\Http\HttpProtocol.cs:line 618
@davidsh
Copy link
Contributor

davidsh commented Aug 27, 2019

@wfurt
Copy link
Member

wfurt commented Aug 27, 2019

Is this duplex stream scenario? As far as I can tell the dotnet/corefx#40180 was intended to avoid locking issue, not sequencing described here.

As far as I can tell the chain of WriteAsync() from Http2WriteStream(), SendDataAsync() and SendStreamDataAsync() does not have any state check so it would not know if RST was already sent.

@geoffkizer
Copy link
Contributor

@wfurt those methods get called while sending the request body in SendRequestBodyAsync. We should never send the RST_STREAM until the exception propagates up to the catch in SendRequestBodyAsync, and after than we shouldn't ever be trying to send any more request body.

@geoffkizer
Copy link
Contributor

@jkotalik How easy is it to reproduce this? Do you see client errors too? (I assume so, since it looks like Kestrel is aborting the connection, which should cause requests to fail.) What client error do you see?

It would be interesting to try to narrow down the stress cases here. I assume this is cancellation-related (since we're sending RST_STREAM).

@wfurt
Copy link
Member

wfurt commented Aug 27, 2019

@eiriktsarpalis will give it more try tomorrow @geoffkizer. It seems like it takes few hours to hit this but we should be able to isolate this to specific test.

@jkotalik
Copy link
Contributor Author

I was able to see it with a higher frequency than every hour. I was seeing every 15 minutes. @eiriktsarpalis has already started investigating.

@wfurt
Copy link
Member

wfurt commented Aug 27, 2019

I uploaded interesting portion of original capture @geoffkizer. (wireshark will complain about missing ACK but it should still decode HTTP2)

http2.small.pcapng.zip

The sequence for stream 44339 is:

packet 6: GET /duplex
packet 211: 200 from server
packet 265: client sends RST(cancel)
packet 271: client sends weird data packet

It shows data length of 0.

However it also shows frame 271: 2198 on Wire, 1106 bytes captures. So it can be incomplete capture.

It would be nice to add some X-Test-Info: header @eiriktsarpalis so we can can correlate back to test and test parameters from packet capture.

Since the cancelation comes almost at the same time as request is made, I assume we are canceling explicitly, not because of timeout.

@geoffkizer
Copy link
Contributor

Does packet 271 also show EndStream set? We don't ever send a 0-length packet except when we send EndStream.

@eiriktsarpalis
Copy link
Member

I haven't been able to reproduce the error. Using stress suite from master and the latest 3.0.1xx nightlies with the following arguments:

dotnet run -c Release -- -cancelRate 0 -serverUri http://localhost:5001 -clientTimeout 100 -aspnetlog

Mostly produce the following errors:

fail: Microsoft.AspNetCore.Server.Kestrel[13]
      Connection id "0HLPBRSILHQ0H", Request id "0HLPBRSILHQ0H:00005D01": An unhandled exception was thrown by the application.
System.IO.IOException: The client reset the request stream.
   at System.IO.Pipelines.PipeCompletion.ThrowLatchedException()
   at System.IO.Pipelines.Pipe.GetReadResult(ReadResult& result)
   at System.IO.Pipelines.Pipe.GetReadAsyncResult()
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2MessageBody.ReadAsync(CancellationToken cancellationToken)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpRequestStream.CopyToAsyncInternal(Stream destination, CancellationToken cancellationToken)
   at HttpStress.StressServer.<>c.<<MapRoutes>b__7_10>d.MoveNext() in C:\Users\eitsarpa\devel\dotnet\corefx\src\System.Net.Http\tests\StressTests\HttpStress\StressServer.cs:line 243
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application)

Which however are expected, they emanate from the POST Duplex Dispose operation which dispose of the response before reading the entire content. Once I disable that operation (-xops 9) these errors go away.

@wfurt
Copy link
Member

wfurt commented Aug 28, 2019

If that is race condition it can depend on machine configuration.

BTW the packet capture seems incomplete and missing data @jkotalik. I don't know if that is weirdness of loopback. Did we ever try this with two machines @eiriktsarpalis ?

@eiriktsarpalis
Copy link
Member

Did we ever try this with two machines?

Not really, still trying to get it to repro locally.

@eiriktsarpalis
Copy link
Member

@jkotalik could you provide a hardcoded configuration of the stress runner that repro's the issue? I haven't been able to hit it at all.

@eiriktsarpalis
Copy link
Member

This issue hasn't been updated in a while. Should I close or maybe change the milestone?

@jkotalik
Copy link
Contributor Author

jkotalik commented Sep 3, 2019

Let's change the milestone for now. I'm still busy debugging https://github.com/dotnet/corefx/issues/40194, which is higher priority than this. I'll try to repro this again once I get the chance.

@karelz karelz unassigned wfurt Sep 9, 2019
@karelz
Copy link
Member

karelz commented Sep 9, 2019

Moved to 5.0 as we do not have anything actionable now.

@jkotalik
Copy link
Contributor Author

I'm going to work on reproing now that dotnet/corefx#40194 is resolved. Assigning myself for now.

@jkotalik jkotalik self-assigned this Sep 10, 2019
@eiriktsarpalis
Copy link
Member

I'm able to reproduce using the latest nightlies.

@karelz
Copy link
Member

karelz commented Oct 4, 2019

@eiriktsarpalis it is assigned to you - are you working on it actively?

@eiriktsarpalis
Copy link
Member

No, this has been dropped on the floor since my trip. I can unassign since I'm focusing on load testing.

@eiriktsarpalis eiriktsarpalis removed their assignment Oct 7, 2019
@karelz karelz transferred this issue from dotnet/corefx Jan 9, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Net.Http untriaged New issue has not been triaged by the area owner labels Jan 9, 2020
@karelz karelz added bug and removed untriaged New issue has not been triaged by the area owner labels Jan 9, 2020
@karelz karelz added the tenet-reliability Reliability/stability related issue (stress, load problems, etc.) label Jan 9, 2020
@karelz karelz added this to the 5.0 milestone Jan 9, 2020
@karelz karelz modified the milestones: 5.0, Future Feb 24, 2020
@karelz karelz modified the milestones: Future, 5.0 May 8, 2020
@karelz karelz changed the title HttpClient is still sending DATA after RST HTTP2: HttpClient is still sending DATA after RST May 29, 2020
@karelz karelz modified the milestones: 5.0.0, 6.0.0 Aug 13, 2020
@karelz
Copy link
Member

karelz commented Aug 13, 2020

Moving to 6.0 and unassigning.
Next step: Figure out if the issue is still reproducible.

@alnikola
Copy link
Contributor

It seems to be caused by the same RST_STREAM vs SendEndStream race condition that is explained in #42200

@karelz
Copy link
Member

karelz commented Jun 21, 2021

@alnikola should we close it as duplicate then?

@alnikola
Copy link
Contributor

Yes, let's close it as duplicate.

@alnikola
Copy link
Contributor

This log analysis proves that there is the same root cause as in #42200.
Duplicate of #42200

@karelz
Copy link
Member

karelz commented Jun 21, 2021

Duplicate of #42200

@karelz karelz marked this as a duplicate of #42200 Jun 21, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jul 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Http bug tenet-reliability Reliability/stability related issue (stress, load problems, etc.)
Projects
None yet
Development

No branches or pull requests

9 participants