Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Envoy Server drops the connection on Windows when the client certificate is invalid #13191

Closed
davinci26 opened this issue Sep 19, 2020 · 6 comments · Fixed by #13264
Closed

Comments

@davinci26
Copy link
Member

davinci26 commented Sep 19, 2020

Title: Envoy Server drops the connection on Windows when the client certificate is invalid

Description:

When the client certificate is invalid the behavior on Windows and UNIX is different. On Linux (correct behavior) it notifies back the client and on Windows it just drops the connection.

  • UNIX: SSL_get_error is SSL_ERROR_SSL and the client has the error TLS error: 268436501:SSL routines:OPENSSL_internal:SSLV3_ALERT_CERTIFICATE_EXPIRED in its error queue
  • Windows: SSL_get_error is SSL_ERROR_SYSCALL and a WSAGetLastError of 10054

Information provided by: SSL_CTX_set_info_callback(ctx.ssl_ctx_.get(), apps_ssl_info_callback);:

On linux the alert that is raised is:
SSL error[undefined][TLS client read_session_ticket]: ret: 557 alert type fatal alert desc certificate expired

On Windows the alert that is raised is:
SSL error[SSL_connect][TLS client read_session_ticket]: ret: -1 alert type unknown alert desc unknown

Repro steps:

Run the test case FailedClientCertificateExpirationVerification in //test/extensions/transport_sockets/tls:ssl_socket_test

cc: @envoyproxy/windows-dev

@davinci26 davinci26 added bug triage Issue requires triage area/windows labels Sep 19, 2020
@mattklein123 mattklein123 removed the triage Issue requires triage label Sep 21, 2020
@yanavlasov
Copy link
Contributor

@PiotrSikora may be can suggest further steps

@PiotrSikora
Copy link
Contributor

10054 is WSAECONNRESET (Connection reset), which means that the connection with unread data was closed, so it appears that the ordering of network events is different in the underlying TCP stack, and on Windows read() reports connection closed before the SSL/TLS alert can be read from the wire.

However, TLS implements graceful shutdown via close_notify alerts, so this shouldn't be happening...

Does this happen with production code or only in this particular test?

@davinci26
Copy link
Member Author

Here is the wireshark packet dump from the handshake on Windows:
image

Does this happen with production code or only in this particular test?
I haven't tried it in production but I can also verify that the same behavior happens on the grpc tests that use SSL when the client certificate is invalid.

@sunjayBhatia
Copy link
Member

I can get the tests involving invalid certs to pass by just placing a shutdown() before socket close in the connection_impl code, the theory about the alert not being sent to the client is correct and we have run into this issue in other places before

It seems that when there is an invalid cert, SSL_shutdown does not occur as the connection state is not marked as "in progress" or "complete"

the connection is torn down without doing a proper shutdown so the client gets connection reset, as the data of the alert remains unread

@sunjayBhatia
Copy link
Member

Working on a PR, there a couple other small issues in the test setup etc. but shouldn't be too hard to fix

@sunjayBhatia
Copy link
Member

See #13264

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants