Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: new style WebSockets, where headers and data are processed by the filter chain. #3776

Merged
merged 8 commits into from
Jul 12, 2018

Conversation

alyssawilk
Copy link
Contributor

@alyssawilk alyssawilk commented Jul 2, 2018

This is the complete HTTP/1.1 implementation of #3301, new style websockets.

I believe it preserves existing behavior for "old style" websockets except for handling transfer-encoding requests (we all agree shouldn't happen) and responses (actually could happen and have been requested) better.

Risk Level: High (should be self contained but still lots of core code changes)
Testing: Thorough integration tests. unit tests for http1 codec
Docs Changes: added websocket FAQ
Release Notes: added

Fixes #3301 (modulo timeouts not working, which will be addressed by #3654 or #1778)

I'd like some subset of @ggreenway @rshriram @PiotrSikora and @mattklein123 to take a look at how this handles bodies. I think it comes out really cleanly.

@rshriram has requested we handle response bodies. I can't find anything in the spec which says upgrades aren't allowed on the request path as well. I would love Envoy to reject all GETs-with-bodies but I consider that to be orthogonal to this PR (though anyone can open an issue and file it my way and I'll tackle it as a part of this refactor). I'd be happy to lock websocket upgrades to GET since that's required by the spec, but for generic upgrades I believe POST upgrades are allowed, and as it turns out it's easier to just through-proxy everything than special case just the request path to disallow bodies. That said there was enough discussion on the issue I don't want to write thorough unit tests until there's consensus on what to do here. So comment away and we'll try to get consensus and then I'll write the tests.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks great. I didn't review the tests yet. Would be great for @rshriram and @ggreenway to also review. I have a few comments/questions, will review tests after.

@@ -447,7 +447,7 @@ const Network::Connection* ConnectionManagerImpl::ActiveStream::connection() {

void ConnectionManagerImpl::ActiveStream::decodeHeaders(HeaderMapPtr&& headers, bool end_stream) {
request_headers_ = std::move(headers);
createFilterChain();
bool upgrade_rejected = createFilterChain() == false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const

@@ -569,7 +569,7 @@ void ConnectionManagerImpl::ActiveStream::decodeHeaders(HeaderMapPtr&& headers,
connection_manager_.stats_.named_.downstream_cx_http1_active_.dec();
connection_manager_.stats_.named_.downstream_cx_websocket_total_.inc();
return;
} else if (websocket_requested) {
} else if (websocket_requested && upgrade_rejected) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be || ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what the logic should be here, but it seems like there should be an error case anytime upgrade_rejected is true, whether or not websocket_requested is true. I think that can only be hit for other upgrade types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be &&.. the top if block checks for upgrade_requested && upgrade_allowed. the bottom if block then returns a 404 if client requests upgrade but we have not configured the route to upgrade for websockets.. The filter chain stuff eliminates the need to explicitly specify upgrades in the routes. So, if client requests upgrades and the route doesn't specify an upgrade, we fail the request if and only if the auto upgrade attempt (aka upgrade_rejected = createFilterChain() ) fails.

@@ -942,7 +942,9 @@ void ConnectionManagerImpl::ActiveStream::encodeHeaders(ActiveStreamEncoderFilte

if (connection_manager_.drain_state_ == DrainState::Closing &&
connection_manager_.codec_->protocol() != Protocol::Http2) {
headers.insertConnection().value().setReference(Headers::get().ConnectionValues.Close);
if (headers.Connection() == nullptr || headers.Connection()->value() != "Upgrade") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here for why we don't add the header if this is an upgrade?

@@ -94,7 +94,7 @@ void StreamEncoderImpl::encodeHeaders(const HeaderMap& headers, bool end_stream)
Headers::get().TransferEncoding.get().size(),
Headers::get().TransferEncodingValues.Chunked.c_str(),
Headers::get().TransferEncodingValues.Chunked.size());
chunk_encoding_ = true;
chunk_encoding_ = headers.Upgrade() == nullptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to conflict with your HEAD change so maybe merge that first?

@@ -324,6 +347,8 @@ void ConnectionImpl::dispatch(Buffer::Instance& data) {

ENVOY_CONN_LOG(trace, "parsed {} bytes", connection_, total_parsed);
data.drain(total_parsed);

maybeDirectDispatch(data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea here that we may have paused during the initial loop and now have to do direct dispatch? Can you add a comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here. Comments would help.

Copy link
Contributor

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach looks reasonable. I don't have strong opinions about the body-handling code; it's not too intrusive.

@@ -275,6 +274,10 @@ message HttpConnectionManager {
// The current implementation of upgrade headers does not handle
// multi-valued upgrade headers. Support for multi-valued headers may be
// added in the future if needed.
//
// .. warning::
// The current iplementation of upgrade headers does not work with HTTP/2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling: implementation

@@ -569,7 +569,7 @@ void ConnectionManagerImpl::ActiveStream::decodeHeaders(HeaderMapPtr&& headers,
connection_manager_.stats_.named_.downstream_cx_http1_active_.dec();
connection_manager_.stats_.named_.downstream_cx_websocket_total_.inc();
return;
} else if (websocket_requested) {
} else if (websocket_requested && upgrade_rejected) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what the logic should be here, but it seems like there should be an error case anytime upgrade_rejected is true, whether or not websocket_requested is true. I think that can only be hit for other upgrade types?

connection_manager_.config_.filterFactory().createFilterChain(*this);
bool ConnectionManagerImpl::ActiveStream::createFilterChain() {
auto upgrade = request_headers_->Upgrade();
if (upgrade != nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the case where a user just wants to pass all upgrades to their upstream http server (because they want to deny upgrades, but not in a direct response from envoy)? I think we should always fallback to the non-upgrade filterChain if there is not a configured filter chain for this upgrade type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

websocket is always point to point isn't it? even the current implementation is point to point to the extent that we upgrade and hope that the upstream will properly terminate the connection if it doesn't want to upgrade.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine doing this for now, but note that for most early responses we no longer pass through the filter chain. There's an open issue for responses to only pass through the filters which saw the request so I believe this behavior will not persist in the long run.

and upstreams. It detects requests with "Upgrade: websocket" headers and upgrades from HTTP processing
mode to TCP proxy mode, establishing a raw TCP connection upstream, forwarding the raw headers, any
HTTP body, and WebSocket payload unmodified by the HTTP filter chain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a note: if nobody other than Istio and cloud foundry are using the websocket upgrade stuff, I would be happy to drop this explicit "allow_websocket_upgrade: true" immediately. Its easy for us to configure the filter chains for websockets (and CF folks would do the same as they use Pilot code)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As tempted as I am to clean up code early, I don't think there's a way to poll all websocket users, so I plan on the normal deprecation timeline :-)

Copy link
Contributor Author

@alyssawilk alyssawilk Jul 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and worth noting, you may need timeouts to work properly before switching over. As part of making the tests old-and-new-style compatible I discovered Envoy doesn't have proper request timeouts and due to the bidi nature of websockets and the way current Envoy timers work, they'll linger forever :-/

I believe @htuch was planning on picking up #3654

If that blocks you from moving over I'll make sure he or I pick it up in the next 1-2 weeks since I'd really like the new code field tested in prod before I deprecate the old code. I'm not going to deprecate the old way of doing things until the timeouts are fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alyssawilk I'm working on #1778 rather than #3654.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah noted. Thankfully that'll work just as well :-)

Copy link
Member

@rshriram rshriram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks good to me.
Wonder if H2 tunneling is equally simple or involves more work.

Copy link
Member

@rshriram rshriram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more comments inline.

will we be able to use tracing code with this or with h2 ? Given that we enabled full filter chain, having tracing would be very helpful. If not a simple filter that does createChildSpan() Call will suffice I “think”.

Also are we providing any other retry or timeout semantics for the upgrade part of the connection ?

auto upgrade = request_headers_->Upgrade();
if (upgrade != nullptr) {
if (!connection_manager_.config_.filterFactory().createUpgradeFilterChain(
upgrade->value().c_str(), *this)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this already implemented? If not, a suggestion here: we need a way to yell filters that they are in upgrade filter chain and not http filter chain.

Second, what happens if this returns false? Will connection fail? If so, won’t it break existing websocket behavior where users don’t have to define a filter chain?

Unlike http we don’t need a mandatory route filter here do we? Or we could have one and auto instantiate if not present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we fail over to normal sendLocalReply behavior where we reject the request. I'll add a comment here for clarity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and createUpgradeFilterChain was added in #3685
I'm inclined to say if you configure a filter for upgrades and they need to know about upgrades they should check the header.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Contributor Author

FWIW I think H2 will come out pretty cleanly once we can implement it but it's currently blocked on
nghttp2/nghttp2#1181 because nghttp2 (per the spec) currently rejects a bunch of the headers we need for the CONNECT style upgrades. I plan on patching in their work in progress and getting a PR ready but apparently it's going to be 6-8 weeks until the RFC lands plus time for nghttp2 work plus time for Envoy review so at likely 3 months out :-/

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk alyssawilk force-pushed the websocket_delay branch 2 times, most recently from 22e526d to bef5ef6 Compare July 9, 2018 21:15
@alyssawilk alyssawilk changed the title WIP: http: new style WebSockets, where headers and data are processed by the filter chain. http: new style WebSockets, where headers and data are processed by the filter chain. Jul 9, 2018
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Contributor Author

@mattklein123 I think this is ready for another look whenever you get a chance

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some questions. Nice work!

} else {
upgrade_rejected = true;
// Fall through to the default filter chain. The function calling this
// will send a local reply indicating that the upgrade failed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that this local reply will go through the encode filter chain for the default, right? SGTM if so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

@@ -101,9 +101,12 @@ void StreamEncoderImpl::encodeHeaders(const HeaderMap& headers, bool end_stream)
Headers::get().TransferEncoding.get().size(),
Headers::get().TransferEncodingValues.Chunked.c_str(),
Headers::get().TransferEncodingValues.Chunked.size());
// We do not aply chunk encoding for HTTP upgrades. Any incoming chunks
// will be through-proxied by maybeDirectDispatch untouched.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't not chunk encoding the stream orthogonal from how decoding happens on the other end? I think maybe the comment just needs clarification that in the upgrade case we don't chunk encode because we are basically dealing with "raw" bytes on the decode side? (Rephrase however you want just trying to make sure I and future readers understand.)

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
// We do not aply chunk encoding for HTTP upgrades.
// If there is a body in a WebSocket Upgrade response, the chunks will be
// passed through via maybeDirectDispatch so we need to avoid appending
// extra chunk boundaries endEncode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "endEncode" is errant here. Or do you mean "... boundaries in encodeData and endEncode." ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turns out we don't call encodeData (because we direct dispatch) but we do call endEncode (because reasons), but I think documenting that is more likely to cause confusion so I'll just remove it :-)

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk alyssawilk merged commit 95c3e13 into envoyproxy:master Jul 12, 2018
snowp pushed a commit to snowp/envoy that referenced this pull request Jul 12, 2018
* origin/master:
  config: making v2-config-only a boolean flag (envoyproxy#3847)
  lc trie: add exclusive flag. (envoyproxy#3825)
  upstream: introduce PriorityStateManager, refactor EDS (envoyproxy#3783)
  test: deflaking header_integration_test (envoyproxy#3849)
  http: new style WebSockets, where headers and data are processed by the filter chain. (envoyproxy#3776)
  common: minor doc updates (envoyproxy#3845)
  fix master build (envoyproxy#3844)
  logging: Requiring details for RELEASE_ASSERT (envoyproxy#3842)
  test: add test for consistency of RawStatData internal memory representation (envoyproxy#3843)
  common: jittered backoff implementation (envoyproxy#3791)
  format: run buildifier on .bzl files. (envoyproxy#3824)
  Support mutable metadata for endpoints (envoyproxy#3814)
  test: deflaking a test, improving debugability (envoyproxy#3829)
  Update ApiConfigSource docs with grpc_services only for GRPC configs (envoyproxy#3834)
  Add hard-coded /hot_restart_version test (envoyproxy#3832)
  healthchecks: Add interval_jitter_percent healthcheck option (envoyproxy#3816)

Signed-off-by: Snow Pettersen <snowp@squareup.com>
@alyssawilk alyssawilk deleted the websocket_delay branch November 28, 2018 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants