-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: use a request queue in the http2 conn pool #4917
Conversation
Modifies the http2 connection pool to use a request queueu similar to how the http/1.1 connection pool does. This is done in order to defer the connection pool callbacks until the connection has been established instead of simply invoking the callback immediately. Signed-off-by: Snow Pettersen <snowp@squareup.com>
NOTE: not done, test coverage is lacking and not all integration tests are passing, but putting this up to get some feedback on the high level approach |
@@ -308,15 +308,15 @@ TEST_P(Http2UpstreamIntegrationTest, UpstreamConnectionCloseWithManyStreams) { | |||
responses.push_back(std::move(encoder_decoder.second)); | |||
// Reset a few streams to test how reset and watermark interact. | |||
if (i % 15 == 0) { | |||
codec_client_->sendReset(*encoders[i]); | |||
// codec_client_->sendReset(*encoders[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reseting a stream causes waitForEndStream
to hang, not sure why this is happening yet
@@ -132,6 +146,21 @@ void ConnPoolImpl::onConnectionEvent(ActiveClient& client, Network::ConnectionEv | |||
if (client.connect_timer_) { | |||
host_->cluster().stats().upstream_cx_connect_fail_.inc(); | |||
host_->stats().cx_connect_fail_.inc(); | |||
|
|||
// Raw connect failures should never happen under normal circumstances. If we have an upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit is copied verbatim from the http1 conn pool. Should probably share that code somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, optimally we would share code here. I would at least add a clear TODO on what could be shared.
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a high level this makes sense to me. LMK when you are ready for a more formal review. Thanks for working on this!
host_->cluster().resourceManager(priority_).requests().inc(); | ||
callbacks.onPoolReady(primary_client_->client_->newStream(response_decoder), | ||
primary_client_->real_host_description_); | ||
// If the primary client is not connected yet, queue up the request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the pending request circuit breaker here. This is a nice cleanup that also brings parity between H2 and H1 IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping on this. Please make sure this has test coverage also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, with test coverage
@@ -132,6 +146,21 @@ void ConnPoolImpl::onConnectionEvent(ActiveClient& client, Network::ConnectionEv | |||
if (client.connect_timer_) { | |||
host_->cluster().stats().upstream_cx_connect_fail_.inc(); | |||
host_->stats().cx_connect_fail_.inc(); | |||
|
|||
// Raw connect failures should never happen under normal circumstances. If we have an upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, optimally we would share code here. I would at least add a clear TODO on what could be shared.
source/common/http/http2/conn_pool.h
Outdated
@@ -66,12 +66,29 @@ class ConnPoolImpl : Logger::Loggable<Logger::Id::pool>, public ConnectionPool:: | |||
Upstream::HostDescriptionConstSharedPtr real_host_description_; | |||
uint64_t total_streams_{}; | |||
Event::TimerPtr connect_timer_; | |||
Event::TimerPtr upstream_ready_timer_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refresh my memory of why this is needed? I remember this was added to HTTP/1 to fix some issue which I can't recall now. I think @lizan did this. Can you track down why this was done and add comments to both places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somewhat cargo culted this from the http1 implementation, but seems like it was done in order to avoid requests being assigned to a connection that's already been closed: #2715
I think this would apply to http/2 as well? I'll spend some time thinking about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that sounds right. I would just look at the convo in the related PR. If you could add some comments for future readers that would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking into the issue I don't think we need this for h/2 unless we're already at risk without this PR. The semantics of how we're handling a connection being closed/reset isn't changing, we're just changing when streams get created on the codec. The issue also talks about how this being due to a h/1.1 issue, though I'm not sure exactly what property of h/1.1 causes this. Maybe the different close semantics vs h/2?
I'll try without using a timer. Hopefully we'll be able to add test coverage that can add confidence around this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look at this more tomorrow. For now I would go ahead and remove the timer and we can then think about whether it is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No you don't need this timer when connection is established, this timer is not used in that case in H1 conn pool either.
It is needed in H1 conn pool to schedule another request to existing idle connection, while the connection might be already closed by upstream so we wait one event cycle. In H2, streams are multiplexed into one connection so the problem doesn't exist. You can call onUpstreamReady
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for confirming. There's no similar concern about the primary client's connection being reset immediately after newStream
is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no similar concern about the primary client's connection being reset immediately after newStream is called?
Can you elaborate a bit more? Connection reset after newStream can happen but the timer cannot save that case either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I misunderstood slightly and thought that the timer was meant to help with connections being dropped mid-request. Your explanation makes sense to me now.
So looking into my conn reset issue it seems like the issue is that with this change, when a stream resets before the connection has been established, Tests like the grpc json transcoder seems to use this to get access to the request that will be sent on that stream, but the stream is reset in certain cases due to the filter's usage of So the way I see it there's two options: 1) get I assume @mattklein123 thoughts? |
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Sorry can you provide more detail on the flow that is failing/different? I'm not completely clear on what is happening now. |
Signed-off-by: Snow Pettersen <snowp@squareup.com>
The grpc-json transcoder integration tests uses With the change in this PR, it seems like |
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transcoder integration test LGTM, thanks for fixing that.
This should be ready for review now (just some clang-tidy stuff to... tidy up) I extracted out the PendingRequest and queue management stuff from the h1 connection pool so that I could reuse it in the h2 one, which significantly reduces the amount of duplicated code compared to earlier. Besides that there's some minor tweaks to some integration tests to handle the new behavior of |
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Actually, looks like tests are flaky. Looking into it |
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Ok, tests are green (except mac tests that are taking forever). ptal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is well done. @alyssawilk can you also PTAL, especially at the integration test changes? @snowp I would upgrade the regression risk of this change to high IMO. This is all pretty scary.
source/common/http/conn_pool_base.h
Outdated
virtual void checkForDrained() PURE; | ||
|
||
Upstream::HostConstSharedPtr host_; | ||
Upstream::ResourcePriority priority_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: const
source/common/http/http2/conn_pool.h
Outdated
@@ -35,6 +36,9 @@ class ConnPoolImpl : Logger::Loggable<Logger::Id::pool>, public ConnectionPool:: | |||
ConnectionPool::Cancellable* newStream(Http::StreamDecoder& response_decoder, | |||
ConnectionPool::Callbacks& callbacks) override; | |||
|
|||
// Http::ConnPoolImplBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can make this override private or protected.
host_->cluster().resourceManager(priority_).requests().inc(); | ||
callbacks.onPoolReady(primary_client_->client_->newStream(response_decoder), | ||
primary_client_->real_host_description_); | ||
// If the primary client is not connected yet, queue up the request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping on this. Please make sure this has test coverage also.
void ConnPoolImpl::onUpstreamReady() { | ||
// Establishes new codec streams for each pending request. | ||
while (!pending_requests_.empty()) { | ||
newClientStream(pending_requests_.back()->decoder_, pending_requests_.back()->callbacks_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a test case that covers when there is a circuit breaker overflow for max requests once we start creating new streams here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
void expectClientCreate() { | ||
// Creates a new test client, expecting a new connection to be created and associated | ||
// with the new client. | ||
void expectClientCreate(absl::optional<int> buffer_limits = {}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/int/uint64_t or whatever the type is?
if (!grpc_request_messages.empty()) { | ||
ASSERT_TRUE(fake_upstream_connection_->waitForNewStream(*dispatcher_, upstream_request_)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the change here? Is this basically because in certain cases the old code was making an upstream connection and creating a stream, but then immediately resetting it or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, from what I could gather this code waits for was waiting for an upstream stream, but since the filter immediately called sendLocalReply
the upstream connection is reset before the connection is established, so upstream never sees a stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why did it see a stream before? I'm confused. @lizan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what happens is that:
- transcoder filter responds with
Continue
indecodeHeaders
- router creates an
UpstreamRequest
, opening a connection - transcoder filter calls
sendLocalReply
indecodeData
, reseting the connection opened in 2), canceling the pending stream before it's ever created.
Before this change, newStream
is called on the codec inline in ConnPool::newStream
which initializes the stream by calling nghttp2_submit_request
which triggers nghttp2_session_callbacks_new
-> ServerConnectionCallbacks.newStream
. This is all before the connection has been established, and is how FakeConnection
picks up new streams.
After this PR, all of this stream setup happens after the upstream connection has been established. This means that there is now a window between a connection being opened and when its established in which a connection reset will prevent a stream from being created. In this flow, a reset before the connection has been established will call cancel
on the PendingRequest
, so there's never a call to newStream
on the http2 codec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK this makes sense, but I don't think the explanation is quite right.
This is all before the connection has been established, and is how FakeConnection picks up new streams.
FakeConnection is a real network connection, so I think this was always relying on a specific sequence? Meaning, we made the connection and immediately wrote data to the stream, so the connection would pick it up. Now I think the sequence is more explicit in that we reset locally before we ever actually get back to create the stream on the newly connected connection. Right? In any case, this change now makes sense to me assuming we agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that fits my understanding.
@@ -115,8 +115,6 @@ class GrpcJsonTranscoderIntegrationTest | |||
upstream_request_->encodeTrailers(response_trailers); | |||
} | |||
EXPECT_TRUE(upstream_request_->complete()); | |||
} else { | |||
ASSERT_TRUE(upstream_request_->waitForReset()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason? No stream created w/ the new code? Trying to understand the flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't call waitForNewStream
, we don't have an upstream_request_
to call waitForReset
on
@@ -306,17 +306,24 @@ TEST_P(Http2UpstreamIntegrationTest, UpstreamConnectionCloseWithManyStreams) { | |||
{":authority", "host"}}); | |||
encoders.push_back(&encoder_decoder.first); | |||
responses.push_back(std::move(encoder_decoder.second)); | |||
|
|||
// Ensure we've established a connection before issuing the resets. | |||
// TODO(snowp): Why can't we handle resets here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm a little confused here. What was the sequence causing the hang here? I don't think we should need to wait for upstream before sending client resets? What was actually hanging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this change, the code was hanging in waitForNewStream
, which we never hit since we reset the stream while it's in the pending queue.
If only call waitForNewStream
for streams that won't get reset, this ends up hanging in waitForEndStream
about 50% of the time. The test sends between 1 and 51 requests, reseting every 15th, and every time it fails it's when a reset happens in the middle of all the requests (so number of requests > 15), but it doesn't always fail in that case.
It seems like it's getting stuck waiting for the (15 + 1)th stream to end, so I was thinking it might be a race condition either in the conn pool or in how we account for stream/connection events in test. It seems like somehow the reset on stream i=15 ends up affecting stream i=16.
I suspected that the pending request queue might be concurrently accessed, but guarding it with a mutex didn't seem to help.
I agree that this should work, but at this point I'm not even sure if the connection pool is buggy or if the tests just need to be reworked, so open to any ideas :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it some more I think it might be due to a race between when we call reset and when the connection is established: any reset that comes after connect will be proceeded by a new stream (because we've already called onPoolReady), while any reset before connect will not trigger a new stream (we'll cancel the pending request).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not comfortable merging this change until we understand this. I can think about this more tomorrow morning. Do you have full trace logs for the hang? That would be useful to look at.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a gist with a good and bad test run:
failed run: https://gist.githubusercontent.com/snowp/ecb62fd9cf74dc885f68ddfe2aa0ac8e/raw/8e54586118f15cbcf9b0eb3eb8f1030447eda8e9/failure.out
successful run: https://gist.githubusercontent.com/snowp/ecb62fd9cf74dc885f68ddfe2aa0ac8e/raw/8e54586118f15cbcf9b0eb3eb8f1030447eda8e9/success.out
This is when running with the following modifications to the code in the PR: snowp@4b59099
Namely:
- move waiting for the connection back to the original location
- hard code the number of requests instead of picking a random number
- skip waiting for requests that should be reset (without this we fail on the first waitForNewStream)
The test sends 17 requests, and resets every 15th of them, namely it will reset the 0th and 15th request.
This bit from the output logs makes me think my previous theory about this being a race between when the connection is established and when the request is reset:
~/Development/envoy(h2-queue):0 grep cancel failure
[2018-11-02 04:01:20.395][2075503][debug][router] [source/common/router/router.cc:938] [C1][S2898179549770959211] cancelling pool request
[2018-11-02 04:01:20.395][2075503][debug][pool] [source/common/http/conn_pool_base.cc:41] cancelling pending request
~/Development/envoy(h2-queue):0 grep cancel success
[2018-11-02 04:00:15.278][2069333][debug][router] [source/common/router/router.cc:938] [C1][S4194523254301487947] cancelling pool request
[2018-11-02 04:00:15.278][2069333][debug][pool] [source/common/http/conn_pool_base.cc:41] cancelling pending request
[2018-11-02 04:00:15.308][2069333][debug][router] [source/common/router/router.cc:938] [C1][S82244168407272366] cancelling pool request
[2018-11-02 04:00:15.308][2069333][debug][pool] [source/common/http/conn_pool_base.cc:41] cancelling pending request
Specifically, in the successful run both resets happened before the connection completed, as indicated by the log line stating that the pool request was canceled.
I suspect that in the run that failed, one of the reset requests created a stream (because it wasn't canceled), so it messed up the stream accounting on our end: we didn't expect the 15th request to generate a stream, so when it did we attributed it to the 16th request. This test is now failing because it's waiting for endStream
on what we think is the 16th stream, but that's actually the 15th request's stream, which has been reset and won't reach endStream
.
If I'm right, then in order to have this test send resets before the connection has been established, we need a way to tell whether we reset the request before or after the connect, because depending on that we may or may not see a stream on the FakeConnection
. One option would be to wait for N streams, compare to total number of requests R and from that infer that the first (R-N) requests must have been reset before the connection was established.
Hopefully this makes sense. Would definitely want someone else to look this over to make sure it seems like a reasonable explanation of what we're seeing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks for the really detailed investigation here. I agree your theory makes sense.
@alyssawilk probably disagrees with me here, but I would be inclined to make this test more deterministic (remove the random number of requests). I think the randomness makes this more confusing than it probably needs to be.
Can we figure out a way to make this test a bit easier to understand and more deterministic? @alyssawilk can likely help with this on Monday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll accept removing randomity here - I care more about it when it comes to payload sizes. Failing the nth request when n < [default max stream sizes] is less likely than triggering backup due to larger/smaller payloads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the test to always send 20 requests, so we get a reset on the 1st and 15th request.
upstream_requests.emplace_back(); | ||
FakeStreamPtr stream; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable move isn't explicitly needed, right? Just trying to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it's not, just a remnant of me playing around. in fact, it's not even used so i'll probably just remove it
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to take a look - I'll do a pass later today after I get through my morning meeting block :-)
One high level question - do we currently have developer docs on the connection pool design? I'm pretty solid on the downstream data plane but less so with upstream and it strikes me as a thing worth having architectural docs on if we don't already. And not just because it'd make this review easier ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And that day did not clear up the way I thought it was going to :-(
blocked off time monday to do a proper review. Apologies for the delay.
source/common/http/conn_pool_base.h
Outdated
ConnectionPool::Callbacks& callbacks); | ||
~PendingRequest(); | ||
|
||
// Cancellable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nitty nit, while you're moving this can you change it to ConnectionPool::Cancellable?
I took this as a really silly comment at first pass ;-)
Signed-off-by: Snow Pettersen <snowp@squareup.com>
There are not. I will take an action item to write some up. |
Marking as waiting. /wait |
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Ping on review on this. I think all the current comments have been addressed |
@snowp will review today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks great. 2 small things.
/wait
dispatcher_.clearDeferredDeleteList(); | ||
} | ||
|
||
// Verifies that reuqests are queued up in the conn pool and respect max request circuit breaking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo reuqests
@@ -306,16 +305,24 @@ TEST_P(Http2UpstreamIntegrationTest, UpstreamConnectionCloseWithManyStreams) { | |||
{":authority", "host"}}); | |||
encoders.push_back(&encoder_decoder.first); | |||
responses.push_back(std::move(encoder_decoder.second)); | |||
|
|||
// Ensure we've established a connection before issuing the resets. This is necessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't this be lifted up above the for loop since this is HTTP/2 Basically, make an HTTP connection and then wait for the upstream connection, and then do the for loop? That might be simpler to understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that won't work because we're waiting for an upstream connection, which won't be established until we send at least one request through to the router. Moving the check up causes the test to time out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK makes sense. Then can we do it just on the first iteration since we should just make a single connection and need to wait for it?
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Modifies the http2 connection pool to use a request queue similar to how the http/1.1 connection pool does. This is done in order to defer the connection pool callbacks until the connection has been established instead of simply invoking the callback immediately. Signed-off-by: Snow Pettersen <snowp@squareup.com> Signed-off-by: Fred Douglas <fredlas@google.com>
Modifies the http2 connection pool to use a request queue similar to
how the http/1.1 connection pool does. This is done in order to defer
the connection pool callbacks until the connection has been established
instead of simply invoking the callback immediately.
Signed-off-by: Snow Pettersen snowp@squareup.com
Risk Level: High
Testing: UTs for new conn pool behavior
Docs Changes: n/a
Release Notes: n/a
Part of #4903