-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: LoadBalancer2 and DelayedClientTransport.reprocess(). #2443
Conversation
This is the second step of a major refactor for the LoadBalancer-related part of Channel impl (grpc#1600)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hate to bring up naming again, but I'd like to suggest renaming Picker
to Selector
. Maybe it's just me, but Picker
always sounded weird for me in this context whereas Selector
make it's more intuitive and gives me a better idea of what it'll be responsible for. Proposed new naming: SubchannelSelector
, SelectedSubchanel
, #selectSubchannel
, #updateSubchannelSelector
).
* <p>The LoadBalancer is responsible for closing unused Subchannels, and closing all | ||
* Subchannels within {@link #shutdown}. | ||
*/ | ||
public abstract Subchannel createSubChannel(EquivalentAddressGroup addrs, Attributes attrs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: createSubChannel
=> createSubchannel
to make it consistent with other class names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
* Schedule a task to be run in the Channel Executor, which serializes the task with the | ||
* callback methods on the {@link LoadBalancer2} interface. | ||
*/ | ||
public abstract void runSeralized(Runnable task); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: runSerialized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
/** | ||
* No decision could be made. The RPC will stay buffered. | ||
*/ | ||
public static PickResult withNothingYet() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe withNoSubchannel
and also updated comment that the returned result is successful (Status.OK
) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LoadBalancer implementations care more about what constructor method to call, than what getStatus()
returns. I think getStatus()
is sufficiently documented, and it's unnecessary to talk about the resulting status.
I was struggling in the naming of this method. The message that I wanted to convey is that "LoadBalancer cannot make a decision yet, thus nothing will happen to this RPC.". I don't think "withNoSubchannel" does the job better. Some alternatives are:
withNoResult()
stillPending()
pending()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Maybe withNoResult()
would work better? I like pending()
the most, but it's missing with
and I'd like to to be consistent with other methods. Heh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to withNoResult()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍
I don't find Picker to be less intuitive than Selector, maybe because that we use "Picker" in our internal LB interface :). Given that, I do prefer Picker because it's shorter. |
Sounds good. It's good enough so I won't be fighting over it ;)
On Fri, Nov 25, 2016 at 12:48 PM Kun Zhang ***@***.***> wrote:
I don't find Picker to be less intuitive than Selector, maybe because that
we use "Picker" in our internal LB interface :). Given that, I do prefer
Picker because it's shorter.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#2443 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAtKnVUKuqtkOI-WGceir8ulvbI2fkOks5rB0mxgaJpZM4K7OL5>
.
--
Łukasz (on mobile)
|
public abstract EquivalentAddressGroup getAddresses(); | ||
|
||
/** | ||
* The same attributes passed to {@link #createSubChannel}. LoadBalancer can use it to attach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Helper#createSubchannel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
* called again and a new picker replaces the old one. If {@link #updatePicker} has never been | ||
* called, the channel will buffer all RPCs until a picker is provided. | ||
*/ | ||
public abstract void updatePicker(SubchannelPicker picker); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to updateSubchannelPicker
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary. updatePicker
can't be mistaken for anything else in this context, given that the argument type already refers to "Subchannel".
* additional information here, e.g., the shard this Subchannel belongs to. | ||
*/ | ||
public abstract Attributes getAttributes(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attributes
are immutable which means that if we want to be able to update attributes on existing channel, we'll need setAttributes(Attributes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily. You could use mutable values, e.g., AtomicReference
in Attributes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah. That's fair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
* | ||
* <h3>Channel Executor</h3> | ||
* | ||
* <p>Channel Executor is an internal executor of the channel, which is used to run serialize all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/run serialize/serialize/ or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
* <p>Channel Executor is an internal executor of the channel, which is used to run serialize all | ||
* the callback methods on the {@link LoadBalancer2} interface, thus the balancer implementation | ||
* doesn't need to worry about synchronization among them. However, the actual thread that Channel | ||
* Executor is typically the network thread, thus following rules must be followed to prevent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/is/is is/ or s/actual thread that/actual thread of the/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
* synchronization primitives, blocking I/O, blocking RPCs, etc.</li> | ||
* | ||
* <li><strong>Avoid calling into other components with lock held</strong>. Channel Executor may | ||
* run callbacks under a lock, e.g., the transport lock of OkHttp2. If your LoadBalancer has a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/OkHttp2/OkHttp/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
/** | ||
* The main balancing logic. It <strong>must be thread-safe</strong>. Typically it should only | ||
* synchronize on its own state, and void synchronzing with the LoadBalancer's state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/void/avoid/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
/** | ||
* A decision to proceed the RPC on a Subchannel. The state of the Subchannel is supposed to be | ||
* {@link ConnectivityState#READY}. However, a non-READY Subchannel will not fail the RPC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was unclear why this "However" was stated. Maybe something like, "However, since such decisions are racy, a non-READY Subchannel..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. Rephrased as suggested.
* <p>When a new picker is provided via {@link Helper#updatePicker}, the channel will apply the | ||
* picker on all buffered RPCs, by calling {@link SubchannelPicker#pickSubchannel}. | ||
* | ||
* <p>The channel will hold the a picker and use it for all RPCs, until {@link #updatePicker} is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/the a/the/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
* <p>It maintains at most one physical connection (aka transport) for sending new RPCs, while | ||
* also keeps track of previous transports that has been shut down but not terminated yet. | ||
* | ||
* <p>If there isn't an active transport yet, and the Subchannel is assigned to an RPC, it will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/the Subchannel is assigned to an RPC/an RPC is assigned to the Subchannel/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
// To be implemented by ManagedChannelImpl2 and unit test | ||
@Nullable | ||
ClientTransport obtainActiveTransport(Subchannel subchannel) { | ||
throw new UnsupportedOperationException("Not implemented"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems strange, but I'm willing to wait and see how it all fits together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I introduced a SubchannelImpl
that adds obtainActiveTransport()
to Subchannel
.
} | ||
pendingStreams.removeAll(toRemove); | ||
if (pendingStreams.isEmpty()) { | ||
pendingStreams = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think pendingStreams
should only be made null if the delayed transport is shut down and setTransport is called. Just leave the collection in-place, even if its empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since setTransport() will go away, and delayed transport will be kept open, it seems necessary to take opportunity to discard the hashmap, as it only grows in size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the code doesn't ever create a new hashmap. The next stream that needs to be pending will cause a NPE. If you want to re-initialize, you'll need to create the new hashmap here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... do you also need to handle transport terminated here, like in PendingStream.cancel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Actually I missed out the shutdown process for the LBv2 design. Would the following shutdown sequence work?:
ManagedChannelImpl.shutdown()
: stop accepting new calls, shutdown delayed transport. All other functionalities continue working.- Once the delayed transport is terminated, shutdown NameResolver and LoadBalancer.
- Once all subchannels and all OOB channels have terminated, ManagedChannelImpl will terminate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added termination handling, and fixed the NPE. Both covered by tests. PTAL.
} | ||
|
||
synchronized (lock) { | ||
if (pendingStreams == null || pendingStreams.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit worrying. If streams may be removed from pendingStreams
, then that seems like we could "start" a stream more than once (maybe as failures).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added documentation to say that reprocess()
cannot be called concurrently with itself or setTransport
. However, between the two synchronized blocks, streams may be cancelled, and shutdown()
may also be called, which will make pendingStreams
empty and null respectively. Looks I have to keep this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ejona86 PTAL
* | ||
* <h3>Channel Executor</h3> | ||
* | ||
* <p>Channel Executor is an internal executor of the channel, which is used to run serialize all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
* <p>Channel Executor is an internal executor of the channel, which is used to run serialize all | ||
* the callback methods on the {@link LoadBalancer2} interface, thus the balancer implementation | ||
* doesn't need to worry about synchronization among them. However, the actual thread that Channel | ||
* Executor is typically the network thread, thus following rules must be followed to prevent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
* synchronization primitives, blocking I/O, blocking RPCs, etc.</li> | ||
* | ||
* <li><strong>Avoid calling into other components with lock held</strong>. Channel Executor may | ||
* run callbacks under a lock, e.g., the transport lock of OkHttp2. If your LoadBalancer has a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
/** | ||
* The main balancing logic. It <strong>must be thread-safe</strong>. Typically it should only | ||
* synchronize on its own state, and void synchronzing with the LoadBalancer's state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
/** | ||
* A decision to proceed the RPC on a Subchannel. The state of the Subchannel is supposed to be | ||
* {@link ConnectivityState#READY}. However, a non-READY Subchannel will not fail the RPC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. Rephrased as suggested.
* <p>When a new picker is provided via {@link Helper#updatePicker}, the channel will apply the | ||
* picker on all buffered RPCs, by calling {@link SubchannelPicker#pickSubchannel}. | ||
* | ||
* <p>The channel will hold the a picker and use it for all RPCs, until {@link #updatePicker} is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
* <p>It maintains at most one physical connection (aka transport) for sending new RPCs, while | ||
* also keeps track of previous transports that has been shut down but not terminated yet. | ||
* | ||
* <p>If there isn't an active transport yet, and the Subchannel is assigned to an RPC, it will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
} | ||
pendingStreams.removeAll(toRemove); | ||
if (pendingStreams.isEmpty()) { | ||
pendingStreams = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since setTransport() will go away, and delayed transport will be kept open, it seems necessary to take opportunity to discard the hashmap, as it only grows in size.
// To be implemented by ManagedChannelImpl2 and unit test | ||
@Nullable | ||
ClientTransport obtainActiveTransport(Subchannel subchannel) { | ||
throw new UnsupportedOperationException("Not implemented"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I introduced a SubchannelImpl
that adds obtainActiveTransport()
to Subchannel
.
} | ||
|
||
synchronized (lock) { | ||
if (pendingStreams == null || pendingStreams.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added documentation to say that reprocess()
cannot be called concurrently with itself or setTransport
. However, between the two synchronized blocks, streams may be cancelled, and shutdown()
may also be called, which will make pendingStreams
empty and null respectively. Looks I have to keep this check.
This is the second step of a major refactor for the LoadBalancer-related
part of Channel impl (#1600).
Please review the PR as an entirety.