-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JMX/RMI socket factory with timeout #298
Conversation
@@ -68,7 +61,7 @@ protected void createConnection() throws IOException { | |||
this.env.put("attribute.remote.x.request.waiting.timeout", CONNECTION_TIMEOUT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Short story : I think this is a no-op property.
Long story: I tried to find out where this property might be used. On the "official" oracle side I found something similar only in the the jmx spec with a minor variation (jmx.remore.x.request.waiting.timeout
listed as a non-standard parameter). We were using this value a while ago (change), initially added with this change. So far the only implementation I found that seems to honour this parameter is weblogic, I did not find any clue about this property inside the JDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, there might be a small change in behavior at connect-time but this looks much better. Great stuff.
public Socket createSocket(String host, int port) throws IOException { | ||
Socket socket = factory.createSocket(host, port); | ||
socket.setSoTimeout(timeoutMs); | ||
socket.setSoLinger(false, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍰
@Override | ||
public Socket createSocket(String host, int port) throws IOException { | ||
Socket socket = factory.createSocket(host, port); | ||
socket.setSoTimeout(timeoutMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the past we had two different timeout values:
sun.rmi.transport.proxy.connectTimeout
(connect TO)sun.rmi.transport.tcp.responseTimeout
(read TO)
This socket-level, and more explicit SO_TIMEOUT
is probably fine for all cases, but we may be potentially changing behavior on new connections by dropping the rmi_connection_timeout
. It does feel like a bit of a redundant option, but I'm wondering if we didn't have to set that for some weird edge-case.
Also, I may be missing something!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the usage of sun.rmi.transport.tcp.responseTimeout
because it sets SO_TIMEOUT
later in the whole thing lifecyle (as per 1 and 2) and some earlier operation would occur without having the SO_TIMEOUT
set.
AFAIK the latter option sun.rmi.transport.proxy.connectTimeout
should not have impact on our base case as far as my test went, we exit the socket factory here, before using the property value. I had a quick look in the code base history I did not find where we were using this property...
I think I will still try to rework the factory a little bit to have the timeout enforced during initial connection time. Because removing the mailbox made the code relying on the OS timeout behavior for the initial connection as the current factory sets the timeout after the initial connection (then I will retained the rmi_connection_timeout
and come up with the exact same behaviour as before) WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@truthbk I did something similar to the mailbox code (:trollface:) thus we are getting back the rmi_connection_timeout
, however it's fully located in the socket factory so any subsequent connection not going through RemoteConnection
will still be subject to the timeout and I think all case are covered (i.e. SSL/non-SSL, initial connection, subsequent connection directly by the RMI subsystem, timeout during socket lifetime).
@sirianni it should bring an effective & generic solution to the issue #279.
c18678b
to
4f3c530
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! 👍
Proper RMI/JMX socket factory with timeout for both SSL and non-SSL.
Should be a drop in replacement for actual code. I also suggest to remove
Connection.connectWithTimeout
because timeouts are enforced within the socket factory. I also tested the attach API and it seems that infinite freeze can not happen in that case.rmi_connection_timeout
&rmi_client_timeout
have been retained to configure timeouts in the socket factory.See code comment for a question about the mysterious
attribute.remote.x.request.waiting.timeout
property...A piece of the code path that may block because of network timeout is located here it plays with different configurable timeout during the TCP connection sequence, it even overrides for certain steps the
SO_TIMEOUT
from the the socket factory in at least one place (Can be set with through the following propertysun.rmi.transport.tcp.handshakeTimeout
). But having theSO_TIMEOUT
always set at the socket level ensure always having a configurable timeout on socket operation. The the initial connection timeout is managed differently to keep the ability to get a socket from any factory (used for SSL support at the moment). It shall be noticed that the connection timeout can be shortened but there is an upper bound for enforceable value. The upper bound depends on the OS settings (~1 min on average).Timeout will then be logged as shown below :
For information the call stack while connecting, providing information about involved classes :