-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make daemon “reset” itself when the IP address changes #284
Make daemon “reset” itself when the IP address changes #284
Conversation
@codebot can you please ask someone on the SOSS project to try this out? Maybe Grey? |
Thanks for this! Indeed, it does seem to be correctly deciding when the interface address has changed (I inserted a few However, I find that although the initial run of
Test scenario:Machine A is continually connected to a single WiFi network, letting this command run forever:
Machine B is running ROS Dashing is being used on both machines. Machine B is running the |
@codebot I will try to reproduce a similar scenario here. Edit: I will also like to know platform details. Including rmw implementation, OS. |
Thanks!
To keep it "simple," machine A stays at the same address throughout.
An easy way to replicate the scenario is to bounce machine B between a
"normal" wireless LAN connection and a cellphone hotspot connection.
Cheers
…On Thu, Jun 27, 2019, 8:36 PM ivanpauno ***@***.***> wrote:
@codebot <https://github.com/codebot> I will try to reproduce a similar
scenario here.
Does the interface address of Machine A change too?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#284?email_source=notifications&email_token=ABIFDWUQS6MAF4BLSKCWAXTP4SX6HA5CNFSM4HYKV7DKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYW7FUI#issuecomment-506327761>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABIFDWXQXVA32JETUWAZEQLP4SX6HANCNFSM4HYKV7DA>
.
|
@codebot I'm not being able to reproduce your problem. Could you provide me some other details?:
And anything else you think is useful for reproducing the issue correctly. |
Thanks for trying to replicate this.
The network in question is "unusual" in some ways. For reasons I still don't understand, it requires TTL >= 2 in order for UDP multicast from "peers" on the WiFi to reach each other. I'm doing that via the I'll try this experiment again on a "normal" WiFi network tomorrow. |
@codebot Just checking if you came up with any further information on this topic. |
@codebot Friendly ping. |
OS: machine A (linux desktop), machine B (linux VM on laptop), ubuntu 18.04 |
ea15772
to
b8c9a75
Compare
@claireyywang It would be good to have some additional information:
My setup: I have two machines in the same LAN, using static IPs. I run a node in machine A (e.g.
I think there are two possible reasons why doing
I don't believe it's the first problem, as I have checked and tested the logic a bunch of times.
And if some network interface changed, you should see:
@claireyywang Could you test this and check if you see the If you're not seeing that, please copy all your logs here (the ones when you were able to discover topics in the other machine, and the ones after you stopped being able). |
Thanks for the suggestions @ivanpauno! Here are some updates.
After switching to a different wifi network, I ran
|
Signed-off-by: ivanpauno <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
fa1ef33
to
d7fd738
Compare
Thanks for checking @claireyywang !
Though participant removal is detected fast (e.g.: when you
I was finally able to reproduce this.
I corrected some bugs in 9767d0f and forced to reset the daemon node when a new network interface is detected in d7fd738. I wonder if this only happens with |
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
I confirm it is working. I switched machine between two different networks and the delay is significantly shorter; topics are correctly discovered without me doing anything to the daemon. Seems to me that both problems are solved indeed. |
wrapper.__name__ = attr.__name__ | ||
|
||
if inspect.ismethod(attr): | ||
return wrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivanpauno consider:
if inspect.ismethod(attr):
@functools.wraps(attr)
def wrapper(*args, **kwargs):
self.reset_if_addresses_changed()
return getattr(self.node, name)(*args, **kwargs)
wrapper.__signature__ = inspect.signature(attr)
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
wrapper.__signature__ = inspect.signature(attr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, functools.wraps
does that automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is needed if we want to keep the decorator as transparent as possible, and no, functools.wraps
does not force any function signature, only naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See ros2/launch#279 (comment), though it's arguably a detail that has no impact on this particular code (at least for the time being).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a proof, you run do the following:
import functools
import inspect
def asd(bsd, csd):
print('asd')
@functools.wraps(asd)
def wrapper(*args, **kwargs):
asd(*args, **kwargs)
print(inspect.signature(wrapper) == inspect.signature(asd))
The result is True
.
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
if not hasattr(args, 'debug') or not args.debug: | ||
kwargs['stdout'] = subprocess.DEVNULL | ||
kwargs['stderr'] = subprocess.DEVNULL | ||
subprocess.Popen(cmd, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivanpauno meta: I had not noticed, and it predates this PR, but this is a peculiar way to demonize a process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what is the correct way, but in any case, I would make that modification in a follow-up PR.
ros2cli/ros2cli/node/daemon.py
Outdated
'--ros-domain-id', str(ros_domain_id)], | ||
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, **kwargs) | ||
'--ros-domain-id', str(ros_domain_id)]) | ||
if not hasattr(args, 'debug') or not args.debug: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivanpauno hmm all CLIs that use NodeStrategy
and happen to have a debug
option may unintentionally hook up the daemon to their standard streams. Consider adding another argument to the spawn_daemon
function instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See e339123.
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
CI in ros2/ci#346 (comment). |
Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
The problem is that when the IP address changes (e.g.: by DHCP), discovery doesn't work all the time.
This is a first proposal for solving the problem.
I'm doing the following:
The methods registered in the
XMLRPC
server now check if one IP has changed, and in that case theDirectNode
used by the Daemon is reset.When the
DirectNode
is reset, the requested method is immediately call. Maybe anspin_time
should be added (for ensuring enough discovery time).With "checking that one IP has changed", I'm referring to the following -> If the IP address of any interface that has a gateway has changed, the
DirectNode
is reset.As far as I understand, it's not possible to know what interface is the
rmw
implementation using.Python doesn't have in the standard library a package for listing network interfaces, etc (in a platform independent way).
I'm using netifaces for that. I don't know if we want to add a new dependency or not.
In case that we don't want to add it, I can maybe add this as an optional plugin in another package.
For trying this, you have to install netifaces:
python3 -m pip install -U netifaces