Robotics StackExchange | Archived questions

ROS2 and network usage

Hi,

I'm using ROS2 Bouncy and Crystal and I noticed a very strange behavior.

I'm running several nodes on the same machine (my laptop). The nodes are standard talkers and listeners. The DDS implementation is FastRTPS. My laptop is connected to the internet through an ethernet connection and wi-fi

When I start several nodes (one immediately after the other), it's common to see the ethernet connection disconnecting. This happens more frequently when the number of nodes that I run is high (10 15 nodes).

EDIT: linked to this github issue https://github.com/ros2/rmw_fastrtps/issues/255

EDIT:

When the ethernet gets disconnected I observe the following logs:

dmesg

e1000e: enp0s31f6 NIC Link is Down

syslog

Jan 24 11:26:15 asoragna kernel: [6536.084261] e1000e: enp0s31f6 NIC Link is Down
Jan 24 11:26:17 asoragna ntpd[3727]:Deleting interface #25 enp0s31f6,10.102.1.49#123, interface stats: received=0, sent=5, dropped=0,active_time=72 secs
Jan 24 11:26:17 asoragna ntpd[3727]: Deleting interface #26 enp0s31f6, fe80::a49b:ede8:e675:83c7%2#123, interface stats: received=0, sent=0, dropped=0, active_time=72 secs
Jan 24 11:26:19 asoragna NetworkManager[785]: <info>  [1548329179.3922] device (enp0s31f6): link disconnected (calling deferred action)
Jan 24 11:26:19 asoragna NetworkManager[785]: <info>  [1548329179.3928] device (enp0s31f6): state change: activated -> unavailable (reason 'carrier-changed') [100 20 40]
Jan 24 11:26:19 asoragna NetworkManager[785]: <info>  [1548329179.4094] dhcp4 (enp0s31f6): canceled DHCP transaction, DHCP client pid 21312
Jan 24 11:26:19 asoragna NetworkManager[785]: <info>  [1548329179.4095] dhcp4 (enp0s31f6): state changed bound -> done
Jan 24 11:26:19 asoragna avahi-daemon[772]: Withdrawing address record for 10.102.1.49 on enp0s31f6.
Jan 24 11:26:19 asoragna avahi-daemon[772]: Leaving mDNS multicast group on interface enp0s31f6.IPv4 with address 10.102.1.49.
Jan 24 11:26:19 asoragna avahi-daemon[772]: Interface enp0s31f6.IPv4 no longer relevant for mDNS.
Jan 24 11:26:19 asoragna NetworkManager[785]: <info>  [1548329179.4164] manager: NetworkManager state is now CONNECTED_LOCAL
Jan 24 11:26:19 asoragna avahi-daemon[772]: Withdrawing address record for fe80::a49b:ede8:e675:83c7 on enp0s31f6.
Jan 24 11:26:19 asoragna avahi-daemon[772]: Leaving mDNS multicast group on interface enp0s31f6.IPv6 with address fe80::a49b:ede8:e675:83c7.
Jan 24 11:26:19 asoragna avahi-daemon[772]: Interface enp0s31f6.IPv6 no longer relevant for mDNS.

Do you have any hint about what could be causing this?

Asked by alsora on 2019-01-24 05:24:55 UTC

Comments

it's common to see the ethernet connection disconnecting.

is it really disconnecting, or can you just not get any other traffic to be passed through the connection? Those are different things.

Asked by gvdhoorn on 2019-01-24 06:12:41 UTC

It's really disconnecting, I get the popup message from Ubuntu showing it.

Asked by alsora on 2019-01-24 06:15:13 UTC

Does dmesg show something related at that time? syslog lines?

Asked by gvdhoorn on 2019-01-24 06:16:08 UTC

Updated with dmesg output and partial syslog

Asked by alsora on 2019-01-24 06:37:18 UTC

What sort of switch / router do you have this connected to?

Simple(r) consumer routers can crash when they are bombarded with too much traffic.

Not saying this is the cause, but something to look at.

Asked by gvdhoorn on 2019-01-24 07:41:36 UTC

My laptop is connected to the internet through an ethernet connection and wi-fi

This is a bad idea and causes a lot of problems. Use one link only and your system will run stable. Desktops are not prepared as routers do. 2 default routes via 2 interfaces will cause a lot of pain.

Asked by ChriMo on 2019-01-24 12:08:06 UTC

BTW: how handles DDS dual homed hosts ???

Asked by ChriMo on 2019-01-24 12:11:22 UTC

As long as one interface does not have a default route things should be fine. And even then, if the metric is different for the two routes, one will almost never be used.

It doesn't make much sense to me such a setup, that I agree with.

Asked by gvdhoorn on 2019-01-24 12:44:43 UTC

What I mean is: wi-fi is enabled while I'm connected to the ethernet. When the ethernet disconnects the laptop switches to wi-fi connection, without waiting to connect to it, it's the default setup of every laptop I guess. The wi-fi still works now However disabling the wi-fi the outcome is the same

Asked by alsora on 2019-01-24 13:25:23 UTC

the main problem is, when my developers connect both interfaces into networks with DHCP :-(

Don't allow this !!!

Asked by ChriMo on 2019-01-24 14:01:13 UTC

it's the default setup of every laptop I guess

no, not really.

You still haven't told us what the rest of your network setup is.

The reason for the disconnect seems to "carrier loss", either due to the switch/router, or because of some renegotiation taking place.

Asked by gvdhoorn on 2019-01-24 14:25:28 UTC

I don't know the router, however 1) it's a company router so it should be more performant than a consumer one 2) I experienced the problem after moving to new office 3) If it can be of interest the disconnection happens even if no messages are published. I will try to get info about the router

Asked by alsora on 2019-01-31 12:48:50 UTC

3) If it can be of interest the disconnection happens even if no messages are published

yes, that is certainly important information.

Would you be able to try with a different RMW implementation (ie: different DDS vendor)?

Asked by gvdhoorn on 2019-01-31 12:57:59 UTC

No issues with Opensplice. I have a different problem with Connext now, but I will try it too as soon as possible.

However this may be already suggesting that it's a fastrps problem..

Asked by alsora on 2019-01-31 13:30:00 UTC

No issues with Opensplice. [..] However this may be already suggesting that it's a fastrps problem..

yes, that would seem so.

I would suggest to post an issue on the fastrtps tracker.

Please post a link to that issue here, so we keep things connected.

Asked by gvdhoorn on 2019-01-31 13:32:22 UTC

Answers