ROS2 foxy: Can't discover nodes from other machines
Hi,
I'm currently looking into transitioning a ROS1 setup to ROS2. However, I can't get two machines to reliably talk to each other.
Setup
My test setup consists of a laptop (192.168.2.105) running Ubuntu 20.04 + ROS2 foxy and a Raspberry Pi 4b 4GB (192.168.2.185), also running Ubuntu (Server 64bit) 20.04 + foxy. For both installs I followed this tutorial. As far as I can tell, everything went smoothly. The suggested example (talker + listener) run just fine locally. The systems are connected via wifi.
Problem
However, when I try to run the talker on the RPi and the listener on my laptop, I don't receive any output from the listener.
I run this to start the nodes:
RPi:
.bashrc:
source /opt/ros/foxy/setup.bash
source ~/dev_ws/install/setup.bash
export ROS_DOMAIN_ID=1
source /usr/share/colcon_cd/function/colcon_cd.sh
export _colcon_cd_root=~/ros2_install
ros2 run py_pubsub talker
Output:
[INFO] [1614045051.113840842] [minimal_publisher]: Publishing: "Hello World: 0"
[INFO] [1614045051.546117451] [minimal_publisher]: Publishing: "Hello World: 1"
[INFO] [1614045052.046084469] [minimal_publisher]: Publishing: "Hello World: 2"
Laptop
.bashrc:
source /opt/ros/foxy/setup.bash
source ~/dev_ws/install/setup.bash
export ROS_DOMAIN_ID=1
source /usr/share/colcon_cd/function/colcon_cd.sh
export _colcon_cd_root=~/ros2_install
ros2 run py_pubsub listener
Output:
No output, stuck in waiting for message...
Running ros2 node list
shows only the node locally available. Running ros2 doctor
gives the following print, that I would consider as OK (aside from the issue of this topic):
/opt/ros/foxy/lib/python3.8/site-packages/ros2doctor/api/package.py: 112: UserWarning: rqt_reconfigure has been updated to a new version. local: 1.0.6 < required: 1.0.7
/opt/ros/foxy/lib/python3.8/site-packages/ros2doctor/api/package.py: 124: UserWarning: Cannot find required versions of packages: py_pubsub my_package
/opt/ros/foxy/lib/python3.8/site-packages/ros2doctor/api/topic.py: 56: UserWarning: Subscriber without publisher detected on /chatter.
All 4 checks passed
Possible solutions
This is apparently a common issue as there are many related questions. However, none of the suggested solutions solved my issue. Below I will list every suggestion I could find and tested myself as a possible help to others:
Check multicast communication:
Machine 1:
ros2 multicast receive
Machine 2:
ros2 multicast send
When sending once, the receiver does nothing. When sending for another time, the receiver reliably gets the message. Both directions (RPi <-> laptop) behave the same. I have tried waiting for a while after sending the first message, but this does not seem to be an issue of high latency.
Configure firewall
sudo ufw allow in proto udp to 224.0.0.0/4
sudo ufw allow in proto udp from 224.0.0.0/4
ifconfig
Multicast is listed in both wifi adapters.
Check the network by pinging IP:
ping 192.168.2.185
PING 192.168.2.185 (192.168.2.185) 56(84) bytes of data.
64 bytes from 192.168.2.185: icmp_seq=1 ttl=64 time=29.7 ms
64 bytes from 192.168.2.185: icmp_seq=2 ttl=64 time=3.88 ms
64 bytes from 192.168.2.185: icmp_seq=3 ttl=64 time=3.81 ms
ping 192.168.2.105
PING 192.168.2.105 (192.168.2.105) 56(84) bytes of data.
64 bytes from 192.168.2.105: icmp_seq=1 ttl=64 time=0.046 ms
64 bytes from 192.168.2.105: icmp_seq=2 ttl=64 time=0.045 ms
64 bytes from 192.168.2.105: icmp_seq=3 ttl=64 time=0.046 ms
Check ROSDOMAINID:
echo $ROS_DOMAIN_ID
Laptop: 1
Raspi: 1
Check RMW_IMPLEMENTATION:
echo $RMW_IMPLEMENTATION
Laptop:
Raspi:
Deactivating unused network adapters
sudo ifconfig eth0 down
ifconfig
no longer lists eth0
on both devices. wlan0
/wlp2s0
and lo
remain. Turning off lo
terminates ssh connection between machines, so I assume this is needed.
Check UDP connection
Machine1:
nc -lu -p 5555
Machine2:
nc -vzu 192.168.1.xx 5555
Works in both directions on first try: Connection to ... [udp/rplay] succeed!
Restart ros2 daemon:
ros2 daemon stop
ros2 daemon start
Check for same subnet
ifconfig
netmask 255.255.255.0
for both machines.
Set the rm_filter parameter
I could not find any information on this and thus have not tried it. Hints are very welcome!
Further investigations
I gladly provide more info on my setup, however, I feel this post is already quite lengthy.
I would appreciate any hints on how to solve this issue. While researching this I came across many people (not only those 5 posts linked above) that struggled with apparently the same problem.
Asked by Finn2708 on 2021-02-22 21:08:24 UTC
Comments
I have the exact same problem, and just to rule out the possible problems that might occur due to Wifi, I have both physical machines on the same network connected via Ethernet
Asked by goksankobe on 2021-04-09 04:44:32 UTC
could you able to fix this problem, i am stuck on the same issues from last two days and none of the solutions worked out for me. i dont event get multicast running aswell, no messages are being sent from or to to any machine and i have the exact setup like yours.
Asked by Dp on 2023-07-21 03:36:43 UTC