ROS2 subs/pubs stop working after node restart
Hi,
I have two nodes, simulation
and control
, which are designed to work side by side. The control
node is designed to be started and stopped multiple times, while simulation
should run continuously.
My problem is that after restarting control
node a few times, the communication between nodes stops.
The simulation
nodes publishes to /raw_robot_data
topic (custom message), and control
publishes to /joint_command
(another custom message) - both are written in C++.
When i start both nodes for the first time it always woks fine, but after restarting control
for a few times (1 - 3 times) the communication stops. I've tried to debug it, but it seems that sometimes simulation
stops publishing it's topic, and sometimes control
does not start publishing to its topic. Control
restart is done by first stopping node by Ctrl+C
and then restarting it via launch file.
I've reviewed how the node is closed, making sure there are all proper destructors are in place. Is it the proper way to 'shutdown' nodes? I haven't found a clear explanation how to do it properly. I've also tried to manually shutdown all subs/pubs using *.reset()
methods of the shared pointers. I've also experimented with different QoS settings, using dozen of different combinations (both ros2 premade and composed manually), and the only difference I've obvserved was while using best_effort
strategy, the subscribers not always unsubscribed after node was stop (verified using ros2 topic info
).
I am using default ROS2 settings everywhere I can
PLATFORM INFORMATION
system : Linux
platform info : Linux-5.16.11-76051611-generic-x86_64-with-glibc2.29
release : 5.16.11-76051611-generic
processor : x86_64
RMW MIDDLEWARE
middleware name : rmw_fastrtps_cpp
ROS 2 INFORMATION
distribution name : foxy
distribution type : ros2
distribution status : active
release platforms : {'ubuntu': ['focal']}
Any ideas or clues on how to track the issue would be great.