ROS2 subs/pubs stop working after node restart

Hi,

I have two nodes, simulation and control, which are designed to work side by side. The control node is designed to be started and stopped multiple times, while simulation should run continuously.

My problem is that after restarting control node a few times, the communication between nodes stops. The simulation nodes publishes to /raw_robot_data topic (custom message), and control publishes to /joint_command (another custom message) - both are written in C++.

When i start both nodes for the first time it always woks fine, but after restarting control for a few times (1 - 3 times) the communication stops. I've tried to debug it, but it seems that sometimes simulation stops publishing it's topic, and sometimes control does not start publishing to its topic. Control restart is done by first stopping node by Ctrl+C and then restarting it via launch file.

I've reviewed how the node is closed, making sure there are all proper destructors are in place. Is it the proper way to 'shutdown' nodes? I haven't found a clear explanation how to do it properly. I've also tried to manually shutdown all subs/pubs using *.reset() methods of the shared pointers. I've also experimented with different QoS settings, using dozen of different combinations (both ros2 premade and composed manually), and the only difference I've obvserved was while using best_effort strategy, the subscribers not always unsubscribed after node was stop (verified using ros2 topic info).

I am using default ROS2 settings everywhere I can

PLATFORM INFORMATION
system           : Linux
platform info    : Linux-5.16.11-76051611-generic-x86_64-with-glibc2.29
release          : 5.16.11-76051611-generic
processor        : x86_64

RMW MIDDLEWARE
middleware name    : rmw_fastrtps_cpp

ROS 2 INFORMATION
distribution name      : foxy
distribution type      : ros2
distribution status    : active
release platforms      : {'ubuntu': ['focal']}


Any ideas or clues on how to track the issue would be great.

edit retag close merge delete