ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question

Best practices or code examples of how a "complete system" manages nodes

asked 2023-05-18 10:49:32 -0500

sameh4 gravatar image

Most good examples of ROS systems I've seen on Github make strong use of launch files, and keep a good separation between nodes.

What I have not seen is how a complete robot system handles node faults. For example if a node goes down while the robot is running, there are probably several potential courses of action:

  1. try to relaunch the node
  2. if after 3 attempts the node does not come back, log or report this to some notification system

While that seems self-explanatory, I am curious if anyone has an example of the implementation of this?

My primary issue with ros2 has always been a lack of open source "professional projects" where one can learn how a production-grade ros2 code base looks like. There are quite a few using ros-neotic out in the wild however.

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted

answered 2023-05-18 13:34:29 -0500

updated 2023-05-18 13:35:22 -0500

Look at Nav2 + Lifecycle Manager, that is a professionally maintained and commercially deployed system that has checks on deadlocked or crashed servers to handle respawn and lifecycle management to bring down the system into a safe state until the fault is handled, when that fault is internal to Nav2.

It is however still on the application developer for higher level failures to put the system into a safe state, but we make it as easy as possible with tooling and the infrastructure to support Nav2 safe state setting once your application detects the problem to require it.

edit flag offensive delete link more

Question Tools



Asked: 2023-05-18 10:49:32 -0500

Seen: 140 times

Last updated: May 18 '23