ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question
5

Nodelets and bond timeouts

asked 2011-04-11 12:46:13 -0500

Daniel Stonier gravatar image

updated 2011-09-05 01:36:07 -0500

joq gravatar image

I've been having a problem with the diamondback upgrade - quite often my nodelets on the robot will suddenly exit cleanly with no warning or message. After some debugging, I've tracked this down to the bond heartbeat checks.

I observied it only happens when monitoring over the wireless to the robot, which in our congested building, often has rather lengthy delays. This seems to trigger the bond OnHeartBeatTimeout which prompts everything to shut up shop (would be good to print a message when doing so).

I can disable these via the /bond_disable_heartbeat_timeout or increase the timeouts in bond/msg and that keeps the system running.

Some questions.

  1. Why does the robot nodelet exit if a remote service client connection has a problem, or is it the source of the problem itself?
  2. What should I be wary of when disabling these globally via the param?
  3. Would it be possible to move configuration of the timeouts to $ROS_ROOT/config?
edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
2

answered 2011-04-27 10:25:17 -0500

tfoote gravatar image

updated 2019-04-03 16:06:30 -0500

130s gravatar image

I would suggest that you simply launch the nodelet loaders and the manager on the same side of the wifi link, and this whole problem will be avoided.

To your specific questions:

  1. The bond functionality was added to nodelets to allow easier error management and recovery. In particular if an individual nodelet segfaults it will take down the entire manager. With the bond implementation and using the respawn argument in roslaunch the entire system will restart successfully. Without the bond connection it also much harder to unload nodelets.
  2. When disabling them, crashes of the manager will not recover as usual. This flag is designed to enable debugging in gdb (otherwise when interrupted by gdb the bond will timeout and unload the nodelet.)
  3. There are a few other places where this timeout would be possible to be set. However it would need to be consistent across all computers on the network which $ROS_ROOT/config does not guarentee.
edit flag offensive delete link more

Comments

Loaded nodelets and their manager are all on the same side of the wifi link. The remote client on the other side of the wifi is a separate standalone roscpp node used for monitoring - it is communicating with the nodelets over ros services (.srv).
Daniel Stonier gravatar image Daniel Stonier  ( 2011-04-27 11:52:07 -0500 )edit
1

Any new ideas regarding this? We've been running into this on some machines and there appears to be no real solution apart from reducing load and hoping for the best.

Stefan Kohlbrecher gravatar image Stefan Kohlbrecher  ( 2015-03-06 19:58:35 -0500 )edit
1

@Stefan Kohlbrecher I suggest you open a new question this one is pretty old.

tfoote gravatar image tfoote  ( 2015-03-06 20:31:33 -0500 )edit
bit-pirate gravatar image bit-pirate  ( 2016-03-30 22:36:33 -0500 )edit

Question Tools

Stats

Asked: 2011-04-11 12:46:13 -0500

Seen: 3,514 times

Last updated: Apr 03 '19