Ask Your Question
4

What are 'bond' topics for?

asked 2011-03-20 08:02:07 -0500

updated 2011-03-20 08:34:49 -0500

.. and why are there so many connections involving them. e.g.:

bouffard@lipschitz:~/ros$ rosnode info /pelican2/camera1394_nodelet
--------------------------------------------------------------------------------
Node [/pelican2/camera1394_nodelet]
...    
contacting node http://pelican2:59141/ ...
Pid: 7324
Connections:
...
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: /pelican2/camera1394_nodelet
    * direction: outbound
    * transport: INTRAPROCESS
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: /pelican2/camera_nodelet_manager
    * direction: outbound
    * transport: TCPROS
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: /pelican2/jplvision_node
    * direction: outbound
    * transport: TCPROS
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: http://pelican2:59141/
    * direction: inbound
    * transport: INTRAPROCESS
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: http://pelican2:35545/
    * direction: inbound
    * transport: TCPROS
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: http://pelican2:52260/
    * direction: inbound
    * transport: TCPROS

and

bouffard@lipschitz:~/ros$ rosnode info /pelican2/camera_nodelet_manager
--------------------------------------------------------------------------------
Node [/pelican2/camera_nodelet_manager]
...
contacting node http://pelican2:35545/ ...
Pid: 7317
Connections:
...
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: /pelican2/camera_nodelet_manager
    * direction: outbound
    * transport: INTRAPROCESS
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: /pelican2/camera1394_nodelet
    * direction: outbound
    * transport: TCPROS
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: /pelican2/jplvision_node
    * direction: outbound
    * transport: TCPROS
...
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: http://pelican2:35545/
    * direction: inbound
    * transport: INTRAPROCESS
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: http://pelican2:59141/
    * direction: inbound
    * transport: TCPROS
 * topic: /pelican2/camera_nodelet_manager/bond
    * to: http://pelican2:52260/
    * direction: inbound
    * transport: TCPROS
...

In particular I'm interested why there are multiple connections on the same bond topic, some of which are using transport TCPROS despite the use of nodelets?

Edit with a followup question:

Doesn't having a 'spawner' process that hangs around for the lifetime of my nodelet, and moreover, is constantly communicating over TCPROS, defeat some of the main purposes of having nodelets, namely to reduce the number of processes and unnecessary network-level communication? In all my use cases at least, nodelets are a nice way to couple together what would otherwise be nodes but I don't need any of the respawning behaviour that you describe--either the entire manager plus its loaded nodelets are up, or they are all down. Could the 'spawner' nodes optionally exit after their nodelet has been loaded successfully?

edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
8

answered 2011-03-20 08:22:55 -0500

tfoote gravatar image

updated 2011-03-20 09:56:06 -0500

Bond is a tool which connects two units of code together with a heartbeat such that if one side goes down the other also goes down. See bond on the wiki for more information.

Nodelets use bond to connect the nodelet spawner to the dynamically loaded instance in the manager. By doing this if the manager dies, all the spawner instances come down, and if desired can be relaunched. And if a spawner process is killed (even with a -9, or a segfault) the nodelet will be unloaded when the bond is broken.

The nodelet spawner processes are separate processes which is why the connections are TCPROS.

Re: Followup on the overhead of bond.

The purpose of nodelets was to allow zero copy transports of high volume data (either fast or big or both). Passing images or PointClouds through the network interface with 1-2 memcopies is prohibitively expensive even on fast hardware. Bond only uses heartbeat at the default rate of 1Hz. Keeping a couple sockets open with a 1 Hz update is a low enough overhead that I doubt that the difference is measurable. The first implementation of nodelets did not have bond and it was added to provide more control, the ability to introspect, and to allow error recovery in the case of failures.

edit flag offensive delete link more

Comments

Thanks, but see my followup question above.
Patrick Bouffard gravatar imagePatrick Bouffard ( 2011-03-20 08:51:51 -0500 )edit
1
I've updated my answer. You could make that feature request, but I'd want a measurable improvement for some usecase before I increased the complexity of the API.
tfoote gravatar imagetfoote ( 2011-03-20 09:59:14 -0500 )edit
Thanks for the update -- could you expand on specifically what kinds of control, introspection, and error recovery are enabled by the bond?
Patrick Bouffard gravatar imagePatrick Bouffard ( 2011-03-20 10:24:44 -0500 )edit
1
control: you can easily stop it by killing the spawner. Introspection: if the spawner's up you know that the nodelet is still running. Error recovery: if the nodelet manager crashes the spawner will crash and whatever requested the nodelet has an option to respond. (often it's roslaunch with respawn = true)
tfoote gravatar imagetfoote ( 2011-03-20 15:53:49 -0500 )edit
While I can't comment on how much sense this makes for other use cases, in my own if the manager or its nodelets are crashing, then it makes sense to fix the underlying issue, not to just respawn. I think I'll look into what it would take to make 'bondage' optional.
Patrick Bouffard gravatar imagePatrick Bouffard ( 2011-03-20 16:33:56 -0500 )edit
I just created https://code.ros.org/trac/ros-pkg/ticket/4878 with some thoughts on implementation.
Patrick Bouffard gravatar imagePatrick Bouffard ( 2011-03-20 16:34:36 -0500 )edit

Is there an updated url for that ticket?

lucasw gravatar imagelucasw ( 2019-09-24 17:22:54 -0500 )edit
0

answered 2011-03-20 08:14:56 -0500

Eric Perko gravatar image

The bond wiki page has more details on what Bond is used for. I think the following from that wiki page addresses their use case for nodelets:

When spawning a nodelet (or anything else), two processes, the spawner and the container, communicate to bring the nodelet up, however, the current system does not cleanly deal with all termination possibilities (#4221). Creating a bond between the spawner and the container allows each to know when the other crashes and to implement appropriate recovery behaviors.

As to why they are using TCPROS as opposed to INTRAPROCESS, not sure. Perhaps bond isn't publishing shared_ptrs (which IIRC is the requirement for the intraprocess zero-copy roscpp magic). I could imagine using zero-copy to be a bad thing for this particular case - if one side dies after publishing a shared_ptr, is that shared_ptr still going to be valid? I can see using the standard copy-publish helping to avoid that type of question.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 2011-03-20 08:02:07 -0500

Seen: 2,216 times

Last updated: Mar 20 '11