Diagnostic aggregator not reading messages from new publishers on /diagnostics topic [closed]

asked 2011-09-01 20:26:22 -0500

Victor Lopez gravatar image

updated 2014-01-28 17:10:20 -0500

ngrennan gravatar image

Here's the situation:

  1. Start roscore
  2. Start diagnostic aggregator
  3. Start node A
  4. Start node B
  5. Start node C
  6. Start node D

And keep starting up to 20 nodes.

All the nodes publish to the /diagnostics topic, which is read by the aggregator, and republished to /diagnostics_agg. The problem is that the aggregator is only reading messages from node A and B, but marks the rest of the nodes as stale.

If I kill the aggregator and restart it, the issue is solved, so it's not a problem with the aggregator yaml file or with the topic publication of the nodes.

Furthermore, if I take a look at the /diagnostics topic, I see messages from all the nodes being published there. And rostopic info /diagnostics displays 20 publishers and 1 subscriber (the aggregator).

Here's the log of the diagnostic_aggregator:

[roscpp_internal] [2011-09-02 11:31:43,575] [thread 0xb5ac3930]: [DEBUG] UDPROS server listening on port [59791]
[roscpp_internal] [2011-09-02 11:31:43,580] [thread 0xb5ac3930]: [DEBUG] Started node [/my_diagnostic_aggregator], pid [2125], bound on [robot], xmlrpc port [54121], tcpros port [33420], logging to [/tmp/5d1153a4-d546-11e0-8b8e-00e0f41fb340/my_diagnostic_aggregator-1.log], using [real] time
[roscpp_internal] [2011-09-02 11:31:43,817] [thread 0xb5ab6b70]: [DEBUG] Accepted connection on socket [7], new socket [12]
[roscpp_internal] [2011-09-02 11:31:43,817] [thread 0xb5ab6b70]: [DEBUG] TCPROS received a connection from [127.0.1.1:43084]
[roscpp_internal] [2011-09-02 11:31:43,817] [thread 0xb5ab6b70]: [DEBUG] Connection: Creating TransportSubscriberLink for topic [/rosout] connected to [callerid=[/rosout] address=[TCPROS connection to [127.0.1.1:43084 on socket 12]]]
[roscpp_internal] [2011-09-02 11:31:44,020] [thread 0xb5ac3930]: [DEBUG] XML-RPC call [getParam] returned an error (-1): [Parameter [/my_diagnostic_aggregator/analyzers/Common/analyzers/guiServer/remove_prefix] is not set]
[roscpp_internal] [2011-09-02 11:31:44,021] [thread 0xb5ac3930]: [DEBUG] XML-RPC call [getParam] returned an error (-1): [Parameter [/my_diagnostic_aggregator/analyzers/Common/analyzers/guiServer/startswith] is not set]
[roscpp_internal] [2011-09-02 11:31:44,022] [thread 0xb5ac3930]: [DEBUG] XML-RPC call [getParam] returned an error (-1): [Parameter [/my_diagnostic_aggregator/analyzers/Common/analyzers/guiServer/name] is not set]
[roscpp_internal] [2011-09-02 11:31:44,023] [thread 0xb5ac3930]: [DEBUG] XML-RPC call [getParam] returned an error (-1): [Parameter [/my_diagnostic_aggregator/analyzers/Common/analyzers/guiServer/contains] is not set]
[roscpp_internal] [2011-09-02 11:31:44,024] [thread 0xb5ac3930]: [DEBUG] XML-RPC call [getParam] returned an error (-1): [Parameter [/my_diagnostic_aggregator/analyzers/Common/analyzers/guiServer/expected] is not set]
[roscpp_internal] [2011-09-02 11:31:45,277] [thread 0xb5ac3930]: [DEBUG] Publisher update for [/diagnostics]: http://robot:44143/, http://robot:59731/,  already have these connections: 
[roscpp_internal] [2011-09-02 11:31:45,277] [thread 0xb5ac3930]: [DEBUG] Began asynchronous xmlrpc connection to [robot:44143]
[roscpp_internal] [2011-09-02 11:31:45,277] [thread 0xb5ac3930]: [DEBUG] Began asynchronous xmlrpc connection to [robot:59731]
[roscpp_internal] [2011-09-02 11:31:45,423] [thread 0xb52b5b70]: [DEBUG] Connecting via tcpros to topic [/diagnostics] at host [robot:56835]
[roscpp_internal] [2011-09-02 11:31:45,423] [thread 0xb52b5b70]: [DEBUG] Resolved publisher host [robot] to [127.0.1.1] for socket [14]
[roscpp_internal] [2011-09-02 11:31:45,423] [thread 0xb52b5b70]: [DEBUG] Async connect() in progress to [robot:56835] on socket [14]
[roscpp_internal] [2011-09-02 ...
(more)
edit retag flag offensive reopen merge delete

Closed for the following reason question is not relevant or outdated by tfoote
close date 2013-02-03 07:33:59

Comments

How fast are you publishing your diagonstics? It could be that the incoming buffer on the aggregator is being overflowed.
tfoote gravatar image tfoote  ( 2011-09-02 06:25:27 -0500 )edit
Each node is publishing to diagnostics once a second. I don't think this is the cause, because restarting the aggregator solves the problem. Running "ros wtf" reports "The following nodes should be connected but aren't:" and a list of nodes that aren't connected to the aggregator or other nodes.
Victor Lopez gravatar image Victor Lopez  ( 2011-09-04 19:07:53 -0500 )edit

If this is still an issue please open a ticket on https://github.com/ros/ros_comm

tfoote gravatar image tfoote  ( 2013-02-03 07:33:38 -0500 )edit