Is `diagnostics` the recommended system for monitoring sensor data rate?

asked 2020-03-30 09:24:58 -0500

Fyhn

11 ●2 ●4 ●4 https://github.com/mortenfyhn

updated 2020-03-31 07:10:50 -0500

Hi.

My goals are

to find some way to easily monitor if a sensor driver node is publishing data at the expected rate, and
to set up a general monitoring system for my ROS nodes, possibly tied to warning/error logging.

I've been looking into the diagnostics package and its sub-packages and tutorials. First of all, is diagnostics the recommended way to do what I'm trying to do? I was able to write a test node that publishes something and adds diagnostics for its own publishing rate, and it works, but it feels somewhat verbose for a "simple" task. Also, for third party drivers, it would be convenient to be able to add topic rate monitoring without modifying the publisher (to avoid maintaining a fork of the driver).

For more general monitoring, is it possible to monitor nodes based on whether they have ever printed warnings/errors/fatal errors to the log? So that all nodes with just info and debug to the log are "green", but if you ever print a warning, the node becomes "yellow" at least for a while, and so on.

Thanks.

Edit: A followup:

What about nodes that crash or otherwise fail to run their internal diagnostics? If a drive crashes, it won't for instance update its own topic frequency diagnostics, so on /diagnostics, I suppose the error won't show up. Is that right, and if so, what's the recommended way to solve it?

answered 2020-04-01 01:20:11 -0500

mgruhler

12390 ●32 ●193 ●173 https://github.com/mgruhler

I wanted to post as a comment, as I don't feel this is fully answering your question. But due to the size limitation I still posted as an answer.

I'll only briefly touch some of the questions, as they are pretty generic and thus would require quite a lengthy discussion (at least, some of them).

is diagnostics the recommended way to do what I'm trying to do

That ones easy, at least: Yes (in the ROS ecosystem, anyways) :-)

it feels somewhat verbose for a "simple" task

True, then again, you should not be using the raw diagnostics but use the diagnostics_updater as this highly simplifies the required code. It is not clear, if you have done this. There are also helper classes/functions that specifically deal with publishers (check the example, l. 191-211). Note that you also need to call tick() function on the respective updater for each publish. All in all, this should be like 5 lines of code, I don't feel that this is overly verbose.

for third party drivers, it would be convenient to be able to add topic rate monitoring without modifying the publisher

True (again). But there is no tool (that I know of) that can actually do that. You could achieve that by writing something like a generic "ExternalTopicRateMonitor" node, which subscribes to the topic. But this adds another node and one pub/sub-cycle, so it defeats the purpose of the diagnostics (at least, to some extent). It could still help you to achieve your goal.

is it possible to monitor nodes based on whether they have ever printed warnings/errors/fatal errors to the log?

I'm not quite sure what you expect here. In terms of "I'll just start up something", AFAIR, no. But then again there is the rqt_console, which does something pretty similar. So you could check how this is implemented therein. Or is that actually already what you want?

What about nodes that crash

Sure they cannot update their diagnostics. You could use a bond for that. But that would again force you to update the code. Or go the "ExternalTopicRateMonitor" way again (hoping that this doesn't crash) ;-)

I hope this answers some of your questions, but I'll also be interested in other people's opinion...

edit flag offensive delete link

Comments

Thank you very much! diagnostics_updater works well.

Regarding diagnostics for node "aliveness", I've noticed rosmon does that, without any modification to the nodes. Maybe with the same interface that rosnode ping/info uses. I guess the problem is that rosmon knows which nodes are expected to run, because rosmon is both the roslauncher and a monitor. A standalone system wouldn't automatically know which nodes are expected to live.

Fyhn ( 2020-04-06 07:44:21 -0500 )edit

Happy to help. Right, rosmon... Forget about that one already. Still have it on my check-when-I-have-time list ;-)

mgruhler ( 2020-04-06 07:58:29 -0500 )edit

add a comment

Is `diagnostics` the recommended system for monitoring sensor data rate?

1 Answer

Comments

Question Tools

Stats

Related questions

Is `diagnostics` the recommended system for monitoring sensor data rate? edit

1 Answer

Comments

Question Tools

Stats

Related questions

Is `diagnostics` the recommended system for monitoring sensor data rate?