Node not running at constant frequency

asked 2016-11-15 16:44:58 -0500

maxb gravatar image

Hi there :)

I've got a little problem running ROS on two machines and I don't really know how to find out what could cause the problem: Setup: I'm running nodes on two machines at different but constant frequencies (e.g. IMU at 50 Hz, controller at 10 Hz, ...). Most of the time it works perfectly well but just sometimes it seems that all nodes stop running for short instants (see plot of IMU measurements here: https://postimg.org/image/g3p3n4xtz/ ). You can see in the plot that it's running well until about 10.5 seconds, then slower and then there's a pretty long break of no data being published/received. I'm suspecting two problems:

  1. Wrong synchronization of the machines (I'm using chrony but I can't guarantee that everything is set up perfectly right)
  2. Too high CPU workload which does not allow all nodes to be executed (which would be weird since it's running fine at other times)

Have you ever observed a similar problem or do you know what could cause it? And if not - which tools or methods would you suggest to find the problem?

Any help will be much appreciated, thank you very much in advance!

Best, Max

edit retag flag offensive close merge delete

Comments

Are you having multiple callbacks (either subscriber/service callback) in the single-threaded node? Also, check for any sleep in your callback, it might block the main thread and prevent other callback to be triggered

DavidN gravatar image DavidN  ( 2016-11-16 00:21:39 -0500 )edit

Any details on how the two nodes are connected to one another? e.g. Wifi is known to create lags.

Humpelstilzchen gravatar image Humpelstilzchen  ( 2016-11-16 02:44:13 -0500 )edit

I do have multiple callbacks in almost all nodes (7 nodes and up to 5 callbacks) but there are no sleep or other time consuming functions in the callbacks. The nodes are connected by ssh over Wifi so this might be the reason. I also found out that the machines are properly synchronized.

maxb gravatar image maxb  ( 2016-11-16 18:06:28 -0500 )edit

Assuming the wifi is indeed the problem: Do you know any tricks to make the connection more reliable? Both in-ROS (maybe decrease number of topics/messages) and 'outside' of ROS?

maxb gravatar image maxb  ( 2016-11-16 20:05:17 -0500 )edit

I had the problem that bad reception with a lot of traffic ended in low throughput (Low Bit Rate in iwconfig). For my robot a bigger antenna helped.

Humpelstilzchen gravatar image Humpelstilzchen  ( 2016-11-17 00:40:06 -0500 )edit

Provided your application allows for it (can cope with lost msgs) and your msgs aren't too large (smaller than a datagram) you could try and see whether UDPROS works better. See Transport Hints for info.

gvdhoorn gravatar image gvdhoorn  ( 2016-11-17 01:59:24 -0500 )edit

Personally I never understood the transport hints. Does someone really need to patch all nodes for UDP usage? I'd wish for a global setting...

Humpelstilzchen gravatar image Humpelstilzchen  ( 2016-11-17 03:01:34 -0500 )edit

Btw we are indeed not sure yet if Wifi is the problem. I'ld recommend to check if problems also occur with Ethernet.

Humpelstilzchen gravatar image Humpelstilzchen  ( 2016-11-17 03:02:47 -0500 )edit