Nodelets, getMTNodeHandle, timers and publish slowdown
We have a nodelet that we allow to multithread by using the nodelet::Nodelet::getMTNodeHandle()
function.
Next, we use ros::NodeHandle::createTimer(0.001)
to create four 1 ms timers that do not explicitly share any context, but within each timer we call ros::Publisher::publish()
to publish a short (8 bytes or so) message to a shared topic with a queue with 100 entries. Each timer tries to read from a hardware device (CANBUS) that publishes at over 100 Hz, and if it finds input it is interested in, it publishes it to the topic.
Measuring the execution time of publish()
, we find that about 4 to 8 times per second it takes longer than 1 ms, sometimes up to 20 ms, to do the publish()
call alone.
I know from previous experience that publish()
can cause a context switch, and there are a number of other threads running, but nothing that should be hogging the system for 20 ms.
Has anyone experienced something like this? What might be a cause? Are timers re-entrant? Is publish()
thread-safe?
This sounds a lot like what would happen if you don't use a real-time OS with deterministic scheduling for your threads and a real-time capable node.
Additionally: "normal Linux" kernels and userlands might have difficulty running (multiple) timers at rates of a kHz and more consistently. If not using preempt_rt (or something similar) it's all going to be best effort. Missing deadlines is ..
.. unfortunately something that can happen then.
Note: I'm not saying this is the ultimate cause of the issue you describe. I just wanted to make you aware of (one of) the limitations of non-real-time OS.
Note that even with an RT OS you may run into issues as
roscpp
is not real-time safe.