Revision history - ROS Answers: Open Source Q&A Forum

After spending quite some time looking for information about ROS callback queues, multi-threaded spinners and timers, I think I can answer my questions. Seems like the ROS documentation could be improved on this side.

A very useful article about multi-threaded spinners and multiple callback queues can be found here: https://levelup.gitconnected.com/ros-spinning-threading-queuing-aac9c0a793f

As known, ROS does some internal threading. It runs each subscriber in a receiver thread and each timer in a timer thread. This allows to receive data independent of the spinner. However, these threads do not process the callback, they just collect it. For the subscriber, it adds an element to the subscriber queue and for the timer, it adds the timer callback to the callback queue. Timer callbacks end up in the same callback queue with subscriber callbacks.

Now, on each spin the subscriber queue is added to the callback queue (number of elements to keep in describer queue can be set. If spinner has much lower frequency than incoming subscriber callbacks, they might be dropped). The complete callback queue is processed in first-in-first-out order. This means in each spin, every element in the callback queue gets processed.

Callbacks should always be fast, because multiple of the same callback can end up in the queue if the subscriber queue is not limited to 1. For the timer to work properly, it is required that the spinner runs regularly at a fast frequency, because the ROS timer just adds elements to the callback queue when spin is called and might only execute it after other callbacks are processed.

Also, for this reason, it makes sense to use timers instead of limiting the frequency of the spinner. Limiting the frequency means that callbacks can not be processed anymore just because of maybe one function which should be executed at a certain frequency. But the spinner does much more and we should not limit it.

Since the spinner is a bottleneck if processing the callback queue takes too much time, it makes sense to use a multi-threaded spinner. If you use a multi-threaded spinner/ async spinner, keep in mind that a lock is applied for a specific callback (no concurrency per default) and multiple callbacks of the same type are not processed in parallel. However, if enough threads are available, the next unlocked callback of the queue will be called (https://stackoverflow.com/a/48544551/8623933). This also means that if a timer callback takes longer than the timer duration, it will start with a delay. No new timer callback will be added to the callback queue during this time (https://answers.ros.org/question/248656/does-callbacks-get-drop-from-queue-when-it-is-exceeded-it-expected-execution-time-by-too-much/).

For the case that there are processes which are more time critical and can not wait until the elements in front of the callback queue are processed, we can use multiple callback queues and assign subscribers and timers (http://docs.ros.org/en/diamondback/api/roscpp/html/structros_1_1TimerOptions.html) to them. There are different options of doing this. The gitconnected article linked above describes how to create another spinner in a new thread. Another option is to add an AsyncSpinner with its own callback queue in addition to a global callback queue as shown here: https://gist.github.com/bgromov/e6f5eb142346b3c88e9f96bce17eee92

So as a summary, the solution to my problem is to create an AsyncSpinner, which can process the timer callback in parallel so that it does not block the faster callbacks. If this is not enough, I would start using multiple callback queues for critical processes.

There are still two open questions: - Some sources say that the complete callback queue is locked, other say that only callbacks of a certain type are blocked. The first case would mean that callbacks are not actually processed in parallel which disagrees with this tutorial: https://roboticsbackend.com/ros-asyncspinner-example/ - Does the AsyncSpinner need to wait until all threads are finished before updating the callback queue and executing another spin? Or can spins continue in "parallel" if threads are available?