As part of the ROS Real-time Working Group (RTWG) we have the idea to create several guides to help ROS 2 users to develop real-time applications. Please take a look at the RTWG documentation and our current roadmap. https://ros-realtime.github.io/. As you can see some topics are planned but are still WIP and there is no information available yet. Hopefully, we will be able to complete all these tasks in the next few months.
In the remainder of this answer, we collected some guidelines you could follow to improve your application's real-time performance in the meanwhile.
With respect to the Linux kernel, if your application really needs real-time performance we suggest that you use the full Preempt-RT kernel. We created a guide to explain how to build and configure Raspberry PI 4 with Preempt-RT using a docker-based tool linux-real-time-kernel-builder.. While this is only for the RPI4, the guide should help you to understand the steps to build the kernel and the settings we are using. Additionally, you can take a look at this presentation which gives a good overview of how to configure the kernel:
Depending on the computing device you are using you may have to configure additional settings (i.e: disable hyper-threading, disable NMIs, etc)
Finally, you should use cyclictest (https://wiki.linuxfoundation.org/real...) to check the real-time performance you are getting with your system and your configuration. This is important because this allows you to catch issues with your configuration and your system and provides a baseline of the performance you could get.
Using CPU isolation or CPU affinity could help to improve the performance of your application. Note with this approach you would reduce the overall CPU available bandwidth of your system because your non-real-time applications won't be scheduled in the isolated CPUs. This approach could make sense if you have a clear separation of real-time and non-real-time applications in your system, and enough CPU for all your applications.
With respect to assigning RT priorities, one thing you could do is to separate node tasks in different callback groups depending on their priority and run them in different threads with different priorities. Here is an example https://github.com/ros2/examples/tree.... Additionally, some DDS implementations allow to fine-tune their internal thread priorities and CPU affinities. For example, in the case of CycloneDDS it is possible to set the stack size, scheduling class, and scheduling priority for each thread (https://github.com/eclipse-cyclonedds...). Note that if your real-time application depends on network communications you will have to tune the kernel network-related threads accordingly to the priorities you're setting in your application (see https://arxiv.org/pdf/1808.10821.pdf).
Here I'm listing some additional points related to memory management, blocking calls, etc.
- Lock the process memory to avoid memory page faults
- Allocate the memory before the ...
(more)