ROS dropping messages - How to predict it

asked 2021-10-24 14:08:25 -0500

camilo.chiang
46 ●3 ●7 ●10

updated 2021-10-26 09:09:41 -0500

lucasw

8729 ●137 ●239 ●256 https://github.com/lucasw

Good day.

Situation: On A RPI4 (8gb RAM, Ubuntu 18.04 server), I'm recording a rosbag with three different topics (IMU at 200 hz with messages of 0.35 kb; Lidar at 10 hz with messages of 190 kb; and video (different resolutions of compressed images at 30 hz). All of them using c++/ROS packages.

When recording the bag using 640x480 video, there is no problem during replaying as the frecuency is the right one for all topics. BUT, when recording at 1920 x 1080 (400 kb per message) set at 30 hz, the recorded bag is actually at 11 hz for the video and the IMU topic get affected decreasing the frecuency to only 160 hz.

Im using a Sandisk extreme mini sd card (up to 70 mb per second) and I need to record all messages in the right order so increasing the buffer size will not help.

Question: - ROS related: Is there a way in ROS to indicate or predict the limiting factor (i.e: RAM usage )? - Aditional question: Any good solutions for this? Will increase RAM memory solve this problem?

Thanks in advance! -

edit retag flag offensive close merge delete

add a comment

answered 2021-10-25 08:02:45 -0500

Martin Günther

11816 ●82 ●159 ●198 https://www.dfki.de/en...

updated 2021-10-25 10:08:13 -0500

Just to be clear: You are recording the image_raw/compressed topic, and the compression type is set to JPEG, correct? I believe the CPU power required for JPEG compression on the Raspberry Pi is the bottleneck, so it has nothing to do directly with ROS, writing to the SD card, rosbag or the amount of RAM. You could check this by running rostopic hz on the compressed image topic on the live system. Also running top or htop could help - it should show you if the CPU is maxed out.

Unfortunately, saving the uncompressed images is probably also not an option, since the data rate is in the order of 1920×1080×3×30/(1024×1024) = 178 MB/s. If you are going to record very short rosbags only, you can increase the amount of RAM buffer that rosbag uses with the command line option rosbag record --buffsize=SIZE, where SIZE is the buffer size in MB (default: 256 MB), but at that data rate, it will only last a minute or so.

I guess recording a 30 Hz image stream in 1080p is just not going to work on a Raspi4.

Edit in response to comment:

Online compression to JPEG for an even bigger image stream (3840x2160@30Hz) will be a challenge on most "embedded" CPUs. You can benchmark your own system by installing sudo apt install libjpeg-turbo-test and running tjbench. On my laptop CPU (Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz), I get a compression throughput of 233 Megapixels/sec and decompression throughput of 272 Megapixels/sec. The DJI Manifold 2-C has a Intel Core i7-8550U, which gets a decompression throughput of 176 Megapixels/sec according to this page, so the compression throughput should be about 150 Megapixels/sec (around 3 times faster than the Raspberry Pi 4).

You would need:

for 640×480@30Hz: 9.2 Megapixels/sec
for 1920×1080@30Hz: 62.2 Megapixels/sec
for 3840×2160@30Hz: 248.8 Megapixels/sec

Note that the values from tjbench are for the libjpeg-turbo library, but OpenCV (which is used by the ROS image_transport_plugin for compression) uses the slower libjpeg library, which can be up to 2-6x slower. It is possible to build OpenCV with libjpeg-turbo and then rebuild all ROS packages that depend on OpenCV. I have done it, but it's a huge hassle and I wouldn't recommend it if it can be avoided.

The Raspberry Pi's ARMv8 Cortex-A72 has a decompression throughput of 61 Megapixels/sec according to this page, so let's guess the compression throughput should be around 52 Megapixels/sec. What you have actually observed is 1920×1080@11Hz = 22.8 Megapixels/sec. This could be explained by the libjpeg vs. libjpeg-turbo issue discussed above, so calculate some large margin of error when speccing your CPU. In other words, I guess that the DJI Manifold 2-C realistically has a compression throughput closer to 60 Megapixels/sec, so it can just barely handle 1920×1080@30Hz and will definitely ... (more)

edit flag offensive delete link

Comments

Hei Martin! 1. Thanks for your answer. You were right. After rechecking the live system , the CPU usage is above 100 at 30 and 15 hz. I can still use it at 6 hz (25% CPU usage) but is not ideal. 2. Then, I will assume that there is no a standard way from ROS to inform about the lack of CPU power to reach the required frequency. My option is to change the code and check loop_rate.sleep() . If this value is lower than 0 then i will add a warning. 3. Non ROS related : Do you have any suggestions of how I could calculate which CPU should be able to do this? I'm planning to use higher resolutions in the future but I'm afraid that the new CPU could not manage it (3840x2160 - 30 hz. I have been thinking about DJI manifold-C ).

camilo.chiang ( 2021-10-25 08:54:41 -0500 )edit

See my edit for my response!

Martin Günther ( 2021-10-25 10:01:42 -0500 )edit

This needs more upvotes.

Instead of vagueries based on intuition and previous experience, you actually used a representative benchmark and describe some objective measurements.

This question (+1 @camilo.chiang for asking) and your answer Martin are a really nice example of how something may seem like it's bottlenecked "on ROS" or "message passing", but where it turns out even the basic IO of (de)compression is just too much for the system used.

I guess that the DJI Manifold 2-C realistically has a compression throughput closer to 60 Megapixels/sec, so it can just barely handle 1920×1080@30Hz [..]

even stronger: it can barely handle a single stream. The op describes several more, with different resolutions.

Edit: perhaps switching to something like swri-robotics/imagezero_transport (or other alternative compression) could help. But it will all depend on performance of the system used.

gvdhoorn ( 2021-10-25 10:05:48 -0500 )edit

And if the system has hw support for jpeg/png/x264 compression, there are image_transport plugins which can use those (for JPEG you could even just package the already compressed payload).

gvdhoorn ( 2021-10-25 10:11:12 -0500 )edit

Yes, using HW support actually seems like the best way to do it. Many cameras have the option of outputting a mjpeg stream. E.g., this node reads the mjpeg stream from a network camera and directly publishes it as a compressed image: https://github.com/chadrockey/mjpeg_n... . Many USB cameras can also output a mjpeg stream. If you can get a mjpeg stream from your camera and directly copy the bitstream into a CompressedImage message, that would eliminate any need for jpeg compression on the CPU and should easily work with huge images.

Martin Günther ( 2021-10-27 02:16:47 -0500 )edit

In #q379644 we discussed some (hw accelerated via FFMPEG's NVENC support fi) x264 image_transport plugins.

May also be an option.

gvdhoorn ( 2021-10-27 02:42:04 -0500 )edit

Thanks guys :) I learn so much with this! I really appreciate it

camilo.chiang ( 2021-10-27 03:00:45 -0500 )edit

add a comment

ROS dropping messages - How to predict it

1 Answer

Comments

Question Tools

Stats

Related questions

ROS dropping messages - How to predict it edit

1 Answer

Comments

Question Tools

Stats

Related questions

ROS dropping messages - How to predict it