Slam Toolbox: Message Filter dropping message for reason 'discarding message because the queue is full'
Hello, (I'll start by saying I've checked already if it's a tf tree problem and it's not)
I'm currently attempting to perform SLAM and navigation utilizing a Jetson Xavier NX, and an Intel's D455. The Jetson's OS is the Ubuntu 20.04 (done by https://qengineering.eu/install-ubunt...) and I'm using Galactic built from source. The Jetson's mode is the one with 20W and 6 cores.
I initiate a launch file consisting of: D455 node, robot_localization, depth_image_to_laserscan (I plan to use a 2D lidar but I'm using the D455 for fast tests), and ros2_laser_scan_matcher (from https://github.com/AlexKaravaev/ros2_...), and finally a static transform between base_link and camera_link. Up until here, everything works exceptionally well. The tf tree looks as one would expect (odom -> base_link average rate of 20 as I set in robot_localization configuration, base_link -> camera link is a static transform set by me, and the rest is the camera's frames). (Sorry, I can't upload images yet.)
By using rviz, I can also see that everything is working as expected, meaning that I set the static frame to odom, and as I move the camera the scans don't change their position, only the camera's frames do. Great. Now I plan to use slam_toolbox as well as Nav2.
I'm using slam_toolbox default configuration (online_async), only remapping the robot's base frame to base_link. Here starts the problems: The terminal is spamming these:
[1641398181.499569062] [slam_toolbox]: Message Filter dropping message: frame 'camera_depth_frame' at time 1641398181.448 for reason 'discarding message because the queue is full'
In rviz I set the static frame to odom, the scans are being shown as well every frame and updated with decent frequency, I can see the map being built right at start, and then it only updates every now and then when I move the D455 significantly. If I set the static frame to the map, I can't see the scans, neither the frames, only every now and then when the map is updated. (I'm talking every 5 seconds or more and I have to move the camera). The tf tree shows the map -> odom transformation, however, the average rate is 2590000, buffer length 0. Meanwhile, every other transformation is as expected (as before initiating slam_toolbox).
At first I didn't mind it (before not realizing it wasn't only the map updating slowly, but the map->odom tf was messed up), and attempted to test Nav2 (again using default launch file navigation_launch, only changing every set_sim_time to false.). The terminal spams these:
[global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: Lookup would require extrapolation into the past. Requested time 1641398077.397696 but the earliest data is at time 1641398167.204133, when looking up transform from frame [base_link] to frame [map]
Unsurprisingly. And if I send a goal, controller/planner crashes.
Most solutions I see are regarding ...
That board has "low power" modes (== low performance.) What power-mode are you using? Link to Nvidia power management
Hi, I'm using the one with 20W and 6 cores. Which for some reason isn't listed on that link for the Xavier NX. This one should be the best performance-wise.
While slam is running, does
top
show any processes using 100% cpu?The laser_scan_matcher is taking up 95-100%; realsense2 around 40-50%; Xorg around 35%; lxterminal is 20%; depthimage_to_laserscan is 15%; and slam_toolbox is at 10%. The rest is all under 10%. So for now, I should: switch the scan_matcher algorithm for another alternative (any suggestions)? And probably just use ssh, since the Xorg taking up so much is likely thanks to be using monitor/kb/mouse together with the camera, right?
Yes, shutting down Xorg and using ssh is a good idea (although I don't recall Xorg using so much cpu.)
rqt_console
GUI app to identify which node is complaining about data loss?Alright, I've reduced the FPS in the realsense wrapper configuration file to 15 (from 30) and this reduced the CPU usage of the laser_scan_matcher to around 75%. It is now almost working as expected, both slam_toolbox as is Nav2. Slam_toolbox now publishes both map and transform (default 50hz) consistently, and Nav2 accepts goals. However, Slam_toolbox spams the same message (at a reduced rate (5hz)). And Nav2 spams:
[global_costmap.global_costmap_rclcpp_node]: Message Filter dropping message: frame 'camera_depth_frame' at time * for reason 'the timestamp on the message is earlier than all the data in the transform cache'
and after sending a goal: [controller_server-1] [WARN] [controller_server]:Unable to transform robot pose into global plan's frame
[controller_server-1] [ERROR] [tf_help]:Transform data too old when converting from odom to map
I'll further attempt to reduce unnecessary CPU usage, since I plan to use a LIDAR like Velodyne's which might require even more effort from the ICP node. I'll also play around with it's parameters(laser_scan_matcher's), as well as with both slam_toolbox and Nav2 parameters regarding buffers and timeouts.
By the way, which rqt_console plugin were you mentioning that would tell me about the data loss? The logging console, right?
Thanks for the help.
Regarding rqt_console, I forgot you are using ros2; I don't know if tool is the same as in ros1.
Regarding the message timestamp complaint, I don't have any suggestions. That message seems very strange if all your nodes are running on the same host.