Smearing / Ghosting of laser scan in move_base costmap
Hello ROS Community,
I'm seeing a problem where laser scans are exhibiting rotational "smearing" when marking the move_base costmaps. This results in "ghost" obstacles in both the global and local maps, and these have the potential to cause path planning to fail. Typically, these smearing effects are cleared out when subsequent scans detect free space, and so the issue does not always cause navigation to fail. However, it does sometimes cause sub-optimal behavior, such as a need to re-plan.
The robot spec:
- Kobuki base
- Raspberry Pi 2, running ROS Indigo / Ubuntu ARM
- Rhoeby Dynamics R2D LiDAR (@ 3 Hz scan rate)
- Navigation stack (gmapping, move_base)
In the below screenshot: observe the accumulated LaserScans (in white), the local costmap marked areas (in yellow), and the inflated obstacles (in blue). Notice the ghosting of the walls in the top-left part of the costmap (in yellow).
If I set "Decay Time" on the LaserScan topic in RViz to accumulate the scans, and then observe the LaserScan plot over the same time period as when the costmap smearing occurs: I see that there is no such smearing in the decayed LaserScan plot. So the errors in rotation only show up in the costmap, and not on the LaserScans themselves. How/why would that be happening? How is it possible for this difference to exist?
Another observation about this phenomenon is that the problem only occurs when the rotational velocity of the robot is a non-constant, eg. when beginning turning from stopped, or when slowing down from a turn which is in progress. So the problem is a transient one that occurs during theta acceleration. Also of note is the fact that there are no similar smearing effects in translation (robot accelerating along x/y).
The problem is:
- unique to theta
- only occurs during theta acceleration (or deceleration)
- does not seem to affect LaserScans themselves (as plotted in RViz)
- only appears on the move_base costmaps (global and local costmap)
The fact that theta acceleration must be non-zero, leads me to think the problem could be due to delays in the odom TF, perhaps with a root cause in the Kobuki. Is the Kobuki just slow to reflect changes in it's rotation? Perhaps the odom transform lags the actual robot rotation?
But why would this not also affect the "decayed" LaserScan data as plotted in RViz?
The fact that the LaserScan plot is not affected, leads me to think the error could be rooted in move_base, although I doubt it. Perhaps there is some delay in the uptake of changes in the TF? But why only on rotation?
Thanks for any comment.
ADDENDUM:
I've checked the TF frames, and it seems to look reasonable:
The frames utilized are:
The topic /laserdata is published in tf frame: baselaser (rigidly attached to base_footprint)
The topic /movebase/localcostmap/costmap is published in tf frame: odom
The Global frame is: map
local_costmap:
globalframe: odom robotbaseframe: /basefootprint
global_costmap:
globalframe: /map
robotbaseframe: /basefootprint
ADDENDUM 2:
I made the following change to obstacle_layer.cpp, as per the recommendation in answer 1 (tried 0.75 also):
// debug smearing: changes based on analysis by A. Hendrix
// see http://answers.ros.org/question/231560/smearing-ghosting-of-laser-scan-in-move_base-costmap/
#if 1
observation_notifiers_.back()->setTolerance(ros::Duration(0.34));
ROS_WARN("@@@ Setting observation_notifiers_ tolerance to 0.34");
#else
observation_notifiers_.back()->setTolerance(ros::Duration(0.05));
#endif
The trace:
process[move_base-7]: started with pid [13502]
[ INFO] [1460470261.939842115]: Reconfigure request : 0.800000 5.400000 1.000000 2.000000 1.500000
[ INFO] [1460470262.992933885]: Initialising nodelet... [kobuki_safety_controller]
[ INFO] [1460470263.414586645]: Kobuki initialised. Spinning up update thread ... [kobuki_safety_controller]
[ INFO] [1460470263.415913468]: Nodelet initialised. [kobuki_safety_controller]
[ INFO] [1460470265.514135603]: Using plugin "static_layer"
[ INFO] [1460470266.017338311]: Requesting the map...
[ INFO] [1460470266.258624717]: Resizing costmap to 544 X 576 at 0.050000 m/pix
[ INFO] [1460470266.356887790]: Received a 544 X 576 map at 0.050000 m/pix
[ INFO] [1460470266.402144769]: Using plugin "obstacle_layer"
[ INFO] [1460470266.446317321]: Subscribed to Topics: scan bump
[ WARN] [1460470266.671768155]: @@@ Setting observation_notifiers_ tolerance to 0.34
[ INFO] [1460470267.148540550]: Using plugin "inflation_layer"
[ INFO] [1460470268.311201644]: Using plugin "obstacle_layer"
[ INFO] [1460470268.733313727]: Subscribed to Topics: scan bump
[ WARN] [1460470268.898442998]: @@@ Setting observation_notifiers_ tolerance to 0.34
[ INFO] [1460470269.329662164]: Using plugin "inflation_layer"
[ INFO] [1460470270.321315914]: Created local_planner dwa_local_planner/DWAPlannerROS
[ INFO] [1460470270.357102684]: Sim period is set to 0.20
[ INFO] [1460470272.187904767]: Recovery behavior will clear layer obstacles
[ INFO] [1460470272.659158517]: Recovery behavior will clear layer obstacles
[ INFO] [1460470273.067576694]: odom received!
But the problem remains the same.
Asked by jordan on 2016-04-10 12:31:39 UTC
Answers
It looks like the costmap is smearing the laser scan because it doesn't quite correctly account for the amount of time that it takes to actually rotate the scanner and capture the scan. This probably isn't noticeable on faster scanners or slower robots, because the distortion is lower.
rviz is correct, and waits for transforms for the entire duration of the scan to arrive before attempting to transform it.
costmap2d waits for a fixed window of transforms to arrive before attempting to transform the scan. If your scan took longer to acquire than the fixed time that the costmap has hardcoded, then it will attempt to transform before all of the transforms have arrived, and will probably smear the scan.
If you're comfortable building navigation from source, you can manually increase the tolerance here: https://github.com/ros-planning/navigation/blob/ada542feae238cc85eeab9944cc39e37eee93481/costmap_2d/plugins/obstacle_layer.cpp#L179 to something at or above your scan time, rebuild navigation, and see if that helps. (For a 3Hz sensor, a tolerance of 0.34 should be about right).
If that fixes the issue, open a ticket against navigation's bug tracker.
For reference, the section of rviz that computes the correct tolerance is https://github.com/ros-visualization/rviz/blob/1804cbf63241dcf174b1b76449a302e4321c50d8/src/rviz/default_plugin/laser_scan_display.cpp#L87-L93
Asked by ahendrix on 2016-04-12 01:11:17 UTC
Comments
Thanks very much for the suggestion. I changed the code, but the problem remains the same. Yes, having a faster scan definitely reduces the degree to which this is a problem. And slowing the robot down also mitigates it. It's so curious that the RViz accumulated plot seems fine.
Asked by jordan on 2016-04-12 10:19:55 UTC
Are you certain that you've rebuilt and are actually running your modified version of navigation, instead of the system-installed version? Did you remember to build your workspace and source the workspace once it was built?
Asked by ahendrix on 2016-04-12 13:26:33 UTC
I like to add some kind of obvious print/ROS_WARN when I modify a package that's also installed on the system, and then look for that at run-time to make sure that it's actually running my modified version.
Asked by ahendrix on 2016-04-12 13:38:24 UTC
Yes, indeed, I usually put a print to confirm execution, just as you say. Please see ADDENDUM 2. If the problem is TF delay, how would that not affect translation too? Why is only theta acceleration implicated in this, and not constant angular velocity too?
Asked by jordan on 2016-04-12 15:25:08 UTC
Comments
Have you tried mapping a priori using
gmapping
and then pairingmove_base
withamcl
on the (now) known map? I'm asking because your costmap configuration might be assuming a static map, not one generated online using SLAM. Conversely, check your costmap configuration.Asked by spmaniato on 2016-04-10 12:43:14 UTC
Yes, I've done that, ie. taking a previously built map (and then using amcl), but it shows exactly the same behavior. Note: mapping and navigation are essentially work fine, modulo this transient rotational issue.
Asked by jordan on 2016-04-10 13:06:24 UTC
I see, OK. Then yeah, the next logical suspects are the Odometry messages and the odom TF. Have you checked/ran
tf view_frames
,tf tf_monitor
,roswtf
,rostopic echo odom
, etc.? Anything suspicious?Asked by spmaniato on 2016-04-10 14:00:49 UTC
I've checked the TF frames, and it seems to look reasonable (see ADDENDUM:). Appreciate your input.
Asked by jordan on 2016-04-10 16:05:06 UTC
The plot thickens. What are the values of your
global_costmap/robot_base_frame
,global_costmap/global_frame
,local_costmap/robot_base_frame
, andlocal_costmap/global_frame
parameters? (You've already mentioned a few of these in your addendum; just getting the full picture.)Asked by spmaniato on 2016-04-10 20:59:46 UTC
local_costmap: global_frame: odom robot_base_frame: /base_footprint
global_costmap: global_frame: /map robot_base_frame: /base_footprint
Asked by jordan on 2016-04-11 10:04:51 UTC