Robot drifts away from goal and laserscan skews on random interval
I have to apologize for the vague and/or convoluted title. We don't know where the issue is ourselves.
The robot I'm working on makes use of ROS Kinetic, the standard navigation stack with amcl
, teb_local_planner
, laser_scan_matcher
and move_base
. It is a front-wheeled differential drive with two caster wheels on the back with a rectangular body/footprint, built on the chassis of an electric wheelchair.
The issue we're facing is happening on a random interval: Whenever we set a goal in rviz, the robot does move towards his goal. After a random amount of time or after a certain distance has been traveled, two errors could occur:
The robot loses track of his position as the local costmap would become skewed, we see this by lines drawn from the LIDAR sensor, indicating cabinets and walls, becoming skewed and/or rotating from their location. This behaviour causes it to continue on "a path" until it does encounter an obstacle.
Each of these erratic behaviours is visible within rviz and the console output can mention the errors Map Update/Control loop missed its desired rate <...>
which we think might not be related to CPU load, as it is averaging at about 50% including when generating new paths. never has it gone above 90%.
Additionally, the robot would keep on its last set velocity (cmd_vel
output) and rviz would no longer be able to set and display a new goal, rendering the robot unresponsive. We're able to clear cmd_vel
by using manual control (teleop_twist_joy
) and stopping the robot that way. Upon hitting the reset
button on the bottom-right in rviz, we progressively lose the local and global costmap per press, accompanied by the warning "No map provided". Rebooting the ROS environment will restore everything until the "drift" occurs again.
We can consistently recreate this behaviour by having the robot drive various distances, believing it appears more often when traversing small distances of approximately 2-3 meters, on longer distances i.e 5-20 meters the errors occure less often, but the laserscan becoming askew is still prevalent. The LIDAR/local costmap becoming askew is easily reproducible by rotating the robot while stationary, when turning both left and right the laser scan and local costmap would rotate in short intervals and progressively move to the right a certain distance within rviz until no longer, where it is still skewed massively.
We've managed to make a quick recording of what happens when the robot is spinning around its axis once, alas, the video is rather shaky. https://www.dropbox.com/s/iywk2r9c6db0wda/VID_20190326_151328.mp4?dl=0 (The link will be removed if it's not allowed to post here.)
Below are the common configuration files requested:
navigation.launch
<node if="$(eval sim == 'true')" pkg="rosbag" type="play" name="play" args="$(arg path)/bags/t5_lidar.bag --delay=5 --clock"/> <node pkg="tf" type="static_transform_publisher" name="base_link_to_laser" args="0.0 0.0 0.0 0.0 0.0 0.0 /base_link /laser 40" /> <node pkg="tf" type="static_transform_publisher" name="base_link_to_sonar_front" args="0.0 0 0 0.0 0.0 0.0 /base_link /front 40"/> <node pkg="tf" type="static_transform_publisher" name="base_link_to_sonar_right" args="0.0 0 0 -1.57 0.0 0.0 /base_link /backRight 40"/> <node pkg="tf" type="static_transform_publisher" name="base_link_to_sonar_left" args="0.0 0 0 1.57 0.0 0.0 /base_link /backLeft 40"/> <node pkg="tf" type="static_transform_publisher" name="world_to_map" args="0.0 0 0 0.0 0.0 0.0 /world /map 40"/> <node pkg="tf" type="static_transform_publisher" name="world_to_map_nav" args="0.0 0 0 0.0 0.0 0.0 /world /map_nav 40"/> <!-- <node pkg="laser_filters" type="scan_to_scan_filter_chain" name="laser_filter"> <rosparam command="load" file="$(arg path)/config/laser_filter_config.yaml" /> </node> --> <node pkg="rviz" type="rviz" name="rviz" args="-d $(arg path)/navigation.rviz"/> <node pkg="laser_scan_matcher" type="laser_scan_matcher_node" name="laser_scan_matcher_node" output="screen"> <param name="base_frame" value = "/base_link"/> <param name="fixed_frame" value = "/odom"/> <param name="vel" value="/cmd_vel" /> <param name="use_cloud_input" value="false"/> <param name="publish_tf" value="true"/> <param name="publish_odom" value="true"/> <param name="use_odom" value="false"/> <param name="use_imu" value="false"/> <param name="use_vel" value="true"/> <param name="use_alpha_beta" value="true"/> <param name="max_iterations" value="10"/> <remap from="scan" to="scan_filtered" /> </node> <node pkg="map_server" type="map_server" name="map_server" args="$(arg path)/maps/T5_full_small.yaml" output="screen"/> <node pkg="map_server" type="map_server" name="map_server_nav" args="$(arg path)/maps/T5_full_small_move_base.yaml" output="screen"> <remap from="map" to="map_nav"/>
<param name="odom_alpha3" value="0.8"/> <param name="odom_alpha4" value="0.2"/> <param name="laser_z_hit" value="0.5"/> <param name="laser_z_short" value="0.05"/> <param name="laser_z_max" value="0.05"/> <param name="laser_z_rand" value="0.5"/> <param name="laser_sigma_hit" value="0.2"/> <param name="laser_lambda_short" value="0.1"/> <param name="laser_lambda_short" value="0.1"/> <param name="laser_model_type" value="likelihood_field"/> <param name="laser_likelihood_max_dist" value="2.0"/> <param name="update_min_d" value="0.2"/> <param name="update_min_a" value="0.5"/> <param name="odom_frame_id" value="odom"/> <param name="resample_interval" value="1"/> <param name="transform_tolerance" value="0.1"/> <param name="recovery_alpha_slow" value="0.0"/> <param name="recovery_alpha_fast" value="0.0"/> <remap from="scan" to="scan_filtered" />
baselocalplanner_params.yaml
controllerfrequency: 2.0 recoverybehaviorenabled: false clearingrotationallowed: false recoverybehaviors: [{name: conservativereset, type: clearcostmaprecovery/ClearCostmapRecovery}, {name: aggressivereset, type: clearcostmaprecovery/ClearCostmapRecovery}]
TebLocalPlannerROS:
odomtopic: odom mapframe: /odom
# Trajectory
tebautosize: True dtref: 0.3 dthysteresis: 0.1 globalplanoverwriteorientation: True maxglobalplanlookaheaddist: 2.0 feasibilitycheckno_poses: 4
# Robot
maxvelx: 0.7 minvelx: 0.3 maxvelxbackwards: 0.5 maxveltheta: 0.7 acclimx: 0.7 acclimtheta: 0.7 minturningradius: 0.4 footprintmodel: # types: "point", "circular", "twocircles", "line", "polygon" type: "line" linestart: [0.0, 0.0] line_end: [-0.6, 0.0] radius: 0.2
# GoalTolerance
xygoaltolerance: 0.2 yawgoaltolerance: 0.1 freegoalvel: False
# Obstacles
minobstacledist: 0.3 includecostmapobstacles: True costmapobstaclesbehindrobotdist: 1.0 obstacleposesaffected: 30 costmapconverterplugin: "costmapconverter::CostmapToPolygonsDBSMCCH" costmapconverterspinthread: True costmapconverterrate: 5
# Optimization
noinneriterations: 4 noouteriterations: 3 optimizationactivate: True optimizationverbose: False penaltyepsilon: 0.1 weightmaxvelx: 2 weightmaxveltheta: 1 weightacclimx: 0.5 weightacclimtheta: 6.0 weightkinematicsnh: 1000 weightkinematicsforwarddrive: 1 weightkinematicsturningradius: 1 weightoptimaltime: 1 weight_obstacle: 50
# Homotopy Class Planner
enablehomotopyclassplanning: True enablemultithreading: True simpleexploration: True maxnumberclasses: 2 roadmapgraphnosamples: 15 roadmapgraphareawidth: 5 hsignatureprescaler: 0.5 hsignaturethreshold: 0.1 obstaclekeypointoffset: 0.1 obstacleheadingthreshold: 0.45 visualizehc_graph: False
costmapcommonparams.yaml
plugins: - {name: static_layer, type: 'costmap_2d::StaticLayer'} - {name: obstacle_layer, type: 'costmap_2d::ObstacleLayer'} - {name: inflation_layer, type: 'costmap_2d::InflationLayer'} - {name: sonar, type: "range_sensor_layer::RangeSensorLayer"} footprint: [[0.25,-0.31],[0.25,0.31],[-0.69,0.31],[-0.69,-0.31]] footprint_padding: 0.03 inflation_layer: inflation_radius: 0.4 cost_scaling_factor: 2.58 sonar: topics: ['/sonar_data'] obstacle_layer: combination_method: 1 enabled: true footprint_clearing_enabled: true max_obstacle_height: 0.6 observation_sources: scan obstacle_range: 2.0 raytrace_range: 2.0 scan: {clearing: true, data_type: LaserScan, expected_update_rate: 0, marking: true, topic: /scan_filtered}
globalcostmapparams.yaml
global_costmap: global_frame: map robot_base_frame: /base_link update_frequency: 5.0 publish_frequency: 1.0 static_map: true rolling_window: false resolution: 0.05 transform_tolerance: 1.0 map_type: costmap
localcostmapparams.yaml
local_costmap: global_frame: odom robot_base_frame: base_link update_frequency: 5.0 publish_frequency: 1.0 static_map: true rolling_window: true width: 5.0 height: 5.0 resolution: 0.05 transform_tolerance: 1.0 map_type: costmap inflation_layer: inflation_radius: 0.2 cost_scaling_factor: 10
Asked by halberd on 2019-03-22 07:36:00 UTC
Comments
that sounds like a resource issue.
What sort of hw are you running this on (ie: CPU, memory, platform, etc)?
Asked by gvdhoorn on 2019-03-22 07:41:01 UTC
The hardware consists of a laptop with an i5-3210M CPU (2.50GHz x4) and 16GB RAM running Ubuntu 16.04. CPU usage (with ROS) averages at 40-50% idle and 70-80% when given a goal.
Asked by halberd on 2019-03-22 07:59:00 UTC
80% is quite close to maximum. Could be a red herring, but it'd be interesting to see what the utilisation level is when your problem manifests itself.
How did you install ROS? All using
apt
?Also btw: why so many
static_transform_publisher
s? Can you not create a URDF?Asked by gvdhoorn on 2019-03-22 08:03:28 UTC
We're the 5th group assigned to this project, therefore we don't know the full details of what has happened where (the documentation was rather incomplete). We can safely assume the packages were installed via
apt
and our group installed the planner that way. In terms of CPU consumption, one or two of the cores does spike up to 65% or higher when recalculating paths, though we don't see any unusual spikes when the problem manifests. We did capture one such instance of the issue on video. About the URDF model, we can look into that at a later stage.Asked by halberd on 2019-03-22 08:29:23 UTC
I'm not a navigation expert, so I don't believe I can help you. That's also why I only posted comments (so as to not discourage other board members from posting).
What I could suggest is to see whether this is a resource issue by monitoring CPU and memory usage over time. It could help to see whether a failing control loop (due to lack of resources) is the cause.
Asked by gvdhoorn on 2019-03-25 06:55:23 UTC
Also:
Are
Twist
s still being published bymove_base
? If not: you should really add a time-out to your mobile base driver on incomingTwist
s, so the robot stops automatically if none are received withing a certain amount of time. That would avoid runaway robots like this.Asked by gvdhoorn on 2019-03-25 06:56:57 UTC
In a recent test involving manual and autonomous driving, we're noticing that upon any kind of turn, the local costmap and laser scan skew and warp to the right within rviz. Currently, we made the robot spin in circles to demonstrate this behaviour, and the local costmap appropriately jumps on each revolution. To this point, we haven't yet found any reason why the laser output would skew after turning the robot around. Nor can we think which of the edits made in the past would cause this. We're still poking around and keeping track of what's what.
Asked by halberd on 2019-03-26 07:12:17 UTC
We've just discovered that the bug occurred again, confirming that
move_base
doesn't publish anything oncmd_vel
after the goal has been reached and a message with all zeroes has been sent.Asked by halberd on 2019-03-28 05:08:19 UTC
Are you
rosbag
ing everything? Could/would help with post-mortem debugging.re:
cmd_vel
: sounds like adding a time-out would be a good idea.Asked by gvdhoorn on 2019-03-28 05:11:22 UTC
We'll look into making a time-out for
cmd_vel
, as it is currently a rarely occurring issue at the moment. Our priority is on the map skewing issue, for which I've indeed made arosbag
just now.Asked by halberd on 2019-03-28 05:31:05 UTC
having such a time-out is considered a soft safety feature. I would not want to sit in a wheelchair that could potentially become uncontrollable ..
Asked by gvdhoorn on 2019-03-28 06:03:47 UTC
agreed. thankfully we only make use of the base of the wheelchair; nobody can sit on the robot either. On top of it is an emergency button we've used plenty of times as a result. During testing, I've found that there's several
map update loop missed its desired refresh rate
errors, as such I'll delve into those further, believing it might be related to the skewing issue.Asked by halberd on 2019-03-28 06:09:57 UTC
It might pay to spend some time installing a monitoring tool. Something like collectd. Or perhaps ethz-asl/ros-system-monitor. That way you could monitor CPU usage time-synced with the other messages in your bag.
that was what I pointed at in my first comment as well.
Asked by gvdhoorn on 2019-03-28 06:24:35 UTC
indeed. that might help us get closer to discovering the cause. In the past ~20 minutes i've driven the robot about, both manual and autonomous, and all three issues did not occur; the CPU load was still light (maximum 40%), the laserscan did not get skewed and the drifting issue only occured once a person walked in front of it (Might've been recovery behaviour, but it continued going backwards infinitely.) I've recorded this session in a bag file, too.
Asked by halberd on 2019-03-28 06:29:55 UTC