actionlib/SimpleActionClient sometimes not subscribe result topic

asked 2018-10-22 05:44:42 -0600

ksawa gravatar image

updated 2018-10-22 21:36:38 -0600

I have a simple reproduction code, it tries action communication in the loop. In my environment, waitForResult() will time out as a result of miss subscribe result by approximately 500 - 10000 tries. I'm running ROS Kinetic, actionlib 1.11.13, ros-comm 1.12.14, and Ubuntu 16.04.5 LTS AMD64, with kernel 4.15.0.

To investigate further, I checked result_pub_.getNumSubscribes() in action_server_impl.h publishResult(). The number of subscribers is one in the first execution, and the become two. When a timeout occurred, the number of subscribers on the resulting topic was decreasing. Perhaps when running in the loop, one of the subscribers is the previous subscriber and the other is the current subscriber.

Is this the wrong way to use actionlib or is it a bug? I suppose it is a bug that sometimes fails to register subscriber, so maybe it will be occur not only the loop but also first execution.

Client log:

[ INFO] [1539855711.275219044]: sendGoal [570]
[ INFO] [1539855711.672797568]: sendGoal [571]
[ INFO] [1539855711.979528457]: sendGoal [572]
[ INFO] [1539855712.278595804]: sendGoal [573]
[ INFO] [1539855712.588373298]: sendGoal [574]
[ INFO] [1539855712.888758167]: sendGoal [575]
[ INFO] [1539855713.092482734]: sendGoal [576]
[ INFO] [1539855713.401528966]: sendGoal [577]
[ INFO] [1539855713.702749769]: sendGoal [578]
[ERROR] [1539855718.703182372]: waitForResult timeout 578

Server log:

[ INFO] [1539855712.599485422]: getNumSubscribers() = 2
[ INFO] [1539855712.600284986]: setSucceeded
[ INFO] [1539855712.899739017]: getNumSubscribers() = 2
[ INFO] [1539855712.899964403]: setSucceeded
[ INFO] [1539855713.104094805]: getNumSubscribers() = 2
[ INFO] [1539855713.104387345]: setSucceeded
[ INFO] [1539855713.412796118]: getNumSubscribers() = 2
[ INFO] [1539855713.413093028]: setSucceeded
[ INFO] [1539855713.713709330]: getNumSubscribers() = 1
[ INFO] [1539855713.713923076]: setSucceeded

Client code:

#include <ros/ros.h>
#include <actionlib/client/simple_action_client.h>
#include <control_msgs/FollowJointTrajectoryAction.h>

void spinThread()
{
  ros::NodeHandle nh;
  ros::spin();
}

int main(int argc, char** argv)
{
  ros::init(argc, argv, "TestClient");

  boost::thread spin_thread(&spinThread);

  uint32_t i = 1;
  control_msgs::FollowJointTrajectoryGoal goal;

  while (ros::ok())
  {
    actionlib::SimpleActionClient<control_msgs::FollowJointTrajectoryAction> ac("follow_joint_trajectory_action", true);
    ac.waitForServer();

    ROS_INFO("sendGoal [%u]", i);
    ac.sendGoal(goal);

    if (!ac.waitForResult(ros::Duration(5.0)))
    {
      ROS_ERROR("waitForResult timeout %u", i);
      while(1);
    }

    i++;
  }

  ros::shutdown();
  spin_thread.join();

  return 0;
}

Server code:

#include <ros/ros.h>
#include <actionlib/server/simple_action_server.h>
#include <control_msgs/FollowJointTrajectoryAction.h>

class TestServer
{
protected:
  ros::NodeHandle nh_;
  actionlib::SimpleActionServer<control_msgs::FollowJointTrajectoryAction> as_;

public:
  TestServer(): as_(nh_, "follow_joint_trajectory_action", boost::bind(&TestServer::goalCb, this, _1), false)
  {
    as_.start();
  }

  ~TestServer()
  {
  }

  void goalCb(const control_msgs::FollowJointTrajectoryGoalConstPtr &goal)
  {
    ros::Duration(0.01).sleep();
    control_msgs::FollowJointTrajectoryResult result;
    as_.setSucceeded(result);
    ROS_INFO("setSucceeded");
  }
};

int main(int argc, char** argv)
{
  ros::init(argc, argv, "TestServer");
  TestServer ts;
  ros::spin();

  return 0;
}

Modified actionlib/include/actionlib/server/action_server_imp.h

template<class ActionSpec>
void ActionServer<ActionSpec>::publishResult(const actionlib_msgs::GoalStatus & status,
  const Result & result)
{
  boost::recursive_mutex::scoped_lock lock(this->lock_);
  // we'll create a shared_ptr to pass to ROS to limit copying
  boost::shared_ptr<ActionResult> ar(new ActionResult);
  ar->header.stamp = ros::Time::now();
  ar->status = status;
  ar->result = result;
  ROS_DEBUG_NAMED("actionlib", "Publishing result for goal with id: %s and stamp: %.2f",
    status.goal_id.id.c_str(), status.goal_id.stamp.toSec());
  ROS_INFO("getNumSubscribers() = %u", result_pub_.getNumSubscribers()); // ADD
  result_pub_.publish(ar);
  publishStatus();
}
edit retag flag offensive close merge delete

Comments

I believe this is a known issue with the ROS 1 implementation of actionlib.

Please see the issue tracker for related issues.

gvdhoorn gravatar imagegvdhoorn ( 2018-10-22 05:54:31 -0600 )edit

Thank you for your advice. I've posted as new issue here.

ksawa gravatar imageksawa ( 2018-11-05 05:47:12 -0600 )edit

My suggestion was to see whether one of the current bugreports already captures what you found and then provide more information there. Not necessarily to open a new issue.

gvdhoorn gravatar imagegvdhoorn ( 2018-11-05 06:52:34 -0600 )edit