pthread_mutex_lock segfault on parallelized SimpleActionGoal publish

asked 2020-10-29 20:04:42 -0500

mmiles19 gravatar image

I'm trying to perform multithreaded action server calls using independent SimpleActionClients/Servers, and I'm getting a segfault with the following GDB output:

0 __GI___pthread_mutex_lock (mutex=0x110) at ../nptl/pthread_mutex_lock.c:65

1 0x00007ffff74793d0 in ros::Publication::hasSubscribers() () from /opt/ros/melodic/lib/libroscpp.so

2 0x00007ffff746a246 in ros::TopicManager::publish(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, boost::function<ros::SerializedMessage ()> const&, ros::SerializedMessage&) () from /opt/ros/melodic/lib/libroscpp.so

3 0x00007fffd7b66c6d in ros::Publisher::publish<my_package::MyActionGoal_<std::allocator<void> > const> (this=0x7fff50000d08, message=...) at /opt/ros/melodic/include/ros/publisher.h:88

4 0x00007fffd7b58b8a in actionlib::ActionClient<my_package::MyAction_<std::allocator<void> > >::sendGoalFunc (this=0x7fff50000b60, action_goal=...) at /opt/ros/melodic/include/actionlib/client/action_client.h:209

I do this on the client side by, in parallel, constructing a new SimpleActionClient for each thread, sending the corresponding goal, and waiting for the result. It looks something like this:

 void threadedFunc(index){
   SimpleActionClient client("some_action_"+to_string(index));
   SimpleActionGoal goal;
   goal.index = index;
   client.sendGoal(goal);
   client.waitForResult();
   SimpleActionResult result = client.getResult();  
}
int main(){
   for (int i=0; i<numThreads; i++){
     threadpool.schedule(threadedFunc(i));
   }
   threadpool.wait(); 
}

On the server side:

serverCallback(SimpleActionGoal goal){
  ROS_INFO("Server callback for thread %d", goal.index);
  SimpleActionResult result;
  switch(goal.index){
    case 0: server_0.setSucceeded(result); break;
    case 1: server_1.setSucceeded(result); break;
    ...
}   
int main(){
  SimpleActionServer server_0("some_action_0", serverCallback);
  SimpleActionServer server_1("some_action_1", serverCallback);
  ...
  spin();
}

It's sort of hacky, I know, but I'm not exactly sure how to go about doing this with the bare actionlib library so I've been avoiding it. If you have tips, would love to hear them. Anyways, this hacky method has worked fine before so I'm not sure what happened. The only major changes I've made recently are speed increases, so this could be a race condition. Researching about pthread_mutex_lock supports that notion, suggesting there's some thread unsafe vector accessing going on, but I'm not sure how to go about fixing it.

edit retag flag offensive close merge delete