Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

For a few months I sidestepped this problem. But today I've found the deadlock. I reckon there's a bug in

I wasn't able to get the python debugger to produce the stack traces that explained the problem. But after looking at the roslog debug messages and the source code, I came up with the following reconstruction of the events:

Thread A (rospy's message dispatcher?)

  • Deadlocked in: self.execute_condition.acquire()
  • In function: SimpleActionServer.internal_goal_callback() [] which was called from: ActionServer.internal_goal_callback() []
  • This thread has ActionServer.lock and wants to acquire SimpleActionServer.lock (condition variable was initialised with the latter lock).

Thread B (SimpleActionServer's executeLoop thread)

  • Deadlocked in: with self.action_server.lock
  • In function ServerGoalHandle.set_accepted() [] which was called from: SimpleActionServer.accept_new_goal()[] which was called from: SimpleActionServer.executeLoop()[] which at that point is holding SimpleActionServer.lock
  • This thread wants ActionServer.lock and has SimpleActionServer.lock

In summary, if if a new goal arrives at the same time executeLoop is trying to get a previous (but still new, SimpleActionServer will deadlock.

I suspect the solution involves calling accept_new_goal() [] without holding SimpleActionServer.lock. My intuition is that simply setting a flag will do, but I will have to study the code a bit more to make sure there no side-effects.

I will try to code the solution myself unless someone fixes it quicker :)

Best, Miguel S.