Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

For a few months I sidestepped this problem. But today I've found the deadlock. I reckon there's a bug in simple_action_server.py.

I wasn't able to get the python debugger to produce the stack traces that explained the problem. But after looking at the roslog debug messages and the source code, I came up with the following reconstruction of the events:

Thread A (rospy's message dispatcher?)

  • Deadlocked in: self.execute_condition.acquire()
  • In function: SimpleActionServer.internal_goal_callback() [simple_action_server.py:211] which was called from: ActionServer.internal_goal_callback() [action_server.py:293]
  • This thread has ActionServer.lock and wants to acquire SimpleActionServer.lock (condition variable was initialised with the latter lock).

Thread B (SimpleActionServer's executeLoop thread)

  • Deadlocked in: with self.action_server.lock
  • In function ServerGoalHandle.set_accepted() [server_goal_handle.py:71] which was called from: SimpleActionServer.accept_new_goal()[simple_action_server.py:131] which was called from: SimpleActionServer.executeLoop()[simple_action_server.py:284] which at that point is holding SimpleActionServer.lock
  • This thread wants ActionServer.lock and has SimpleActionServer.lock

In summary, if if a new goal arrives at the same time executeLoop is trying to get a previous (but still new, SimpleActionServer will deadlock.

I suspect the solution involves calling accept_new_goal() [simple_action_server.py:284] without holding SimpleActionServer.lock. My intuition is that simply setting a flag will do, but I will have to study the code a bit more to make sure there no side-effects.

I will try to code the solution myself unless someone fixes it quicker :)

Best, Miguel S.