ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question

Shouldn't smach transition even if the action process has died?

asked 2015-06-11 15:49:38 -0500

updated 2016-05-31 23:56:32 -0500

lucasw gravatar image


I am using a smach concurrent state machine and I want it to transition, even if one of the child states has died. It seems like it sits there waiting for the pre-emption to return, when it never will, since the process has died. This leads to undesirable behavior where the state machine will not transition to other states when command messages are sent.

My concurrent state is composed of the following states:

  • A simple action state that executes a long running behavior, implemented as a ROS nodelet.
  • A monitor state that subscribes to a ROS topic, and transitions when that topic is received.

This all works fine if the nodelet is running as expected. I can publish a topic which the monitor state listens for, the monitor state becomes invalid, and terminates, causing the outcome of the concurrent state to transition to a different state.

The problem comes when the simple action state has died (due to software bugs causing a crash.) Of course, I am fixing these bugs to prevent future crashes of the nodelet process. I am also now setting the problem nodelet to respawn in the launch file. However, for robustness, I would like for the state machine to still transition, even if this process has died.

Is this behavior by design or is there a way to force the concurrence behavior to transition, even if one of the children does not respond to the pre-emption request? In my outcome map, I am using the following, which I thought would cause it to terminate from the monitor state alone, but I think it is waiting for a response from the crashed nodelet.

child_termination_cb=(lambda so: True)

Thanks in advance!

Edit: I am adding my subclassed code for SimpleActionState: Edit2: The code is working now and properly aborts the state:

import roslib
import rospy
import smach
import smach_ros
from smach import State, StateMachine, CBState
from smach_ros import SimpleActionState, IntrospectionServer
from std_msgs.msg import Empty

import threading
import traceback
import copy

from actionlib.simple_action_client import SimpleActionClient, GoalStatus

import smach
from smach.state import *

__all__ = ['WatchdogActionState']

class WatchdogActionState(SimpleActionState):
    """Watchdog Simple action client state.  
       Subclasses the Smach SimpleActionState to provide watchdog capability on the actionserver process execution.

"""Constructor for SimpleActionState action client wrapper."""
# Overridden constructor:
def __init__(self,
        # Action info
        # Feedback (heartbeat) timeout = how long to wait for a process that has died
        # before aborting this state.
        feedback_wait_timeout = rospy.Duration(3.0),
        # Default goal 
        goal = None,
        goal_key = None,
        goal_slots = [],
        goal_cb = None,
        goal_cb_args = [],
        goal_cb_kwargs = {},
        # Result modes
        result_key = None,
        result_slots = [],
        result_cb = None,
        result_cb_args = [],
        result_cb_kwargs = {},
        # Keys
        input_keys = [],
        output_keys = [],
        outcomes = ['succeeded','aborted','preempted'],
        # Timeouts
        exec_timeout = None,
        preempt_timeout = rospy.Duration(1.0),
        server_wait_timeout = rospy.Duration(1.0)

# Initialize base class
rospy.loginfo("init watchdog_action_state for action_name [%s], action_spec [%s]", action_name, action_spec)
    # The feedback (hearbeat) timeout
    self._feedback_wait_timeout = feedback_wait_timeout
rospy.loginfo("server_wait_timeout [%f]", self._feedback_wait_timeout.to_sec())
    self._feedback_time = rospy.Time ...
edit retag flag offensive close merge delete


I found a related issue, but it doesn't say how to address the problem with smach: simpleactionstate-behaviour-if-action-server-crashes

ceverett gravatar image ceverett  ( 2015-06-11 16:08:33 -0500 )edit

I have it working now. I pasted the correct code now, which subclasses SimpleActionState and will abort the state in a separate heartbeat thread if it hasn't received feedback in 3 seconds. If you have any feedback that would be great, but I am marking this as answered. Thanks again.

ceverett gravatar image ceverett  ( 2015-06-22 09:25:39 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted

answered 2015-06-11 18:25:59 -0500

jbohren gravatar image

updated 2015-06-11 18:32:55 -0500

You would need to create a NotSoSimpleActionState which accepts a timeout that can be used to monitor the actionlib status topic for the action. Have the state abort if this heartbeat timeout is violated.

The problem is that when the concurrence tries to preempt the SimpleActionState, it's trying to preempt the action, which will never preempt, because it has died.

edit flag offensive delete link more


Thank you very much. I added the code for my first pass at this suggestion to the question above. However, I am not sure how I can abort? Please take a look at the method/thread _heartbeat_timer() where I try to abort. What should I call to force an abort? Thanks!

ceverett gravatar image ceverett  ( 2015-06-19 16:25:35 -0500 )edit

Question Tools



Asked: 2015-06-11 15:49:38 -0500

Seen: 523 times

Last updated: May 31 '16