controller_manager behaviour when a hardware error occurs

asked 2020-08-18 13:05:52 -0500

grouchy
53 ●5 ●7 ●9

I'm integrating ros_control and thus controller_manager with a new robot, but I can't really figure out the best way to handle robot hardware errors.

Let's say a robot goes into an error state while a joint_trajectory_controller is running and the joints stop servoing, right now I call update() with reset_controllers == true which aborts any active action goal, but the controller is restarted and will accept new goals. If another action is sent to the controller, it tries to run happily but it actually isn't moving the arm because the arm has stopped servoing!

It seems like I almost need a stop_controllers flag in the update function, or otherwise an interface in controller_manager to stop the controllers without having to call the switch_controllers service which seems messy.

I tried completely destroying the controller_manager while the arm is disabled but I'd quite like to keep passive controllers like the joint_state_controller running.

What's the recommended way to deal with this?

edit retag flag offensive close merge delete

add a comment

answered 2020-08-18 13:58:50 -0500

gvdhoorn
86574 ●283 ●1432 ●1054 http://cor.tudelft.nl/

updated 2020-08-19 03:01:46 -0500

I'm not sure whether they're recommended, but I've seen two approaches:

use an external node which monitors your robot (make your hardware_interface publish some sort of robot state which includes whether it's errored or not) and then have the external node call the switch_controllers service of the controller_manager. An example of this approach would be Universal_Robots_ROS_Driver (overview in readme, controller_stopper node).
as part of your hardware_interface, check the state of your robot, and whenever you encounter an error, use ControllerManager::getControllerByName(..) to retrieve all controllers you'd like/need to stop, and when any of them isRunning(), call stopRequest(..). I've seen this approach used in a proprietary component, so I can't link you to it.

Both approaches result in the same thing: a set of controllers, preconfigured by you (or your user), will be stopped upon the occurence/detection of an error (either in the remote system, or your own hardware_interface).

an interface in controller_manager to stop the controllers without having to call the switch_controllers service which seems messy

It is, but on the other hand, you do use the interfaces as they were intended to be used.

At least the first approach removes the responsiblity of managing controller runtime state from the hardware_interface, and places it in a separate node. That seems cleaner than embedding the logic in it (at the very least the separate node allows to relatively easily change the policy).

Edit:

As you say, the controllers are not the responsibility of the hardware interface, they were owned by whatever client is using the controllers,

If you understood it that way, that's not what I meant to say.

I agree the controllers are not the responsibility of the hardware_interface, but I don't agree they are owned by the "client". They are managed by the controller_manager, which in the end is owned by the node which wraps the entire hardware_interface.

thus the client should decide when to stop controllers when things go wrong, which is closer in concept to option 1.

It would be a different client, but yes: keeping that separate from the hardware_interface itself seems like an OK idea to me.

I think a good compromise would be a dynamic resource based approach where the joint resources could be made unavailable at any time by the hardware interface (e.g. position interfaces made unavailable, but joint states still available when the arm goes into disabled mode), and controllers can react accordingly by cancelling actions and refusing to start new ones until their resources are returned.

I would have to reason about it a bit more, but I'm not sure I like the idea of removing command handles upon error. Semantically, it would seem like it "gives the wrong message": it's not the command facilities of your robot suddenly don't exist any more, it's the state of those resources has changed (ie: command handles can't actually command the joint they represent).

Seeing as this ... (more)

edit flag offensive delete link

Comments

I reckon option 2 fits best my understanding of how ros_control works, though most likely its my understanding that's wrong in the first place. As you say, the controllers are not the responsibility of the hardware interface, they were owned by whatever client is using the controllers, and thus the client should decide when to stop controllers when things go wrong, which is closer in concept to option 1.

I think a good compromise would be a dynamic resource based approach where the joint resources could be made unavailable at any time by the hardware interface (e.g. position interfaces made unavailable, but joint states still available when the arm goes into disabled mode), and controllers can react accordingly by cancelling actions and refusing to start new ones until their resources are returned.

grouchy ( 2020-08-18 16:41:15 -0500 )edit

add a comment

controller_manager behaviour when a hardware error occurs

1 Answer

Comments

Question Tools

Stats

Related questions

controller_manager behaviour when a hardware error occurs edit

1 Answer

Comments

Question Tools

Stats

Related questions

controller_manager behaviour when a hardware error occurs