Actionlib reconnect problem (c++ , with test case)
I'm unable to get my action clients to reconnect to their servers for more than a single request when the connection is temporarily lost (e.g. when the server node is restarted). It may be that I'm doing something wrong with threading. I created a very simple test case [1] (server.cpp and client.cpp) to illustrate the problem.
To see the problem, simply start the client, and leave it running. It will report its connection status every second, and attempt to send a goal if it's connected. In another terminal, run the server, then stop it, and then start it again. The following is the full output of the client terminal, with comments denoting when I start and stop the server.
$ rosrun actionlib_cpp_disconnect client # in first terminal client not connected, skipping goal=0! client not connected, skipping goal=1! #### here i started the server, second terminal client connected, sending goal=2! client connected, sending goal=3! #### here i stopped the server, second terminal client not connected, skipping goal=4! client not connected, skipping goal=5! #### here i started the server again, second terminal [ WARN] [1375230239.830277763]: goalConnectCallback: Trying to add [/server] to goalSubscribers, but it is already in the goalSubscribers list [ WARN] [1375230239.830734224]: cancelConnectCallback: Trying to add [/server] to cancelSubscribers, but it is already in the cancelSubscribers list client connected, sending goal=6! # the first one after the reconnect always works ... client not connected, skipping goal=7! # wtf? the server is running ... client not connected, skipping goal=8! client not connected, skipping goal=9!
Is this standard behaviour? So far, the only workaround I've found is to manually delete the action client whenever it reports that it's disconnected, and re-construct a new one in its place.
I'm using ROS Groovy on Ubuntu Precise 12.04 with actionlib version 1.9.11-0precise-20130325-2034-+0000.
Here's the server code:
void cb_goal(actionlib::ServerGoalHandle<actionlib::testaction> srv_handle) { printf("server received new goal, goal=%d!\n", srv_handle.getGoal()->goal); srv_handle.setRejected(); } int main(int argc, char **argv) { ros::init(argc, argv, "server"); ros::NodeHandle nh; actionlib::ActionServer<actionlib::testaction> server(nh, "action", cb_goal, false); /* auto_start = false */ server.start(); ros::spin(); return 0; }
Here's the client code:
void ros_spin() { ros::spin(); } int main(int argc, char** argv) { ros::init(argc, argv, "client"); ros::NodeHandle nh; actionlib::ActionClient<actionlib::testaction> client("action"); boost::thread mythd(ros_spin); actionlib::TestGoal goal; goal.goal = 0; ros::Rate loop_rate(1.0); while (ros::ok()) { if (client.isServerConnected()) { printf("client connected, sending goal=%d!\n", goal.goal); client.sendGoal(goal); } else { printf("client not connected, skipping goal=%d!\n", goal.goal); } loop_rate.sleep(); goal.goal++; } return 0; }
This is an extremely good question. Tried to figure out for an hour what's going on, but gave up. I started out by replacing Client and Server by their Simple* versions, just to make sure you got all state transitions right (see https://gist.github.com/mintar/6132716, FWIW). Sounds like a bug to me.
I submitted a bug against actionlib here: https://github.com/ros/actionlib/issues/7. I'll keep this question updated.