20-60 second delays in actionlib.SimpleActionClient.wait_for_server()

asked 2017-12-11 04:54:57 -0500

lindzey gravatar image

updated 2017-12-11 04:55:23 -0500

I am sporadically seeing actionlib’s wait_for_server() take tens of seconds to return when starting a new client node, even when the server node is already up, running, and responding to goals from other nodes.

It’s disconcerting to have this sometimes return instantly, and sometimes require up to a minute.

Is this expected? Has anybody seen this before? Does it reflect some error in how we’ve configured the system or how we're using actionlib?


We are running Indigo on Ubuntu 14.04, using the python implementation of SimpleActionClient/SimpleActionServer, and a single node implements multiple servers.

Unfortunately, I have been unable to reproduce this delay in simulation or on the bench. So far, it has only (sporadically) shown up when testing in the field. I don’t have enough samples to say for sure, but the delays might get longer the longer we’re operating. Additionally, I have not seen this the first time a client connects to the server; it seems to happen on subsequent connections.

Based on the SimpleActionClient’s and SimpleActionServer’s log files, I think the issue is that the /cancel and /goal topics aren’t connecting immediately: note the 20 second delay between 13:51:55 -> 13:52:15.

If I understand the logs, the problem was that rosmaster needed to push out updates regarding who was publishing/subscribing in order for the SimpleActionServer to properly set up inbound connections.

I’m only seeing it in this particular node, and only for inbound. (While waiting for the client to start, we used rostopic echo to create new connections on other topics.)


Here are what I think are the relevant bits of the logs:

  • Server (/motions)
  • Client (/planner) (Note that there was already another client node connected to the server at this time, on the same topics.
  • master.log (I removed +PUB, +SUB, publisherUpdate notifications for other topics, but left in all errors)
edit retag flag offensive close merge delete

Comments

Without an MWE this is going to be difficult to debug, so let's hope someone has seen this before and happens upon your post.

In any case this sounds like a proper problem, so perhaps opening an issue over at ros/actionlib/issues would be appropriate.

gvdhoorn gravatar image gvdhoorn  ( 2017-12-11 06:38:32 -0500 )edit

Do first please check the open issues, as it may already have been reported (ros/actionlib#75 is C++, but sounds similar fi).

gvdhoorn gravatar image gvdhoorn  ( 2017-12-11 06:40:08 -0500 )edit
1

I've read through all open issues, and none seem to be the same thing. I've submitted this as ros/actionlib#93)

lindzey gravatar image lindzey  ( 2018-01-07 06:46:00 -0500 )edit