Robotics StackExchange | Archived questions

Subscriber Resubscription Problem

Hello, I have yet to find a solution to this problem, additionally I have yet to replicate it artificially. I am wondering if anyone has come across it.

I have a program that is built upon a ROS network, there are times when publishers are frozen or taken offline, and then unfrozen and respawned. There are times after this when the subscriber does not receive any messages, it simply does nothing until it is shut down and restarted, when it will start receiving messages again. Has anyone come across this before?

I have tried to induce this artificially by manually shutting down and restarting a publisher but the subscriber always reconnects to the topic.

Some information that may be useful, we are currently running ROS Hydro on a custom WinROS build. The deployed system is spilt across two computers.

ROS_IP is set

ROSMASTERURI is set

ROSLaunch is not being used, so bringing down the first node does not bring down ROSCore (https://answers.ros.org/question/10171/ros-msgs-not-received-if-publisher-starts-after-subscriber/)

Is there a way to explicitly tell the ROS Master to send another callback to nodes that are subscribed to a topic informing them that there is a new publisher?

EDIT:

I finally produced the error and roswtf can detect it with: "the following nodes should be connected but aren't"

I have continued digging and still have no solution, though I have found more information that may help, the ROS Master logs list the following errors, which leads to to the following open issue - https://github.com/ros/ros_comm/issues/572


[rosmaster.master][INFO] 2018-04-04 15:51:15,819: publisherUpdate[/node2] -> http://192.168.0.134:55036/ [rosmaster.threadpool][ERROR] 2018-04-04 15:51:16,823: Traceback (most recent call last):

File "C:\opt\ros\hydro\x64\lib\site-packages\rosmaster\threadpool.py", line 218, in run result = cmd(*args)

File "C:\opt\ros\hydro\x64\lib\site-packages\rosmaster\masterapi.py", line 189, in publisherupdatetask xmlrpcapi(api).publisherUpdate('/master', topic, puburis)

File "C:\Python27\lib\xmlrpclib.py", line 1224, in call return self.send(self.name, args)

File "C:\Python27\lib\xmlrpclib.py", line 1578, in request verbose=self.verbose

File "C:\Python27\lib\xmlrpclib.py", line 1264, in request return self.singlerequest(host, handler, requestbody, verbose)

File "C:\Python27\lib\xmlrpclib.py", line 1292, in singlerequest self.sendcontent(h, request_body)

File "C:\Python27\lib\xmlrpclib.py", line 1439, in sendcontent connection.endheaders(requestbody)

File "C:\Python27\lib\httplib.py", line 991, in endheaders self.sendoutput(message_body)

File "C:\Python27\lib\httplib.py", line 844, in sendoutput self.send(msg)

File "C:\Python27\lib\httplib.py", line 806, in send self.connect()

File "C:\Python27\lib\httplib.py", line 787, in connect self.timeout, self.source_address)

File "C:\Python27\lib\socket.py", line 571, in create_connection raise err

error: [Errno 10061] No connection could be made because the target machine actively refused it


[rosmaster.master][INFO] 2018-04-04 16:03:50,592: publisherUpdate[/node] -> http://192.168.0.134:55698/ [rosmaster.threadpool][ERROR] 2018-04-04 16:03:50,592: Traceback (most recent call last):

File "C:\opt\ros\hydro\x64\lib\site-packages\rosmaster\threadpool.py", line 218, in run result = cmd(*args)

File "C:\opt\ros\hydro\x64\lib\site-packages\rosmaster\masterapi.py", line 189, in publisherupdatetask xmlrpcapi(api).publisherUpdate('/master', topic, puburis)

File "C:\Python27\lib\xmlrpclib.py", line 1224, in call return self.send(self.name, args)

File "C:\Python27\lib\xmlrpclib.py", line 1578, in request verbose=self.verbose

File "C:\Python27\lib\xmlrpclib.py", line 1264, in request return self.singlerequest(host, handler, requestbody, verbose)

File "C:\Python27\lib\xmlrpclib.py", line 1297, in singlerequest return self.parseresponse(response)

File "C:\Python27\lib\xmlrpclib.py", line 1473, in parse_response return u.close()

File "C:\Python27\lib\xmlrpclib.py", line 793, in close raise Fault(**self._stack[0])

Fault:

Asked by EpicZa on 2018-03-23 14:57:13 UTC

Comments

we are currently running ROS Hydro on a custom WinROS build

this will most likely make it very difficult for people to reproduce this and / or help you. You're also almost 5 ROS releases and 3 years behind current developments. The problem you are experiencing could have already been fixed.

Asked by gvdhoorn on 2018-03-23 15:39:24 UTC

I know, I'm interested to see if anyone has come across the issue and can point me in any direction. So far my research hasn't given any results. If has been fixed it would be great, I'd at least know what is causing the problem.

Asked by EpicZa on 2018-03-23 16:14:23 UTC

Would using bond to have your subscribers automatically be recreated whenever the publisher is killed be a useful addition?

Asked by jarvisschultz on 2018-03-23 16:26:16 UTC

That may help, I'll look into it, thank you.

Asked by EpicZa on 2018-03-24 13:32:55 UTC

I finally produced the error and roswtf can detect it with: the following nodes should be connected but aren't...

Asked by EpicZa on 2018-03-27 10:13:13 UTC

That's good to hear. Perhaps add some details on how to reproduce it? That could potentially help people.

if you can create an MWE and it is also an issue with current HEAD, then that would really help.

Asked by gvdhoorn on 2018-03-27 10:36:56 UTC

Unfortunately I cannot give a MWE. I have figured out though that when the nodes aren't connected, the node which I have respawned is not given a new port number by ROS, which leads me to believe it may be shut down issue.

Asked by EpicZa on 2018-04-02 11:39:34 UTC

Answers