Subscriber Resubscription Problem [closed]
Hello, I have yet to find a solution to this problem, additionally I have yet to replicate it artificially. I am wondering if anyone has come across it.
I have a program that is built upon a ROS network, there are times when publishers are frozen or taken offline, and then unfrozen and respawned. There are times after this when the subscriber does not receive any messages, it simply does nothing until it is shut down and restarted, when it will start receiving messages again. Has anyone come across this before?
I have tried to induce this artificially by manually shutting down and restarting a publisher but the subscriber always reconnects to the topic.
Some information that may be useful, we are currently running ROS Hydro on a custom WinROS build. The deployed system is spilt across two computers.
ROS_IP is set
ROS_MASTER_URI is set
ROSLaunch is not being used, so bringing down the first node does not bring down ROSCore ( https://answers.ros.org/question/1017... )
Is there a way to explicitly tell the ROS Master to send another callback to nodes that are subscribed to a topic informing them that there is a new publisher?
EDIT:
I finally produced the error and roswtf can detect it with: "the following nodes should be connected but aren't"
I have continued digging and still have no solution, though I have found more information that may help, the ROS Master logs list the following errors, which leads to to the following open issue - https://github.com/ros/ros_comm/issue...
[rosmaster.master][INFO] 2018-04-04 15:51:15,819: publisherUpdate[/node2] -> http://192.168.0.134:55036/ [rosmaster.threadpool][ERROR] 2018-04-04 15:51:16,823: Traceback (most recent call last):
File "C:\opt\ros\hydro\x64\lib\site-packages\rosmaster\threadpool.py", line 218, in run result = cmd(*args)
File "C:\opt\ros\hydro\x64\lib\site-packages\rosmaster\master_api.py", line 189, in publisher_update_task xmlrpcapi(api).publisherUpdate('/master', topic, pub_uris)
File "C:\Python27\lib\xmlrpclib.py", line 1224, in __call__ return self.__send(self.__name, args)
File "C:\Python27\lib\xmlrpclib.py", line 1578, in __request verbose=self.__verbose
File "C:\Python27\lib\xmlrpclib.py", line 1264, in request return self.single_request(host, handler, request_body, verbose)
File "C:\Python27\lib\xmlrpclib.py", line 1292, in single_request self.send_content(h, request_body)
File "C:\Python27\lib\xmlrpclib.py", line 1439, in send_content connection.endheaders(request_body)
File "C:\Python27\lib\httplib.py", line 991, in endheaders self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 844, in _send_output self.send(msg)
File "C:\Python27\lib\httplib.py", line 806, in send self.connect()
File "C:\Python27\lib\httplib.py", line 787, in connect self.timeout, self.source_address)
File "C:\Python27\lib\socket.py", line 571, in create_connection raise err
error: [Errno 10061] No connection could be made because the target machine actively refused it
[rosmaster.master][INFO] 2018-04-04 16:03:50,592: publisherUpdate[/node] -> http://192.168.0.134:55698/ [rosmaster.threadpool][ERROR] 2018-04-04 16:03 ...
this will most likely make it very difficult for people to reproduce this and / or help you. You're also almost 5 ROS releases and 3 years behind current developments. The problem you are experiencing could have already been fixed.
I know, I'm interested to see if anyone has come across the issue and can point me in any direction. So far my research hasn't given any results. If has been fixed it would be great, I'd at least know what is causing the problem.
Would using bond to have your subscribers automatically be recreated whenever the publisher is killed be a useful addition?
That may help, I'll look into it, thank you.
I finally produced the error and roswtf can detect it with: the following nodes should be connected but aren't...
That's good to hear. Perhaps add some details on how to reproduce it? That could potentially help people.
if you can create an MWE and it is also an issue with current
HEAD
, then that would really help.Unfortunately I cannot give a MWE. I have figured out though that when the nodes aren't connected, the node which I have respawned is not given a new port number by ROS, which leads me to believe it may be shut down issue.