Debugging - TCP/IP connection failed errors
I have a codebase that runs reliably on a particular pc with ubuntu 16.04/kinetic, I tried running this code on another PC with 18.04/Melodic and the setup only works (i.e the ros server correctly parses messages between nodes and the robot behaves as expected) intermittently. While trying to diagnose the problem on the new PC I notice that:
"Inbound TCP/IP connection failed: connection from sender terminated before handshake header received. 0 bytes were received. Please check sender for additional details. "
Is often printed in the console. In addition, if I look in the roslaunch logs I can see many of the nodes reporting errors like:
[rosout][DEBUG] 2019-06-27 05:25:42,632: connecting to thehive 34707
[rospy.internal][INFO] 2019-06-27 05:25:42,662: topic impl's ref count is zero, deleting topic /optic_flow_node/matched_filter_out...
[rospy.internal][INFO] 2019-06-27 05:25:42,662: topic[/optic_flow_node/matched_filter_out] removing connection to
http://thehive:37109/
[rosout][DEBUG] 2019-06-27 05:25:42,663: connecting to thehive 41453
[rospy.topics][ERROR] 2019-06-27 05:25:42,665: Traceback (most recent call last):
File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/topics.py", line 326, in close
c.close()
File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/impl/tcpros_base.py", line 841, in close
self.socket.close()
AttributeError: 'NoneType' object has no attribute 'close'
[rospy.internal][WARNING] 2019-06-27 05:25:42,666: Unknown error initiating TCP/IP socket to thehive:41453 (http://thehive:37109/): Traceback (most recent call last):
File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/impl/tcpros_base.py", line 558, in connect
self.read_header()
File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/impl/tcpros_base.py", line 653, in read_header
self._validate_header(read_ros_handshake_header(sock, self.read_buff, self.protocol.buff_size))
AttributeError: 'NoneType' object has no attribute 'buff_size'
or
[rosout][INFO] 2019-06-27 05:25:42,668: receiving optic flow topic
[rospy.internal][INFO] 2019-06-29 15:15:56,082: topic[/clock] removing connection to http://thehive:41299/
[rospy.internal][WARNING] 2019-06-29 15:15:56,083: Unknown error initiating TCP/IP socket to thehive:56837 (http://thehive:41299/): Traceback (most recent call last):
File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/impl/tcpros_base.py", line 558, in connect
self.read_header()
File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/impl/tcpros_base.py", line 653, in read_header
self._validate_header(read_ros_handshake_header(sock, self.read_buff, self.protocol.buff_size))
AttributeError: 'NoneType' object has no attribute 'buff_size'
[rospy.topics][ERROR] 2019-06-29 15:15:56,084: Traceback (most recent call last):
File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/topics.py", line 326, in close
c.close()
File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/impl/tcpros_base.py", line 841, in close
self.socket.close()
AttributeError: 'NoneType' object has no attribute 'close'
I'm not 100% sure that these errors are causing the functional problems I am seeing - but so that I can at least rule it out can anyone advise what this type of warning message might suggest, and/or how I can prevent it?
Asked by jannsta on 2019-06-29 10:15:57 UTC
Comments
Are you creating
Publisher
s orSubscriber
s in the bodies of callbacks? That could explain this -- as they'd go out of scope so fast that the middleware has no chance of actually delivering the bytes to clients.Asked by gvdhoorn on 2019-06-29 10:33:23 UTC
I'm pretty sure I don't do this - I do some post processing of data in the callbacks so I guess there is a chance that new data would come in while this processing is happening - could this be a problem? I'm trying to pin down if there is a particular topic that is causing the failure. From the traceback above ("topic impl's ref count is zero...") - can I infer that /optic_flow_node/matched_filter_out is causing the TCP/IP error? Or should this information be treated separately?
Asked by jannsta on 2019-06-29 11:50:36 UTC
Did you solve this error? I have the same problem.
Asked by fabbro on 2020-07-09 02:28:42 UTC
Unfortunately I didn't ever find the root cause of this. It's a while since I worked on this task but I would be keen to know what the issue was if you manage to resolve it.
Asked by jannsta on 2020-07-09 02:38:48 UTC
I had problem with ROS tcp that data from topics was lost when the network connection was lost. The solution was configure the /etc/hosts file with the right hostname of the machine. Try to check your ros network configuration. http://wiki.ros.org/ROS/NetworkSetup#Configuring_.2Fetc.2Fhosts
Asked by VitorHirozawa on 2021-03-26 10:03:19 UTC