rosout logging + topic subscription: not thread safe
I believe I have found a threading issue with ros logging.
If a node subscribes to a topic in one thread and logs to rosout in another, a conflict can occur, since the nodes topic list may be altered while the rosout logging traverses it.
The following error is reported:
Unable to report rosout: Set changed size during iteration
And what is worse is that tcpros protocol synchronization is lost:
[ERROR] [1520594758.451498758] [/rosout]: a message of over a gigabyte was predicted in tcpros. that seems highly unlikely, so I'll assume protocol synchronization is lost.
[ERROR] [1520594758.451553409] [/rosout]: Exception thrown when deserializing message of length [0] from [/rosout_test]: Buffer Overrun
[ERROR] [1520594758.451622886] [/rosout]: Exception thrown when deserializing message of length [39] from [/rosout_test]: Buffer Overrun
Are there any locks or similar in place, so I can safely subscribe/unsubscribe to topics?
This python node reproduces the problem:
#!/usr/bin/env python
import rospy
from std_msgs.msg import Float64
from threading import Thread
def topic_callback(msg):
pass
def my_topic_thread():
"""endless subscribing and unsubscribing to a list of topics"""
topic_names = ["/some/topic{}".format(i) for i in range(0, 100)]
while not rospy.is_shutdown():
rospy.loginfo("subscribing")
subs = []
for topic in topic_names:
subs.append(rospy.Subscriber(topic, Float64, topic_callback))
rospy.sleep(0.001)
rospy.loginfo("unsubscribing")
for sub in subs:
sub.unregister()
if __name__ == '__main__':
rospy.init_node('demo')
topic_thread = Thread(target=my_topic_thread)
topic_thread.start()
while not rospy.is_shutdown():
# This logging fails intermittently, because the nodes
# topic list is changed during logging
# Error: protocol synchronization is lost
rospy.loginfo("dummy message")
rospy.sleep(0.001)
topic_thread.join()
kinetic, ubuntu 16.04
This is not an answer, but a recommendation: ROS Answers might not be the best place to report this sort of problem. If you have an MWE and believe it is a bug, it might be more efficient to post to ros/ros_comm/issues directly. If you do, please link back.
Thanks for the input. Now reported here