rosmaster unresponsive
Hi guys,
I have a somewhat weird problem with rosmaster. In certain situations rosmaster becomes completely unresponsive and reports 100% cpu usage. The situation occurred in Electric both under Ubuntu Natty (2.6.38-13) and Oneiric (3.0.0-14) 64-bit.
An executable p (written in C++) subscribes and publishes to a single topic t and sends message every 100ms via rosudp. After about 10 seconds the executable calls ros::shutdown() and terminates.
Executables are started in the following pattern:
while(true) {
for(i: 1..10) {
for(j: 1..10) {
Start p i-times in parallel.
Wait for all processes to terminate
Sleep(0.5s)
}
}
}
After about 300 runs through the innermost loop, rosmaster becomes unresponsive without error message. Tools such as roswtf or rostopic fail to communicate with it and do not report any error. Further, cpu usage of rosmaster climbs to 100% in top. This happens both if the nodes initialise ros with the AnonymousName init option or without.
Is there any suggested way to diagnose this problem? Is there an upper limit on the number of processes/nodes a roscore can manage?
Update:
The master.log contains many errors similar to the following:
[rosmaster.threadpool][ERROR] 2012-02-03 12:28:02,770: Traceback (most recent call last):
File "/opt/ros/electric/stacks/ros_comm/tools/rosmaster/src/rosmaster/threadpool.py", line 218, in run
result = cmd(*args)
File "/opt/ros/electric/stacks/ros_comm/tools/rosmaster/src/rosmaster/master_api.py", line 189, in publisher_update_task
xmlrpcapi(api).publisherUpdate('/master', topic, pub_uris)
File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1575, in __request
verbose=self.__verbose
File "/usr/lib/python2.7/xmlrpclib.py", line 1264, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python2.7/xmlrpclib.py", line 1297, in single_request
return self.parse_response(response)
File "/usr/lib/python2.7/xmlrpclib.py", line 1473, in parse_response
return u.close()
File "/usr/lib/python2.7/xmlrpclib.py", line 793, in close
raise Fault(**self._stack[0])
Fault: <Fault -1: 'publisherUpdate: unknown method name'>
The log ends with the following error:
rosmaster.threadpool][ERROR] 2012-02-03 12:28:38,945: Traceback (most recent call last):
File "/opt/ros/electric/stacks/ros_comm/tools/rosmaster/src/rosmaster/threadpool.py", line 218, in run
result = cmd(*args)
File "/opt/ros/electric/stacks/ros_comm/tools/rosmaster/src/rosmaster/master_api.py", line 189, in publisher_update_task
xmlrpcapi(api).publisherUpdate('/master', topic, pub_uris)
File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1575, in __request
verbose=self.__verbose
File "/usr/lib/python2.7/xmlrpclib.py", line 1264, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python2.7/xmlrpclib.py", line 1292, in single_request
self.send_content(h, request_body)
File "/usr/lib/python2.7/xmlrpclib.py", line 1439, in send_content
connection.endheaders(request_body)
File "/usr/lib/python2.7/httplib.py", line 951, in endheaders
self ...