Why do roscpp publishers and subscribers take so much longer to connect than their rospy counterparts?

asked 2020-12-21 03:42:09 -0600

grouchy gravatar image

updated 2020-12-21 03:46:26 -0600

When a roscpp subscriber is created on a topic which has existing lively or latched publishers already publishing to it, it takes ~200ms before messages start to be received. Compare this to a rospy subscriber which takes ~10ms before messages start to be received. Note: These timings were obtained with both ends running in separate processes but on the same machine.

This might not seem like much, but it takes the delay from virtually imperceptible to somewhat noticeable, and thus prohibits subscribers from being created as a result of a user action, such as the pressing of a button, or in the middle of a time critical sequence. The canonical solution seems to be to create subscribers well before they are ever needed which is what I ended up doing, but this increases my program's complexity and introduces data races which weren't there before. What's more, it seems roscpp services can connect to each other typically in less than 1ms, so this delay seems artificial. Also, I noticed that if a node already has another subscriber to the same topic, the delay is reduced into the realms of imperceivability, so "pre-subscribing" to topics may be a valid option.

I've created a Github repo for the testing I've performed: https://github.com/stevegolton/ros_su... The results can be seen below running on a relatively modern Intel machine.

roscpp

Starting tests...
Average time spent in service_client.waitForExistence = 0.000481 seconds
Average time spent in topic::waitForMessage = 0.198579 seconds
Average time spent in topic::waitForMessage (presub) = 0.000958 seconds
Average time spent in action_client.waitForServer = 0.240173 seconds
Average time spent in action_client.waitForServer (presub) = 0.102728 seconds
Average time spent in action_client.waitForServer (prewait) = 0.000002 seconds
Average time spent waiting for TF = 0.187526 seconds
Average time spent waiting for TF (presub) = 0.007806 seconds
Testing complete

rospy

Allowing ample time for the server to come up...
Starting tests...
Average time spent in rospy.wait_for_message = 0.012007 seconds
Average time spent in action_client.wait_for_server = 0.002346 seconds
Average time spent waiting for TF = 0.010064 seconds
Testing complete

Digging into ros_comm a little, it seems the problem comes down to this block of code in xmlrpc_manager.cpp:line 266.

// Update the XMLRPC server, blocking for at most 100ms in select()
{
  boost::mutex::scoped_lock lock(functions_mutex_);
  server_.work(0.1);
}

Reducing the timeout passed to server_.work() to e.g. 0.01 has a corresponding reduction in the magnitude of the message delay tim, which appears to solve my problem. However, I found this made other parts of the system unstable when I tried it before, which makes this solution feel very much like a hack.

roscpp + patched ros_comm 0.01ms block time

Starting tests...
Average time spent in service_client.waitForExistence = 0.000425 seconds
Average time spent in topic::waitForMessage = 0.020430 seconds
Average time spent in topic::waitForMessage (presub) = 0.000939 seconds
Average time spent in action_client.waitForServer = 0.084940 seconds
Average time spent in ...
(more)
edit retag flag offensive close merge delete