Message Size-dependent Queueing/Latency
ROS Indigo (and/or Lunar), Ubuntu 14.04 / Docker (ros-indigo-base, ros-lunar-base)
I've noticed some delays up to 1/(publish_frequency) for medium sized messages. I've reproduced the behavior just using rostopic below:
rostopic pub -r 50 /test std_msgs/String "data: '`printf "%0.s-" {1..100}`'"
rostopic hz -w 50 /test
average rate: 49.983
min: 0.018s max: 0.022s std dev: 0.00043s window: 50
Pretty normal, what you might expect. Now how about 1000 bytes of hyphens:
rostopic pub -r 50 /test std_msgs/String "data: '`printf "%0.s-" {1..1000}`'"
average rate: 50.033
min: 0.019s max: 0.020s std dev: 0.00018s window: 50
average rate: 49.994
min: 0.019s max: 0.021s std dev: 0.00019s window: 50
average rate: 49.990
min: 0.019s max: 0.021s std dev: 0.00021s window: 50
average rate: 49.973
min: 0.019s max: 0.021s std dev: 0.00021s window: 50
average rate: 49.027
min: 0.000s max: 0.040s std dev: 0.01368s window: 50
average rate: 51.041
min: 0.000s max: 0.041s std dev: 0.01996s window: 50
average rate: 51.035
min: 0.000s max: 0.044s std dev: 0.01963s window: 50
Notice that after about 3 seconds, we see a min of 0 seconds and a max of .04 seconds.
Now for a weirdness, this doesn't seem to happen with big messages. 100,000 bytes of hyphens:
rostopic pub -r 50 /test std_msgs/String "data: '`printf "%0.s-" {1..100000}`'"
average rate: 50.037
min: 0.019s max: 0.020s std dev: 0.00020s window: 50
average rate: 49.994
min: 0.019s max: 0.021s std dev: 0.00027s window: 50
average rate: 49.979
min: 0.019s max: 0.021s std dev: 0.00020s window: 50
average rate: 50.020
min: 0.018s max: 0.022s std dev: 0.00042s window: 50
average rate: 49.995
min: 0.019s max: 0.021s std dev: 0.00020s window: 50
average rate: 50.007
min: 0.019s max: 0.021s std dev: 0.00017s window: 50
So, no problem for large messages! At 50 Hz, the size of message that seems to do this is ~600 bytes - 65,000 bytes. Unfortunately for my use case, I can't just make all my messages bigger (or smaller). Introspecting in the specific use case, it seems like queueing in that there will be a miss and then 2 messages the next expected cycle. However, after it starts, this continues to happen with regularity, unless the messages are bigger or smaller. Any idea why this is happening?
I've reproduced in roscpp and rospy, with and without tcdnodelay, zero queues, and sychonous publishing (queu_size=None). It always exhibits this same behavior. I've also tried this in the indigo-ros-core docker container as well as the lunar-ros-core docker container. Lastly, I've also adjusted process priority, which also has no effect. Obviously, this is all on one machine as ...
More info: I've found the EXACT byte range that will cause this error (at least with string messages) Strings lengths between 516 bytes and 65,474 bytes will cause this issue. Lower or higher works! I think this is some hard-coded buffer issue internal to ros_comm.
Also, the problem only exhibits for rates above 25 Hz.
To be clear, this does not seem to be a throughput issue, as the same byte range has caused the same issue for publish rates of 50, 100, and 200 Hz.
If you have a self-contained MWE then this might be something to report over at ros/ros_comm/issues.
Note btw that the ROS middleware gives no guarantees, nor does it make any claims, about message throughput, max/min rates or other types of QoS. I would imagine you'd be in a better place with ROS 2 for those things.
Nevertheless, this is strange and would be good to investigate.
Posted this to ros_comm issues as per gvdhoorn's suggestion:
https://github.com/ros/ros_comm/issue...