Weird increasing delay in rosbridge despite queue_size=1 and buff_size=2**24
Hi everyone,
I've got a very weird problem here. I have a topic published on at around 50hz. On the other end there's a subscriber. The callback of the subscriber is a relatively long running process, slower than those 50hz.
That subscriber is initialized with queue_size=1
and buff_size=2**24
. That way I want to make sure, messages come in as fast as possible and if there's currently one message processed, further messages get dropped. That's the desired behavior. I don't need all messages, but I need them fast.
When I leave queue_size=1
and buff_size=2**24
out, I see an increasing delay when I take a look at the age of the messages processed (now - header.stamp).
queue_size=1
and buff_size=2**24
work when I use rospy
directly. However, it does not with rosbridge
. I have already changed rosbridge
in a way that the actual rospy.Subscriber
is created in the exact same way as descrbed above, including queue_size=1
and buff_size=2**24
. it does not help. I still see the increasing delay.
Where do I have to look to make sure I don't have that delay when using rosbridge
?
Thanks
Update 1
Here's a code example with which I can reproduce the problem. I'm just not sure if it's caused by roslibpy
or by rosbridge
. Maybe someone with a little spare time could help me investigate this behavior. Or tell me that they cannot reproduce it and that I'm the only one seeing this behavior. Or that I'm doing something inherently wrong.
import roslibpy
import time
import json
import logging
logger = logging.getLogger("roslibpy")
handler = logging.StreamHandler()
logger.addHandler(handler)
def to_epoch(stamp):
stamp_secs = stamp["secs"]
stamp_nsecs = stamp["nsecs"]
return stamp_secs + stamp_nsecs*1e-9
def from_epoch(stamp):
stamp_secs = int(stamp)
stamp_nsecs = (stamp - stamp_secs) * 1e9
return {"secs": stamp_secs, "nsecs": stamp_nsecs}
def sub(msg):
# The `age` of the message should remain constant, but it increases seemingly indef.
age = time.time() - to_epoch(msg["stamp"])
print("Age of message: %6.3f seconds" % age)
time.sleep(1./5.)
ros = roslibpy.Ros("ws://localhost:9090")
ros.run()
tt = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_size=1)
ts = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_length=1)
ts.subscribe(sub)
try:
seq = 0
while True:
seq += 1
header = {"frame_id": "test", "stamp": from_epoch(time.time()), "seq": seq}
tt.publish(roslibpy.Message(header))
except KeyboardInterrupt:
ros.terminate()
Update 2
Three things that I've already ruled out:
- in
rosbridge_library subscribers.py
settingrospy.Subscriber(..., queue_size = 1, ...)
has no effect - in
rosbridge_library subscribers.py
settingrospy.Subscriber(..., buff_size = 2**24, ...)
has no effect - in
rosbridge_server rosbridge_websocket.py
settinglistenWS(..., backlog = 1, ...)
orbacklog = 0
has no effect
Update 3
I've now tested rosbridge_server/rosbridge_udp
. It shows the same behavior! So it is not related to websockets.
Update 4
As suggested below I've added a sleep to the send loop. It has no effect on my side.
import roslibpy
import time
import json
import logging
logger = logging.getLogger("roslibpy")
handler = logging.StreamHandler()
logger.addHandler(handler)
def to_epoch(stamp):
stamp_secs = stamp["secs"]
stamp_nsecs = stamp["nsecs"]
return stamp_secs + stamp_nsecs*1e-9
def from_epoch(stamp):
stamp_secs = int(stamp)
stamp_nsecs = (stamp - stamp_secs) * 1e9
return {"secs": stamp_secs, "nsecs": stamp_nsecs}
def sub(msg):
# The `age` of the message should remain constant, but it increases seemingly indef.
age = time.time() - to_epoch(msg["stamp"])
print("Age of message: %6.3f seconds" % age)
time.sleep(1./5.)
print("Trying to connect to ros bridge")
ros = roslibpy.Ros("ws://localhost:9090")
ros.run()
print("Connected to ros bridge")
tt = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_size=1)
ts = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_length=1)
ts.subscribe(sub)
try:
seq = 0
print("Fire!")
while True:
seq += 1
header = {"frame_id": "test", "stamp": from_epoch(time.time()), "seq": seq}
tt.publish(roslibpy.Message(header))
time.sleep(0.01)
except KeyboardInterrupt:
ros.terminate()
Trying to connect to ros bridge
Connected to ros bridge
Fire!
Age of message: 0.180 seconds
Age of message: 0.317 seconds
Age of message: 0.486 seconds
Age of message: 0.676 seconds
Age of message: 0.845 seconds
Age of message: 1.035 seconds
Age of message: 1.226 seconds
Age of message: 1.416 seconds
Age of message: 1.848 seconds
Age of message: 2.039 seconds
Age of message: 2.208 seconds
Age of message: 2.366 seconds
Age of message: 2.557 seconds
Age of message: 2.747 seconds
Age of message: 2.927 seconds
The behavior persists.
Asked by Hendrik Wiese on 2019-10-08 10:03:36 UTC
Answers
So I made a change and got your code to work on my machine. Initially got the same results as you.
Remove time.sleep(1./5.)
from the callback and add time.sleep(0.01)
to the while true loop.
Perhaps the main needs to give some time for the callbacks to function.
#!/usr/bin/python
import roslibpy
import time
import json
import logging
logger = logging.getLogger("roslibpy")
handler = logging.StreamHandler()
logger.addHandler(handler)
def to_epoch(stamp):
stamp_secs = stamp["secs"]
stamp_nsecs = stamp["nsecs"]
return stamp_secs + (stamp_nsecs*1e-9)
def from_epoch(stamp):
stamp_secs = int(stamp)
stamp_nsecs = (stamp - stamp_secs) * 1e9
return {"secs": stamp_secs, "nsecs": stamp_nsecs}
def sub(msg):
# The `age` of the message should remain constant, but it increases seemingly indefinetly
age = time.time() - to_epoch(msg["stamp"])
print("Age of message: %6.3f seconds" % age)
# time.sleep(1./5.)
ros = roslibpy.Ros("ws://localhost:9090")
ros.run()
tt = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_size=1)
ts = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_length=1)
ts.subscribe(sub)
try:
seq = 0
while True:
seq += 1
header = {"frame_id": "test", "stamp": from_epoch(time.time()), "seq": seq}
tt.publish(roslibpy.Message(header))
time.sleep(0.01)
except KeyboardInterrupt:
ros.terminate()
Age of message: 0.003 seconds
Age of message: 0.002 seconds
Age of message: 0.003 seconds
Age of message: 0.002 seconds
Age of message: 0.002 seconds
Age of message: 0.003 seconds
Age of message: 0.001 seconds
Age of message: 0.001 seconds
Age of message: 0.003 seconds
Age of message: 0.002 seconds
Age of message: 0.002 seconds
Asked by billy on 2019-10-10 02:33:11 UTC
Comments
Thanks for your effort but the script that I gave was only a minimal working example to reproduce the actual problem. I don't have a time.sleep
in my actual code that I could remove. The callback (or rather the process with the callback) of my code just is slow enough to cause this behavior. I can't do much about that fact. I need to find the actual cause, the very root of the problem, and get a fix for it. Or I need to know that this is expected behavior (which it can't be because rospy
alone, without rosbridge
in between, does not behave like this if I set the queue_sizes accordingly)
Asked by Hendrik Wiese on 2019-10-10 04:52:09 UTC
But it's good to know that I'm not alone and that you can reproduce the behavior
Asked by Hendrik Wiese on 2019-10-10 04:53:01 UTC
Removing the sleep from the callback didn't fix it for me. It was adding the sleep to the while true
loop that fixed it. Have you tried that yet?
Asked by billy on 2019-10-10 18:15:37 UTC
Yes, thanks. Unfortunately, it has no effect.
Asked by Hendrik Wiese on 2019-10-11 03:57:49 UTC
Comments
Possible you have buffering not related to ROS, like in a network switch or card. Is data through rosbridge coming through network? I've had more than one issue where images get backed up with frames 10 - 15 seconds old coming appearing, related to location of the image source in the house. Issue only occurs in wifi dead zones, and when coming back in range images from long ago will show up, and not always in the order you would expect.
So while ROS be properly handling the queue = 1, your HW may have other ideas.
Asked by billy on 2019-10-09 01:23:36 UTC
Cannot rule that out, but the sources are all on the same system, so everything runs via loopback. I've even throttled the topic to 1hz and still the age of the messages keeps increasing at the very same rate, which is exceptionally weird.
It only happens when I use
rosbridge_suite
in conjunction withroslibpy
. If it also happens when I useroslibjs
I'll have to test...rospy
works fine. I'm just refraining from using rospy because my code has hardpython 3
dependencies androspy
appears to be incompatible withpython 3
, which is a real shame.I'll soon add a small test script that you might be willing to run in order to verify the behavior.
Asked by Hendrik Wiese on 2019-10-09 03:15:45 UTC
I just wonder how can you connect to udp rosbridge. thanks.
Asked by huyhoangk50 on 2020-03-30 02:06:17 UTC
Any solution to this? Having the same problem
Asked by Bender_From_Futurama on 2021-09-08 05:39:10 UTC
Much to my regret, no. I've not been on the project anymore for almost 2 years. I've never tried to solve it. If I find the time, I'll go check it out again in my spare time, however, that'd take a while. If you find a solution, please, post it here!
Asked by Hendrik Wiese on 2021-09-08 05:49:13 UTC