ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question
0

Weird increasing delay in rosbridge despite queue_size=1 and buff_size=2**24

asked 2019-10-08 10:03:36 -0500

Hendrik Wiese gravatar image

updated 2019-10-11 03:59:52 -0500

Hi everyone,

I've got a very weird problem here. I have a topic published on at around 50hz. On the other end there's a subscriber. The callback of the subscriber is a relatively long running process, slower than those 50hz.

That subscriber is initialized with queue_size=1 and buff_size=2**24. That way I want to make sure, messages come in as fast as possible and if there's currently one message processed, further messages get dropped. That's the desired behavior. I don't need all messages, but I need them fast.

When I leave queue_size=1 and buff_size=2**24 out, I see an increasing delay when I take a look at the age of the messages processed (now - header.stamp).

queue_size=1 and buff_size=2**24 work when I use rospy directly. However, it does not with rosbridge. I have already changed rosbridge in a way that the actual rospy.Subscriber is created in the exact same way as descrbed above, including queue_size=1 and buff_size=2**24. it does not help. I still see the increasing delay.

Where do I have to look to make sure I don't have that delay when using rosbridge?

Thanks

Update 1

Here's a code example with which I can reproduce the problem. I'm just not sure if it's caused by roslibpy or by rosbridge. Maybe someone with a little spare time could help me investigate this behavior. Or tell me that they cannot reproduce it and that I'm the only one seeing this behavior. Or that I'm doing something inherently wrong.

import roslibpy
import time
import json
import logging
logger = logging.getLogger("roslibpy")
handler = logging.StreamHandler()
logger.addHandler(handler)

def to_epoch(stamp):
    stamp_secs = stamp["secs"]
    stamp_nsecs = stamp["nsecs"]
    return stamp_secs + stamp_nsecs*1e-9

def from_epoch(stamp):
    stamp_secs = int(stamp)
    stamp_nsecs = (stamp - stamp_secs) * 1e9
    return {"secs": stamp_secs, "nsecs": stamp_nsecs}


def sub(msg):
    # The `age` of the message should remain constant, but it increases seemingly indef.
    age = time.time() - to_epoch(msg["stamp"])
    print("Age of message: %6.3f seconds" % age)
    time.sleep(1./5.)


ros = roslibpy.Ros("ws://localhost:9090")
ros.run()

tt = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_size=1)

ts = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_length=1)
ts.subscribe(sub)

try:
    seq = 0
    while True:
        seq += 1
        header = {"frame_id": "test", "stamp": from_epoch(time.time()), "seq": seq}
        tt.publish(roslibpy.Message(header))
except KeyboardInterrupt:
    ros.terminate()

Update 2

Three things that I've already ruled out:

  1. in rosbridge_library subscribers.py setting rospy.Subscriber(..., queue_size = 1, ...) has no effect
  2. in rosbridge_library subscribers.py setting rospy.Subscriber(..., buff_size = 2**24, ...) has no effect
  3. in rosbridge_server rosbridge_websocket.py setting listenWS(..., backlog = 1, ...) or backlog = 0 has no effect

Update 3

I've now tested rosbridge_server/rosbridge_udp. It shows the same behavior! So it is not related to websockets.

Update 4

As suggested below I've added a sleep to the send loop. It has no effect on my side.

import roslibpy
import time
import json
import ...
(more)
edit retag flag offensive close merge delete

Comments

Possible you have buffering not related to ROS, like in a network switch or card. Is data through rosbridge coming through network? I've had more than one issue where images get backed up with frames 10 - 15 seconds old coming appearing, related to location of the image source in the house. Issue only occurs in wifi dead zones, and when coming back in range images from long ago will show up, and not always in the order you would expect.

So while ROS be properly handling the queue = 1, your HW may have other ideas.

billy gravatar image billy  ( 2019-10-09 01:23:36 -0500 )edit

Cannot rule that out, but the sources are all on the same system, so everything runs via loopback. I've even throttled the topic to 1hz and still the age of the messages keeps increasing at the very same rate, which is exceptionally weird.

It only happens when I use rosbridge_suite in conjunction with roslibpy. If it also happens when I use roslibjs I'll have to test... rospy works fine. I'm just refraining from using rospy because my code has hard python 3 dependencies and rospy appears to be incompatible with python 3, which is a real shame.

I'll soon add a small test script that you might be willing to run in order to verify the behavior.

Hendrik Wiese gravatar image Hendrik Wiese  ( 2019-10-09 03:15:45 -0500 )edit

I just wonder how can you connect to udp rosbridge. thanks.

huyhoangk50 gravatar image huyhoangk50  ( 2020-03-30 02:06:17 -0500 )edit

Any solution to this? Having the same problem

Bender_From_Futurama gravatar image Bender_From_Futurama  ( 2021-09-08 05:39:10 -0500 )edit

Much to my regret, no. I've not been on the project anymore for almost 2 years. I've never tried to solve it. If I find the time, I'll go check it out again in my spare time, however, that'd take a while. If you find a solution, please, post it here!

Hendrik Wiese gravatar image Hendrik Wiese  ( 2021-09-08 05:49:13 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted
0

answered 2019-10-10 02:33:11 -0500

billy gravatar image

updated 2019-10-10 02:35:24 -0500

So I made a change and got your code to work on my machine. Initially got the same results as you.

Remove time.sleep(1./5.) from the callback and add time.sleep(0.01) to the while true loop.

Perhaps the main needs to give some time for the callbacks to function.

#!/usr/bin/python
import roslibpy
import time
import json
import logging
logger = logging.getLogger("roslibpy")
handler = logging.StreamHandler()
logger.addHandler(handler)

def to_epoch(stamp):
    stamp_secs = stamp["secs"]
    stamp_nsecs = stamp["nsecs"]
    return stamp_secs + (stamp_nsecs*1e-9)

def from_epoch(stamp):
    stamp_secs = int(stamp)
    stamp_nsecs = (stamp - stamp_secs) * 1e9
    return {"secs": stamp_secs, "nsecs": stamp_nsecs}


def sub(msg):
    # The `age` of the message should remain constant, but it increases seemingly indefinetly
    age = time.time() - to_epoch(msg["stamp"])
    print("Age of message: %6.3f seconds" % age)
 #   time.sleep(1./5.)


ros = roslibpy.Ros("ws://localhost:9090")
ros.run()

tt = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_size=1)

ts = roslibpy.Topic(ros, "/test", "std_msgs/Header", queue_length=1)
ts.subscribe(sub)

try:
    seq = 0
    while True:
        seq += 1
        header = {"frame_id": "test", "stamp": from_epoch(time.time()), "seq": seq}
        tt.publish(roslibpy.Message(header))
        time.sleep(0.01)
except KeyboardInterrupt:
    ros.terminate()


Age of message:  0.003 seconds
Age of message:  0.002 seconds
Age of message:  0.003 seconds
Age of message:  0.002 seconds
Age of message:  0.002 seconds
Age of message:  0.003 seconds
Age of message:  0.001 seconds
Age of message:  0.001 seconds
Age of message:  0.003 seconds
Age of message:  0.002 seconds
Age of message:  0.002 seconds
edit flag offensive delete link more

Comments

Thanks for your effort but the script that I gave was only a minimal working example to reproduce the actual problem. I don't have a time.sleep in my actual code that I could remove. The callback (or rather the process with the callback) of my code just is slow enough to cause this behavior. I can't do much about that fact. I need to find the actual cause, the very root of the problem, and get a fix for it. Or I need to know that this is expected behavior (which it can't be because rospy alone, without rosbridge in between, does not behave like this if I set the queue_sizes accordingly)

Hendrik Wiese gravatar image Hendrik Wiese  ( 2019-10-10 04:52:09 -0500 )edit

But it's good to know that I'm not alone and that you can reproduce the behavior

Hendrik Wiese gravatar image Hendrik Wiese  ( 2019-10-10 04:53:01 -0500 )edit

Removing the sleep from the callback didn't fix it for me. It was adding the sleep to the while true loop that fixed it. Have you tried that yet?

billy gravatar image billy  ( 2019-10-10 18:15:37 -0500 )edit

Yes, thanks. Unfortunately, it has no effect.

Hendrik Wiese gravatar image Hendrik Wiese  ( 2019-10-11 03:57:49 -0500 )edit

Question Tools

2 followers

Stats

Asked: 2019-10-08 10:03:36 -0500

Seen: 1,543 times

Last updated: Oct 11 '19