Ask Your Question
2

Huge delay in calling disconnect callback of an inactive publisher

asked 2012-01-17 20:05:39 -0600

Stephan gravatar image

updated 2012-01-17 23:07:06 -0600

Consider the following example:

#include <ros/ros.h>
#include <std_msgs/String.h>

void connectCallback(const ros::SingleSubscriberPublisher&)
{
  ROS_INFO("connectCallback");
}

void disconnectCallback(const ros::SingleSubscriberPublisher&)
{
  ROS_INFO("disconnectCallback");
}

int main(int argc, char **argv)
{
  ros::init(argc, argv, "test_node");
  ros::NodeHandle nh;
  ros::Publisher pub = nh.advertise<std_msgs::String>("talk", 1, &connectCallback, &disconnectCallback);
  ros::Rate r(1);
  while (nh.ok())
  {
    ROS_INFO("Number of subscribers: %zu", pub.getNumSubscribers());
    ros::spinOnce();
    r.sleep();
  }
  return 0;
}

When subscribing using rostopic echo /talk, connectCallback is called immediately. However, when shutting down the rostopic process, the reported number of subscribers is still 1 until disconnectCallback gets called about one minute later. Why? Is there any way to shorten this timeout? When the publisher is active (i.e. publishing something in the loop), disconnectCallback is called immediately.

EDIT

Here is another subscriber (different from rostopic) that I used for testing with the same result:

#include <ros/ros.h>
#include <std_msgs/String.h>

void callback(const std_msgs::StringConstPtr& msg)
{
  std::cout << msg->data << std::endl;
}

int main(int argc, char **argv)
{
  ros::init(argc, argv, "subscriber");
  ros::NodeHandle nh;
  ros::Subscriber sub = nh.subscribe("talk", 1, &callback);

  ros::Time start_time = ros::Time::now();
  while (ros::ok() && (ros::Time::now() - start_time).toSec() < 10)
  {
    ros::spinOnce();
  }
  return 0;
}

edit retag flag offensive close merge delete

Comments

One guess would be that shutting down rostopic doesn't properly close the connection (although it should), so only when you try to publish the node realizes it's disconnected
dornhege gravatar imagedornhege ( 2012-01-17 20:21:29 -0600 )edit
I just tried using a self-made subscriber, same result.
Stephan gravatar imageStephan ( 2012-01-17 20:40:31 -0600 )edit
I would guess that the rostopic process does not shutdown correctly and after a minute, your node notices that the socket is invalid. I would write a simple roscpp node that subscribes, stays connected for a while and then shuts down. If that works, you know it's rostopic's fault.
Lorenz gravatar imageLorenz ( 2012-01-17 22:46:08 -0600 )edit
@Lorenz: I tried that already, I'll edit the question to clarify this.
Stephan gravatar imageStephan ( 2012-01-17 23:04:05 -0600 )edit

2 Answers

Sort by ยป oldest newest most voted
3

answered 2012-01-18 08:30:14 -0600

Patrick Mihelich gravatar image

updated 2012-01-18 08:33:05 -0600

This is ticketed at https://code.ros.org/trac/ros/ticket/3403. There I wrote:

I've noticed this issue with unsubscribe callbacks too. However, it's not true that they're never called; they do get called exactly 60s after you subscribe.

In practice this shouldn't normally be an issue. In each nodelet we use connect callbacks to control whether it hooks into a data source (camera or ROS topic). As long as it's getting data, it publishes some processed output. So if all subscribers to the output unsubscribe, the next call to publish() triggers the disconnect callback, which unhooks from the data source. So as long as you're streaming data, disconnects will be timely.

It's still annoying and can cause odd behavior in practice - e.g. if the register nodelet doesn't get the tf data it needs, it can't publish(), so it takes the full 60s to disconnect. So I'd like to see this fixed, or at least know why it happens.

Generally this hasn't been an issue for us - if we need the disconnect callback called immediately, it's because we're shutting down a publisher that's continuously spewing data. Do you have a use case where you require a timely disconnect callback for an inactive publisher?

edit flag offensive delete link more

Comments

Thanks Patrick, I'll mark your answer as accepted as the solution will come up in ticket 3403. Use case: a lazy node that subscribes to inputs if its output is needed and publishes dependent on the input data (object found). So if the object disappears, publish is not called. Similar to your tf ex.
Stephan gravatar imageStephan ( 2012-01-18 18:36:33 -0600 )edit
2

answered 2012-01-18 19:00:27 -0600

Adolfo Rodriguez T gravatar image

I have updated Issue 3403 with a comment. Without having delved deep into the internals of roscpp, Maybe the reported issue might have something to do with the time a closed socket spends in the TIME_WAIT state, which is configured by the tcp_fin_timeout parameter, that defaults to 60s:

cat /proc/sys/net/ipv4/tcp_fin_timeout
60
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 2012-01-17 20:05:39 -0600

Seen: 1,065 times

Last updated: Jan 18 '12