Ask Your Question
0

Publish messages immediately within a subscribe callback?

asked 2015-10-20 02:19:43 -0500

drewm1980 gravatar image

updated 2015-10-20 04:16:40 -0500

What's the cleanest way to publish a message immediately, without triggering any subscriber callbacks? I would expect calling ros::spinOnce() inside a callback to (potentially) result in stack overflow as the callback calls itself.

Use case: I am trying to make heartbeat function that "just works" in callbacks and such. Here is a broken initial design:

  class Heart {
   private:
    ros::NodeHandle nh;
    std::unordered_map<std::string, ros::Publisher> publishers;
    const char* file_;
   public:
    Heart(const char* file) {
      file_ = file;
    }
    void beat(int line) {
      string keystring = std::string("heartbeat_") + file_ + "_" + std::to_string(line);
      std::replace(keystring.begin(), keystring.end(),'.', '_');
      auto resultpair = publishers.find(keystring);
      if (resultpair == publishers.end())
      {
        publishers[keystring] = nh.advertise<std_msgs::Empty>(keystring, 1);
      } 
      publishers[keystring].publish(std_msgs::Empty());
      ros::spinOnce();
    };
  };
  Heart heart(__FILE__);
  heart.beat(__LINE__);

The idea is that you spray a bunch of:

heart.beat(__LINE__);

all over your code, record with:

rosbag record --regex "/heartbeat_(.*)" -O heartbeats.bag

And look at the gloriously clearly sorted lines in:

rqt_bag heartbeats.bag

It is broken in the sense that spinOnce() isn't enough to result in even remotely reliable timings.

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2015-10-20 04:00:26 -0500

gvdhoorn gravatar image

updated 2015-10-20 07:31:07 -0500

What's the cleanest way to publish a message immediately, without triggering any subscriber callbacks?

Just use ros::Publisher::publish(..)? As long as you're not running multithreaded or asynchronous spinners, callback queue processing is singlethreaded, so you should not run into issues.

I would expect calling ros::spinOnce() inside a callback to (potentially) result in stack overflow as the callback calls itself.

Calling spinOnce() inside a callback is not a good idea in any case, for whatever reason. It generally points to bad data / control flow design.

Use case: Publishing messages as timing statements in different parts of a long running callback. rosbag record just the topics for the timing statements, look at their timing in rqt_bag

Could you describe your design for this a bit more? If you're publishing msg type X for your timing info, and your long running callback is triggered by publications of type Y, where do you expect there to be an issue with self-triggering?


edit:

It is broken in the sense that spinOnce() isn't enough to result in even remotely reliable timings.

I'd expect this to be the case because the infrastructure was never designed for this use case (the stamp field is there for a reason).

As to your issue:

  • you could probably get around the "spinOnce() calls my callback while I'm in the callback" by some judicious use of multiple callback queues and associated spinners. You are responsible for spinning custom queues, so that should allow you to call spinOnce() without triggering execution of callback_Y
  • perhaps (this is really a guess) see if using UDPROS improves anything. Since it uses UDP, it should not do any buffering like TCP does, which might reduce the latency you observe
  • use the stamp field and accept that for analysis / visualisation you can't directly use rqt_bag, but would need to process the messages and extract the timestamp
  • use proper profiling tools (like perf, oprofile, gprof, etc)
  • see if you can use rqt_graphprofiler and rosprofiler (but really only for messaging analysis). Random presentation about this: ROS System Profiling and Visualization.

In general I'd say that if your bottleneck is in messaging, then yes, you could use bags and their visualisation to get an idea of that, and your approach may make sense (but I'd use other tools for that). If the bottleneck is in the callbacks, then it is an algorithm issue, and you should be profiling those algorithms, without the middleware interfering.

edit flag offensive delete link more

Comments

If I publish type X in a callback triggered by publications of type Y, and then call spinOnce() in the callback, the callback for type Y could be called again. Calling publish() alone isn't enough to push the message out immediately on the network socket, which is what I want for timing.

drewm1980 gravatar imagedrewm1980 ( 2015-10-20 04:08:57 -0500 )edit

I'm not sure I follow: why would callback_Y be called again? Unless you're expecting there to be more outstanding msg_Y instances in your queue?

gvdhoorn gravatar imagegvdhoorn ( 2015-10-20 04:11:56 -0500 )edit

As to timing: publishing is an asynchronous afair through-and-through. Once you let go of the msg, you don't have any control over it anymore. If capturing timing info is important, use the stamp field in the header.

gvdhoorn gravatar imagegvdhoorn ( 2015-10-20 04:13:18 -0500 )edit

I'm trying to make something that isn't terribly accurate, but that is really easy to use, for debugging delays on the order of a tenth of a second. I updated the question with code

drewm1980 gravatar imagedrewm1980 ( 2015-10-20 04:18:41 -0500 )edit

and yes, another node is publishing messages of type Y in our discussion, and faster than my callback for type Y can handle them. I am manually instrumenting this callback to try to fix the problem. (really I am hoping to come up with a general solution to this problem, though)

drewm1980 gravatar imagedrewm1980 ( 2015-10-20 04:21:29 -0500 )edit

I'm looking into your first couple suggestions. I don't want to have to write a separate analysis tool (again). Sampling based profiling tools don't give you a good idea where your latencies are, or what your control flow was during the recorded history, especially if you're dropping data.

drewm1980 gravatar imagedrewm1980 ( 2015-10-21 09:00:38 -0500 )edit

Using a separate callBackQueue doesn't seem to work. And upon reflection, the delays I am seeing are way to big to be just a TCP vs. UDP thing. Here's a version with separate callback queue that wasn't better: http://paste.ubuntu.com/12892878/

drewm1980 gravatar imagedrewm1980 ( 2015-10-22 03:59:48 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2015-10-20 02:19:43 -0500

Seen: 549 times

Last updated: Oct 20 '15