ros_tutorials roscpp talker/listener loses first message or two
This must've been seen before...I ran into it while trying to track down a lost message from a command-line app which publishes one message then exits. I'm running ros kinetic on Ubuntu 16.04.4. My colleague has duplicated the problem using the ros_tutorials, and even found it happens on indigo!
I downloaded the ros_tutorials and when I run the listener then the talker, I see the first one or two messages are not received by the listener. Note: I am careful to start the listener many seconds before the talker, so I know it's sitting there waiting to receive messages on the topic.
terminal 1:
$ rosrun roscpp_tutorials listener
[ INFO] [1522884464.985198808]: I heard: [hello world 3]
[ INFO] [1522884465.084941089]: I heard: [hello world 4]
[ INFO] [1522884465.184948624]: I heard: [hello world 5]
terminal 2:
$ rosrun roscpp_tutorials talker
[ INFO] [1522884464.684117550]: hello world 0
[ INFO] [1522884464.784338065]: hello world 1
[ INFO] [1522884464.884241317]: hello world 2
[ INFO] [1522884464.984259097]: hello world 3
[ INFO] [1522884465.084324610]: hello world 4
[ INFO] [1522884465.184348827]: hello world 5
^C[ INFO] [1522884465.284327998]: hello world 6
EDIT: The use case is I'm using a command-line utility to inject a fault-notification DiagnosticStatus message onto /diagnostics, where the diagnostic_aggregator is already up and running, and other publishers have been publishing for some time. In this case it is important to not lose messages - it is not a "emit sensor-data" use case, and the service model cannot be used.
FWIW the python talker/listener do not lose initial messages. The problem goes away if I put ros::Duration(1).sleep() between the call to advertise and the call to publish, but does not go away if I put a ros::Rate sleep() for 10 seconds between the advertise and the publish.
Once the messages start flowing they all come through - it's the first one or two that get lost.
EDIT2: The following code snippet in the talker seems to prevent initial packet loss. Advertising:
ros::Publisher chatter_pub = n.advertise<std_msgs::String>("chatter", 10, true);
ros::Duration(0.5).sleep();
ros::spinOnce();
ros::Duration(0.5).sleep();
Inside the loop:
chatter_pub.publish(msg);
ros::spinOnce();
ros::Duration(0.5).sleep();
Something seems broken. Does anyone have any clues?
EDIT3: Per suggestion from @gvdhoorn I added chatter_pub.getNumSubscribers() between the ROS_INFO that prints what the talker is about to publish, and the actual publish.
Talker output:
$ rosrun experiments talker1
[ INFO] [1522935255.630339354]: hello world 0
[ INFO] [1522935255.630375560]: Number of subscribers before publishng: 0
[ INFO] [1522935255.730489884]: hello world 1
[ INFO] [1522935255.730574901]: Number of subscribers before publishng: 0
[ INFO] [1522935255.830478047]: hello world 2
[ INFO] [1522935255.830555734]: Number of subscribers before publishng: 0
[ INFO] [1522935255.930438125]: hello world 3
[ INFO] [1522935255.930517228]: Number of subscribers before publishng: 1
[ INFO] [1522935256.030550391]: hello world 4
[ INFO] [1522935256.030636388]: Number of subscribers before publishng: 1
Listener output:
$ rosrun experiments listener2
[ INFO] [1522935255.931268459]: I heard: [hello world 3]
[ INFO] [1522935256.031134685]: I heard: [hello world ...
subscriptions still take time and msgs could be published before they are registered. If you add a
getNumSubscribers()
to the tutorial node, does it change?Thanks for the suggestion gvdhoorn. I updated the post with edit3 with a code snippet that waits until getNumSubscribers is non-zero, and that fixed it. I wonder if I should put a note on the wiki page for pub/sub tutorial
I believe this is not specific to this tutorial, but a general characteristic of how pub-sub works (or: is implemented in ROS). I wouldn't know where to put this on the wiki so that it gets the attention it deserves though, so perhaps adding a note to the tutorial would be ok.
Note also that the answer by @knxa is the answer here. I only provided one possible way to "work around" this characteristic of pub-sub.