# How should a subscriber interpret multiple messages upon subscription?

If I have multiple nodes who each latch a message onto a particular topic, a node that later subscribes to that topic will receive multiple messages immediately upon subscription. If those message types do not include a timestamped header, how should that message pair be interpreted?

The way I wish (and previously thought) topics were conceptualized, a topic would represent a stream of messages that supersede each other (like, the current position of something). In that case, it seems like latching should be "per topic". When a new subscriber subscribes to the topic, it makes perfect sense that they would like to see what the current state of the topic is, even before another message is published. In this case, I would expect that when a publisher latches a message, any other latched message should be unlatched -- future subscribers will want the most recent message on that topic, regardless of source, and messages prior to the most recent one are now obsolete. And this is, in fact, the behavior that publishers in the same node have.

But, when the publishers are in different nodes, multiple messages are sent to a new subscriber. This is different from my conceptualization. I understand that the way topics work, in fact, is that latched messages are resent by their publisher (on a per-node basis) to a new subscriber whenever it detects that new subscriber. But WHY does this behavior happen? What is the rationale behind having all publishers with latched messages send them upon a new subscription? In my conceptualization, the idea is "a new subscriber should be able to know the current state of the topic upon subscription, and that state is defined by the most recent message published to the topic". What is the equivalent idea for receiving multiple messages upon subscription?

EDIT: To clarify, the difference between a new subscriber connecting to a topic and immediately receiving two latched messages, and an old subscriber receiving those messages when they were published, is that approximate time synchronization is lost. Assume a topic with superseding messages (and see gvdhoorn's answer for why that might not be a good assumption) that don't include a timestamped header. If publisher 1 publishes and latches message 1, and then an hour later, publisher 2 publishes and latches message 2, the old subscriber knows that message 2 represents the true state of the topic. The new subscriber receives message 1 and message 2 in a random order upon subscription and has no way of knowing which message represents the state of the topic. See answers for responses to this.

edit retag close merge delete

Sort by » oldest newest most voted

(I'm just a user, as you are, so the below is based on my understanding of what is there, not what it was designed to do or how it was designed)

This probably does not completely answer your question, but may be something to think about: say I have n publishers on a single topic (something like temperatures). Each msg includes an id field or similar, and each publisher sets this id field to indicate the originating temperature sensor (maybe the location, maybe just a numeric ID).

If we follow your idea -- each topic contains the 'latest state' -- then we have a problem: which of the n sensor msgs represents 'the latest state'? There is no single message that contains the state of the entire system (which is an ambiguous thing anyway). Only by keeping track of msgs that have been received, and storing the data in some internal model would consumers be able to piece together what the aggregate state of all temperature sensors is.

It could still make sense for these publishers to latch their publications though: it allows late joiners to instantly become aware of the latest available state of all temperature sensors indepenent of their sampling & publication period. That is what latching is for.

There is no requirement (or even a convention) for topics to represent "the current state" of anything, unless the author of the node, and thus the designer of the dataflows, decides that it makes sense to him. Named, typed topics are basically an addressing scheme for dataflows, nothing more. It is then the responsibility of the consumers of those dataflows to figure out how to handle incoming messages.

Adding timestamps (ie: headers) to messages is one way in which producers can make it easier for consumers to make sense of incoming messages. The id in the example above is of course similar to what a std_msgs/Header provides with its frame_id and stamp fields, although it's only the space coordinate of a sensor. The time coordinate is missing from the message.

I would expect that when a publisher latches a message, any other latched message should be unlatched -- future subscribers will want the most recent message on that topic, regardless of source, and messages prior to the most recent one are now obsolete.

Your main assumption here is that all topics always carry some sort of message that encodes some kind of 'state union'. That is not true.

And this is, in fact, the behavior that publishers in the same node have.

Slight nuance: only if those publishers publish to the same topic.

In my conceptualization, the idea is "a new subscriber should be able to know the current state of the topic upon subscription, and that state is defined by the most recent message published to the topic".

and that is how latching works, but it's up to the subscriber to make sense of the incoming messages. See my earlier example for why your idea ...

more

Based on understanding from gvdhoorn's answer, my answer to this question is that topics can be multi-channel or single-channel. My question describes single-channel topics -- that is, topics that describe the state of exactly one thing. I think any message type that doesn't have a channel-differentiating field nor a message-prioritizing field must constitute a single-channel topic, and I don't think multiple latched messages makes any sense in this situation.

For instance, consider a small LCD that subscribes to topic /status_msg which carries std_msgs/String. Whenever it receives a message, it prints that message on the LCD. There's no space for multiple messages; any new message erases the previous one. If two publishers latch messages to this topic, when the LCD subscribes, we get undefined and undesirable behavior -- the desired behavior would be for only the most recent message to be delivered to the LCD upon subscription.

This can mostly be resolved by adding a message-prioritizing field such as a timestamp. If the /status_msg topic above carried a message containing both a string field and time field, then the LCD could discard the older of the two latched messages even if it happened to arrive after the newer message upon subscription.

If topics were only single-channel, I think that's an ugly solution -- we potentially have multiple zombie publishers consuming resources yet producing messages (to new subscribers) that will never be useful, and it makes many message types essentially invalid (in the general case) for use in topics.

But, topics are not only single-channel. A single topic may contain many channels, each channel representing the state of exactly one thing. A good, simple example is gvdhoorn's where the topic /temperatures carries a message type that contains an id field, describing the source of the temperature measurement ("ambient", "intake", "engine", etc), and a float field for temperature. In that case, different publishers have current information for different channels so it makes sense for multiple publishers to latch messages simultaneously. A more complicated example is how TF works on /tf_static.

One alternative to multi-channel topics might have been (prior to when ROS was actually implemented) to make all topics single-channel, and then have more sophisticated subscription rules (like subscribing to /temperatures/*, which would listen for messages from /temperatures/sensor1, /temperatures/sensor2, etc). But, that just trades topic content complexity for subscription complexity. And, it limits the channel identifiers to strings (sections of topic address), which the actual implementation does not limit. So, I think I see the design philosophy now and it seems pretty reasonable given those considerations. I happen to work nearly exclusively with single-channel topics so I miss some capabilities and constraints that would exist if not for multi-channel topics, but at least now I think I understand.

more