Robotics StackExchange | Archived questions

ros2 tf2 get latest transform not working as expected

I am using tf2::canTransform() and tf2::lookupTransform() with time set to TimePointZero. My understanding was that these calls would return the latest available transform i.e canTransform() would return true if there was any available transform and lookupTransform() would return the latest available transform.

In my application, I am publishing the transform periodically at a fairly high rate. Another node subscribes to this transform but occasionally canTransform() and/or lookupTransform() fails especially when there are occasional processor load spikes on the ros machine. I would have expected that once a transform has been published, canTransform() and lookupTransform() should never fail if you are asking for the latest available transform using TimePointZero. Maybe I am misunderstanding what it means by latest available transform.

Asked by thejeeb on 2023-05-26 13:09:00 UTC

Comments

Answers

I would have expected that once a transform has been published, canTransform() and lookupTransform() should never fail

This is a bad assumption. The TFBuffer object only keeps the /tf messages for a limited amount of time, typically around 10 seconds. But the /tf_static messages do not expire.

Asked by Mike Scheutzow on 2023-05-27 06:54:29 UTC

Comments

I realize that. And maybe I didn't state it correctly. The dynamic transform I am publishing at a high rate is highly unlikely to not be published for 10s. Which is why I am perplexed why occasionally I see it can get that transform. At the rates I am publishing the transforms, I always expect the buffer to have this transform and not expire so I don't see how canTransform() and lookupTransform() can fail when getting latest transform.

Asked by thejeeb on 2023-05-27 12:28:01 UTC

Have you checked the time-of-day clocks on both the publishing machine and the subscribing machine?

Asked by Mike Scheutzow on 2023-05-27 14:29:06 UTC

Good thought. But both publisher and subscriber are on same machine

Asked by thejeeb on 2023-05-27 19:49:01 UTC

Then I would look for issues in the src code, for example spinners not running frequently enough, or one callback blocking another callback. You could create a ros node that does only the transform lookup, and see if it has the same failure. That could narrow down whether issue is with publish-side or with subscribe-side.

Asked by Mike Scheutzow on 2023-05-28 07:54:28 UTC

I would have expected that once a transform has been published, canTransform() and lookupTransform() should never fail if you are asking for the latest available transform using TimePointZero.

This expectation is generally true if the information is within the local buffer that you're looking up. However as @Mike Scheutzow mentions there are edge cases that you can hit such as very high latency or synchronization errors which is longer than the buffer causing failure to match.

But the main problem with your stated assumption is that you assume that data being published means that it's guarenteed to have arrived in the buffer that you're querying. In a distributed system there is inherently latency in delivery. And depending on how the network is configured and potentially loaded delivery is not guaranteed. The most common place the people run into this problem is creating a new instance of a Transform Buffer and expecting it to have data available immediately even before connections could have been established to the publishers.

To understand you're problem further please put together an minimum self contained example. It's likely while generating that you'll identify what's different from your test case and be able to answer your own solution. But if not please edit your question and we'll try to help more.

Asked by tfoote on 2023-05-30 13:27:56 UTC

Comments

Yes, startup synchronization is a problem that we have experienced and have had to address. As you say, when you create a transform buffer, it may take some time for the published transform to be available in the buffer. This is not the case I am experiencing. I am already receiving the data in the buffer. But sometime later, it returns false occasionally indicating the data is not there. I am publishing the transform at 100 hz. So it is unlikely, the data is expiring due to no data for 10s.

Asked by thejeeb on 2023-05-30 15:48:39 UTC

Do you have any reparenting or other conflicting publications which could break connectivity?

Asked by tfoote on 2023-05-30 16:03:48 UTC