ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | Q&A
Ask Your Question

tf impact on perf (tf design)

asked 2014-08-07 08:41:34 -0600

arennuit gravatar image

updated 2014-08-07 10:38:07 -0600

Hi all,

I have been playing with tf a bit and I am wondering about its perfs mainly related to design reasons. Maybe some of you can put some light on my thoughts... Here they are:

tf is kind of a swiss knife for transforms, it ensures communication between nodes even distantly, buffers the data for history access, makes transformations between frames automatically... This is really great for ease of use, though:

  1. tf buffers the transforms for historic access, it is all sorted for fast access, but even though the access is as fast as possible searching in a sorted list comes at a high cost
  2. The frames are accessed by name (e.g. listener.lookupTransform("/ee_link", "/world", ros::Time(0), transform);), this access implies another search in a sorted list for the name: argh!
  3. robot_state_publisher sends data we may not use, as it is over TCP/IP (and not via shared memory or so) it has an impact on perf
    • Especially as most use cases have nodes exchange transforms data on the same machine
  4. Interpolation / extrapolation...

I am coming from the RT physics simulation / control world and so far I would have banned all the points mentioned above and traded all them off for a super-fast raw data structure with a uber-simple dereferencing for access. Is there something I missed or using tf has an enormous hit on perf?



edit retag flag offensive close merge delete

2 Answers

Sort by » oldest newest most voted

answered 2014-08-07 11:56:10 -0600

tfoote gravatar image

updated 2014-08-07 16:52:23 -0600

You're right that tf was designed focusing on ease of use. I'll comment on each of your points.

  1. We have tested using a binary tree structure for lookup which is faster for arbitrary lookups. However we observed actually higher lookup times and more overhead on maintaining the structure than the linked list. The maintenance of the tree is obviously higher overhead as you need to insert into the tree and prune in the tree. And as most data comes in in sequence it never needs to walk the tree when inserting. And pruning a sorted linked list is also trivial. For the lookup times most of the time queries are very close to the front of the list and although the lookup complexity is O(n/2) where n is the length of the list. In most use cases it is much faster since it walks the list from the newest data. The other thing to keep in mind is that although the user/developer mostly sees the lookup times. There are many more insertions and background updates than lookups.
  2. Names are stored in a hash map, which does not require a search. (yes it is not realitme safe to be processing strings, but names mean a lot more to developers and users than numeric ids. We tested originally using numeric frame ids and it was wholly undebuggable.
  3. When sending data on the same machine the kernel is quite optimized and effectively passes the data through shared memory over the loopback interface. I've never seen tf data be a bottleneck performance wise on loopback. (It does become a problem over wifi)
  4. I'm not sure what you mean by this bullet point. Interpolation is required estimate transforms across unsyncronized datasources. As well as allowing lower publish rates for data which is able to reflect the bandwidth of the signal of a link. (Fast moving transforms require high speed publishing. Low speed links can publish infrequently but remain accurate. Without this every link would be required at every timestep wanted to query, so it's actually a significant performance boost.) And extrapolation is discouraged and disabled by default.

Overall there are things which can be optimized however in typical testing on a desktop a tf lookup is measured in microseconds so we haven't worked on optimizing it further. There are many other things to optimize. The biggest bottleneck, mentioned earlier, is bandwidth usage over wifi. And there are several different techniques for downsampling and forwarding data over a limited bandwidth link.

edit flag offensive delete link more


Haha! Thanks a lot for this very clear explanation. I better understand the solutions and trade offs now. It definitely looks less scary than it initially seemed.

arennuit gravatar image arennuit  ( 2014-08-08 03:24:46 -0600 )edit

answered 2014-08-07 10:19:40 -0600

You are basically right with your observations. They are somewhat offset by the fact that using tf makes many tasks very easy/convenient that otherwise would take a lot of time to implement. Given that current computers are really fast, the advantages often outweigh the (in many cases largely negligible) performance impact of using tf.

That being said, it certainly helps to be aware of how tf works behind the scenes. For instance, designing some sort of system that launches hundreds of nodes subscribing to tf likely wouldn´t end well.

edit flag offensive delete link more


Thanks Stefan, your answer is clear, I get the big picture. Now there is surely something better to do than accessing the frames per string id at each time step (as done in function *lookupTransform()* - which AFAIK is the only way to read a tf), no?

arennuit gravatar image arennuit  ( 2014-08-07 10:50:21 -0600 )edit

See Tully´s comprehensive answer. "Better" is always a matter of the metric you apply. Yes, using integer IDs rather than string IDs is better if all that matters is raw speed, but in most practical cases lookups are fast enough and there is so much happening in a robotic system that a few string comparisons (or hash lookups) are really negligible.

Stefan Kohlbrecher gravatar image Stefan Kohlbrecher  ( 2014-08-07 15:06:17 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools



Asked: 2014-08-07 08:41:34 -0600

Seen: 469 times

Last updated: Aug 07 '14