ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question

how can i check bottle neck in my code while node is running?

asked 2021-08-27 05:36:26 -0500

benthebear93 gravatar image

So i have a node that subscribes pointcloud2 msgs which generated from moving line scanner. (about 200hz publish rate) I tried to concatenate(add) pointcloud every subscribe callback but it seems very slow. about 20hz, i tested by publish back from callback function which could make little slow yet 20hz seems too slow. (i used ros::spin.)

There is definitely bottle neck in my code. Is there anyway i can check the code bottle neck while ros node is running? Only idea i have is to print curret time every line that matters in the code, but this seem very tedious. so if there is any solution it would be great! any idea is also welcome :) thx.

more general info about what i am trying to do

each pointcloud has 800 point (xyz) pointcloud data will be added and when node is finished it will save total point cloud.

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted

answered 2021-08-27 07:38:34 -0500

Per Edwardsson gravatar image

Finding bottlenecks in code is called Profiling and is interesting in many fields. There are many tools to do it, but none which work fairly automatically - they all require some fiddling with, so they are not necessarily _easier_ to do than printing current time and working backward, but they likely will tell you more. Perhaps you can use as a start, and continue to search for tutorials on how to use tools like gprof to make your code high performant.

That said, accumulating 800 3D points into an (ever growing) storage 200 times per second sounds like a very compute heavy process. In particular, arrays can typically not be resized in any performant way. If you think you resize an array, it is more likely that you allocate another array of the new size and copy all data from the latest iteration into there. Copying data, by the way, is a very expensive process as well. I don't think you should expect to get all 200 scans accumulated every second.

So, what can be done in this situation? Think about your problem for a moment - what are you trying to do? Do you actually need 800 points in every scan? Perhaps some filtering can be done to reduce that number? Do you need 200 scans per second, or could you make a sufficiently good image with 100 or 50 scans? What about the accumulation - could you pre-allocate the memory that you need so that you don't have to resize arrays when live? Can this process be parallelized? Is ROS the right network of choice here, or can you get a faster connection by editing the source code to the driver? There are many avenues to explore, and I hope you find a good solution for your problem.

edit flag offensive delete link more


Thx a lot! I will look into the link that u gave me and also try to look into the different method to save computational expense.

benthebear93 gravatar image benthebear93  ( 2021-08-28 05:29:34 -0500 )edit

Question Tools

1 follower


Asked: 2021-08-27 05:36:26 -0500

Seen: 193 times

Last updated: Aug 27 '21