cuda performance [closed]

asked 2012-02-08 12:29:47 -0500

WW gravatar image

I'm currently using ROS Electric and have a simple gazebo model consisting of a rectangular base, 3 wheels (3 continuous joint) and a steering joint (1 revolute joint). I've set ode to run at 100hz. When I use CPU "<steptype>quick</steptype",>parallel_quick</steptype>". I've tried tweaking the CUDA batch sizes and block sizes however it seems to be worse than the default setup. I've also spawned 10 of the same vehicles and tried to test both CPU and CUDA performances. CPU wins out as well. Is there something I'm missing? I'm guessing the performance hit is in the memory transfer between GPU and CPU and CUDA is not a good fit for my setup. I'm using a Quadro FX 4800 with 192 cuda cores.

Thanks alot, WW

edit retag flag offensive reopen merge delete

Closed for the following reason question is not relevant or outdated by tfoote
close date 2013-07-23 05:59:35


That's really funny. I'm seeing the same thing on my robot using pcl_cuda. Have you profiled your code and verified that the performance drain is due to the memory transfer?

mortonjt gravatar image mortonjt  ( 2013-05-31 11:04:48 -0500 )edit