Ask Your Question
0

Sharing CUDA pointers between nodes

asked 2016-01-18 22:40:12 -0500

KenYN gravatar image

Looking through all the answers, I see one solution is to use Nodelets to share data easily, but thinking about a more general solution I can see a number of ways of doing it, so I'm wondering if any of these exist and/or are advisable:

  1. Pretend the CUDA memory pointer is just a value and cast it at either end.

    This falls over if the message gets published to a different device, although I could add something like a Publisher device ID to detect this, for instance.

  2. (or 1.5) Limit message propagation to the same device.

    Can ROS actually do this?

  3. Write a smart Publish/Subscribe routine that will transparently copy the GpuMat to a standard Mat if it detects the target is a different device.

    Has someone tried this? It seems like the best solution, but quite a lot of effort.

  4. Some other technique I am not aware of

Any hints towards the best course of action are most welcome.

edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
1

answered 2016-01-19 01:52:05 -0500

ahendrix gravatar image

This StackOverflow post seems to indicate that it isn't possible to share CUDA pointers between processes.

Since nodes are different processes, I don't think it's possible to share CUDA pointers across nodes.

You _might_ be able to share CUDA pointers between nodelets in the same process, but even then, you'd have to share not just the pointer, but the entire CUDA context, and that seems like it will be somewhere between awkward and intractable.

edit flag offensive delete link more
0

answered 2016-01-19 02:32:06 -0500

I think the best course of action is to have a single nodelet that performs all your CUDA-related processing. Nodelets are really meant to provide effective means of passing messages (i.e. to efficiently provide the message part of a ROS API) and not for "transporting" another library's (like CUDA's) API. It might be possible to distribute CUDA processing across nodelets in a hacky way by passing around pointers (or data about whole contexts) in messages or so, but this sounds brittle.

You could for instance use pluginlib inside your single CUDA nodelet to selectively load CUDA processing plugins at start- or run-time, which to some extend would emulate the flexibility of nodelets. The drawback is that this approach hasn't been standardized, so there is no existing code.

Ecto might also be interesting to look at, as it also aims at creating processing pipelines. I haven't seen CUDA related code for it though.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2016-01-18 22:40:12 -0500

Seen: 561 times

Last updated: Jan 19 '16