ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question

2D image point to 3D

asked 2015-07-20 05:34:50 -0500

nwanda gravatar image


I have a setup with a bottom camera (simple webcam pointed at the ground). Along with the camera is an IMU, which means that I have the attitude of the camera. I also have an estimation of the distance from the camera to the ground.

Since I'm detecting some target on the ground (I'm able to detect and get their position on the image using openCV), I would like to extract their position on the world frame. However I'm lost on how to do it.

Are there any ROS package that implement this? How should I do it?

Thanks in advance!

edit retag flag offensive close merge delete

2 Answers

Sort by » oldest newest most voted

answered 2015-07-21 07:21:08 -0500

Airuno2L gravatar image

It sounds like you have the 3D point that describes the location of the camera, and since you have the IMU with it you know it's orientation as well. Since you also know the estimated location of the ground plane this can be treated as a ray-plane intersection problem, which is a common problem in computer graphics that solves the x,y,z intersection point of a ray and a plane.

The line that starts from the camera and points through the ground is the ray, and the ground or course is the plane. Since the object you're selecting/detecting from the camera image is not always in the center pixel, you'll need to add a pan and tilt angle to the ray depending on which pixel the center of the object corresponds to, the image_geometry package has tools to help do just that.

As mig mentioned, you'll want to use tf to help keep track of all the transformation frames, and a urdf can make this even easier.

Once you know the ray, google "line-plane intersection". There is even a Wikipedia article about it to get started. You might even get lucky searching for "how to do line-plane intersection in c++" or python or however you want to do it, and you might find some ready to use code.

edit flag offensive delete link more


good points there!

mgruhler gravatar image mgruhler  ( 2015-07-21 07:50:50 -0500 )edit

That's actually what I had in mind, using the plane z=0 (the ground). However I'm unsure how to obtain the ray, since I only have a 3x3 matrix of intrinsic parameters (camera matrix). And how should I get the scale factor? I'm a little bit lost.

nwanda gravatar image nwanda  ( 2015-07-22 05:10:11 -0500 )edit

After reading about image_geometry, I think I know how to proceed. Using projectPixelTo3dRay I'm able to obtain the ray and after the intersection with the ground plane I should be getting a x,y,z for the target. right? Btw, which library is usually used for linear algebra computation on ros?

nwanda gravatar image nwanda  ( 2015-07-22 05:26:58 -0500 )edit

That's exactly right. The tf library has some built in tools for linear algebra. Here is a small tutorial that shows python use. TF actually uses

Airuno2L gravatar image Airuno2L  ( 2015-07-22 09:06:30 -0500 )edit

I'm not sure what people do when using C++, Looking at that tutorial TF::Transform is just a btTransform from bullet, but there is nothing stopping you from using other ways. Eigen is good to.

Airuno2L gravatar image Airuno2L  ( 2015-07-22 09:12:45 -0500 )edit

Thanks for all the help! I will look into the eigen library.

nwanda gravatar image nwanda  ( 2015-07-22 11:26:03 -0500 )edit

answered 2015-07-21 01:07:08 -0500

mgruhler gravatar image

You're saying you have the position of the camera in 3D already, right?

Then, you would have to write this yourself, but this is fairly easy with the tf library (best read through the documentation and the tutorials). However, you need to have your camera and IMU set up correctly in the urdf. With the position of the target in the camera frame (i.e. x,y,z with respect to the camera), you can call tranformPoint (see documentation here) to transform the Point from the camera frame into any other frame you have avaiable (e.g. the base_link frame of your robot or any map or world frame you have set up).

edit flag offensive delete link more


Do i need to use the urdf model? Is it not enough to define a static tf between the camera and IMU? How should I get the x,y,z with respect to the camera? I only have the x,y in the image, and camera_calibration only outputs the intrinsic parameters (from what I can tell).

nwanda gravatar image nwanda  ( 2015-07-21 06:49:06 -0500 )edit

Static tf is fine. Just assumed you'd have a robot... You say you have an estimate of the distance from the camera to the ground --> z. Otherwise, from a monocular camera you cannot tell the distance to an object (as long as you don't know the exact parameters of the the object and estimate it).

mgruhler gravatar image mgruhler  ( 2015-07-21 06:58:04 -0500 )edit

I couldn't find any relevant tutorial to this question over there, could you pinpoint exactly which one is referring to 2d-3d coordinate conversion?

Jägermeister gravatar image Jägermeister  ( 2019-02-08 08:09:38 -0500 )edit

Question Tools

1 follower


Asked: 2015-07-20 05:34:50 -0500

Seen: 4,061 times

Last updated: Jul 21 '15