ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question
0

Projecting 3D points into pixel using image_geometry::Pinholecameramodel

asked 2022-01-09 17:00:30 -0500

KhalidOwlWalid gravatar image

updated 2022-01-13 04:50:24 -0500

Currently, I have this set of problems where I am trying to simulate bounding boxes in the simulation, and the general idea that I have drawn out is basically:

  1. Subscribe to ground truth which helps me to detect the object I am trying to find its bounding boxes inside gazebo
  2. Currently, the position of the object as subscribed from the ground truth is with respect to the base_link. So, I want to find the transform between the base_link and the camera_link, which I could simply use a lookupTransform to find its translation.
  3. Then, I would have to do some maths so that the position of the object is now with respect to the camera_link
  4. From here, I am able to determine the object's 3d position with respect to the camera (camera is now the origin)
  5. I could then project the 3d coordinates into pixel simply by using image_geometry::PinholeCameraModel::project3dToPixel

However, the problem that I am facing is that the projection from 3d coordinates into pixel coordinate is way off from the expected position (as in it is not in the frame of my camera altho the cone is right there). I cant seem to find where I did it wrongly. Here is a really simple code for me to test in my simulation

import rclpy
import time
import cv2

from cv_bridge import CvBridge, CvBridgeError
from rclpy.qos import qos_profile_sensor_data

from rclpy.node import Node
from sensor_msgs.msg import Image, CameraInfo
from image_geometry import PinholeCameraModel

class BoundingBox(Node):

     def __init__(self):
        super().__init__('boundary_box_node')

        self.model = PinholeCameraModel()

        # self.cone_position = (6.029119415283203, 2.77589750289917, 0.76)
        self.cone_position = (12.79, 3.012697458267212, 0.76)
        # self.cone_position = (0,0,1)
        self.need_info = True
        self.cam_info = self.create_subscription(CameraInfo, "zed/right/camera_info", self.info_callback, 
                              qos_profile_sensor_data)


        # while self.need_info:
        #      time.sleep(0.01)

        self.get_logger().info(f'Got info from right!')

        self.bridge = CvBridge()
        self.sub_image = self.create_subscription(Image, 'zed/right/image_rect_color', self.image_callback, 
                                  qos_profile_sensor_data)
        # self.pub_image = self.create_publisher('/image_out', Image, 10)

    def info_callback(self, info):

        if self.need_info:
            self.get_logger().info(f'Inside info callback')
            self.model.fromCameraInfo(info)
            self.need_info = False

    def image_callback(self, image):

        translated_cone_pixel = []

        u, v = self.model.project3dToPixel(self.cone_position)

        trannslated_cone_pixel.append((int(u), int(v)))

        try:
            cv_image = self.bridge.imgmsg_to_cv2(image, desired_encoding="bgr8")
            cv2.circle(cv_image, translated_cone_pixel[0], 10, (255,0,0))
           #cv2.rectangle(cv_image, corners[0],corners[1],(0,255,0),3)

            cv2.imshow('test', cv_image)
            cv2.waitKey(1)

        except CvBridgeError as e:
            print(e)


def main(args=None):
    rclpy.init(args=args)

    bounding_box = BoundingBox()

    try:
        while rclpy.ok():
            rclpy.spin_once(bounding_box)

    except KeyboardInterrupt:
        pass

    bounding_box.destroy_node()
    rclpy.shutdown()


if __name__ == '__main__':
     main()

In this code, I have extracted the cone's position myself and transform it manually, and I was just testing out only for the 3d projection part!

Note: For confidential reason, I had to omit my model in the first image, and change it to a cube.

The position of the cube inside gazebo is (-12.3647,10.2,0) and the cone ... (more)

edit retag flag offensive close merge delete

Comments

The values of self.cone_position seem rather suspect.. (12.79, 3.012697458267212, 0.76) means the point is 12.8 meters to the east and 3 meters to the south of the camera_optical_frame, is what I understand - in which case the point is way outside the view frustum of the camera.

JunTuck gravatar image JunTuck  ( 2022-01-10 06:55:41 -0500 )edit

Hi, I have editted my question and included some diagrams to make it easier as to why I obtain those coordinates.

KhalidOwlWalid gravatar image KhalidOwlWalid  ( 2022-01-10 07:50:40 -0500 )edit

Thanks for the diagrams, they're very helpful.

In step 2, you wrote " So, I want to find the transform between the base_link and the camera_link ...". I believe the correct frame is camera_color_optical_frame, not camera_link. These are not the same.

In any case, you can obtain the correct frame by looking the CameraInfo message you subscribed to. Do $rostopic echo zed/right/camera_info; what is the frame_id of the header?

JunTuck gravatar image JunTuck  ( 2022-01-11 14:13:36 -0500 )edit

Here is a part of the echo:

    header:
  stamp:
    sec: 23
    nanosec: 176000000
  frame_id: zed_left_camera_optical_frame
height: 720
width: 1280
distortion_model: plumb_bob

So the frame id is zed_left_camera_optical_frame

Thank you for pointing that out. The translation is still the same and testing it out still gives me a really odd pixel coordinate value.

KhalidOwlWalid gravatar image KhalidOwlWalid  ( 2022-01-12 14:46:40 -0500 )edit

I've edited and added a simple theory that I had in mind. Would be grateul if you could give some insights

KhalidOwlWalid gravatar image KhalidOwlWalid  ( 2022-01-12 15:08:03 -0500 )edit

So zed_left_camera_optical_frame is the correct frame, but I believe your math is still wrong. I'm guessing self.cone_position = (-3.012697458267212, 0.76, 12.79) gives you the correct result.

There is a simple sanity check. Can you run RVIZ and display TF? Specifically, observe zed_camera_link and zed_left_camera_optical_frame.

JunTuck gravatar image JunTuck  ( 2022-01-12 19:58:17 -0500 )edit

Yes, you are absolutely correct about the cone_position. May I ask why is that specifically? I've edited my question to include the tf diagram

KhalidOwlWalid gravatar image KhalidOwlWalid  ( 2022-01-13 04:46:22 -0500 )edit

Ahh, I see. is it because the axis use for opencv is slightly different hence the change in axis as shown here camera info pipeline`

KhalidOwlWalid gravatar image KhalidOwlWalid  ( 2022-01-13 04:54:42 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted
1

answered 2022-01-12 18:27:37 -0500

KhalidOwlWalid gravatar image

updated 2022-01-13 04:44:23 -0500

Actually, I manage to solve this issue. For some reason,

u, v = self.model.project3dToPixel(self.cone_position)

actually takes in (y,z,x) format instead of (x,y,z).

Now, I am able to get the right position of the pixel coordinates. I think I might have made a mistake when calculating the cone's coordinate (though I doubt it since visualizing it in gazebo confirms that I made the right calculation.

edit flag offensive delete link more

Comments

No, this is incorrect. The function absolutely takes (x,y,z) as per its docstring.

JunTuck gravatar image JunTuck  ( 2022-01-12 19:47:39 -0500 )edit
1

I now realises that the axis in gazebo and axes provided here is different hence the "odd" coordinate I am using CameraInfo

KhalidOwlWalid gravatar image KhalidOwlWalid  ( 2022-01-13 07:40:11 -0500 )edit

Question Tools

1 follower

Stats

Asked: 2022-01-09 17:00:30 -0500

Seen: 895 times

Last updated: Jan 13 '22