different encodings of depth image for real & simulated kinect

asked 2018-06-28 06:52:25 -0500

RedJohn
31 ●2 ●4 ●8

Hi all

I've got a Gazebo simulation of a kinect camera, where I'm subscribing to the ROS topic /camera/depth/image_raw, which is of type sensor_msgs/Image. I also have a real Kinect for Xbox 360 camera connected to my PC via USB, where I can also get the /camera/depth/image_raw topic.

Unfortunately, these two messages are not exactly the same:

message from real camera:

header: 
  seq: 0
  stamp: 
    secs: 1527861646
    nsecs: 997300811
  frame_id: "camera_depth_optical_frame"
height: 480
width: 640
encoding: "16UC1"
is_bigendian: 0
step: 1280
data: [0, 0, ... ] # array is 614'400 long, i.e. 2 entries per pixel

message from simulated camera:

header: 
  seq: 0
  stamp: 
    secs: 27
    nsecs: 854000000
  frame_id: "camera_link"
height: 480
width: 640
encoding: "32FC1"
is_bigendian: 0
step: 2560
data: [9, 51, 243, ... ] # array is 1'228’800 long, i.e. 4 entries per pixel

The important difference here is the different encodings used, which results in different length of the data array. I don't know why this is the case, but nevertheless I would like to have the same encoding for both the real and the simulated depth image (I don't really care which one). How can I convert one encoding into the other? Also, is it possible to convert the data array so that it contains height * width entries, where each entry corresponds to the distance for this pixel in mm (or meter)?

I'm using the freenect drivers for the real kinect camera (sudo apt-get install libfreenect-dev, sudo apt-get install ros-indigo-freenect-launch), because the Openni drivers didn't work for me, as I'm on a Ubuntu VM, where Openni has their problems with.

Thanks & Cheers!

edit retag flag offensive close merge delete

Comments

http://wiki.ros.org/cv_bridge/Tutoria...

Try this to change the encoding .

I think , the main think you need to change is the frame_id . Not the encoding .

chrissunny94 ( 2018-06-28 09:14:54 -0500 )edit

Thanks for your advise! Sorry for responding late, I wasn't around for a while. The cv_bridge did work more or less, but I couldn't encode from 32FC1 to 16UC1, as this results in a 480*640 array of only 0's. Vice verca it was ok, but the value ranges differed (~0.5 from simulated vs. ~500 from real)

RedJohn ( 2018-07-09 04:04:42 -0500 )edit

So I simply normalized both arrays to be between 0 and 255 (I will post my code when it's working properly). But why do I need to change the frame_id and not the encoding, and how can this be done? Cheers

RedJohn ( 2018-07-09 04:07:13 -0500 )edit

add a comment

2 Answers

Sort by » oldest newest most voted

answered 2019-04-05 08:45:43 -0500

Andrew Price

66 ●2 ●3

@PeteBlackerThe3rd is right about the units. The package depth_image_proc has functions to convert between the two versions. If the rest of your code is expecting the uint16_t version, you can do something like the following:

if ("16UC1" == depth_msg->encoding)
{
    try
    {
        cv_depth_ptr = cv_bridge::toCvCopy(depth_msg, sensor_msgs::image_encodings::TYPE_16UC1);
    }
    catch (cv_bridge::Exception& e)
    {
        ROS_ERROR("cv_bridge exception: %s", e.what());
        return false;
    }
}
else if ("32FC1" == depth_msg->encoding)
{
    try
    {
        cv_depth_ptr = cv_bridge::toCvCopy(depth_msg, sensor_msgs::image_encodings::TYPE_32FC1);
    }
    catch (cv_bridge::Exception& e)
    {
        ROS_ERROR("cv_bridge exception: %s", e.what());
        return false;
    }

    cv::Mat convertedDepthImg(cv_depth_ptr->image.size(), CV_16UC1);

    const int V = cv_depth_ptr->image.size().height;
    const int U = cv_depth_ptr->image.size().width;

    #pragma omp parallel for
    for (int v = 0; v < V; ++v)
    {
        for (int u = 0; u < U; ++u)
        {
            convertedDepthImg.at<uint16_t>(v, u)
                = depth_image_proc::DepthTraits<uint16_t>::fromMeters(cv_depth_ptr->image.at<float>(v, u));
        }
    }

    cv_depth_ptr->encoding = "16UC1";
    cv_depth_ptr->image = convertedDepthImg;
}

with datatypes:

sensor_msgs::ImageConstPtr depth_msg;
cv_bridge::CvImagePtr cv_depth_ptr;

Although, it might make more sense to convert the mm version to float, since you probably actually care about the depth in meters anyway.

edit flag offensive delete link

add a comment

answered 2018-07-13 00:34:22 -0500

RedJohn
31 ●2 ●4 ●8

I managed it to get it done with cv_bridge, as suggested by chrissunny94. As I mentioned in the comments to my answer, it worked only to change from 16UC1 to 32FC1 encoding, not vice verca. Because the value ranges after the conversion differed greatly (despite being at roughly the same distance in real & in the simulation), I simply normalized both converted image arrays (the simulated & the real one) to be between 0 and 255. If anyone knows a better solution to this (or why they have different value ranges at all) please let me know. Anyway, this is what I've implemented in my Python program:

bridge = CvBridge()
data = None
cv_image = None
while(data is None or cv_image is None):
    try:
        data = rospy.wait_for_message('/camera/depth/image_raw', Image, timeout=5)
        # encoding of simulated depth image: 32FC1, encoding of real depth image: 16UC1
        # converting & normalizing image, so that simulated & real depth image have the same encoding & value range
        # see http://wiki.ros.org/cv_bridge/Tutorials/ConvertingBetweenROSImagesAndOpenCVImagesPython
        cv_image = bridge.imgmsg_to_cv2(data, desired_encoding="32FC1")
    except rospy.exceptions.ROSException:
        print("failed to read depth image from simulated kinect camera")
        print(e)
    except CvBridgeError:
        print("failed to convert depth image into another encoding")
        print(e) 
# while

image = np.array(cv_image, dtype=np.float)
print(type(image)) # <type 'numpy.ndarray'>
print(image.shape) # (480, 640)
# normalizing image
cv2.normalize(image, image, 0, 255, cv2.NORM_MINMAX)
# round to integer values
image = np.round(image).astype(np.uint8)

edit flag offensive delete link

Comments

The different value ranges is due to the different units used by convention. Floating point depth images are in meters while 16 bit integer depth images are in mm. So a factor of 1000 different.

PeteBlackerThe3rd ( 2018-07-13 01:39:44 -0500 )edit

add a comment

different encodings of depth image for real & simulated kinect

Comments

2 Answers

Comments

Question Tools

Stats

Related questions

different encodings of depth image for real & simulated kinect edit

Comments

2 Answers

Comments

Question Tools

Stats

Related questions

different encodings of depth image for real & simulated kinect