Revision history [back]

I can't seem to find to find the root cause of the problem. However, I created a face detector ROS package that uses's Haar's cascade and CV Bridge.

This approach works for my use case.

I can't seem to find to find the root cause of the problem. However, I created a face detector ROS package that uses's Haar's cascade and CV Bridge.

This approach works for my use case.

Edit: Solution

I created a node that subscribes to /camera/rgb/image_raw and /camera/depth/image_raw.

The ROI of detected face can be obtained by applying Haar's cascade to Kinect's RGB image channel. Then, the same ROI is used to get the Depth image of detected face. This Depth image is used to compute the 3D point cloud (x, y, z) of the detected face.

RGB image callback:

 1. Convert RGB image message data to cv2 image.
 2. Apply Haar's cascade to the converted cv2 image to obtain the ROI of detected face.

Since now you know ROI of the face, you can basically use this ROI on the Depth image channel.

Depth image callback:

 1. Convert Depth image message data to cv2 image.
 2. Apply the ROI of detected face to the converted cv2 image.
 3. Convert the depth image to 3D point cloud data.

Use this CV Bridge guide to convert raw image message data to cv2 image.

As for converting the depth image to point cloud, I referred to this Conference Paper. For convenience, the multiplier constant M_x and M_y was not given in the paper however I found it some time ago at some forum.

Multiplier of X coordinate = 1.12032
Multiplier of Y coordinate = 0.84024

If successful, you should be getting three coordinates (x, y and z) which is basically the point cloud of detected face.