Convert Pixel Coordinates (U,V) to Pointcloud2 (X, Y Z) (Python)

asked 2021-02-24 09:15:33 -0500

LeeAClift
21 ●3 ●4 ●8

updated 2021-07-09 04:06:30 -0500

Hi All,

I am currently writing a python script where I want to take a pixel pair from my depth camera, have it spit out its XYZ coordinates.

I am currently subscribing to two topics, camera/color/image_raw, and camera/depth/color/points.

I am quite new to this, and I had assumed that once I got the pixel position from my image raw, I could do something with that to find the matching depth point; giving me the needed XYZ.

It seems that they don't have the same resolution either, the image is 640x480 whereas the pointcloud is 1280x720.

If someone could help me with this, that would be a great help, I'm really struggling to understand how to get this :)

EDIT: With the help of Ranjit, I have found a solution to my issue.

Using image_geometry.PinholeCameraModel() and camera_model.projectPixelTo3dRay((U,V)) and then normalising as suggested works well enough for an X Y (accurate to 3cm from 1 meter away with an intel realsense d435).

From here, Instead of using a generator, I put the pointcloud directly into a ros_numpy array like this get_xyz_points(pointcloud2_to_array(data), remove_nans=True, dtype=np.float32).

From here, we simply searched through the array using a cDKTree tree = cKDTree(points)

Finally add the Z to the preexisting XY, to get a decent enough result.

EDIT 2: On request, here is the final function created to call the camera, and get the coordinates.

### Callback for converting a pixel XY and depth to XYZ ###
def xyzCallback(self, data):

#get global X&Y and check assignemnt
global intY
global intX

#Get the pointcloud into an array
xyz_array = get_xyz_points(pointcloud2_to_array(data), remove_nans=True, dtype=np.float32)

#Use curetn camera model to get somewhat accurate 3d postiion for X & Y
vector = cam_model.projectPixelTo3dRay((intX,intY))

#Noramlise the 3dray
ray_z = [el / vector[2] for el in vector]

#Assign vector results to easier accessable varaibles 
a = vector[0]
b = vector[1]

#Use cKDTree to search for and accurate Z with results closest to the one inputtted
points = xyz_array[:,0:2] # drops the z column
tree = cKDTree(points)
idx = tree.query((a, b))[1] # this returns a tuple, we want the index
result = xyz_array[idx, 2]

#Assign new Z to normalised X Y
ray_z[2] = result
positions = ray_z
print("Final position = ", positions)

edit retag flag offensive close merge delete

Comments

Could you kindly share the complete code here? Thanks in advance :)

thedarknight ( 2021-07-08 13:05:54 -0500 )edit

Of course, here's my GitHub, with the completed code. Specifically, you will want to look at XYZCallback, starting on line 96.

https://github.com/LeeAClift/gazebo_e...

LeeAClift ( 2021-07-08 15:58:38 -0500 )edit

I can't seem to find the repository on your profile. Could you please check if it's the right link or have you made the repo private?

thedarknight ( 2021-07-08 23:37:11 -0500 )edit

Many apologies, the GitHub is private as it contains work that hasn't been published yet. I will attach the function to the original post though.

LeeAClift ( 2021-07-09 04:05:24 -0500 )edit

Hello @LeeAClift , I'm making a similar project now, and I'm stuck, are you willing to share your python code here? Do you run the python code on a terminal and the moveit launch file on another one? do you know how to visualize x,y,z arrow on rviz/moveit display area? I want it to be like this video (x,y,z arrow appeared on the detected object) : https://www.youtube.com/watch?v=iJWHR...

I'm quite new to this sorry to be such a nuisance with all these questions but I will be really grateful if you could help me !

TapleFlib ( 2023-04-22 05:18:53 -0500 )edit

add a comment

answered 2021-02-24 10:09:59 -0500

Ranjit Kathiriya
1624 ●18 ●148 ●46

updated 2021-02-25 10:54:58 -0500

Hello there,

I think by following these steps you might achieve results according to your requirement.

a) First you have to subscribe to camera_info, pixel_coordinate, and point cloud topic.

b) Then use image_geometry.PinholeCameraModel() and give your camera information by calling this function image_geometry.PinholeCameraModel().fromCameraInfo(__Camera_INFO__DATA). Note: this function should be called once.

c) Call get_depth function for obtaining the distance of that pixel into the point cloud.

def get_depth(self, x, y):
    gen = pc2.read_points(self.pc, field_names='z', skip_nans=False, uvs=[(x, y)])
    return next(gen)

d) Call camera_model.projectPixelTo3dRay((U,V)) this function and you will get output into vector. Note Z will be always 1.

e) you will have to normalize Z and then after from that vector multiply get_depth function result(distance) with the vector X and Y value.

For more information please refer to this post

It seems that they don't have the same resolution either, the image is 640x480 whereas the pointcloud is 1280x720.

When you will subscribe to camera info node and image_geometry.PinholeCameraModel().fromCameraInfo() I think that this will handle your all different resolution of point cloud and color image.

edit flag offensive delete link

Comments

Thanks for your response Ranjit, it seems very close to the result I am looking for.

After implementing your suggestion, as seen below, I am stuck on the last part (part e). Parts a and b are done in different functions, but I believe I have done those bits correctly

### Callback for converting a pixel XY and depth to XYZ ###
def xyzCallback(self, data):
    print (intY)
    print (intX)

    gen = point_cloud2.read_points(data, field_names='z', skip_nans=False, uvs=[(intX, intY)])

    nextGen = next (gen)

    XY = np.array(cam_model.projectPixelTo3dRay((intX,intY)))

    print (XY)

    print(XY*nextGen)

The initial results just for XY print out an X, Y and Z coordinate which don't seem correct, I am assuming this is because I haven't normalised them. When I then multiply them by the next(Gen) I get an array of three NaN's.

Have you got any ideas about what I am doing wrong?

LeeAClift ( 2021-02-25 10:46:18 -0500 )edit

Hello LeeAClift,

XY = np.array(cam_model.projectPixelTo3dRay((intX,intY)))

In this code line, you are obtaining the vector that means on point cloud you are getting X and Y from 3d space. The output of this line should be in X, Y, Z Note: over here your Z always be 1 or around 1.

But before performing this did you check that you are getting distance for a particular pixel from point cloud by using the get_depth() function which was in point C? If you received the distance then start finding the vector which is your XY = np.array(cam_model.projectPixelTo3dRay((intX,intY))) .

For your reference, I am attaching a link that can be helpful to you. If you have any questions feel free to ask.

Ranjit Kathiriya ( 2021-02-25 11:02:36 -0500 )edit

Continue.

As I have maintained in the above comment about the Z value that it will be 1 or near to 1. For Eg. 0.98 or any random number. To make this number to 1 you first have to normalize this Z value to 1 and then multiply your X and Y with your obtained depth.

vector = self.camera_model.projectPixelTo3dRay((X,Y))
ray_z = [el / vector[2] for el in vector]
positions = [el * depth[0] for el in ray_z]

Here, this position will be X, Y, Z of the point cloud.

Ranjit Kathiriya ( 2021-02-25 11:15:40 -0500 )edit

If you want to check that your output in 3dspace means point cloud is true or false. You can check it by following doing reverse engineering by obtaining your U, V from X, Y, Z.

For that you can use the function called camera_model.project3dToPixel((positions[0], positions[1], positions[2])) and pass your X,Y,Z and you will get the U, V and compare it with your previous U,V value. If your code seems to be right then both values should be similar.

Ranjit Kathiriya ( 2021-02-25 11:32:08 -0500 )edit

Hi Ranjit, thank you for the continuation.

Just so my understanding is correct, in the code above, vector gives the nono-normalised X and Y.

From here, we then normalise the Z with ray_Z, and then finally combine the now normalised vector with the depth we had gotten from get_depth?

After running this, I am still getting the issue that the result of gen = point_cloud2.read_points(data, field_names='z', skip_nans=False, uvs=[(intX, intY)]) is always giving a NaN, and when combined by doing positions = [el * depth[0] for el in ray_z], I get an array of 3 NaN's.

Am I doing something wrong? Does Get_distance have to be its own function, and not just a line of code in my current function?

LeeAClift ( 2021-02-25 11:41:41 -0500 )edit

vector gives the nono-normalised X and Y

Yes, you will get X, Y, Z but make a note: Z will be 1 or around 1. If you are getting Z around 0.89, or 0.98 it is normal so for that you have to normalize Z by ray_z = [el / vector[2] for el in vector] here vector[2] is your Z value.

Now to obtain your Z value you need to call your gen = point_cloud2.read_points(data, field_names='z', skip_nans=False, uvs=[(intX, intY)])nextGen = next (gen) it will not be NaN. This line of code will always give you the exact distance in millimeters from the camera to the U,V pixel. I think you are doing something wrong if you are getting NaN values over here.

positions = [el * depth[0] for el in ray_z]

By multiplying, you will able to get pixel on 3d world(point cloud).

Ranjit Kathiriya ( 2021-02-25 11:58:42 -0500 )edit

Thank you once again for your quick responses Ranjit, it is very much appreciated.

It seems I am doing something wrong here, but what, I do not know. I am quite new to robotic vision as a general field, hence many of my questions. I seem to be getting NaN whenever I access my pointcloud via point_cloud2.read_points, but if I use point_cloud2.get_xyz_points I can see them.

I know its a big ask, and you've done so much already, but is there any chance you could provide a complete class or function from the many snippets you've provided me, as I'm certain I am doing something wrong somewhere, and something like that would probably solve it.

LeeAClift ( 2021-02-25 14:06:05 -0500 )edit

Hey LeeAclift,

I was giving you a code snippet from below link but this code is of finding 3d point from the depth camera and plot it into a point cloud. In place of the point cloud, you can also use a depth camera for obtaining the distance.

Ranjit Kathiriya ( 2021-02-25 14:58:32 -0500 )edit

see more comments

Convert Pixel Coordinates (U,V) to Pointcloud2 (X, Y Z) (Python)

Comments

1 Answer

Comments

Question Tools

Stats

Related questions

Convert Pixel Coordinates (U,V) to Pointcloud2 (X, Y Z) (Python) edit

Comments

1 Answer

Comments

Question Tools

Stats

Related questions

Convert Pixel Coordinates (U,V) to Pointcloud2 (X, Y Z) (Python)