# Transformation of poses between camera frame and base frames [closed]

Many variations of these questions seem to have been asked quite a few times, however, I was not able to find and compute the right answer for my problem.

I am trying out a custom picking of an object using a Robotic Arm. I have a separate perception module that detects objects and estimates the pose for grasping. It, however (as expected) is in the camera frame and follows image processing convention of coordinate frames, i.e. with right: +x-axis, forward: +z-axis, down: +y-axis. From this perception module, I get two values - 3x3 Rotation matrix and 1x3 translation vector.

Example T1 :

Tra: [0.09014122 0.16243269 0.6211668 ]
Rot: [[ 0.          0.03210089 -0.99948463]
[ 0.          0.99948463  0.03210089]
[ 1.         -0.          0.        ]]


(i.e. I have to grasp at that location and in that orientation)

My robot to camera transform is understandably in the right hand coordinate system. Here is an example of the same, say T2:

translation:
x: 0.0564581200121
y: 0.318823912978
z: 0.452250135698
rotation:
x: -0.6954818376
y: 0.693982204231
z: -0.13156524004
w: 0.13184954074


Now, to get the pose of the object from the robot, is a simple transformation T2 times T1. However, T1 follows a different convention from T2.

How would I go about this ? A detailed explanation using this example will be highly appreciated! I am trying to understand from scratch, hence I would prefer to arrive at a transformation matrix on my own to apply to the above ones to get the final pose

Additional note: The robot to camera transformation is using the depth_camera_optical_frame to base_link. Should I use this or just depth_camera_frame?

I have checked out similar issues like 33091, gazebo-4226, and other ones. However, all of them seem to give solutions for 3D points, and I could not find any that is applicable for the pose transformation matrix.

edit retag reopen merge delete

### Closed for the following reason the question is answered, right answer was accepted by Shrutheesh R Iyer close date 2020-05-21 11:56:03.739239

Sort by ยป oldest newest most voted

This is just a matter of unifying the representations of T1 and T2. Quaternions are the preferred orientation in ROS, and the TF C++ and Python APIs offer methods to convert between them and rotation matrices. But you shouldn't bother multiplying transformation matrices directly in ROS. TF was made to take care of this for you.

The easiest way to get the position of your object should be:

1) Create a PoseStamped object called p.

2) Set the frame_id of its header to depth_camera_optical_frame (the frame that your object is detected in)

3) Set p.pose.position to the translation vector you posted, and p.pose.orientation to the quaternion that you obtain from the rotation matrix.

4) To obtain the quaternion in C++, set the rotation of a Transform with this function and this matrix type, then get the quaternion from this function. In Python, simply use quaternion_from_matrix. Don't forget that tf and geometry_msgs quaternions are not the same.

5) Use the transformPose function of the TransformListener (Python API) to transform p to the frame base_link.

more