# Transformation of poses between camera frame and base frames [closed]

Many variations of these questions seem to have been asked quite a few times, however, I was not able to find and compute the right answer for my problem.

I am trying out a custom picking of an object using a Robotic Arm. I have a separate perception module that detects objects and estimates the pose for grasping. It, however (as expected) is in the camera frame and follows image processing convention of coordinate frames, i.e. with `right: +x-axis, forward: +z-axis, down: +y-axis`

. From this perception module, I get two values - `3x3`

Rotation matrix and `1x3`

translation vector.

Example `T1`

:

```
Tra: [0.09014122 0.16243269 0.6211668 ]
Rot: [[ 0. 0.03210089 -0.99948463]
[ 0. 0.99948463 0.03210089]
[ 1. -0. 0. ]]
```

(i.e. I have to grasp at that location and in that orientation)

My robot to camera transform is understandably in the right hand coordinate system. Here is an example of the same, say `T2`

:

```
translation:
x: 0.0564581200121
y: 0.318823912978
z: 0.452250135698
rotation:
x: -0.6954818376
y: 0.693982204231
z: -0.13156524004
w: 0.13184954074
```

Now, to get the pose of the object from the robot, is a simple transformation `T2 times T1`

. However, `T1`

follows a different convention from T2.

How would I go about this ? A detailed explanation using this example will be highly appreciated! I am trying to understand from scratch, hence I would prefer to arrive at a transformation matrix on my own to apply to the above ones to get the final pose

Additional note: The robot to camera transformation is using the `depth_camera_optical_frame`

to `base_link`

. Should I use this or just `depth_camera_frame`

?

I have checked out similar issues like 33091, gazebo-4226, and other ones. However, all of them seem to give solutions for 3D points, and I could not find any that is applicable for the pose transformation matrix.