ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question
1

Question on ROS frame convention

asked 2022-12-13 05:48:02 -0500

pointsnadpixels gravatar image

Hello, I keep getting confused with this every once in a while, and I can't find a definite resource for the convention. The frame id is supposed to be: #Frame this data is associated with

To represent the pose of an object in the map, I'm not sure what convention is used. i.e.. if the data should be in the frame of object (camera, robot,etc..) or in the frame of the map. In CV, SLAM literature, the 'pose' represented by a Rotation (R), and translation (t) is usually in the frame of the camera,
i.e. R contains the coordinates of the bases of the map frame in the Camera frame, and t is the translation in the Camera frame, or X_camera = R*X_map + t where X_map is a point in the map frame, and X_camera the same point in the camera frame.

From what I understand, it seems like the the frame_id should be Camera for orientation R (in quaternion) and position (t), and map for orientation inv(R) (in quat) and position -inv(R)*t, but I've lost count of the number of times I've gotten confused with this. Would appreciate it if you could point me to some documentation on this convention.

edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
0

answered 2022-12-20 05:00:11 -0500

pointsnadpixels gravatar image

updated 2022-12-20 05:00:46 -0500

The documentation for the transform, and pose conventions I was looking for can be found here: http://wiki.ros.org/tf/Overview/Trans...
tl;dr:

To represent the pose of an object in the map, I'm not sure what convention is used. i.e.. if the data should be in the frame of object (camera, robot,etc..) or in the frame of the map.

For the frame_id of an object map, the data should be in the coordinates of the map, opposite to CV/SLAM convention where a pose of a camera is the coordinates of a map in the frame of the camera.

edit flag offensive delete link more
1

answered 2022-12-15 08:37:48 -0500

Per Edwardsson gravatar image

The map frame is special. It is a frame which should be constant in time regardless of sensors, fixed to the world. Frame T which has a transform map->T should be things which are equally fixed to the world. Off the top of my head, the starting location of the robot (ie the odom frame) is one such thing. Another might be a navigation goal.

A detected object which sources information from the camera is not one such frame. It is clearly reliant on where the camera is, and so it should be specified in the camera frame. Any node which needs the object can get the position in any frame which has a connection to the object anyway, thanks to the power of the transform tree.

You can read more about the frame convention here https://www.ros.org/reps/rep-0105.htm...

edit flag offensive delete link more

Comments

Hello, I'm not sure if my question was confusing, but this is not what I meant. The question was more about the convention used for representing the data 'in a frame'. I used 'Camera' and 'map' in my question in the context of SLAM, but it could refer to the pose of any object in any frame.

pointsnadpixels gravatar image pointsnadpixels  ( 2022-12-15 09:58:47 -0500 )edit

Perhaps another example would be useful in the question edit, because I understood your question nearly the same way (frame and data associated with it)

ljaniec gravatar image ljaniec  ( 2022-12-15 11:43:48 -0500 )edit

Hmm. Do you understand what a frame is in this context? I think of them as origins for a coordinate system. The camera frame, then, is a coordinate system based on the camera. A transform is a matrix that moves frame A to frame B via matrix multiplication. The location of the object in that frame is well defined as a Pose (http://docs.ros.org/en/noetic/api/geo...), ie a 3D point and a 4D quaternion, and always in relation to the frame. You typically don't need to consider other representation unless you need anything specific, since tf2_ros takes care of any transforming between them.

Per Edwardsson gravatar image Per Edwardsson  ( 2022-12-16 03:14:08 -0500 )edit

"The location of the object in that frame is well defined as a Pose (http://docs.ros.org/en/noetic/api/geo...), ie a 3D point and a 4D quaternion, and always in relation to the frame." This 'relation' is defined depending on convention, which is what my question is. In a pose message, when frame_id is 'map', is the pose then the coordinates of the 'object' in the 'map' or vice versa (coordinates of the bases of 'map' in the frame of the object). In SLAM, 3D reconstruction,etc. literature a pose wrt a map is the latter.

EDIT: To clarify, the 'vice versa' is the coordinates of the map/world/whatever frame in the frame of the 'camera/object/..'

pointsnadpixels gravatar image pointsnadpixels  ( 2022-12-16 04:25:10 -0500 )edit

@pointsnadpixels

In SLAM, 3D reconstruction,etc. literature a pose wrt a map is the latter.

This statement shows a deep misunderstanding of the calculation being done. A Pose can be represented by a 4x4 matrix, and the object T used to transform PoseA from frameA to frameB can also be represented by a 4x4 matrix. This transform matrix is NOT a Pose. And if we take the inverse of the transform matrix, it's still not a Pose.

Mike Scheutzow gravatar image Mike Scheutzow  ( 2022-12-17 09:09:35 -0500 )edit

How so? A pose can very much be seen as a transform. While computing the Projection matrix for a camera, the extrinsics parameters (R,t) are quite literally the pose of the camera wrt a world , where R in this case is the coordinates of the bases of the world frame in the camera frame, and 't' the position of the origin of the world frame in the camera coordinate system.
i.e. X_c = R*X_w + t where X_w is a point in the world frame, and X_c is the same point in the camera frame . There is plenty of literature which you can very easily look up.

pointsnadpixels gravatar image pointsnadpixels  ( 2022-12-17 10:35:07 -0500 )edit

Hello @Mike Scheutzow can you please expand on why you think this interpretation is a misunderstanding?

pointsnadpixels gravatar image pointsnadpixels  ( 2022-12-19 03:55:36 -0500 )edit
1

@pointsnadpixels, wrt your earlier comment, if a message of type PoseStamped has a header with frame_id of 'map', then the pose described in that message is given in terms of the 'map' frame. The nomenclature is maybe a little unfortunate here: in ROS, a frame is essentially a coordinate system, defined in the world (through a 3D point and a 4D quaternion). A transform takes you from one frame to another, such that B = XA, where A and B are frames, and X is a transform. Poses, here, are a type of message, linked above. It is very possible to transfer the information contained in a frame (3d point + quat) into another medium. Typically, the camera 3x4 projection matrix is not used. They both contain the same information, but saying that transform is a pose is going to be confusing in this context.

Per Edwardsson gravatar image Per Edwardsson  ( 2022-12-20 02:27:48 -0500 )edit

Question Tools

3 followers

Stats

Asked: 2022-12-13 05:48:02 -0500

Seen: 127 times

Last updated: Dec 20 '22