I would request that you update the question with your config and sample input messages from every sensor input, but in this case, I can probably answer it anyway.
Just so we're clear: the package outputs the pose of the camera, and the camera's transform is relative to base_link
? This is, sadly, not a use case that is well-supported by robot_localization
.
The state estimation nodes in r_l
always transform velocity and acceleration data into the base_link_frame
frame, and always transform pose data into the world_frame
(typically _map_ or _odom_) frame.
Let's say you get a pose message in the _zed_optical_frame_, and that pose says we are at (10, 0)
. That will be interpreted as a pose that is 10 meters _in front of_ your robot's camera, so whatever way your robot is facing, we'll try to fuse a pose that is 10 meters in front of it.
What we really need to be doing is transforming the _base_link_->_zed_optical_frame_ transform into the world frame, then applying that transformed offset to the camera's pose data. I suppose any time we're fusing pose data and the required transform "crosses" the world_frame
->base_link_frame
transform, we should treat the data differently. But I don't know if that will get implemented before fuse
, the spiritual successor to r_l
, is done.
In the meantime, you have a few options:
- In the camera driver, obtain the _base_link_ -> _zed_optical_frame_ transform (which is probably static), and store that. Call it _T_. Every time you generate a measurement from the
ORB-SLAM2
package, rotate T
by the robot's current orientation, and then apply the inverse of that transform to the camera's pose data. Then change the frame_id
to whatever your EKF world_frame
is, and publish. - Use
differential
mode, but then the filter will assume that data is in the base_link_frame
frame. Not great, I know. - If
ORB_SLAM2
produces velocity data, fuse that instead. The covariance in your state estimate will grow without bound, though.