Robotics StackExchange | Archived questions

No tf recieved inside docker

Hello,

I am trying to run Apollo 3.0 with the carla simulator with a ros bridge.

Currently I am stuck on the following issue:

In order to run the perception module apollo needs to recieve certain tf's. These tf are published by static transform publishers. However the apollo perception module throws the following error:

E0908 23:30:11.273775  6725 transform_input.cc:44] Cannot transform frame:  novatel to frame velodyne64 , err: . canTransform returned after 10 timeout was 10.. Frames: Frame velodyne16 exists with parent novatel.
Frame radar exists with parent short_camera.
Frame short_camera exists with parent velodyne64.
Frame radar_front exists with parent velodyne64.
Frame velodyne64 exists with parent novatel.
Frame long_camera exists with parent short_camera.
Frame localization exists with parent world.

I have had a closer look into this issue and saw that while the static transform publisher nodes and the tf topics are visible (with rosnode info ... and rostopic info ...) inside the docker environment, no tf data is received. I tried to run tf viewframes, tfecho and tf_monitor inside and outside the docker. Outside the docker all the tf data was received correctly while inside the docker no tf data were received.

I am starting the docker environment like so:

${DOCKER_CMD} run -it \
        -d \
        --privileged \
        --name apollo_dev \
        ${MAP_VOLUME_CONF} \
        --volumes-from ${LOCALIZATION_VOLUME} \
        --volumes-from ${YOLO3D_VOLUME} \
        -e ROS_MASTER_URI=http://172.17.0.1:11311 \
        -e ROS_IP=172.17.0.1 \
        -e DISPLAY=$display \
        -e DOCKER_USER=$USER \
        -e USER=$USER \
        -e DOCKER_USER_ID=$USER_ID \
        -e DOCKER_GRP="$GRP" \
        -e DOCKER_GRP_ID=$GRP_ID \
            -e DOCKER_IMG=$IMG \
${EXTRA_VOLUMES} \
    $(local_volumes) \
    --net host \
        -w /apollo \
        --add-host in_dev_docker:127.0.0.1 \
        --add-host ${LOCAL_HOST}:127.0.0.1 \
        --hostname in_dev_docker \
        --shm-size 2G \
    --pid=host \
    -v /dev/null:/dev/raw1394 \
        $IMG \
        /bin/bash

At this point I really have no idea why i cant receive any tf data inside the docker, while being able to see publishing nodes and the topics. Does anybody have an idea what the source of this issue is?

If you need any additional information, just let me know.

Thanks in advance!

Asked by udeto on 2019-09-08 16:57:58 UTC

Comments

Can you ping the host from inside the docker container by name? If not, that could be the problem (ie: nodes running on host report unresolvable hostname to nodes-in-container -> no traffic).

Asked by gvdhoorn on 2019-09-09 02:09:58 UTC

What exactly do by "ping the host by name"? Do you mean the roscore?

Asked by udeto on 2019-09-09 05:09:10 UTC

"the host" == the machine running docker.

You set ROS_MASTER_URI=http://172.17.0.1:11311, which will work for the nodes inside the container, as the master will probably bind on all IPs or Docker routing will take care of reaching the hosts IP from within the container.

But nodes running outside the container will receive connection requests from nodes inside the container, and they may be returning a hostname that nodes inside the container cannot resolve.

By trying to ping the "host machine" (ie: the one running docker), you could get an indication for whether DNS is working (sufficiently) for nodes inside your container to be able to resolve the hostname that nodes outside the container may be returning.

ROS nodes connect directly, not through the master. So nodes must be able to resolve each others hostnames or only use IP addresses.

You're using --net host, so it may be that this doesn't matter, but I'd check it anyway.

Asked by gvdhoorn on 2019-09-09 05:13:53 UTC

Ok I understand, thank you! I checked and I am able to ping the machine running the docker environment by its hostname from inside the docker. So I figure the nodes should be able to resolve each others hostnames, right?

Asked by udeto on 2019-09-09 05:31:26 UTC

There is a good chance it should work, yes, but I've seen stranger things.

You could test whether setting ROS_IP outside your Docker container makes any difference. Set it to the IP of the PC running docker.

Docker containers can essentially be considered "other hosts" when it comes to networking. So all the problems with DNS, routes and discoverability and communication that come up with multiple hosts can also affect Docker containers.

Running --privileged and with --net=host makes things somewhat easier, but there's still enough that may not work.

Asked by gvdhoorn on 2019-09-09 05:34:48 UTC

I tried setting the ROS_IP to the IP adress to the adress I got, when I ran hostname -I outside of the docker. But then I wasn't able to contact any nodes inside the docker container (i.e. nodes were not visible when running rqt_graph outside the docker container). Therefore I changed the ROS_MASTER_URI as well, and tried to run a roscore outside of the docker, but that didnt work either, as no ros related commands inside the docker lead to any output.

So I now changed both back to the IP address of the docker environment. I additionally checked weather I am able to ping the docker environment from outside the docker using its hostname. Turns out that while I am able to ping the host running the docker from inside the docker, I am not able to ping the docker environment from outside the docker. Is that how it should be, or may that be the source of the issue?

Asked by udeto on 2019-09-09 11:07:48 UTC

I tried setting the ROS_IP to the IP adress to the adress I got,

just to make sure: you're not setting the -e ROS_IP of your container to the IP of the host. Are you?

That would not be what I meant. If you did, then set ROS_IP in the environment of the host (so not the container) to the IP of the host. Leave the ROS_IP of the Docker container as you already show it.

Asked by gvdhoorn on 2019-09-09 11:42:30 UTC

Well, yes I did do that :D

Now I tried to set ROS_IP of the environment to the host IP address with the command:

export ROS_IP=192.168.178.27

Is that what you meant?

Asked by udeto on 2019-09-09 12:19:37 UTC

If 192.168.178.27 is the IP of the machine running docker, then yes, that is what I meant.

Be sure to set it in all shells that you start your container from (or add it to your .bashrc) and in all shells that start ROS nodes (when not starting them from Docker containers).

But again: this is only a test. It could be the actual cause is something completely different.

Asked by gvdhoorn on 2019-09-09 12:25:16 UTC

I tested setting the ROS_IP to the host but that didnt change anything.

Asked by udeto on 2019-09-10 16:43:00 UTC

Next I tried to publish the tf data as a static transform publisher - node via a launchfile inside the docker. For example:

<node pkg="tf"  type="static_transform_publisher" name="novatel_to_velodyne64" args="0 0 0 0 0 0 /novatel /velodyne64 10" />

After launching the node I can see the tf by running rosrun tf view_frames, however when I run rosrun tf tf_echo /novatel /velodyne64 inside the docker nothing happens. There is no error of any kind and no data. It I run tf_echo outside the docker I recieve the published tf data.

So to me that means, that even if I publish the tf inside the docker I am not able to read that published data, even though I am able to see the tf connection. I really do not understand whats going on here, I thought I was dealing with an issue of communication between the docker and the host machine, but if that would be the case, everything should work, when I publish and subscribe inside the docker. Especially as the roscore is running there as well

Asked by udeto on 2019-09-10 16:43:54 UTC

Let's take a step back: can you subscribe to any topics in a docker container and receive messages (published on the outside)?

Have you tried to run rostopic pub outside the container and rostopic echo inside of one and then receive messages? And the other way around?

It's best to approach this sort of thing step-by-step instead of 'randomly' focusing on one specific aspect or node (in this case: treating TF as if it's special).

Asked by gvdhoorn on 2019-09-11 02:20:40 UTC

It I run tf_echo outside the docker I recieve the published tf data.

So to me that means, that even if I publish the tf inside the docker I am not able to read that published data, even though I am able to see the tf connection.

So you set -e ROS_IP=172.17.0.1, but have you made sure your container is given this IP address? Afaik .1 is given to the host, and the IP is used to reach the host from within a container.

Containers get other addresses from the same range, but in any case they are handed out by DHCP (sort of), so if the IP changes, you need to update -e ROS_IP=172.17.0.1 as well.

I had sort-of assumed that you had this covered, but perhaps this is what is going wrong.

Asked by gvdhoorn on 2019-09-11 02:23:21 UTC

Answers