ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question
0

Comms with ROS2 on QEMU

asked 2020-04-11 07:09:41 -0500

broomstick gravatar image

I am running a demo ROS2 talker on a FreeBSD-based OS on QEMU. It is called minimal_publisher and is publishing to chatter. The talker is from demo_nodes_cpp provided by the vanilla install of ROS2 dashing.

I then run the corresponding listener on the host machine, subscribing to chatter.

The talker on QEMU looks like its publishing just fine (printing a string of Publishing: 'Hello, world! [count]'; however, the listener doesn't pick anything up.

ros2 node list shows both nodes, but ros2 topic list does not show the chatter topic. It only shows parameter_events and rosout.

This setup worked for me with ROS[1], running roscore on the host. In that case, though, I was able to explicitly point the talker running on QEMU to the master via ROS_HOSTNAME and ROS_MASTER_UID, I think.

I would appreciate some help in getting my ROS2 publishers and subscribers similarly working across the QEMU boundary.

edit retag flag offensive close merge delete

Comments

First thing to check: make sure UDP multicast is working between all involved hosts/emus.

See the first part of ROS 2 Overview > Troubleshooting to check. Also: Getting started with ros2doctor.

gvdhoorn gravatar image gvdhoorn  ( 2020-04-11 07:11:53 -0500 )edit

The troubleshooting recommends running ros2 multicast and, if that fails, then modifying firewall rules and checking for MULTICAST tags via ifconfig.

My target environment doesn't support python (yet), so I'm not able to build and use ros2cli (unless I'm mistaken, which would make me happy...). I can send and receive multicast packets on the host (via ros2 multicast), but I can't run ros2 multicast send in the QEMU environment and ros2 multicast receive on the host...

ifconfig does show MULTICAST in the flags section of both interfaces displayed by ifconfig on QEMU and I've confirmed that the firewall is not enabled.

To get this demo going, is there a way to simplify this process by reverting to something closer to the ROS1 discovery/communication paradigm? Eventually, I want to be able to use the fully featured, distributed discovery provided by ROS2, but I need to ...(more)

broomstick gravatar image broomstick  ( 2020-04-11 08:53:24 -0500 )edit

You could see whether configuring things for unicast discovery works. There are a couple of Q&As on this site about that. It comes down to editing the Fast RTPS configuration to disable multicast.

There's no guarantee of course that this is your problem. It was just the first thing I would check.

Also: you don't absolutely need to use the ros2 multicast verb. Any tool which can check multicast capabilities could be used. Software based routers/interfaces sometimes don't implement the necessary routing (IGMP snooping, etc). It could be that's what happening here, but it could be something else completely as well.

gvdhoorn gravatar image gvdhoorn  ( 2020-04-11 09:37:12 -0500 )edit

I ran a fairly quick test based on this code, which (I think) demos multicast capability. It worked just fine back and forth between the host and QEMU.

I also tried to establish unicast discovery based on this ROS answer and eProsima documentation (see section on Initial Peers and subsequent tip on disabling multicast). I've posted my DEFAULT_FASTRTPS_PROFILES.xml file.

Still, when I run the talker from within QEMU, it shows up on the host in response to ros2 node list, but the topic does not appear when I run ros2 topic list on the host.

broomstick gravatar image broomstick  ( 2020-04-11 16:46:43 -0500 )edit

Some other observations: ros2 node info /minimal_publisher produces the following error:

ros2_dashing $ ros2 node info /minimal_publisher
/minimal_publisher
[ERROR] [rmw_fastrtps_shared_cpp]: Unable to find GUID for node: minimal_publisher
Failed to get_subscriber_names_and_types: Unable to find GUID for node , at /home/broomstick/ros2_dashing/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_node_info_and_types.cpp:87
broomstick gravatar image broomstick  ( 2020-04-11 17:11:09 -0500 )edit

If I run the talker and then the listener on QEMU (or visa versa), the second one fails with the following error:

1586640423.685765 [42]     100051: UDP make_socket failed for multicast port 17900
[ERROR] [1586640423.781729296] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error

>>> [rcutils|error_handling.c:108] rcutils_set_error_state()
This error state is being overwritten:

  'error not set, at /home/broomstick/cheri/ros2_min/src/ros2/rcl/rcl/src/rcl/node.c:336'

with this new error message:

  'rcl node's rmw handle is invalid, at /home/broomstick/cheri/ros2_min/src/ros2/rcl/rcl/src/rcl/node.c:485'

rcutils_reset_error() should be called after error handling to avoid this.
<<<
[ERROR] [1586640423.895518056] [rcl]: Failed to fini publisher for node: 1
Terminating due to uncaught exception 0x48960300 of type rclcpp::exceptions::RCLError
Abort
broomstick gravatar image broomstick  ( 2020-04-11 17:11:43 -0500 )edit

It does look like something isn't behaving as it should. Particularly in the network stack/interfacing code.

Seeing your other questions: could this be caused by your compiler infrastructure? Have you tried running any of the examples/tests that come with the DDS implementation? That would be one/a few layers of abstraction less, while still being more complex than a bare multicast test.

There is always the chance it's actually a problem in the ROS 2 code, but that's not entirely apparent right now.

gvdhoorn gravatar image gvdhoorn  ( 2020-04-12 04:41:57 -0500 )edit

As I'm currently building with -DBUILD_TESTING=NO, the tests in rmw don't get built. I've tried to find an easy workaround for this, but it'll take some time, as I also need to build the google test suite.

To avoid communicating host-> guest, I tried running one of the composed publisher/subscribers. It seems to do better, but I get the following when I initiate. As you can see, the subscriber prints an empty message:

broomstick gravatar image broomstick  ( 2020-04-13 10:27:26 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted
0

answered 2020-04-17 17:30:44 -0500

broomstick gravatar image

There are two issues here:

  1. My QEMU setup allows communication guest -> host, but not the other way (this is pretty much default QEMU behaviour, though bidirectional comms can be established). I think this explains why a node on the host can identify that there is a node on the guest, but the host nodes can't see or subscribe to published topics originating on the guest because they can't negotiate host -> guest.

  2. The observed behaviour of the publisher/subscriber, where the publisher publishes a std_msgs::String and the subscriber only sees an empty string is due to a bug in rmw_cyclonedds on big endian systems. Specifically, the length of the serialised field in the published message was not correct (it was zero), and so the subscriber could not correctly deserialise. Therefore, the subscriber registered that it had received a message, but thought the message contained an empty string. The fix has been merged. There is extensive discussion and troubleshooting here.

Thanks to @gvdhoorn for pointing me to the cyclonedds issue tracker. They were incredibly prompt and helpful.

edit flag offensive delete link more

Comments

@broomstick: please accept your own answers to mark the question as answered by clicking the checkmark to the left of your answer. You have sufficient karma to do that now.

And good to hear things got resolved.

gvdhoorn gravatar image gvdhoorn  ( 2020-04-18 03:24:53 -0500 )edit

Question Tools

2 followers

Stats

Asked: 2020-04-11 07:09:41 -0500

Seen: 1,147 times

Last updated: Apr 17 '20