Failed to receive subscribed topic in WSL

asked 2018-12-21 10:39:55 -0500

Joe_Q gravatar image

updated 2018-12-21 13:10:40 -0500

gvdhoorn gravatar image

Is there any setting or config that I need to be aware of when running ROS in Windows Subsystem for Linux (WSL)? I'm running into some weird results where I only have one direction communication.

Here's my setup: - a node (let's call it robot_node) that subscribes (e.g. /robot_command) and publishes (e.g. /robot_pos) topics that consist of custom messages that I created. This node interfaces with real hardware, and is running on a real Linux box running with Ubuntu 16.04 with ROS Kinect. - another node (let's call it control_node) that basically subscribe and publish the reverse of the robot_node, i.e. subscribe to /robot_pos, and publish to /robot_command. The node simply sends the same /robot_command over and over, and print out the /robot_pos whenever it receives the message. - the control_node runs on Windows Subsystem for Linux (WSL), with Ubuntu 18.04 with ROS Melodic, the Windows host is Windows 10 Professional.

Issue:

  • When I run the control_node, I do not get any /robot_pos messages.
  • Confirm that the /robot_command messages do get sent out to the robot_node.
  • Try to run this node (with the exact same source code) on another Linux PC, and another Windows 10 PC (running Home edition) using WSL (same Ubuntu and ROS version), and those PCs would be able to run the control_node and be able to get the /robot_pos messages.
  • This shows me that the code for the control_node is correct, and that I have setup my network related stuff (i.e. ROS_MASTER_URI, ROS_IP, /etc/hosts, etc) properly.

Steps to debug:

  • Use wireshark to look at underlying traffic, I can see:

    1. Initial XMLRPC communication of the "registerSubscriber" method call from control_node to ROS_MASTER, with ROS_MASTER responding the robot_node's IP and port for the /robot_pos topic.
    2. The XMLRPC communication between the robot_node and control_node for the "requestTopic" method, and the robot_node replying with the IP and port for use for the actual message communication.
    3. The initial 3-way TCP handshake from the control_node to robot_node. So I can see control_node send SYN, then robot_node reply with SYN-ACK, and then control_node send ACK.
    4. After this, I don't see any more communication on this stream (from wireshark capture of communication to other PCs, I know that the control_node should now send the packet with the callid, checksum of message, topic etc, and then the robot_node would then continuously send the data for the /robot_pos topic)
    5. Furthermore, I can see the control_node would keep initiating new streams to try to communicate with the robot_node (i.e. more new 3-way TCP handshake), but all these streams would not going any further than the 3-way TCP handshake.
  • Wrote my own simple socket program to connect to the IP and port of the robot_node that is supposed to deliver the /robot_pos topic

  • Once the connection is made, I sent a packet that would have the callid, checksum of message, topic etc (by using the wireshark packet), and I was able to get the robot_node to send ...
(more)
edit retag flag offensive close merge delete

Comments

This is quite an extensive post, and the real question is stated only at the very end. It might help if you'd restructure your post a little.

You might also want to make sure that the list of steps is correctly formatted. Right now it's a bit of a wall-of-text.

Finally: as this is a Win10-WSL ..

gvdhoorn gravatar imagegvdhoorn ( 2018-12-21 12:31:29 -0500 )edit

.. question, clearly tag the question as such and mention it in the title and state it up front.

Issues with Linux & ROS running on WSL are known and quite a few, so there's every chance this is a known one.

gvdhoorn gravatar imagegvdhoorn ( 2018-12-21 12:32:53 -0500 )edit

@gvdhoom, thanks for the comments. I've re-arranged the question a bit. I'm not sure if I can cut more since I want to show what steps I've gone through so as to avoid suggestions that I didn't set the network settings properly. Hopefully this is easier to read now?

Joe_Q gravatar imageJoe_Q ( 2018-12-21 12:51:18 -0500 )edit

You might be running into Microsoft/WSL/issues/1391.

gvdhoorn gravatar imagegvdhoorn ( 2018-12-21 13:11:46 -0500 )edit

It is indeed! After pulling bxwllzz code, now my node works! Thanks! So when would this fix be merged back to the main ROS repository?

Joe_Q gravatar imageJoe_Q ( 2018-12-21 16:07:49 -0500 )edit

Now that I think about it, why would the issue only affect one PC running WSL, but not on the other PC running WSL? The one with issue is running on Windows 10 Professional, and the one without issue is on Windows 10 Home.

Joe_Q gravatar imageJoe_Q ( 2018-12-21 23:28:53 -0500 )edit