ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | Q&A answers.ros.org
Ask Your Question
0

Kinetic to Melodic migration question

asked 2021-11-27 02:12:38 -0500

Nemesis gravatar image

updated 2021-11-28 09:45:53 -0500

Mike Scheutzow gravatar image

As my first post, I apologise if this question has been asked before, it is just that my searches here and with google pointed at the site:answers.ros.org did not reveal any insight into my challenge.

OVERVIEW
In both scenarios I use a ROS Master (robot), and a ROS Client/Slave (Laptop)
WiFi network provides DHCP on router without DNS registration or DNS services
I have also tried static IP's on both with a cable directly between both robot and laptop and get same error
No firewalls on either versions of Ubuntu
Multicast DNS is what is relied upon using avahi-daemon in default config
For the purpose of this question the FQDN of Robot and Laptop are as below:
Robot FQDN: rosbot.local Robot hostname: rosbot
Laptop FQDN: laptop.local Laptop hostname: laptop

Robot .bashrc entries:
ROS_MASTER_URI=http://rosbot.local:11311
ROS_HOSTNAME=rosbot.local

Laptop .bashrc entries:
ROS_MASTER_URI=http://rosbot.local:11311
ROS_HOSTNAME=laptop.local

WORKING ROBOT
In the Kinetic/Ubuntu 16.04 scenario all is working fine
On Robot I launch roscore and on the Laptop I run roswtf, rviz, rqt graph all work fine and as expected
Network validation checks have been completed as per http://wiki.ros.org/ROS/NetworkSetup

NOT WORKING ROBOT
In the Melodic/Ubuntu 18.04 scenario, it seems hostname and NOT FQDN is now the preferred choice for the likes of roswtf, rviz, and rqt graph
Network validation checks have once again been completed as before
As per Kinetic, I launch roscore and on laptop I run roswtf and get the following error message:
running graph rules...
ERROR: Unknown host [rosbot] for node [/rosout]
... done running graph rules

ERROR Errors connecting to the following services:
* service [/rosout/set_logger_level] appears to be malfunctioning: Unable to communicate with service [/rosout/set_logger_level], address [rosrpc://rosbot:53773]
* service [/rosout/get_loggers] appears to be malfunctioning: Unable to communicate with service [/rosout/get_loggers], address [rosrpc://rosbot:53773]

Launching rviz and rqt_graph also not working

During this testing, I did the following as a temporary fix and this proved my suspicion about hostname and NOT FQDN being used in Melodic:
I added a host IP entry for the name rosbot into the /etc/hosts file of the laptop
After doing this roswtf, rviz, and rqt graph all work again
Unfortunately in a dynamic IP environment, writing a script workaround on the laptop to do this every time the robot gets a new IP is not the preferred direction I would like to take.

Something has obviously changed between Kinetic and Melodic, and any assistance in identifying what it is and some direction I can take from here would really be appreciated.
Respectfully,
Ant

edit retag flag offensive close merge delete

Comments

2

Are you sure that this is a difference in the ROS code and not in Ubuntu?

Humpelstilzchen gravatar image Humpelstilzchen  ( 2021-11-28 02:34:40 -0500 )edit

Is this your personal wifi network, or from a school or corporation? It's really unusual that it would not provide DNS lookup for the DHCP-assigned hosts.

Mike Scheutzow gravatar image Mike Scheutzow  ( 2021-11-28 08:46:27 -0500 )edit

@Humpelstilzchen - I am not sure at this stage. I can tell you that both kinetic and melodic systems are simple build installs which I completed myself. The only modifications to system have been for the robot usb assignments.
But since the issue is occurring on the laptop, the build method follows these simple steps:
Install Ubuntu desktop from USB (made using rufus and ISO)
During build I set hostname and timezone and leave all else at defaults
Install ROS using respective instructions on ros.org website
Install required ROS packages

@Mike Scheutzow - Its a maker space workshop, which we have no control over the gateway. I agree it is unusual, but its not the only place I have encountered this. Besides, Kinetic scenario is working at this location, and Melodic is not?

Nemesis gravatar image Nemesis  ( 2021-11-28 12:59:42 -0500 )edit

Since the first indication of issues is seen when I run roswtf on the laptop:

running graph rules...
ERROR: Unknown host [rosbot] for node [/rosout]
... done running graph rules

I started to look into differences in the roswtf python files.
These can be found in the following path (note that [distro] refers to version of ROS):
/opt/ros/[distro]/lib/python2.7/dist-packages/roswtf/
I can see that the error message above from running roswtf is generated from environment.py
I can also see differences in 6 out of the 13 files in that folder, including a file called network.py
What I am unable to do is determine in detail why it appears that Kinetic uses the FQDN and Melodic uses just the hostname when roswtf is run.

I do not speak python, and do not really have any experience with it. If I can find a way to ...(more)

Nemesis gravatar image Nemesis  ( 2021-11-29 02:11:13 -0500 )edit

As pointed out by Mike Scheutzow in the answer it is probably a difference in name resolution in Ubuntu, not in ROS, so no need to go through python files. What does happen when you simply do "ping rosbot" on Ubuntu 16 and 18?

Humpelstilzchen gravatar image Humpelstilzchen  ( 2021-11-29 02:24:31 -0500 )edit

"ping rosbot" fails on both versions as there is no local DNS.
NOTE: Kinetic roswtf works, but Melodic roswtf does not?!

Nemesis gravatar image Nemesis  ( 2021-11-29 02:36:21 -0500 )edit

2 Answers

Sort by ยป oldest newest most voted
0

answered 2021-11-28 08:37:17 -0500

Mike Scheutzow gravatar image

updated 2021-11-28 09:38:02 -0500

You are describing a systemd behavior, not a ros issue. By removing the ".local" from the hostname, you've probably excluded those machines from using Multicast DNS. By default, systemd is going to form a FQDN using the hostname + subnet name provided by the DHCP server.

I have never needed to use multicast DNS, but have you tried creating hostname aliases in /etc/hosts on both hosts? Or maybe see if avahi-daemon can be configured to opt-in your two specific machines?

Update: look at this man page: `man avahi-daemon.conf

Update 2: from what I'm reading, you locally configure the host's name without any subnet name, but you need to access the host using the FQDN <hostname>.local. The MDNS guides are not indicating that avahi-daemon.conf needs to be modified for this situation.

edit flag offensive delete link more

Comments

Thanks for the input Mike.
Adding the equivalent of a CNAME record to /etc/hosts so that the hostname resolves to the FQDN is not possible without knowing the IP, as the format of the hosts file is [IP Address] [name] [aliases...]
I did create a host entry (format: [ip] [hostname]) manually to the /etc/hosts file on the melodic platform an this did resolve the issue, but this means I would have to find the IP of the robot manually every time it reboots and update the laptop.
Regarding Update, I have read through avahi-daemon man page looking for indications of any differences between Kinetic and Melodic and am yet to find any. I have also compared these conf files across both machines and they are the same, unmodified from initial build.
Regarding Update 2, you are correct, avahi-daemon defaults to .local domain name resolution out of the box

Nemesis gravatar image Nemesis  ( 2021-11-28 13:18:17 -0500 )edit

First: verify that MDNS is working over the network without involving ROS. If you configure hostname like I describe in "update 2", can you ping the robot from the laptop using the robot's modified hostname?

When I do this experiment between two ubuntu 18 hosts on my network, the ping is successful.

Mike Scheutzow gravatar image Mike Scheutzow  ( 2021-11-29 07:12:53 -0500 )edit
1

Hi Mike,
Thanks for taking the time to test this, it is really appreciated

Both Ubuntu 16 and 18 systems are configured identically with respect to hostname and avahi-daemon
Ping results for both are also identical:
'ping rosbot' fails
'ping rosbot.local' is successful

Please note that as per http://wiki.ros.org/ROS/NetworkSetup, section 2.4 describes my setup and all tests relevant to that setup at the system level on both versions of Ubuntu are working as described in that guide. MDNS is working correctly, and the .local is appended to system hostname automatically by the avahi-daemon

'test': It is only when I try to run roswtf on the laptop (ubuntu 18) after starting roscore on the robot (ubuntu 18) that I start to see the ERROR described.
The same 'test' on ubuntu 16 works fine without error.

Nemesis gravatar image Nemesis  ( 2021-11-29 12:36:52 -0500 )edit

The error message says you are trying to contact "rosbot". This is not the correct hostname. You have to use "rosbot.local", just like you did with ping, because that is the name MDNS knows.

Mike Scheutzow gravatar image Mike Scheutzow  ( 2021-11-29 13:35:31 -0500 )edit

When you run 'roswtf' you do not specify a hostname or FQDN, it uses a python call that queries the environment setup which relies on the .bashrc entries.
Note: .bashrc uses FQDN (see original question) for both Kinetic and Melodic

Below is the python method they use taken from the roswtf scripts:

Python 2.7.17 (default, Feb 27 2021, 15:10:58) 
[GCC 7.5.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import rosgraph
>>> env = os.environ
>>> env.get(rosgraph.ROS_MASTER_URI, None)
'http://rosbot.local:11311'
Nemesis gravatar image Nemesis  ( 2021-11-29 16:23:26 -0500 )edit

Further investigation and trial and error has brought me to observe another clue to what is going on.
I have rebuilt from scratch the robot platform as follows:

Ubuntu 18.04.6 minimal install, static ethernet ip, no gateway (note I also retested this network config on kinetic and it still works ok)
Following ubuntu install guide, installed ros-melodic-ros-base and dependencies for building packages. Setup and intialised catkin.
Robot .bashrc entries:
ROS_MASTER_URI=http://rosbot.local:11311
ROS_HOSTNAME=rosbot.local

Observation: When I launch roscore on the Melodic platform the std output with reference to ROS_MASTER_URI only shows the hostname (NOT hostname.local)

When I repeat this on Kinetic, it displays hostname.local

I am currently working through the roslaunch python files to try identify how it gets a value for the variable self.uri
I am hoping it leads me a little closer to understanding what I have missed or ...(more)

Nemesis gravatar image Nemesis  ( 2021-12-01 07:15:45 -0500 )edit

In the apt package avahi-utils, there is an app named avahi-resolve. Maybe it can help you figure out if the IP address being returned for <host>.local is what you expect.

Mike Scheutzow gravatar image Mike Scheutzow  ( 2021-12-01 11:29:04 -0500 )edit

Thanks again for the support Mike, however I found the answer that I will now not forget.

Nemesis gravatar image Nemesis  ( 2021-12-01 18:17:07 -0500 )edit
0

answered 2021-12-01 18:25:43 -0500

Nemesis gravatar image

Since the rabbit hole of investigating existing known working ros packages was leading me definitely in the wrong direction and also because Mike seemed so adamant that ros could not be at fault, thanks Mike, I decided to compare in detail the entire home folder from both robots. This showed my mistake when looking at the .bashrc files.

Melodic .bashrc:

ROS_MASTER_URI=http://rosbot.local:11311
ROS_HOSTNAME=rosbot.local

Kinetic .bashrc:

export ROS_MASTER_URI=http://rosbot.local:11311
export ROS_HOSTNAME=rosbot.local

As can be seen the 'export' command was missing!!!
Case closed

edit flag offensive delete link more

Comments

Thank you for sharing. Glad you got it

osilva gravatar image osilva  ( 2021-12-01 19:18:43 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

2 followers

Stats

Asked: 2021-11-26 19:43:14 -0500

Seen: 183 times

Last updated: Dec 01 '21