Ask Your Question
1

Difference between 'ros2 launch' and 'ros2 run' from a program point of view

asked 2020-05-06 06:18:31 -0500

Phgo gravatar image

updated 2020-06-12 09:23:11 -0500

I am currently debugging a program in ROS eloquent, which throws an exception ONLY if its started from a launch file and runs without any problems if the executable is invoked by ros2 run directly. This error persist even with a minimal launch file such as:

from launch import LaunchDescription
import launch_ros.actions

def generate_launch_description():
    return LaunchDescription([
        launch_ros.actions.Node(
            package='my_package', node_executable='my_executable', output='screen'),
    ])

This issue only occurs when explicitly selecting FastRTPS as the ROS middleware implementation (RMW_IMPLEMENTATION=rmw_fastrtps_cpp). Interestingly, if i do not specify the ROS middleware at all (or set RMW_IMPLEMENTATION="") the program runs as expected, both with ros2 launch and ros2 run even though it uses the default DDS which again is FastRTPS. CycloneDDS works fine too in both cases but unfortunately is not an option for this application due to missing shared memory support.

So I am wondering what may cause such a behavior that depends on the way the node is executed and whether or not i specify the ROS middleware explicitly or implicitly?

Is there any difference (e.g. visibility of environment variables, default settings,...) between running a node directly and running the same node from a simple launch file?


Detailed description of specific application:

The following exception is thrown on Ubuntu 18.04, with ROS eloquent when running ros2_intel_realsense in combination with a third party camera driver for an industrial version of the Intel RealSense sensor. It occurs right at the beginning when initializing the node and connections. (Note, that ROS2 is not officially supported by Intel yet.)

I am basically interested why this exception occurs ONLY under the circumstances described above and whether this might be a ROS related issue or just a bug in the camera driver/ROS wrapper.

Exeption (captured with gdb/ Valgrind):

Thread 1 "realsense_node" received signal SIGSEGV, Segmentation fault.
0x00007ffff1596abf in std::system_error::system_error(std::error_code, std::string const&) () from /usr/lib/framos/camerasuite/libCameraSuite.so

Invalid read of size 8
at 0xAB48ABF: std::system_error::system_error(std::error_code, std::string const&) (in /usr/lib/framos/camerasuite/libCameraSuite.so)
(...)
Address 0x18 is not stack'd, malloc'd or (recently) free'd

Backtrace (for last calls):

#0  0x000000000ab48abf in std::system_error::system_error(std::error_code, std::string const&) ()
   from /usr/lib/framos/camerasuite/libCameraSuite.so
No symbol table info available.
#1  0x000000000ab4ac11 in asio::detail::do_throw_error(std::error_code const&, char const*) ()
   from /usr/lib/framos/camerasuite/libCameraSuite.so
No symbol table info available.
#2  0x000000002636bc69 in asio::detail::throw_error (location=0x2660783c "bind", err=...)
    at /usr/include/asio/detail/throw_error.hpp:41
No locals.
#3  asio::basic_socket<asio::ip::udp, asio::datagram_socket_service<asio::ip::udp> >::bind (
    endpoint=..., this=0x1ffeff5290) at /usr/include/asio/basic_socket.hpp:593
        ec = {_M_value = 98, _M_cat = 0xb297908 <asio::system_category()::instance>}
        ec = <optimized out>
#4  eprosima::fastrtps::rtps::UDPv4Transport::OpenAndBindInputSocket (this=this@entry=0x26eb36f0, 
    sIp="0.0.0.0", port=<optimized out>, is_multicast=<optimized out>)
    at ./src/cpp/transport/UDPv4Transport.cpp:280
        socket = {<asio::basic_socket<asio::ip::udp, asio::datagram_socket_service<asio::ip::udp> >> = {<asio::basic_io_object<asio::datagram_socket_service<asio::ip::udp>, true>> ...
(more)
edit retag flag offensive close merge delete

Comments

Please edit your question to include the exception that is thrown and if possible a small example of how to reproduce it. There are potentially a lot of subtle variations between launch and run. Things such as how the console output is captured, is there tty emulation etc. Some of this may even be system or environment dependent such as what shell you're running in. You're more likely to be able to resolve your issue if you can help isolate the specific difference at issue for you instead of asking for all potential problems that might cause an exception and potentially be specific to your platform. A simple reproducible example is by far the best way to get help. Otherwise people are just going to have to guess what's going wrong based on only the information in your description.

tfoote gravatar image tfoote  ( 2020-05-07 12:21:52 -0500 )edit

@tfoote thank you very much for your feedback! I updated my question with the thrown exception and the corresponding backtrace. Unfortunately, i was not able to come up with a suitable minimal example, as the error occurs only in the setup described above and involves a third party camera driver.

Phgo gravatar image Phgo  ( 2020-05-15 05:43:36 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted
0

answered 2020-05-07 12:58:47 -0500

ivanpauno gravatar image

Can you copy the error message? It would be good to have complete steps to reproduce the issue (with a real executable).

It shouldn't be any difference between using ros2 run and ros2 launch.

edit flag offensive delete link more

Comments

Thank you for your answer. I updated my question with the thrown exception and the corresponding backtrace. Unfortunately, I can not provide a suitable minimal example in this case, as the setup is rather complex and involves third party software.

Phgo gravatar image Phgo  ( 2020-05-15 05:33:31 -0500 )edit

Hi @Phgo,

The error doesn't seem related to using ros2 launch or ros2 run, neither related with specifying the rmw implementation explicitly or implicitly.
I'm not sure why it happens, I would need more information to be able to help.

Best, Ivan

ivanpauno gravatar image ivanpauno  ( 2020-06-01 13:18:15 -0500 )edit

Hi @ivanpauno, thank you for your replay. I agree with you, that generally there should not be any relevant differences between the described methods to run the node and specify the rmw implementation. Nevertheless, I can reproduce the same behavior in a docker container, so I am confident it is not barely random. Maybe it comes down to just slightly different timings between run and launch, but i do not understand why specifying the rmw implementation explicitly or implicitly should make any difference at all.

It looks like the segmentation fault is not the original exception here, but happens while handling an boost::asio error (address_in_use) from eprosima::fastrtps::rtps::UDPv4Transport::OpenAndBindInputSocket. The actual segfault occurs in std::system_error::system_error, where as far as i can tell from a first look the error string might have went out of scope. If this is true this is actually a bug in ...(more)

Phgo gravatar image Phgo  ( 2020-06-12 09:06:51 -0500 )edit

Hi @Phgo,

I'm not quite sure of what can cause the issue. I've tried something similar to what you described, and I didn't run into that error.

If you can share a reproducible example, consider opening an issue in rmw_fastrtps or fastrtps together with the steps to reproduce the problem. If the steps are too complex, consider sharing a Dockerfile that automates the build.

Best, Ivan

ivanpauno gravatar image ivanpauno  ( 2020-06-24 13:57:20 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

3 followers

Stats

Asked: 2020-05-06 06:18:31 -0500

Seen: 71 times

Last updated: Jun 12