node crashes when initializing object from shared lib
I've been banging my head against my keyboard for the last week or so trying to figure this one out and still havent - so here I am asking for help from the internet!
I have this package I've been porting from ROS1 to ROS2, lets call it lidar_driver
.
There is a provided SDK via a static lib (.a) file that I link to in the CMakeList file and use some additional header files as well from this SDK.
For ROS1 I use the normal find_package()
, specify the path to the static lib and can link to it just fine and run the ROS1 node.
This all works as expected and overall a very pleasant experience in ROS1.
Now when I try to migrate over to ROS2, still using find_package()
and specifying the path I still can find it (_FOUND CMake variable is 1) and all the other CMake variables needed are set and look great.
I am using ament_auto_add_library
to create the library with the node so all my depends
in my package.xml
are accounted for, and also creating a ROS2 component via rclcpp_components_register_node
.
This all compiles fine using colcon build
, but when I try and run the node using ros2 run
it executes up until I use the cpp object provided by the static lib like so shared_ptr.reset(new sdk::Lidar())
So my question is what step am I missing thats different between catkin and ament/colcon to link to a static lib?
Any help or advice would be greatly appreciated!
UPDATE3: thank you to @404RobotNotFound for the tips, I ended up using gdb
to debug the problem and finally got an error to print out! After following the answer here on how to run gdb for a ros2 node I was able to see this error printing out:
Thread 1 "some_node" received signal SIGSEGV, Segmentation fault.
and it went on to print out in what exact function the error was occuring. It was happening inside the StaticLib, so that means we are successfully linking to it, just maybe missing some dependency for it somehow. I will dig into this with the library provider and see what we can come up with. thanks again so much for the support digging into this problem.
UPDATE2: the StaticLib does use some UDP functions to open a UDP socket for receiving/sending....I wonder if that is interferring with something ROS2 has already open while doing its underlying DDS stuff...? Going to ask the provider of the SDK what exactly is going on there.
UPDATE: here is the basic CMakeList file that is causing me issues in ROS2:
cmake_minimum_required(VERSION 3.5.2)
project(some_driver)
# Default to C++14
if(NOT CMAKE_CXX_STANDARD)
set(CMAKE_CXX_STANDARD 14)
endif()
if(CMAKE_COMPILER_IS_GNUCXX OR CMAKE_CXX_COMPILER_ID MATCHES "Clang")
add_compile_options(-Wall -Wextra -Wpedantic)
endif()
find_package(ament_cmake_auto REQUIRED)
ament_auto_find_build_dependencies()
unset(STATIC_LIB_DEV)
find_package(StaticLib
PATHS ${CMAKE_CURRENT_SOURCE_DIR}/StaticLib/Apps/sdk/lib/cmake/libStaticLib
REQUIRED)
get_target_property(StaticLib_INCLUDE_DIRS
StaticLib INTERFACE_INCLUDE_DIRECTORIES)
get_target_property(StaticLib_LIB_DIRS
StaticLib INTERFACE_LIBRARY_DIRECTORIES)
get_target_property(StaticLib_LINK_LIBRARIES
StaticLib INTERFACE_LINK_LIBRARIES)
get_target_property(StaticLib_STATIC_LIB ...
Is there a specific error you are seeing when your code tries to reset the shared pointer? That might help in identifying the problem.
On a separate note, by default in ROS 1 libraries were built as shared libraries (.so files in linux), but in ROS 2 you need to specify them or they build as static libraries (.a files). You need to pass in "SHARED" to the
ament_auto_add_library
call to make it a shared library again.Also, in ROS 2, you have to install build artifacts to generally be able to use them, see the ament cmake documentation for building a library on the correct way to install the library and export the library target correctly.
none what-so-ever...thats the frustrating bit. it just crashes. i am already using SHARED in my
ament_auto_add_library
step....so thats not it. and as far as installing properly goes, my understanding ofament_auto
is that it does all the installation for you, automatically. i might be wrong though?im leaning more towards the static lib that im trying to use isnt added as a runtime dependency somehow. do you know how to add that in so ament can know to install that too?
Without looking at the CMakeList file it would be hard to guess, but another problem could be that
ament_auto_add_library
only callstarget_link_libraries("${target}" ${${PROJECT_NAME}_LIBRARIES})
and nothing that would be external would be caught with that. After calling the auto_add_library, you might need to also calltarget_link_libraries(<LIB_TARGET> ${<external_pkg>_LIBRARIES})
assuming it is exported correctly from that package. How was it built in ROS 1?@404RobotNotFound thanks so much for your help with this! I think ive done both of those as well - to help ive added the problem CMakeList file. the same
find_package
approach as above works for ROS1....but for some reason doesnt work for ROS2 and ament.As another follow up, when it crashes do you get some kind of stacktrace or anything?
there is nothing printed out in the console and ive added prints just before and after the
shared_ptr.reset
call so i know it is that line.ive checked also the
$HOME/.ros/log
file generated but its absolutely empty (nothing in them).any tips on how to turn on some sort of "debug mode"?
So if your node is closing unexpectedly, there should at least be something like a "Segmentation Fault" or "process has died, exit code [some code]". However, in either case looking at logs probably won't capture that info.
For turning on some debug mode, you can add debug symbols with adding
--cmake-args -DCMAKE_BUILD_TYPE=Debug
or--cmake-args -DCMAKE_BUILD_TYPE=RelWithDebInfo
I believe both work (I usually use RelWithDebInfo). However, this is really useful when debugging a code dump, which is something you can do with GDB to track down what call is having the issue. However, you already know what line is having the issue. So I am not sure if it is a problem with finding the library in the first place, or maybe some kind of error happening when you create the object - that is something GDB could probably tell you was the case by following the stack trace.@404RobotNotFound thanks for the tips on gdb! I updated the question with some updates...but also posting this here for anyone else that ends up here: to run gdb for a ros2 node:
https://answers.ros.org/question/2672...