Node exit code -11

asked 2019-11-19 20:11:47 -0600

mrrius gravatar image

updated 2019-11-20 02:59:44 -0600

Hello,

I am creating a really simple subscriber to a published message. The message is from another node that process image and produce odometry. My node consist of 3 files, RoverMoveMain.cpp, RoverMoveSimple.cpp, and RoverMoveSimple.h

RoverMoveSimple.h

#include <stdlib.h>
#include <cmath>

#include "helper/constant.h"
#include "helper/typedef.h"
#include "helper/helper.h"
#include "boost/atomic.hpp"
#include "tf2/convert.h"
#include "tf2/LinearMath/Matrix3x3.h"

#ifndef ROVER_MOVE_H
#define ROVER_MOVE_H

using namespace std;

class RoverMoveSimple {
    protected:
        // Attribute
        NodeHandle n;

        Sub cur_pos_sub;
    public:
        // method
        RoverMoveSimple(NodeHandle* nh);

        void UpdatePosCB(const OdometryPtr& cur_pos);
};

RoverMoveSimple.cpp

RoverMoveSimple::RoverMoveSimple(NodeHandle* nh) {
    this->cur_pos_sub = nh->subscribe("odom", 1000, &RoverMoveSimple::UpdatePosCB, this);
}

void RoverMoveSimple::UpdatePosCB(const OdometryPtr& cur_pos){
    cout << cur_pos->pose.pose.position.x << ";" << cur_pos->pose.pose.position.y << "\n";
}

RoverMoveMain.cpp

#include "rover_move/RoverMoveSimple.h"

int main(int argc, char** argv) {
    ros::init(argc, argv, "RoverMove");

    NodeHandle n;
    RoverMoveSimple rm = RoverMoveSimple(&n);
    ros::spin();
    return 0;
}

The problem is, my node run okay for soem time, and then it will stop working and error. It is said the error code is -11. I have checked that probably related to segmentation fault.Is this correct? The problem is, how can this error happened? This is really just a simple node.

Update backtrace using gdb. It seems there are some problem with boost thread used by ros when delete message or somehing.

#0  0x729012b0 in ?? ()
#1  0x76f85f0c in boost::detail::sp_counted_base::release() ()
   from /home/pi/rover_workspace_for_dat2019/devel/lib/libRoverMove.so
#2  0x76ee5820 in boost::detail::sp_counted_impl_pd<ros::MessageDeserializer*, boost::detail::sp_ms_deleter<ros::MessageDeserializer> >::dispose() () from /opt/ros/kinetic/lib/libroscpp.so
#3  0x76ee9e38 in ros::SubscriptionQueue::call() () from /opt/ros/kinetic/lib/libroscpp.so
#4  0x76e8ed3c in ros::CallbackQueue::callOneCB(ros::CallbackQueue::TLS*) () from /opt/ros/kinetic/lib/libroscpp.so
#5  0x76e8fffc in ros::CallbackQueue::callAvailable(ros::WallDuration) () from /opt/ros/kinetic/lib/libroscpp.so
#6  0x76eee420 in ros::SingleThreadedSpinner::spin(ros::CallbackQueue*) () from /opt/ros/kinetic/lib/libroscpp.so
#7  0x76ed45c0 in ros::spin() () from /opt/ros/kinetic/lib/libroscpp.so
#8  0x00011048 in main ()

Using debug mode in building:

Thread 1 "RoverMoveNode" received signal SIGSEGV, Segmentation fault.
0x73001ca8 in ?? ()
(gdb) bt
#0  0x73001ca8 in ?? ()
#1  0x76f86388 in boost::detail::sp_counted_base::release (this=0x73000ad8)
    at /usr/local/include/boost/smart_ptr/detail/sp_counted_base_spin.hpp:103
#2  0x76ee5820 in boost::detail::sp_counted_impl_pd<ros::MessageDeserializer*, boost::detail::sp_ms_deleter<ros::MessageDeserializer> >::dispose() () from /opt/ros/kinetic/lib/libroscpp.so
#3  0x76ee9e38 in ros::SubscriptionQueue::call() () from /opt/ros/kinetic/lib/libroscpp.so
#4  0x76e8ed3c in ros::CallbackQueue::callOneCB(ros::CallbackQueue::TLS*) () from /opt/ros/kinetic/lib/libroscpp.so
#5  0x76e8fffc in ros::CallbackQueue::callAvailable(ros::WallDuration) () from /opt/ros/kinetic/lib/libroscpp.so
#6  0x76eee420 in ros::SingleThreadedSpinner::spin(ros::CallbackQueue*) () from /opt/ros/kinetic/lib/libroscpp.so
#7  0x76ed45c0 in ros::spin() () from /opt/ros/kinetic/lib/libroscpp.so
#8  0x00011048 in main (argc=1, argv=0x7effef54)
    at /home/pi/rover_workspace_for_dat2019/src/rover_move/src/RoverMoveMain.cpp:6
edit retag flag offensive close merge delete

Comments

One thing that would help is if you launch your node in gdb: http://wiki.ros.org/roslaunch/Tutoria.... If you configure your build (catkin, ament) to have -DCMAKE_BUILD_TYPE=RelWithDebInfo (or Debug) it will allow you to get a backtrace in gdb. When xterm opens your node and you get a gdb prompt, type run, wait for a seg fault, then type bt. That will show which line is triggering the seg fault. If you get that output you can edit your question with the new information.

Thomas D gravatar imageThomas D ( 2019-11-19 20:39:15 -0600 )edit

Did you forget to add an include for nav_msgs/Odometry.h? Otherwise, I don't think you can know what OdometryPtr is.

Thomas D gravatar imageThomas D ( 2019-11-19 20:42:18 -0600 )edit

I have included it and make an alias for it in typedef.h. The problem is, the node run for several seconds until it died. So, the error is in runtime.

mrrius gravatar imagemrrius ( 2019-11-19 21:01:13 -0600 )edit

Also, can you explain about dbg without xterm? I run it in ssh and there is no xterm window. I have followed the tutorial from your link and it seems it will produce log file in $ROS_HOME/core.pid.

mrrius gravatar imagemrrius ( 2019-11-19 21:14:15 -0600 )edit

I got the bt. Updated in question.

mrrius gravatar imagemrrius ( 2019-11-19 21:27:51 -0600 )edit

GDB backtraces are nice, but only really informative if you've compiled your code with Debug symbols enabled.

For a Catkin workspace, you can do that either by forcing the Debug (or RelWithDebInfo) CMAKE_BUILD_TYPE in your CMakeLists.txt, or by invoking Catkin with: catkin_make -DCMAKE_BUILD_TYPE=Debug.

Then run your node in gdb again.

gvdhoorn gravatar imagegvdhoorn ( 2019-11-20 02:10:27 -0600 )edit

I have tried to compile using debug mode and run the module gdb. It shows same error:

Thread 1 "RoverMoveNode" received signal SIGSEGV, Segmentation fault. 0x72901ca8 in ?? ()

and I print the backtrace. Still same, no new info. Any method to get more information? I am using ssh to rasppi so I cannot use xterm now. I cannot set any breakpoint when run in separate launch.

mrrius gravatar imagemrrius ( 2019-11-20 02:23:39 -0600 )edit

It shows same error:

Thread 1 "RoverMoveNode" received signal SIGSEGV, Segmentation fault. 0x72901ca8 in ?? ()

and I print the backtrace. Still same, no new info

then you haven't built your workspace with Debug symbols enabled.

You must remove your build and devel folders from your workspace (or at least those of the package), otherwise it's likely that nothing gets actually rebuilt.

At the very least, the last line (ie: frame 8 "in main()") should show the exact line number where the SEGFAULT is caused.

gvdhoorn gravatar imagegvdhoorn ( 2019-11-20 02:24:57 -0600 )edit