Node exit code -11
Hello,
I am creating a really simple subscriber to a published message. The message is from another node that process image and produce odometry. My node consist of 3 files, RoverMoveMain.cpp, RoverMoveSimple.cpp, and RoverMoveSimple.h
RoverMoveSimple.h
#include <stdlib.h>
#include <cmath>
#include "helper/constant.h"
#include "helper/typedef.h"
#include "helper/helper.h"
#include "boost/atomic.hpp"
#include "tf2/convert.h"
#include "tf2/LinearMath/Matrix3x3.h"
#ifndef ROVER_MOVE_H
#define ROVER_MOVE_H
using namespace std;
class RoverMoveSimple {
protected:
// Attribute
NodeHandle n;
Sub cur_pos_sub;
public:
// method
RoverMoveSimple(NodeHandle* nh);
void UpdatePosCB(const OdometryPtr& cur_pos);
};
RoverMoveSimple.cpp
RoverMoveSimple::RoverMoveSimple(NodeHandle* nh) {
this->cur_pos_sub = nh->subscribe("odom", 1000, &RoverMoveSimple::UpdatePosCB, this);
}
void RoverMoveSimple::UpdatePosCB(const OdometryPtr& cur_pos){
cout << cur_pos->pose.pose.position.x << ";" << cur_pos->pose.pose.position.y << "\n";
}
RoverMoveMain.cpp
#include "rover_move/RoverMoveSimple.h"
int main(int argc, char** argv) {
ros::init(argc, argv, "RoverMove");
NodeHandle n;
RoverMoveSimple rm = RoverMoveSimple(&n);
ros::spin();
return 0;
}
The problem is, my node run okay for soem time, and then it will stop working and error. It is said the error code is -11. I have checked that probably related to segmentation fault.Is this correct? The problem is, how can this error happened? This is really just a simple node.
Update backtrace using gdb. It seems there are some problem with boost thread used by ros when delete message or somehing.
#0 0x729012b0 in ?? ()
#1 0x76f85f0c in boost::detail::sp_counted_base::release() ()
from /home/pi/rover_workspace_for_dat2019/devel/lib/libRoverMove.so
#2 0x76ee5820 in boost::detail::sp_counted_impl_pd<ros::MessageDeserializer*, boost::detail::sp_ms_deleter<ros::MessageDeserializer> >::dispose() () from /opt/ros/kinetic/lib/libroscpp.so
#3 0x76ee9e38 in ros::SubscriptionQueue::call() () from /opt/ros/kinetic/lib/libroscpp.so
#4 0x76e8ed3c in ros::CallbackQueue::callOneCB(ros::CallbackQueue::TLS*) () from /opt/ros/kinetic/lib/libroscpp.so
#5 0x76e8fffc in ros::CallbackQueue::callAvailable(ros::WallDuration) () from /opt/ros/kinetic/lib/libroscpp.so
#6 0x76eee420 in ros::SingleThreadedSpinner::spin(ros::CallbackQueue*) () from /opt/ros/kinetic/lib/libroscpp.so
#7 0x76ed45c0 in ros::spin() () from /opt/ros/kinetic/lib/libroscpp.so
#8 0x00011048 in main ()
Using debug mode in building:
Thread 1 "RoverMoveNode" received signal SIGSEGV, Segmentation fault.
0x73001ca8 in ?? ()
(gdb) bt
#0 0x73001ca8 in ?? ()
#1 0x76f86388 in boost::detail::sp_counted_base::release (this=0x73000ad8)
at /usr/local/include/boost/smart_ptr/detail/sp_counted_base_spin.hpp:103
#2 0x76ee5820 in boost::detail::sp_counted_impl_pd<ros::MessageDeserializer*, boost::detail::sp_ms_deleter<ros::MessageDeserializer> >::dispose() () from /opt/ros/kinetic/lib/libroscpp.so
#3 0x76ee9e38 in ros::SubscriptionQueue::call() () from /opt/ros/kinetic/lib/libroscpp.so
#4 0x76e8ed3c in ros::CallbackQueue::callOneCB(ros::CallbackQueue::TLS*) () from /opt/ros/kinetic/lib/libroscpp.so
#5 0x76e8fffc in ros::CallbackQueue::callAvailable(ros::WallDuration) () from /opt/ros/kinetic/lib/libroscpp.so
#6 0x76eee420 in ros::SingleThreadedSpinner::spin(ros::CallbackQueue*) () from /opt/ros/kinetic/lib/libroscpp.so
#7 0x76ed45c0 in ros::spin() () from /opt/ros/kinetic/lib/libroscpp.so
#8 0x00011048 in main (argc=1, argv=0x7effef54)
at /home/pi/rover_workspace_for_dat2019/src/rover_move/src/RoverMoveMain.cpp:6
Asked by mrrius on 2019-11-19 21:11:47 UTC
Comments
One thing that would help is if you launch your node in gdb: http://wiki.ros.org/roslaunch/Tutorials/Roslaunch%20Nodes%20in%20Valgrind%20or%20GDB. If you configure your build (catkin, ament) to have
-DCMAKE_BUILD_TYPE=RelWithDebInfo
(orDebug
) it will allow you to get a backtrace in gdb. When xterm opens your node and you get a gdb prompt, typerun
, wait for a seg fault, then typebt
. That will show which line is triggering the seg fault. If you get that output you can edit your question with the new information.Asked by Thomas D on 2019-11-19 21:39:15 UTC
Did you forget to add an include for
nav_msgs/Odometry.h
? Otherwise, I don't think you can know whatOdometryPtr
is.Asked by Thomas D on 2019-11-19 21:42:18 UTC
I have included it and make an alias for it in typedef.h. The problem is, the node run for several seconds until it died. So, the error is in runtime.
Asked by mrrius on 2019-11-19 22:01:13 UTC
Also, can you explain about dbg without xterm? I run it in ssh and there is no xterm window. I have followed the tutorial from your link and it seems it will produce log file in $ROS_HOME/core.pid.
Asked by mrrius on 2019-11-19 22:14:15 UTC
I got the bt. Updated in question.
Asked by mrrius on 2019-11-19 22:27:51 UTC
GDB backtraces are nice, but only really informative if you've compiled your code with Debug symbols enabled.
For a Catkin workspace, you can do that either by forcing the
Debug
(orRelWithDebInfo
)CMAKE_BUILD_TYPE
in yourCMakeLists.txt
, or by invoking Catkin with:catkin_make -DCMAKE_BUILD_TYPE=Debug
.Then run your node in
gdb
again.Asked by gvdhoorn on 2019-11-20 03:10:27 UTC
I have tried to compile using debug mode and run the module gdb. It shows same error:
Thread 1 "RoverMoveNode" received signal SIGSEGV, Segmentation fault. 0x72901ca8 in ?? ()
and I print the backtrace. Still same, no new info. Any method to get more information? I am using ssh to rasppi so I cannot use xterm now. I cannot set any breakpoint when run in separate launch.
Asked by mrrius on 2019-11-20 03:23:39 UTC
then you haven't built your workspace with Debug symbols enabled.
You must remove your
build
anddevel
folders from your workspace (or at least those of the package), otherwise it's likely that nothing gets actually rebuilt.At the very least, the last line (ie: frame 8 "in main()") should show the exact line number where the
SEGFAULT
is caused.Asked by gvdhoorn on 2019-11-20 03:24:57 UTC
Ok, I have done that. I updated the post. It seems the line it error is ros:: spin()
Asked by mrrius on 2019-11-20 03:59:11 UTC
The gdb output does not clearly show me where the error is. However, a lot of information is missing. In your include, the
helper
includes are not shown. TheNodeHandle
,Sub
, andOdometryPtr
types are not defined. Yourpackage.xml
andCMakeLists.txt
are not shown, and that can be a source of errors. Based on what is shown I can't find where the seg fault is being triggered. I modified your files locally and fixed up a few things (missing data types, removing unused includes, adding an#endif
) and everything ran fine for me. I was able to userostopic pub
to publish on theodom
topic and this node printed out position. If you can provide the same setup that is not working for you I would be happy to look again.Asked by Thomas D on 2019-11-20 08:45:40 UTC
So, I have done like you did. Publishing odom topic. It got the same error. I set the frequency of the publisher to be 30Hz and it seems it was the caused. Probably due to the high freq rate, there is a condition where data is late to be read but it is already release by ros. If I set it to 10Hz, it seems work finely. I run this in RasPi that probably has limitation.
The weird thing is, if I change the message type to simple string, there is no problem with 30Hz. Maybe the message size also affecting this. Odom type is more complicated and maybe take more time than string in processing.
Asked by mrrius on 2019-11-20 17:49:35 UTC