Ask Your Question
0

Ros Node Hang up

asked 2014-11-24 15:40:28 -0500

satk gravatar image

updated 2014-11-25 11:26:41 -0500

Hi,

I am writing a subscriber node with a long callback function. The node calls the callback function once or twice and then hangs up sometimes inside the callback function or sometimes outside the callback function. I have been able to detect where it hangs by publishing ROS_INFO messages at various points in the code.

Inside the callback function, I modify global variables (matrix). Could that be the reason for the node getting stuck during runtime ?

Any other guesses for why a node will hang such that I cant even ctrl+c and stop it ?

Here is the code:

#include "ros/ros.h"
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <Eigen/Dense>
#include <Eigen/LU>
#include "std_msgs/String.h"
#include <std_msgs/Float32.h>
#include <std_msgs/Int32.h>
#include <tf/transform_broadcaster.h>
#include <nav_msgs/Odometry.h>
#include <coop_est/FeatureB.h>
#include <coop_est/FeatureBArray.h>
#include <nav_msgs/Odometry.h>
#include <tf/transform_datatypes.h>
#include "/opt/ros/hydro/include/tf/LinearMath/Matrix3x3.h"
#include "/opt/ros/hydro/include/tf/LinearMath/Transform.h"
#include "tf/tf.h"
#include <message_filters/subscriber.h>
#include <message_filters/time_synchronizer.h>
#include <message_filters/synchronizer.h>
#include <message_filters/sync_policies/approximate_time.h>
#include <sensor_msgs/Image.h>
#include <sensor_msgs/CameraInfo.h>

using namespace Eigen;
using namespace message_filters;
using namespace sensor_msgs;

// Filter Initialization

VectorXf X(3); // Define Starting position and orientation as the inertial frame
MatrixXf Q(3,3);
MatrixXf P(3,3);

void FeaturesReceived(const coop_est::FeatureBArray::ConstPtr& msg, const nav_msgs::Odometry::ConstPtr& odo) /
{

   DO A NUMBER OF MATRIX OPERATIONS ON THE GLOBAL MATRICES Q, P, Xf

}


int main(int argc, char** argv)
{

  ros::init(argc, argv, "coop_node");

  ros::NodeHandle n;

  int should_continue = 1;
  ros::Rate r(5); // 10 hz


  message_filters::Subscriber<coop_est::FeatureBArray> fea_sub(n, "Feature", 1);
  message_filters::Subscriber<nav_msgs::Odometry> odo_sub(n, "odom", 1);
  typedef sync_policies::ApproximateTime<coop_est::FeatureBArray, nav_msgs::Odometry> MySyncPolicy;
  Synchronizer<MySyncPolicy> sync(MySyncPolicy(1), fea_sub, odo_sub);
  sync.registerCallback(boost::bind(&FeaturesReceived, _1, _2));

  ros::spin();

  return 0;

}

------ UPDATE------

After making the callback function much simpler, I found that the callback function got stuck at the same line repeatedly. I have a for loop inside the callback function and when I change the number of iterations on the for loop, the place where the node gets stuck changes. If the number of iterations are very low then the node does not get stuck.

void FeaturesReceived(const coop_est::FeatureBArray::ConstPtr& msg, const nav_msgs::Odometry::ConstPtr& odo) /
{

for (int i = 0; i< n; i++)
{
   DO A NUMBER OF MATRIX OPERATIONS ON THE GLOBAL MATRICES Q, P, Xf
}
}

My thoughts are that as I am operating on the global Eigen2 type matrix inside the callback, when a second callback is created before the execution of the first one is complete, the program hangs as two callback instances are trying to modify the same memory location.

How can I sequentially execute the callback in ROS and wait until the previous callback has fnished executing ?

Thank you for your help !

edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
2

answered 2014-11-24 21:55:46 -0500

ahendrix gravatar image

If your node is hanging inside your callback, there's probably some kind of infinite loop in your callback.

ROS registers a sigint handler and uses it to perform an orderly shutdown of your node once all callbacks are done, so if you callback runs forever, your node will hang.

You should still be able to kill your node with SIGQUIT ( ctrl-\ )

edit flag offensive delete link more

Comments

Thank you for your reply ahendrix. I am able to stop the node using ctrl-\ My problem is still not solved. I need a way for ROS to not callback a function again until the previous callback has returned. I am updating the original question with the new findings.

satk gravatar imagesatk ( 2014-11-25 10:20:03 -0500 )edit
1

Unless you're using a multithreaded spinner (you aren't), ROS executes callbacks serially. This is almost certainly a bug in your code.

ahendrix gravatar imageahendrix ( 2014-11-25 20:18:11 -0500 )edit

I agree with ahendrix, there must probably be an infinite loop in your code. In this loop: for (int i = 0; i< n; i++){...} how do you define n? are you sure you don't have your infinite loop right there?

Martin Peris gravatar imageMartin Peris ( 2014-12-02 19:18:15 -0500 )edit
0

answered 2014-12-02 15:55:40 -0500

satk gravatar image

updated 2014-12-03 10:09:35 -0500

I solved the problem by using the mutex lock feature to get the node from getting stuck.

The code that worked for me is below:

#include "ros/ros.h"
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <tf/transform_broadcaster.h>
#include <message_filters/subscriber.h>
#include <message_filters/time_synchronizer.h>
#include <message_filters/synchronizer.h>
#include <message_filters/sync_policies/approximate_time.h>
#include <boost/thread.hpp>

static boost::mutex mutex;
    void Callback(const coop_est::FeatureBArray::ConstPtr& msg, const nav_msgs::Odometry::ConstPtr& odo) /
    {

       DO A NUMBER OF MATRIX OPERATIONS ON THE GLOBAL MATRICES Q, P, Xf

    }


    int main(int argc, char** argv)
    {

      ros::init(argc, argv, "coop_node");

      ros::NodeHandle n;

      mutex.lock();   

      message_filters::Subscriber<coop_est::FeatureBArray> fea_sub(n, "Feature", 1);
      message_filters::Subscriber<nav_msgs::Odometry> odo_sub(n, "odom", 1);
      typedef sync_policies::ApproximateTime<coop_est::FeatureBArray, nav_msgs::Odometry> MySyncPolicy;
      Synchronizer<MySyncPolicy> sync(MySyncPolicy(1), fea_sub, odo_sub);
      sync.registerCallback(boost::bind(&Callback, _1, _2));

      mutex.unlock();
      ros::spin();

      return 0;

    }
edit flag offensive delete link more

Comments

1

There is a discrepancy in your code... you are registering sync.registerCallback(boost::bind(&FeaturesReceived, _1, _2)); but your callback function is named Callback. The mutex while defining the message filters and registering the callback has no effect. You should check your code again

Martin Peris gravatar imageMartin Peris ( 2014-12-02 19:13:57 -0500 )edit

If your problem is solved, please mark the right answer as correct.

BennyRe gravatar imageBennyRe ( 2014-12-03 01:09:44 -0500 )edit

Hi, I have corrected the discrepancy on the blog. In my actual program the two names are the same. I was just trying to make it more understandable for this blog.

satk gravatar imagesatk ( 2014-12-03 10:09:08 -0500 )edit

This still does make no sense. There is only one place for the mutex and unless you are calling main reentrant (don't!) this should not have any effect.

dornhege gravatar imagedornhege ( 2014-12-03 10:19:59 -0500 )edit
1

Still as @Martin Peris stated the mutex you added has no effect! Your problem was somewhere else (or maybe still is, and just cause temporarily no trouble anymore!!!) The code is single threaded where you added the mutex locks, no need for mutexes there!!!

Wolf gravatar imageWolf ( 2014-12-03 10:22:20 -0500 )edit

Ok, I will look into the code. Thank you all.

satk gravatar imagesatk ( 2014-12-03 10:40:28 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2014-11-24 15:40:28 -0500

Seen: 1,146 times

Last updated: Dec 03 '14