ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question
0

ROS_INFO "Deadlocks" [closed]

asked 2017-01-25 10:33:56 -0500

updated 2017-01-26 05:28:50 -0500

gvdhoorn gravatar image

Hello, I am developing a roscpp multi-threaded application and I am getting one error that I never had:

Two different threads get locked to each other at ROS_INFO calls.

  • One of them is waiting in a boost::unique_lock inside ros::console::print(ros::console::FiterBase*....
  • The other one, is not wating in a lock. It is waiting in a simple write operation inside ros::console::print(ros::console::FiterBase*..... In fact it is blocked by the operative system at the write function in syscall-template.S.

So technically it is not a "dead lock", but for some reason the operative system does not finish the write operation.

Anyone of you have ever had this problem? Do you know how to solve it?

Can it be something related with the buffer size? May I not be calling with enough frequency ros::spinOnce()?


Edit: The method I follow to say the app is "deadlocked" is the following:

  • First I check that the node does not behave externally as usual (via topics and services). It is frozen.

  • Then I attach the debugger to the running application and I see how the debugger is stopped in a system write operation (ROS_INFO),

  • I later, check the state of the rest of the threads of the system. One of them is located in a ROS_INFO operation too, but waiting in the unique_lock

  • I try to do step over in both, but, they both freeze waiting to the SO response

These are the threads I have:

  • ROS_INFO call in the main thread, singleThreadSpinner, in this case attending to a service call guarded by a lock.
  • ros::ROSOutAppender
  • ros::InternalCallbackThreadFunc-> ros::callAvailable -ros::XMLRPCManager::serverThreadFunc
  • second ROS_INFO thread. That thread runs in a loop and uses boost::thread- I guess this should be implemented using a timer, but it is a third party embedded library (hector_mapping). It does not use a spinOnce(), only r.sleep() and ros::ok() (ref )
  • ros::TimerManager (the application also uses a timer)
  • the tf thread
  • the ros::PollManager thread

To me looks like the second ROS_INFO thread should go on a timer. But it is still weird because I think it has been working like this for a while... (but now I am not sure)

edit retag flag offensive reopen merge delete

Closed for the following reason question is not relevant or outdated by tfoote
close date 2019-09-20 02:18:52.236356

Comments

Please avoid posting updates as answers, this is not a forum. I've moved the contents of your post to the OP, but please keep it mind for next time.

gvdhoorn gravatar image gvdhoorn  ( 2017-01-26 05:29:50 -0500 )edit

I agree. Thanks.

Pablo Iñigo Blasco gravatar image Pablo Iñigo Blasco  ( 2017-01-27 03:09:46 -0500 )edit

1 Answer

Sort by » oldest newest most voted
0

answered 2017-01-25 20:19:02 -0500

tfoote gravatar image

I'd suggest looking into why the write is not finishing. If that's actually blocking, whatever you do at the higher level is not going to get very far.

Are you sure it's blocking or is it not just getting overloaded and every time you sample it's in the the same write call but a different instantion of it? It's quite easy to have the console output be the limiting factor for things turning over if you have a lot of output.

If you can provide a small self contained example that would be very helpful in figuring out what's going on.

edit flag offensive delete link more

Comments

Hello tfoote. Thanks for your response, I have answered in a new response because it is enough large.

Pablo Iñigo Blasco gravatar image Pablo Iñigo Blasco  ( 2017-01-26 04:51:27 -0500 )edit

Question Tools

4 followers

Stats

Asked: 2017-01-25 10:33:56 -0500

Seen: 827 times

Last updated: Jan 26 '17