pthread_recursive_mutex - assertion failed error

asked 2020-05-06 16:03:19 -0500

updated 2022-05-23 09:58:14 -0500

I'm getting the following error. I have all the latest ros codebase and I don't have any activity servers. I only have publishers, subscribers and services. I'm attaching the backtrace as well.

/usr/include/boost/thread/pthread/recursive_mutex.hpp:113: void boost::recursive_mutex::lock(): Assertion `!pthread_mutex_lock(&m)' failed

#0  0x00007fffeebaefff in raise () from /lib/x86_64-linux-gnu/
#1  0x00007fffeebb042a in abort () from /lib/x86_64-linux-gnu/
#2  0x00007fffeeba7e67 in ?? () from /lib/x86_64-linux-gnu/
#3  0x00007fffeeba7f12 in __assert_fail () from /lib/x86_64-linux-gnu/
#4  0x000055555566120d in boost::recursive_mutex::lock() ()
#5  0x000055555566fce5 in boost::unique_lock<boost::recursive_mutex>::lock() ()
#6  0x00007ffff2ed1795 in ros::Connection::drop(ros::Connection::DropReason) () from /opt/ros/melodic/lib/
#7  0x00007ffff2f4ca97 in ros::TransportTCP::close() () from /opt/ros/melodic/lib/
#8  0x00007ffff2f4daa5 in ros::TransportTCP::socketUpdate(int) () from /opt/ros/melodic/lib/
#9  0x00007ffff2f8b39e in ros::PollSet::update(int) () from /opt/ros/melodic/lib/
#10 0x00007ffff2f0ae75 in ros::PollManager::threadFunc() () from /opt/ros/melodic/lib/
#11 0x00007ffff0206116 in ?? () from /usr/lib/x86_64-linux-gnu/
#12 0x00007fffef9c44a4 in start_thread () from /lib/x86_64-linux-gnu/
#13 0x00007fffeec64d0f in clone () from /lib/x86_64-linux-gnu/

The codebase is very big and I really cannot share it due to confidentiality. I really don't have any clue as to where to start to solve this problem. All threads I have been reading regarding this problem has been only for pthread_recursive_mutex errors in activity servers.

Also I saw there were some updates released in ros/ros_comm may have caused this problem. I have a ros_comm package version from last year and I'm not getting the pthread_recursive_mutex error. Are there any parameters that got changed in the updates this year that might cause this issue. More than willing to accommodate the fix in our codebase. But would like to know what might be the potential causes.

I may be doing something very silly. I can try to provide more information as needed. I would like to resolve this issue. I'm not able to reproduce this behavior consistently and also the backtrace shows nothing but some ROS library calls (probably to ros_comm libraries).

Help much appreciated.

That's fine. I should have read the guidelines. I already had the screenshot with me and that's why I posted the screenshot instead of the backtrace in text format. I have updated the question and added probably additional comments that might be useful. Could you please re-open the question? Thanks !

venkisagunner  ( 2020-05-10 12:45:45 -0500 )

1 Answer

Sort by ยป oldest newest most voted

answered 2020-05-14 12:49:25 -0500

updated 2020-05-14 13:14:42 -0500

Fixed in a pull request. Problem in ros/ros_comm library itself. Refer to ros/ros_comm#1950 in ros/ros_comm repository.

