Ask Your Question
1

Why realtime loop could die ?

asked 2011-04-27 20:42:11 -0600

Guido gravatar image

updated 2014-11-22 17:05:41 -0600

ngrennan gravatar image

Hi all,

I'm facing a strange problem. While running a custom controller for the whole pr2 joints, the realtime-loop2 die. It happens randomly when a trajectory is sent to the controller. Sometimes at the first trajectory, some other times after the Nth trajectory. After the realtime loop die, the whole robot die. The robot is started with the command roslaunch /etc/ros/robot.launch (and not sudo robot start, i don't know if this makes a difference). He are some logs.

Realtime-loop2:

[roscpp_internal] [2011-04-28 10:17:46,613] [thread 0x7f821803a770]: [DEBUG] TCP socket [58] closed 
[roscpp_internal] [2011-04-28 10:17:49,618] [thread 0x7f820c5d8950]: [DEBUG] Accepted connection on socket [9], new socket [58]
[roscpp_internal] [2011-04-28 10:17:49,618] [thread 0x7f820c5d8950]: [DEBUG] TCPROS received a connection from [10.68.0.1:37309]
[roscpp_internal] [2011-04-28 10:17:49,618] [thread 0x7f820c5d8950]: [DEBUG] Connection: Creating ServiceClientLink for service [/pr2_controller_manager/list_controllers] connected to [callerid=[/arms_controllers_stopper] address=[TCPROS connection to [10.68.0.1:37309 on socket 58]]]
[roscpp_internal] [2011-04-28 10:17:49,618] [thread 0x7f820c5d8950]: [DEBUG] Service    client [/arms_controllers_stopper] wants service [/pr2_controller_manager/list_controllers] with md5sum [39c8d39516aed5c7d76284ac06c220e5]
[roscpp_internal] [2011-04-28 10:17:49,619] [thread 0x7f821803a770]: [DEBUG] TCP socket [58] closed
[roscpp_internal] [2011-04-28 10:17:49,695] [thread 0x7f820c5d8950]: [DEBUG] Accepted connection on socket [9], new socket [58]
[roscpp_internal] [2011-04-28 10:17:49,695] [thread 0x7f820c5d8950]: [DEBUG] TCPROS received a connection from [10.68.0.1:37314]
[roscpp_internal] [2011-04-28 10:17:49,695] [thread 0x7f820c5d8950]: [DEBUG] Connection: Creating ServiceClientLink for service [/l_forearm_cam_trigger/set_waveform] connected to [callerid=[/camera_synchronizer_node] address=[TCPROS connection to [10.68.0.1:37314 on socket 58]]]
[roscpp_internal] [2011-04-28 10:17:49,696] [thread 0x7f820c5d8950]: [DEBUG] Service client [/camera_synchronizer_node] wants service [/l_forearm_cam_trigger/set_waveform] with md5sum [cbb7e900a71a9a437da9999c8d39fff4]
[roscpp_internal] [2011-04-28 10:17:49,696] [thread 0x7f821803a770]: [DEBUG] TCP socket [58] closed
[roscpp_internal] [2011-04-28 10:17:49,697] [thread 0x7f820c5d8950]: [DEBUG] Accepted connection on socket [9], new socket [58]
[roscpp_internal] [2011-04-28 10:17:49,697] [thread 0x7f820c5d8950]: [DEBUG] TCPROS received a connection from [10.68.0.1:37315]
[roscpp_internal] [2011-04-28 10:17:49,697] [thread 0x7f820c5d8950]: [DEBUG] Connection: Creating ServiceClientLink for service [/r_forearm_cam_trigger/set_waveform] connected to [callerid=[/camera_synchronizer_node] address=[TCPROS connection to [10.68.0.1:37315 on socket 58]]]
[roscpp_internal] [2011-04-28 10:17:49,697] [thread 0x7f820c5d8950]: [DEBUG] Service client [/camera_synchronizer_node] wants service [/r_forearm_cam_trigger/set_waveform] with md5sum [cbb7e900a71a9a437da9999c8d39fff4]
[roscpp_internal] [2011-04-28 10:17:49,698] [thread 0x7f821803a770]: [DEBUG] TCP socket [58] closed
[roscpp_internal] [2011-04-28 10:17:49,698] [thread 0x7f820c5d8950]: [DEBUG] Accepted connection on socket [9], new socket [58]
[roscpp_internal] [2011-04-28 10:17:49,698] [thread 0x7f820c5d8950]: [DEBUG] TCPROS received a connection from [10.68.0.1:37316]
[roscpp_internal] [2011-04-28 10:17:49,698] [thread 0x7f820c5d8950]: [DEBUG] Connection: Creating ServiceClientLink for service [/head_camera_trigger/set_waveform] connected to [callerid=[/camera_synchronizer_node] address=[TCPROS connection to [10.68.0.1:37316 on socket 58]]]
[roscpp_internal] [2011-04-28 10:17:49,698] [thread 0x7f820c5d8950]: [DEBUG] Service client [/camera_synchronizer_node] wants service [/head_camera_trigger/set_waveform] with md5sum [cbb7e900a71a9a437da9999c8d39fff4]
[roscpp_internal] [2011-04-28 10:17:49,699] [thread 0x7f820c5d8950]: [DEBUG] Accepted connection on socket [9], new socket ...
(more)
edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
0

answered 2011-05-01 21:51:37 -0600

Guido gravatar image

I finally found the problem. As explained in this answer there are various threads to take into account in a controller. In my case, I have two threads to look after: the realtime loop and the one calling callbacks.

As my callback was loading new trajectories in a non concurrent-safe way, it was possible for the realtime loop to read a part of the memory being written by the callback.

I fixed it with a shared pointer, in the same way as the ros controllers do it.

edit flag offensive delete link more
2

answered 2011-04-28 07:15:00 -0600

sglaser gravatar image

Hi Guido,

If your controller crashes, the entire realtime loop will die. In the PR2 controller manager, all controllers are run in the same process, so if one crashes, the entire process crashes.

Try attaching gdb before loading your controller (you will probably need to run it as root).

edit flag offensive delete link more

Comments

Thanks for the answer. Regardless of controller, attaching gdb seems to kill the controllers. The realtime loop dies and motors stop. Odom, Imu and narrow stereo complain about missing transforms and services.Gdb: 0x00007fbae6b7956d in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
Guido gravatar imageGuido ( 2011-04-28 22:05:32 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 2011-04-27 20:42:11 -0600

Seen: 311 times

Last updated: May 01 '11