ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question
2

Problems expected if logs deleted during process?

asked 2012-07-02 07:00:45 -0500

updated 2014-01-28 17:12:52 -0500

ngrennan gravatar image

Question:

What would be the expected outcome of deleting logs while a process is running?

Specifically which of the following categories would that fall under : ) ?

  • Fine
  • Undefined
  • Very Bad

Use Case:

Our logs are growing too big because of a bug in diamondback where roscpp_internal does not respect log levels. As a work around we have a script that deletes particular log files ever 30 seconds or so. We've also seen a crash (assert/-6) that we've yet to diagnose closely following our log workaround change, and we currently suspect our log deletions as the cause.

edit retag flag offensive close merge delete

Comments

Maybe it's a silly question but, could you upgrade to Electric or Fuerte to fix that bug? :)

Eric Perko gravatar image Eric Perko  ( 2012-07-02 07:56:52 -0500 )edit

Unfortunately the system where this is occurring is a production machine which is locked to diamondback for the time being.

Asomerville gravatar image Asomerville  ( 2012-07-02 07:58:53 -0500 )edit

Not silly, but I think we do still need to be able to build, test and release Diamondback fixes.

joq gravatar image joq  ( 2012-07-02 08:01:56 -0500 )edit

If you need help with debugging this assertion, could you please try to to send more information about what happens? (a backtrace would be a good start)

Thomas gravatar image Thomas  ( 2012-07-02 09:03:49 -0500 )edit

Thanks for the offer. Because of the nature of the situation (remote machine not internet connected) we're going treat-first-diagnose-later. I'll update if we find anything interesting.

Asomerville gravatar image Asomerville  ( 2012-07-02 09:08:47 -0500 )edit

2 Answers

Sort by ยป oldest newest most voted
2

answered 2012-07-02 08:00:17 -0500

Thomas gravatar image

updated 2012-07-02 09:00:26 -0500

Actually what you try to do just will not help. If you get a look at /proc/XXX/fd where XXX is the process id of your ROS node, you will be able to see the currently opened files and in particular the log files you thought were deleted.

Don't forget that file descriptors are a mechanism of reference counting on a resource, so as long as rosconsole maintains the file opened, the file will, in fact, not be deleted.

This explains why what you do is perfectly safe on Linux :)

So maybe this assertion is just due to you consuming the whole disk space and making the process crash or something? Monitor your disk space, you will see it decreasing until your process crashes, the file descriptor gets released and the kernel cleans the space keps by the previous log files...

So I think it would be better to try to really fix the error instead of trying to workaround it if possible ;)

edit flag offensive delete link more

Comments

@Thomas is right, that is how POSIX filesystems work. The space will not be recovered until all references to the inode are gone.

joq gravatar image joq  ( 2012-07-02 08:05:38 -0500 )edit

Duly noted regarding file deletions while the fd is still held.

Asomerville gravatar image Asomerville  ( 2012-07-02 08:36:29 -0500 )edit
1

answered 2012-07-02 07:45:26 -0500

joq gravatar image

I would expect it to work OK in terms of system stability.

But, the status of future log messages from that run might be undefined (e.g. they might get lost).

What do you expect? What actually happens?

edit flag offensive delete link more

Comments

Well we had a crash(assert/-6) in a system where we had been deleting the logs. We don't have enough info to do a postmortem, but the current assumption is that it was caused by our log deletion. I'll add some of this info to the original question.

Asomerville gravatar image Asomerville  ( 2012-07-02 07:48:39 -0500 )edit

Seems plausible. Maybe construct a small test case that logs continuously, while some other process periodically deletes the log?

joq gravatar image joq  ( 2012-07-02 08:00:29 -0500 )edit

Hopefully we'll get a chance to do some more tests, but it may instead end up being a treat-first-diagnose-later sort of situation do to time/budget constraints.

Asomerville gravatar image Asomerville  ( 2012-07-02 08:37:57 -0500 )edit

Question Tools

Stats

Asked: 2012-07-02 07:00:45 -0500

Seen: 384 times

Last updated: Jul 02 '12