Recover corrupted bag file
I have a large recording (~16GB) of lidar data (point clouds, objects...) in a rosbag .bag file (done with rqt_bag).
The file produced an error when playing back:
Error: Received an invalid TCPROS header. Each line must have an equals sign.
at line 103 in /tmp/binarydeb/ros-noetic-cpp-common-0.7.2/src/header.cpp
[FATAL] [1598618534.165082911]: Error reading FILE_HEADER record
I used rosbag reindex and the resulting file does not give any more errors, however there seem to be no topics recognized. rosbag info only shows
version: 2.0
size: 16.0 GB
I then tried to use rosbag fix, which produces an empty bag (4.1kb in size).
During recording, the computer may have gone into sleep mode for a brief moment, which I suspect is the reason for the corruption. When viewing the file in vim, I see that the data should be there, so I hope there is some way of recovering it. The experiment is hard to repeat, so I would be extremely grateful for some hint about what I can do.
I uploaded the first lines of the bag file before reindexing here: https://drive.google.com/file/d/1Tzab...
ROS distro: noetic
Corruption with
.bag
s is typically non-trivial to fix, and there aren't really any tools which automate the process. Re-indexing is typically also not something which fixes these kind of problems.I'm not saying it's impossible, but unless it's a trivial fix, I would not expect an easy solution.
What did you do after the PC resumed? A common problem with
.bag
s is for them to not have the required blocks at the end whichrosbag
needs to use the file.Is the
FILE_HEADER
error thrown immediately at the beginning, or after some time?I did not do anything after the PC resumed - I only saw that the visualization in rviz was ok and rosbag was also still running and the file kept growing, so I assumed everything was ok. The FILE_HEADER error is thrown immediately.
Is there some documentation on the required blocks at the end? I only found http://wiki.ros.org/Bags/Format/2.0 on rosbags. I've also uploaded the last lines of the file to https://drive.google.com/file/d/1-ZCp... .
Thanks a lot for your help!
I have a few ideas, and I'm willing to take a look (in the next couple weeks), but I'm going to need (much) more of the
.bag
and I can't guarantee anything (ie: whether it's fixable or not).If you're willing/able/allowed to share it: it's not compressed right now (at least not the chunks I could identify), could you try something like this to compress it and see what the resulting size is:
Make sure you have the
pxz
package installed.Note: this will create
corrupted.bag.xz
in/path/to
and use all CPU cores available, so preferably use a high spec machine.Also: as a first attempt: try removing the bytes from offset
0x0D
to0x50
(ie: remove the sequence of 68 bytes starting at offset0x0D
). I'm not sure what ...(more)I tried removing the bytes you mentioned. There's a new error:
It compressed to ~500MB and I uploaded it here: https://drive.google.com/file/d/1yFLn...
I'll also keep trying to get a better understanding of bags and fix it and will keep you updated here.
I replaced the first part of the file with that of another bag file with the same setup. There's no error anymore, but it does not recognize any messages. Also tried opening it in spectx, which recognizes a few messages, but only from less than the first second of the recording. rosbag info shows the correct topics, but 0 messages. Don't know if what I'm doing makes much sense, but I want to try removing the first messages from the file. Is there some marker that separates messages?
That works to 'silence' the errors, but not to make the bag usable. The bag header structure is located in that "first part" and it actually contains important information about the data in the
.bag
. Without that information,rosbag
will not be able to find anything in it.I've tried a few things, and I've gotten a bit further, but there is something really strange about the
.bag
. It's as-ifrosbag
(or any of the underlying OS systems) decided to distribute the contents (ie: bytes) of some blocks all around the file at random locations. I'm not entirely sure yet, but it's strange to "suddenly" see some bytes (which appear to belong to aCHUNK_INFO
record fi) appear in the middle of a list ofINDEX_DATA
records ...(more)Just a quick update: I've not been able to make much progress. There still appear to be parts in the bag overwritten by others.
The most efficient way forward would likely be to scan the bag for
CHUNK
s andCONNECTION_INFO
headers and then piece together a new bag (sort of whatreindex
does).Thanks for all the effort! I will look into identifying these headers, but at this point, I think it will be simpler to arrange to repeat the experiment. If I do make some progress recovering the file, I will update this post.