ideas to salvage corrupt tie()ed DB_File file?

B

botfood

after extensive testing I am stuck with the realization that a database
file salvaged from a corrupted webserver has been messed up just enough
so that tie() fails to open it. I don't know whether it was a virus, a
mechanical crash, or caused by a software bug. using -f and -r checks
pass the file as existing and readable, but the tie() using DB_File
fails and the $! reports 'No such file or directory'.

....how I determined that has taken two days, and some kicking around in
a different thread. What I am after now is any thoughts people might
have on how to 'fix' the file to recover as much of the data as
possible.

To work with the file i pulled a copy local from the Linux server to my
PC with binary transfer in hopes i can pick the data out...

If I open the file with wordpad (too big for notepad) I think I see a
pattern of readable text, with lines of binary gibberish mixed in that
I assume has to do with DB_File indexing and Berkley internal markers.
The trouble with this is that the man-readable data is in chunks, and
the first record of each chunk doesnt have the key value discernable as
the rest in each chunk that look like:
156100.1DANICA||ROMEROI...and more fields
where the value separated by the squares i recognize as the key. The
first record in each chunk typically has a long string of gibberish
with the values starting right afterward rather than the value between
'block' characters, so I dont see how to pick out the key for the first
value string in each block of man-readable text.

What I am wondering to the group here is whether it is worth my time to
attempt to extract the pattern of readable text from the file created
as a tie()ed DB_File, save as plain text, and then write another import
tool to write the text back into a tie()ed file.

or... lost cause?

I tried to paste in a more complete 'chunk' of what i am looking at so
you can see the pattern, but some of the special characters wont go
into the google message composition window.:
 
D

Diomidis Spinellis

botfood said:
after extensive testing I am stuck with the realization that a database
file salvaged from a corrupted webserver has been messed up just enough
so that tie() fails to open it. I don't know whether it was a virus, a
mechanical crash, or caused by a software bug. using -f and -r checks
pass the file as existing and readable, but the tie() using DB_File
fails and the $! reports 'No such file or directory'.
[...]

What I am wondering to the group here is whether it is worth my time to
attempt to extract the pattern of readable text from the file created
as a tie()ed DB_File, save as plain text, and then write another import
tool to write the text back into a tie()ed file.

Assuming that your data is tied to a Berkeley DB file, try using the
db_dump command that comes with Berkeley DB to dump the data in a
readable format. The command's -r and -R options allow you to salvage
data from a corrupt database. If the command's documentation is not
available on your system, you can find it online at
<http://www.sleepycat.com/docs/utility/db_dump.html>.
 
B

botfood

Diomidis said:
....
Assuming that your data is tied to a Berkeley DB file, try using the
db_dump command that comes with Berkeley DB to dump the data in a
readable format. The command's -r and -R options allow you to salvage
data from a corrupt database. If the command's documentation is not
available on your system, you can find it online at
<http://www.sleepycat.com/docs/utility/db_dump.html>.
---------------------------------------

great idea, I will see if I can figure out how to run it remotely...
The complicating factor is that the DB file is on my remote webhost on
Linux, and all I have at home is Windoze. I might be able to get at it
via shell commands.

thanks for the idea,
Dan
 
A

Anno Siegel

botfood said:
Diomidis Spinellis wrote:

[try db_dump]

If the DB is damaged, I'm afraid db_dump will also reject it. Then
again, it's worth a try. It may be more explicit about what is wrong.
great idea, I will see if I can figure out how to run it remotely...
The complicating factor is that the DB file is on my remote webhost on
Linux, and all I have at home is Windoze. I might be able to get at it
via shell commands.

Oh, there's a Windows machine involved? Have the database files ever
been transferred between the two? Could the transfer have happened
in some kind of ascii- or text mode? If so, the damage may be reparable.

Anno
 
D

Diomidis Spinellis

Anno said:
botfood said:
Diomidis Spinellis wrote:

[try db_dump]

If the DB is damaged, I'm afraid db_dump will also reject it. Then
again, it's worth a try. It may be more explicit about what is wrong.

db_dump has two options for recovering damaged files:

-r Salvage data from a possibly corrupt file.

-R Aggressively salvage data from a possibly corrupt file. The -R
flag differs from the -r option in that it will return all possible data
from the file at the risk of also returning already deleted or otherwise
nonsensical items. Data dumped in this fashion will almost certainly
have to be edited by hand or other means before the data is ready for
reload into another database
 
B

botfood

Anno said:
botfood said:
Diomidis Spinellis wrote:

[try db_dump]

If the DB is damaged, I'm afraid db_dump will also reject it. Then
again, it's worth a try. It may be more explicit about what is wrong.

Oh, there's a Windows machine involved? Have the database files ever
been transferred between the two? Could the transfer have happened
in some kind of ascii- or text mode? If so, the damage may be reparable.
-----------------------------------

yes, Windoze involved, BUT probably not the culprit in this case. I did
write a little perl script that goes out every night and FTPs copies of
these file in binary mode, and I've tested for restore capability.

This specific problem occured when a host webserver crashed hard, and
corrupted files on the way. The stars were aligned and my last backup
pulled copies that had been corrupted via either mechanical head crash
or virus, or god knows what, then the computer literally melted down,
all disks lost, including the disk-to-disk RAID backups. So, my backup
were corrupt, and their hard disks are slag.

so... here I am up the creek without a viable backup except a 3 week
old flat textfile I pulled for a special report. I am going to import
that first, and then attempt to extract whatever else I can with the
db_dump utilities.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top