crash during file writing, how to recover ?

J

Joseph

Hi
I'm writing a commercial program which must be reliable. It has to do
some basic reading and writing to and from files on the hard disk,
and also to a floppy.


I have foreseen a potential problem. The program may crash
unexpectedly while writing to the file. If so, my program should
detect this during startup, and then (during startup) probably delete the
data added to the file and redo the writing operation.

Are file writing operations atomic ? ie when you write to a file,
will it either do it succesfully, OR say half fail (eg write a few letters
and not finish), OR not commit any changes to the file if a crash at
this point occurs?

My next question is how is this handled in commercial programming? I
plan on writing a flag (say, a simple char) to another file (this
would signal that a file write is about to begin), and then
removing this char after the file writing operation is completed.
Then on startup i just check the flags. if flag hasn't been removed a
crash occurred, so have to open file and get rid of any garbage.

Has anyone done anything similar b4? if so how did you handle this
crash scenario. My application could totally stuff up if i don't
handle this right.

by the way, i'm using the java language and api. this might effect
how files are written to, so i thought i should mention this.


MANY THANKS
Joseph
 
R

Roedy Green

Are file writing operations atomic ? ie when you write to a file,
will it either do it succesfully, OR say half fail (eg write a few letters
and not finish), OR not commit any changes to the file if a crash at
this point occurs?

Imagine a floppy being written. The power fails half way through
writing one of your tracks. You then have gibberish on your floppy.
Further the fat and directory are likely out of sync. When you use
CHKDSK /F it tries to fix this. The file contains gibberish. You
probably should bulk erase and reformat such a floppy.

A very easy way to detect the problem is to write small file before
you start. You do your thing, and on exit you erase the file. You
can do this to any app by doing the creating testing and deleting in
bat language.

When you start, you check for the presence of the file. If you see it
you assume the worst, and demand a restore from backup or whatever you
need to do to get going again safely.
 
L

Liz

Roedy Green said:
Imagine a floppy being written. The power fails half way through
writing one of your tracks. You then have gibberish on your floppy.
Further the fat and directory are likely out of sync. When you use
CHKDSK /F it tries to fix this. The file contains gibberish. You
probably should bulk erase and reformat such a floppy.

A very easy way to detect the problem is to write small file before
you start. You do your thing, and on exit you erase the file. You
can do this to any app by doing the creating testing and deleting in
bat language.

When you start, you check for the presence of the file. If you see it
you assume the worst, and demand a restore from backup or whatever you
need to do to get going again safely.

I had an interesting situation a few years ago. I was using stacker
on my hard disk since it was only 20 megabytes. And I happened to
be doing a squeeze type operation when my battery died. At the next
powerup, stacker gave a message on the screen that it discovered that
it crashed during a squeeze and proceeded to automatically clean up.
I was impressed.
 
C

Calum

Joseph said:
Hi
I'm writing a commercial program which must be reliable. It has to do
some basic reading and writing to and from files on the hard disk,
and also to a floppy.


I have foreseen a potential problem. The program may crash
unexpectedly while writing to the file. If so, my program should
detect this during startup, and then (during startup) probably delete the
data added to the file and redo the writing operation.

Are file writing operations atomic ? ie when you write to a file,
will it either do it succesfully, OR say half fail (eg write a few letters
and not finish), OR not commit any changes to the file if a crash at
this point occurs?

My next question is how is this handled in commercial programming? I
plan on writing a flag (say, a simple char) to another file (this
would signal that a file write is about to begin), and then
removing this char after the file writing operation is completed.
Then on startup i just check the flags. if flag hasn't been removed a
crash occurred, so have to open file and get rid of any garbage.

Has anyone done anything similar b4? if so how did you handle this
crash scenario. My application could totally stuff up if i don't
handle this right.

by the way, i'm using the java language and api. this might effect
how files are written to, so i thought i should mention this.

One approach is to write to a temporary file, then when writing has
completed successfully, and the file has been closed, rename the
temporary file to the target filename. That way you won't run out of
disk space either. If you need to overwrite an old file, delete it just
before renaming. If your program crashes during temporary file
creation, you'll be left with a damaged temp file that is never used
again - no big deal.

Calum
 
N

Norm Dresner

Always write to a new file and then delete the old one and rename the new.
Also, consider writing a journal file before writing to the new file so you
can at least "recover" what's missing.

Norm,
 
K

Kasper Dupont

Norm said:
Always write to a new file and then delete the old one and rename the new.

No need to delete the old one first. Renaming will
automatically delete the old file.
 
N

Nate Smith

Kasper said:
No need to delete the old one first. Renaming will
automatically delete the old file.


often that will lead to a "cant rename, file already exists"
sort of error....


- nate
 
K

Kasper Dupont

Nate said:
often that will lead to a "cant rename, file already exists"
sort of error....

I'm pretty sure the posix standard requires rename to
atomically remove and replace the target if it already
exists. But I don't have access to the standard, so
somebody else will have to check.

And using rename to delete the file is the correct way
to do because of the atomic behavioure. Deleting the old
file before renaming would introduce a race condition.
 
R

Roedy Green

No need to delete the old one first. Renaming will
automatically delete the old file.

Here is the sort of code I use to rewrite the contents of a file.


// create a tempfile in the same directory as
// the input file we have just processed.
File tempfile = HunkIO.createTempFile ("temp", ".tmp",
fileBeingProcessed );
FileWriter emit = new FileWriter( tempfile );
emit.write( result );
emit.close();
// successfully created output in same directory as input,
// Now make it replace the input file.
fileBeingProcessed.delete();
tempfile.renameTo( fileBeingProcessed );

This effectively makes the tempfile disappear, but without the delete
it would not make the old version disappear. Or would it?

It would be nice to have the delete/rename atomic.
 
K

Kasper Dupont

Roedy said:
Here is the sort of code I use to rewrite the contents of a file.

// create a tempfile in the same directory as
// the input file we have just processed.
File tempfile = HunkIO.createTempFile ("temp", ".tmp",
fileBeingProcessed );
FileWriter emit = new FileWriter( tempfile );
emit.write( result );
emit.close();
// successfully created output in same directory as input,
// Now make it replace the input file.
fileBeingProcessed.delete();
tempfile.renameTo( fileBeingProcessed );

Well, I don't write java code I usually use C, so I don't
know exactly how those methods are implemented. But I
know it is impossible to delete a file using any kind of
handle, you need to use the name. So exactly what is the
meaning of `fileBeingProcessed.delete();'? Does it delete
whatever file has the name originally used to open
fileBeingProcessed?

In the next line it looks like fileBeingProcessed is a
string, but then you wouldn't be able to delete the file
the way it is done in the code.
This effectively makes the tempfile disappear, but without the delete
it would not make the old version disappear. Or would it?

If the renameTo method calls the rename system call, it
will make the old file disappear.
It would be nice to have the delete/rename atomic.

You have it. At least on any posix compliant system you do.
 
N

Nick Landsberg

Joseph said:
Hi
I'm writing a commercial program which must be reliable. It has to do
some basic reading and writing to and from files on the hard disk,
and also to a floppy.


I have foreseen a potential problem. The program may crash
unexpectedly while writing to the file. If so, my program should
detect this during startup, and then (during startup) probably delete the
data added to the file and redo the writing operation.

Are file writing operations atomic ? ie when you write to a file,
will it either do it succesfully, OR say half fail (eg write a few letters
and not finish), OR not commit any changes to the file if a crash at
this point occurs?

My next question is how is this handled in commercial programming? I
plan on writing a flag (say, a simple char) to another file (this
would signal that a file write is about to begin), and then
removing this char after the file writing operation is completed.
Then on startup i just check the flags. if flag hasn't been removed a
crash occurred, so have to open file and get rid of any garbage.

Has anyone done anything similar b4? if so how did you handle this
crash scenario. My application could totally stuff up if i don't
handle this right.

by the way, i'm using the java language and api. this might effect
how files are written to, so i thought i should mention this.


MANY THANKS
Joseph

Just another followup, possibly about a condition that
you have not considered.

Do you need to guard against a hard-disk crash while
writing? If your program does not, by some definitions
this is "not reliable." (Is restoring from "last
week's backup" OK with the customer?)

You can only guard against a single hardware failure
at a time. As I mentioned elsethread, DBMS's use
a log file to log the changes. This log file must
be on a separate hardware device to guard against
a single hardware failure. Thus, either the logfile
or the data file survives. If the logfile is on the
device that fails, then, no problem. If the data-file
is on the device that fails, it may be reconstructed
from the last backup of the data files and applying
all the log-files since the backup.

I am not sure if the "rename" strategy mentioned by
other posters will be atomic over multiple physical
devices nor do I know about what size files you
are talking about. If it's several GB, then the
copy from one disk to another will take considerable
time. Then again, you may not need to be at this
level of paranoia. :)
 
K

Kasper Dupont

Nick said:
You can only guard against a single hardware failure
at a time.

Actually you can guard against multiple hardware
failures, but it will get expensive.
I am not sure if the "rename" strategy mentioned by
other posters will be atomic over multiple physical
devices

That depends. If you use a filesystem on a raid it
should be atomic. But raid is not 100% safe. With
an unfortunate sequence of events even a raid will
lose data. If we are talking independent filesystems
the rename will just report an error.
 
G

goose

Joseph wrote:

My next question is how is this handled in commercial programming? I
plan on writing a flag (say, a simple char) to another file (this
would signal that a file write is about to begin), and then
removing this char after the file writing operation is completed.
Then on startup i just check the flags. if flag hasn't been removed a
crash occurred, so have to open file and get rid of any garbage.

Why not just write your 'dirty flag' in the same file?

<snipped>

goose
 
N

Nick Landsberg

Kasper said:
Actually you can guard against multiple hardware
failures, but it will get expensive.

True. I inadvertantly left the word
"economically".
 
G

Gerry Quinn

I'm pretty sure the posix standard requires rename to
atomically remove and replace the target if it already
exists. But I don't have access to the standard, so
somebody else will have to check.

And using rename to delete the file is the correct way
to do because of the atomic behavioure. Deleting the old
file before renaming would introduce a race condition.

That sounds like a dangerous approach to me.

Why not rename the old file *first*, before writing the new one. Then
if the program starts and finds the most recently written file is
corrupt due to a crash, the last good file remains as a backup.
Renaming an existing file should be quick, and you can wait until it's
done before starting to write.

- Gerry Quinn
 
C

Chris Smith

For the most part, I think this thread demonstrates confusion caused by
cross-posting. We've got answers from people in
comp.lang.java.programmer answering as if this were entirely a Java
question and people from comp.os.linux.development.apps answering as if
it's a Linux question... and we don't know who's right!

Nevertheless, some clarification about Java:

Kasper said:
Well, I don't write java code I usually use C, so I don't
know exactly how those methods are implemented. But I
know it is impossible to delete a file using any kind of
handle, you need to use the name. So exactly what is the
meaning of `fileBeingProcessed.delete();'? Does it delete
whatever file has the name originally used to open
fileBeingProcessed?

In the next line it looks like fileBeingProcessed is a
string, but then you wouldn't be able to delete the file
the way it is done in the code.

Java's standard API class java.io.File is confusingly named. It
represents an abstract path name, not a file. Having a java.io.File
object doesn't even imply the existence of a file in the filesystem,
though File does expose an API method called exists() that tells you
whether this file exists in the filesystem or not. Certain operations
that deal with directory management (rename, delete, etc.) are
implemented for objects of the File class. So Roedy's calls make
perfect sense because they don't operate on a file descriptor, but
rather on a file name.
If the renameTo method calls the rename system call, it
will make the old file disappear.

Not surprisingly, it's not specified whether File.renameTo results in a
call to the rename system call. More surprisingly, it's not specified
whether File.renameTo will succeed if a file already exists by the
target name. That said, renameTo returns a success flag (which is ugly
in Java, but nevertheless happens). So it's entirely possible to write:

if (!newFile.renameTo(fileBeingProcessed))
{
fileBeingProcessed.delete();
newFile.renameTo(fileBeingProcessed);
}

Additional error checking would be nice in case the rename fails, for
example, on a Windows machine because of open file descriptors to the
file. Windows file handling doesn't separate the existence of a file
from its directory entry in the way POSIX does.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
K

Kasper Dupont

I just checked susv3 does require rename to make
sure at any point in time, the target name will
refere to either the old or the new file. And if
rename fails the target must be unaffected.
That sounds like a dangerous approach to me.

Why not rename the old file *first*, before writing the new one. Then
if the program starts and finds the most recently written file is
corrupt due to a crash, the last good file remains as a backup.
Renaming an existing file should be quick, and you can wait until it's
done before starting to write.

But then you'd have a window where no file exist with
the given name. The approach I suggested is safe. When
creating the new file first create it with a different
name. And when you have finished writing you rename it
such that it atomically replaces the old file.

There is nothing dangerous to it.
 
K

Kasper Dupont

Chris said:
Not surprisingly, it's not specified whether File.renameTo results in a
call to the rename system call. More surprisingly, it's not specified
whether File.renameTo will succeed if a file already exists by the
target name.

Well I don't know anything about Java. But I know that
the right way to do this requires use of the rename
system call to atomically remove the old file and
replace it with the new. If that is not what happens it
means either the Java program or the Java VM is broken.
 
C

CBFalconer

Gerry said:
That sounds like a dangerous approach to me.

Why not rename the old file *first*, before writing the new one.
Then if the program starts and finds the most recently written
file is corrupt due to a crash, the last good file remains as a
backup. Renaming an existing file should be quick, and you can
wait until it's done before starting to write.

That is not his point. If a rename can fail because the target
file pre-exists, the delete/rename sequence has a hole between
delete and rename in which some other process can create that file
name, and cause a failure. This is a race condition. It is
especially likely to occur with database systems which inherently
tend to service multiple processes from the same database, and
have to 'take steps' to ensure the self-consistency of that
database.

One cure is to provide atomic operations, often by the use of
critical sections or other synchronization primitives. Another is
the concept of 'transactions'.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top