Funky file contents when os.rename or os.remove are interrupted

R

Russell Warren

I've got a case where I'm seeing text files that are either all null
characters, or are trailed with nulls due to interrupted file access
resulting from an electrical power interruption on the WinXP pc.

In tracking it down, it seems that what is being interrupted is either
os.remove(), or os.rename(). Has anyone seen this behaviour, or have
any clue what is going on?

On first pass I would think that both of those calls are single step
operations (removing/changing an entry in the FAT, or FAT-like thing,
on the HDD) and wouldn't result in an intermediate, null-populated,
step, but the evidence seems to indicate I'm wrong...

Any insight from someone with knowledge of the internal operations of
os.remove and/or os.rename would be greatly appreciated, although I
expect the crux may be at the os level and not in python.

Russ
 
H

hg

Russell said:
I've got a case where I'm seeing text files that are either all null
characters, or are trailed with nulls due to interrupted file access
resulting from an electrical power interruption on the WinXP pc.

In tracking it down, it seems that what is being interrupted is either
os.remove(), or os.rename(). Has anyone seen this behaviour, or have
any clue what is going on?

On first pass I would think that both of those calls are single step
operations (removing/changing an entry in the FAT, or FAT-like thing,
on the HDD) and wouldn't result in an intermediate, null-populated,
step, but the evidence seems to indicate I'm wrong...

Any insight from someone with knowledge of the internal operations of
os.remove and/or os.rename would be greatly appreciated, although I
expect the crux may be at the os level and not in python.

Russ
Taking a quick look at the code, it looks like MoveFileW (Windows API)
is eventually being called by posixmodule.c.

My gut feeling is that you are correct and not facing a Python but
Windows issue (sigh) ... you might want to test your problem on an NTFS
file system and see if the problems are similar.

Regards,

hg
 
S

Sybren Stuvel

Russell Warren enlightened us with:
On first pass I would think that both of those calls are single step
operations (removing/changing an entry in the FAT, or FAT-like
thing, on the HDD) and wouldn't result in an intermediate,
null-populated, step, but the evidence seems to indicate I'm
wrong...

They require multiple blocks to be written to disc, so if you're not
using a journaling filesystem, bad things can happen.
Any insight from someone with knowledge of the internal operations
of os.remove and/or os.rename would be greatly appreciated, although
I expect the crux may be at the os level and not in python.

You're right about that.

Sybren
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Russell said:
Any insight from someone with knowledge of the internal operations of
os.remove and/or os.rename would be greatly appreciated, although I
expect the crux may be at the os level and not in python.

Just to confirm what others have said: Python has nothing to do with
that. It calls the relevant Win32 API rather directly.

Then, Windows has nothing to do with it, either. It calls the routines
of the file system driver rather directly.

It's the FAT file system that may suffer from metadata corruption in
case of power loss. If you lose power on a disk that has a FAT file
system on it, you need to run chkdsk before using the file system
again, and you *still* may see corruption. As others have said:
use NTFS if you want a reasonable chance of getting in a clean state
in case of a power loss.

Regards,
Martin
 
R

Russell Warren

Thanks, guys... this has all been very useful information.

The machine this is happening on is already running NTFS.

The good news is that we just discovered/remembered that there is a
write-caching option (in device manager -> HDD -> properties ->
Policies tab) available in XP. The note right beside the
write-cache-enable checkbox says:

"This setting enables write caching to improve disk performance, but a
power outage or equipment failure might result in data loss or
corruption."

Well waddya know... write-caching was enabled on the machine. It is
now disabled and we'll be power-cycle testing to see if it happens
again.

Regarding the comment on journaling file systems, I looked into it and
it looks like NTFS actually does do journaling to some extent, and some
effort was expended to make NTFS less susceptible to the exact problem
I'm experiencing. I'm currently hopeful that the corrupted files we've
seen are entirely due to the mistake of having write-caching enabled
(the default).
Then, Windows has nothing to do with it, either. It calls the routines
of the file system driver rather directly.

It looks like that is not entirely true... this write-caching appears
to sit above the file system itself. In any case, it is certainly not
a Python issue!

One last non-python question... a few things I read seemed to vaguely
indicate that the journaling feature of NTFS is an extension/option.
Wording could also indicate a simple feature, though. Are there
options you can set on your file system (aside from block size and
partition)?! I've certainly never heard of that, but want to be sure.
I definitely need this system to be as crash-proof as possible.

Thanks again,
Russ
 
F

Fredrik Lundh

Russell said:
One last non-python question... a few things I read seemed to vaguely
indicate that the journaling feature of NTFS is an extension/option.

http://www.microsoft.com/whdc/system/winpreinst/ntfs-preinstall.mspx

NTFS is a journaling file system. NTFS writes a log of changes being
made, which offers significant benefit in cases where a system loses
power, experiences an unexpected reset, or crashes.

http://en.wikipedia.org/wiki/NTFS

A file system journal is used in order to guarantee the integrity of
the file system itself (but not of each individual file). Systems
using NTFS are known to have improved reliability compared to FAT file
systems.

</F>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,262
Messages
2,571,048
Members
48,769
Latest member
Clifft

Latest Threads

Top