knowing when file is flushed to disk

J

John Pote

Hello,

I'm using a Python CGI script on a web server to log data from a remote site
every few minutes. I do not want to lose any data for whatever rare reason -
power outage/os crash just at the wrong moment etc. So I would like to know
when the data is actually written to disk and the file closed. At that point
I can signal deleting of the data at the remote site which has very limited
storage.

Is there some way from my Python script to know when the data is actually on
the disk. BTW server OS is Linux. Presumably calling flush() and close() on
the output file will initiate the disk write, but do they wait for the
actual disk write or immediately return leaving the OS to do the write when
it sees fit?

Any thoughts appreciated,

John
 
D

daftspaniel

John said:
Is there some way from my Python script to know when the data is actually on
the disk. BTW server OS is Linux. Presumably calling flush() and close() on
the output file will initiate the disk write, but do they wait for the
actual disk write or immediately return leaving the OS to do the write when
it sees fit?

All you can do in Python (or similar) is call flush & close and hope
for the best :)

There are many factors outwith the control of the language e.g.
* Library behaviour
* OS behaviour
* Hardware cache on the disk itself

That said, I've only found it an issue when a computer is under heavy
load.

Hope this helps,
Davy Mitchell

http://www.latedecember.com/sites/personal/davy/
 
N

Neil Hodgson

John Pote:
Is there some way from my Python script to know when the data is actually on
the disk. BTW server OS is Linux. Presumably calling flush() and close() on
the output file will initiate the disk write, but do they wait for the
actual disk write or immediately return leaving the OS to do the write when
it sees fit?

No, commonly they will schedule these operations and return quickly.
You can try os.fsync but there are no real guarantees about what that
does either. There's an amusing message from Tim Peters about this:
http://mail.zope.org/pipermail/zodb-dev/2004-July/007689.html

Neil
 
J

John Pote

Thanks for the replies. I guessed the situation would be flush() and trust.
The probability of a crash between flush() returning and data actually
written resulting in a trashed disk must be very small. But if you can be
certain without too much effort it's got to be a good idea, so I thought I'd
ask anyway.

How does the banking industry handle this sort of thing? Could be big bucks
if something goes wrong for them!

Thanks again,

John
 
S

Slawomir Nowaczyk

On Wed, 09 Aug 2006 16:13:19 +0000 (GMT)

#> Is there some way from my Python script to know when the data is actually on
#> the disk. BTW server OS is Linux. Presumably calling flush() and close() on
#> the output file will initiate the disk write, but do they wait for the
#> actual disk write or immediately return leaving the OS to do the write when
#> it sees fit?

You may want to look into sqlite -- it is a single-file based SQL
database which is known to be extremely robust in face of problems you
describe. One of its design goals was to provide a replacement for
file storage. There is python binding http://pysqlite.org which is,
IIRC, supposed to be in stdlib for Python 2.5

That said, if your disk and/or OS is lying about the fact whether it
has actually wrote the data or not, there is not much you can do.

--
Best wishes,
Slawomir Nowaczyk
( (e-mail address removed) )

If vegetarians love animals so much, why do they eat all their food???
 
D

Dennis Lee Bieber

How does the banking industry handle this sort of thing? Could be big bucks
if something goes wrong for them!
Redundancy via transactional databases and journalling systems; such
that data is first written to one area, and then later that area is used
to update the main area.

A write failure will only corrupt one of the two areas at a time, so
restarts can examine the system and rebuild a known configuration --
possibly reporting those transactions that were lost so they can be
re-run. If the write from the journal to main fails, you restore a
backup, and rerun all the journal entries. If the write to the journal
failed, the main has not been corrupted up to whenever the last
successful journal->main -- backup the main, flush the journal, rerun
transactions that were made after that last successful transfer.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top