write to the same file from multiple processes at the same time?

G

gabor

hi,

what i want to achieve:
i have a cgi file, that writes an entry to a text-file..
like a log entry (when was it invoked, when did his worke end).
it's one line of text.

the problem is:
what happens if 2 users invoke the cgi at the same time?

and it will happen, because i am trying now to stress test it, so i will
start 5-10 requests in parallel and so on.

so, how does one synchronizes several processes in python?

first idea was that the cgi will create a new temp file every time,
and at the end of the stress-test, i'll collect the content of all those
files. but that seems as a stupid way to do it :(

another idea was to use a simple database (sqlite?) which probably has
this problem solved already...

any better ideas?

thanks,
gabor
 
P

Paul Rubin

gabor said:
so, how does one synchronizes several processes in python?

first idea was that the cgi will create a new temp file every time,
and at the end of the stress-test, i'll collect the content of all
those files. but that seems as a stupid way to do it :(

There was a thread about this recently ("low-end persistence
strategies") and for Unix the simplest answer seems to be the
fcntl.flock function. For Windows I don't know the answer.
Maybe os.open with O_EXCL works.
 
R

Roy Smith

gabor said:
so, how does one synchronizes several processes in python?

This is a very hard problem to solve in the general case, and the answer
depends more on the operating system you're running on than on the
programming language you're using.

On the other hand, you said that each process will be writing a single line
of output at a time. If you call flush() after each message is written,
that should be enough to ensure that the each line gets written in a single
write system call, which in turn should be good enough to ensure that
individual lines of output are not scrambled in the log file.

If you want to do better than that, you need to delve into OS-specific
things like the flock function in the fcntl module on unix.
 
P

Peter Hansen

Roy said:
On the other hand, you said that each process will be writing a single line
of output at a time. If you call flush() after each message is written,
that should be enough to ensure that the each line gets written in a single
write system call, which in turn should be good enough to ensure that
individual lines of output are not scrambled in the log file.

Unfortunately this assumes that the open() call will always succeed,
when in fact it is likely to fail sometimes when another file has
already opened the file but not yet completed writing to it, AFAIK.
If you want to do better than that, you need to delve into OS-specific
things like the flock function in the fcntl module on unix.

The OP was probably on the right track when he suggested that things
like SQLite (conveniently wrapped with PySQLite) had already solved this
problem.

-Peter
 
P

Paul Rubin

Peter Hansen said:
The OP was probably on the right track when he suggested that things
like SQLite (conveniently wrapped with PySQLite) had already solved
this problem.

But they haven't. They depend on messy things like server processes
constantly running, which goes against the idea of a cgi that only
runs when someone calls it.
 
R

Roy Smith

Peter Hansen said:
The OP was probably on the right track when he suggested that things
like SQLite (conveniently wrapped with PySQLite) had already solved this
problem.

Perhaps, but a relational database seems like a pretty heavy-weight
solution for a log file.
 
G

Gerhard Haering

Perhaps, but a relational database seems like a pretty heavy-weight
solution for a log file.

On the other hand, it works ;-)

-- Gerhard
--
Gerhard Häring - (e-mail address removed) - Python, web & database development

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFClygYdIO4ozGCH14RAmzYAJwKRemKk6jYLCSN4HG/EhX8x/gPbACgvL4e
bSLQlGaD1oEIzDGAEQXZrzs=
=4oWu
-----END PGP SIGNATURE-----
 
J

jean-marc

Sorry, why is the temp file solution 'stupid'?, (not
aesthetic-pythonistic???) - it looks OK: simple and direct, and
certainly less 'heavy' than any db stuff (even embedded)

And collating in a 'official log file' can be done periodically by
another process, on a time-scale that is 'useful' if not
instantaneous...

Just trying to understand here...

JMD
 
G

Grant Edwards

Unfortunately this assumes that the open() call will always succeed,
when in fact it is likely to fail sometimes when another file has
already opened the file but not yet completed writing to it, AFAIK.

Not in my experience. At least under Unix, it's perfectly OK
to open a file while somebody else is writing to it. Perhaps
Windows can't deal with that situation?
 
P

Peter Hansen

Grant said:
Not in my experience. At least under Unix, it's perfectly OK
to open a file while somebody else is writing to it. Perhaps
Windows can't deal with that situation?

Hmm... just tried it: you're right! On the other hand, the results were
unacceptable: each process has a separate file pointer, so it appears
whichever one writes first will have its output overwritten by the
second process.

Change the details, but the heart of my objection is the same.

-Peter
 
C

Christopher Weimann

Hmm... just tried it: you're right! On the other hand, the results were
unacceptable: each process has a separate file pointer, so it appears
whichever one writes first will have its output overwritten by the
second process.

Did you open the files for 'append' ?
 
P

Peter Hansen

Peter said:
Hmm... just tried it: you're right!

Umm... the part you were right about was NOT the possibility that
Windows can't deal with the situation, but the suggestion that it might
actually be able to (since apparently it can). Sorry to confuse.

-Peter
 
P

Peter Hansen

Christopher said:
Did you open the files for 'append' ?

Nope. I suppose that would be a rational thing to do for log files,
wouldn't it? I wonder what happens when one does that...

-Peter
 
D

Do Re Mi chel La Si Do

Hi !


On windows, with PyWin32, to read this little sample-code :


import time
import win32file, win32con, pywintypes

def flock(file):
hfile = win32file._get_osfhandle(file.fileno())
win32file.LockFileEx(hfile, win32con.LOCKFILE_EXCLUSIVE_LOCK, 0, 0xffff,
pywintypes.OVERLAPPED())

def funlock(file):
hfile = win32file._get_osfhandle(file.fileno())
win32file.UnlockFileEx(hfile, 0, 0xffff, pywintypes.OVERLAPPED())


file = open("FLock.txt", "r+")
flock(file)
file.seek(123)
for i in range(500):
file.write("AAAAAAAAAA")
print i
time.sleep(0.001)

#funlock(file)
file.close()




Michel Claveau
 
U

ucntcme

Well I just tried it on Linux anyway. I opened the file in two python
processes using append mode.

I then wrote simple function to write then flush what it is passed:

def write(msg):
foo.write("%s\n" % msg)
foo.flush()

I then opened another terminal and did 'tail -f myfile.txt'.

It worked just fine.

Maybe that will help. Seems simple enough to me for basic logging.

Cheers,
Bill
 
G

gabor

jean-marc said:
Sorry, why is the temp file solution 'stupid'?, (not
aesthetic-pythonistic???) - it looks OK: simple and direct, and
certainly less 'heavy' than any db stuff (even embedded)

And collating in a 'official log file' can be done periodically by
another process, on a time-scale that is 'useful' if not
instantaneous...

Just trying to understand here...

actually this is what i implemented after asking the question, and works
fine :)

i just thought that maybe there is a solution where i don't have to deal
with 4000 files in the temp folder :)

gabor
 
P

Piet van Oostrum

Isn't a write to a file that's opened as append atomic in most operating
systems? At least in modern Unix systems. man open(2) should give more
information about this.

Like:
f = file("filename", "a")
f.write(line)
f.flush()

if line fits into the stdio buffer. Otherwise os.write can be used.

As this depends on the OS support for append, it is not portable. But
neither is locking. And I am not sure if it works for NFS-mounted files.
 
S

Steve Holden

Roy said:
Perhaps, but a relational database seems like a pretty heavy-weight
solution for a log file.

Excel seems like a pretty heavyweight solution for most of the
applications it's used for, too. Most people are interested in solving a
problem and moving on, and while this may lead to bloatware it can also
lead to the inclusion of functionality that can be hugely useful in
other areas of the application.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top