low-end persistence strategies?

J

John Lenton

Also, if you use something where the process doesn't terminate between
calls (such as mod_python, I guess), you have to be sure to write the
try/finallys around your locking code, because the OS only cleans up
the lock when the process exits.

This is what I feared. What happens in the case of a power failure?
Am I left with locked files floating around?

no.

--
John Lenton ([email protected]) -- Random fortune:
Los hijos de los buenos, capa son de duelo

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCFXuSgPqu395ykGsRAheIAJ9cl+6WaNrgGv651khEQiq7QgIGXgCdG9VN
L8L7HNSfhTzdQEmTQW3/T3o=
=/joU
-----END PGP SIGNATURE-----
 
M

Michele Simionato

Ok, I have yet another question: what is the difference
between fcntl.lockf and fcntl.flock? The man page of
my Linux system says that flock is implemented independently
of fcntl, however it does not say if I should use it in preference
over fcntl or not.

Michele Simionato
 
P

Pierre Quentel

If a dozen people click the url in the next day, several of
them will probably in the first minute or so after the email goes out.
So two simultaneous clicks isn't implausible.
More generally, I don't like writing code with bugs even if the bugs
have fairly low chance of causing trouble. So I'm looking for the
easiest way to do this kind of thing without bugs.

Even if the 12 requests occur in the same 5 minutes, the time needed for
a read or write operation on a small base of any kind (flat file, dbm,
shelve, etc) is so small that the probability of concurrence is very
close to zero

If you still want to avoid it, you'll have to pay some price. The most
simple and portable is a client/server mode, as suggested for KirbyBase
for instance. Yes, you have to run the server 24 hours a day, but you're
already running the web server 24/7 anyway
 
N

Nick Craig-Wood

Michele Simionato said:
Ok, I have yet another question: what is the difference
between fcntl.lockf and fcntl.flock? The man page of
my Linux system says that flock is implemented independently
of fcntl, however it does not say if I should use it in preference
over fcntl or not.

flock() and lockf() are two different library calls.

With lockf() you can lock parts of a file. I've always used flock().

From man lockf() "On Linux, this call [lockf] is just an interface for
fcntl(2). (In general, the relation between lockf and fcntl is
unspecified.)"

see man lockf and man flock
 
P

Paul Rubin

Pierre Quentel said:
Even if the 12 requests occur in the same 5 minutes, the time needed
for a read or write operation on a small base of any kind (flat file,
dbm, shelve, etc) is so small that the probability of concurrence is
very close to zero

I prefer "equal to zero" over "close to zero". Also, there are times
when the server is heavily loaded, and any file operation can take a
long time.
If you still want to avoid it, you'll have to pay some price. The most
simple and portable is a client/server mode, as suggested for
KirbyBase for instance. Yes, you have to run the server 24 hours a
day, but you're already running the web server 24/7 anyway

If I have to run a db server 24/7, that's not simple or portable.
There's lots of hosting environments where that's plain not permitted.
Using file locks is much simpler and more portable. The penalty is
that I can only handle one request at a time, but as mentioned, this
is for a low-usage app, so that serialization is ok.
 
J

John Lenton

Ok, I have yet another question: what is the difference
between fcntl.lockf and fcntl.flock? The man page of
my Linux system says that flock is implemented independently
of fcntl, however it does not say if I should use it in preference
over fcntl or not.

it depends on what you want to do: as the manpages say, flock is
present on most Unices, but lockf is POSIX; flock is BSD, lockf is
SYSV (although its in XPG4.2, so you have it on newer Unices of any
flavor); on Linux, lockf works over NFS (if the server supports it),
and gives you access to mandatory locking if you want it. You can't
mix lockf and flock (by this I mean: you can get a LOCK_EX via flock
and via lockf on the same file at the same time).

So: use whichever you feel more comfortable with, although if you are
pretty confident your program will run mostly on Linux there is a bias
towards lockf given its extra capabilities there.

--
John Lenton ([email protected]) -- Random fortune:
¡¡ QQuuiittaa eell LLooccaall EEcchhoo,, MMaannoolloo !!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCFf73gPqu395ykGsRAqf9AJ4qr2BjCvBQNb8Huz/6Oc8z/zEngACeKR8Q
VcgsMiSAiJ9+kDj4Hnh5jUQ=
=ZVQO
-----END PGP SIGNATURE-----
 
M

Michele Simionato

Uhm ... I reading /usr/src/linux-2.4.27/Documentation/mandatory.txt
The last section says:

"""
6. Warning!
-----------

Not even root can override a mandatory lock, so runaway processes can
wreak
havoc if they lock crucial files. The way around it is to change the
file
permissions (remove the setgid bit) before trying to read or write to
it.
Of course, that might be a bit tricky if the system is hung :-(
"""

so lockf locks do not look completely harmless ...

Michele Simionato
 
J

John Lenton

Uhm ... I reading /usr/src/linux-2.4.27/Documentation/mandatory.txt
The last section says:

"""
6. Warning!
-----------

Not even root can override a mandatory lock, so runaway processes
can wreak havoc if they lock crucial files. The way around it is to
change the file permissions (remove the setgid bit) before trying to
read or write to it. Of course, that might be a bit tricky if the
system is hung :-(
"""

so lockf locks do not look completely harmless ...

if you read the whole file, you will have read that turning on
mandatory locks is not trivial. I never said it was harmless, and in
fact (as that section explains) it's a bad idea for most cases; there
are some (very few) situations where you need it, however, and so you
can get at that functionality. Having to mount your filesystem with
special options and twiddling the permission bits is a pretty strong
hint that the implementors didn't think it was a good idea for most
cases, too.

Hmm, just read that file, and it doesn't mention the "have to mount
with special options" bit. But if you look in mount(8), you see an
entry under the options list,

mand Allow mandatory locks on this filesystem. See fcntl(2)

and if you look in fcntl(2), you see that

Mandatory locking
(Nonâ€POSIX.) The above record locks may be either adviâ€
sory or mandatory, and are advisory by default. To make
use of mandatory locks, mandatory locking must be
enabled (using the "â€o mand" option to mount(8)) for the
file system containing the file to be locked and enabled
on the file itself (by disabling group execute permisâ€
sion on the file and enabling the setâ€GID permission
bit).

Advisory locks are not enforced and are useful only
between cooperating processes. Mandatory locks are
enforced for all processes.

if I have come across as recommending mandatory locks in this thread,
I apologize, as it was never my intention. It is a cool feature for
the .001% of the time when you need it (and the case in discussion in
this thread is not one of those), but other than that it's a very,
very bad idea. In the same league of badness as SysV IPC; I'd still
mention SysV IPC to someone who asked about IPC on Linux, however,
because there are places where it is useful even though most times
it's a stupid way to do things (yes, Oracle, *especially* you).

--
John Lenton ([email protected]) -- Random fortune:
A pencil with no point needs no eraser.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCFhlQgPqu395ykGsRAn1IAJ9k7Fn19hGOoex8n2VI1EvbkpsvKgCbBMNh
lvlWU0HnYVsP3mAu+pBsss4=
=EBWP
-----END PGP SIGNATURE-----
 
P

pyguy2

You do not need to use a 24/7 process for low end persistance, if you
rely on the fact that only one thing can ever succeed in making a
directory. If haven't seen a filesystem where this isn't the case. This
type of locking works cross-thread/process whatever.

An example of that type of locking can be found at:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/252495

The only problem with this locking is if a process dies w/out cleaning
up the lock, how do you know when to remove them?
If you have the assumption that the write to the database is quick
(ok for low end), just have the locks time out after a minute. And if
you need to keep the lock longer, unpeel appropriately and reassert
them.

With 2 lock directories, 2 files and 1 temporary file, you end up with
a hard to break system. The cost is disk space, which for low end
should be fine.

Basically, the interesting question is, how far can one,
cross-platform, actually go in building a persistence system with long
term process business.

john
 
P

Paul Rubin

Michele Simionato said:
Ok, I have yet another question: what is the difference between
fcntl.lockf and fcntl.flock? The man page of my Linux system says
that flock is implemented independently of fcntl, however it does
not say if I should use it in preference over fcntl or not.

It looks to me like flock is 4.2-BSD-style locking and fcntl.lockf is
Sys V style. I'm not sure exactly what the differences are, but
generally speaking, BSD did this type of thing better than Sys V. On
the other hand, it looks like lockf has more features, like the
ability to ask for a SIGIO notification if someone tries opening a
file that you have locked. In Python, flock is certainly easier to
use, since you pass an integer mode flag instead of building up the
weird fcntl structure.

There's one subtlety, which is that I'm not sure locking your file
before updating it is guaranteed to do the right thing. You may have
to use a second file as a lock. E.g., suppose the data file is small,
like a few dozen bytes. So maybe:
1) process A opens the file. The contents get read into a cache buffer.
2) Process B opens the file, locks it, updates it, releases the lock.
3) Process A locks and updates the file. Is the cached stuff guaranteed
to get invalidated even in some awful RFS environment? Or could
A clobber B's changes?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top