Semaphore Techniques

J

John D Giotta

I'm looking to run a process with a limit of 3 instances, but each
execution is over a crontab interval. I've been investigating the
threading module and using daemons to limit active thread objects, but
I'm not very successful at grasping the documentation.

Is it possible to do what I'm trying to do and if so anyone know of a
useful example to get started?
 
D

David Bolen

John D Giotta said:
I'm looking to run a process with a limit of 3 instances, but each
execution is over a crontab interval. I've been investigating the
threading module and using daemons to limit active thread objects, but
I'm not very successful at grasping the documentation.

Is it possible to do what I'm trying to do and if so anyone know of a
useful example to get started?

Does it have to be built into the tool, or are you open to handling the
restriction right in the crontab entry?

For example, a crontab entry like:

* * * * * test `pidof -x script.py | wc -w` -ge 4 || <path>/script.py

should attempt to run script.py every minute (adjust period as
required) unless there are already four of them running. And if pidof
isn't precise enough you can put anything in there that would
accurately check your processes (grep a ps listing or whatever).

This works because if the test expression is true it returns 0 which
terminates the logical or (||) expression.

There may be some variations based on cron implementation (the above
was tested against Vixie cron), but some similar mechanism should be
available.

If you wanted to build it into the tool, it can be tricky in terms of
managing shared state (the count) amongst purely sibling/cooperative
processes. It's much easier to ensure no overlap (1 instance), but
once you want 'n' instances you need an accurate process-wide counter.
I'm not positive, but don't think Python's built-in semaphores or
shared memory objects are cross-process. (Maybe something in
multiprocessing in recent Python versions would work, though they may
need the sharing processes to all have been executed from a parent
script)

I do believe there are some third party interfaces (posix_ipc,
shm/shm_wrapper) that would provide access to posix shared-process
objects. A semaphore may still not work as I'm not sure you can
obtain the current count. But you could probably do something with
a shared memory counter in conjunction with a mutex of some sort, as
long as you were careful to clean it up on exit.

Or, you could stick PIDs into the shared memory and count PIDs on
a new startup (double checking against running processes to help
protect against process failures without cleanup).

You could also use the filesystem - have a shared directory where each
process dumps its PID, after first counting how many other PIDs are in
the directory and exiting if too many.

Of course all of these (even with a PID check) are risky in the
presence of unexpected failures. It would be worse with something
like C code, but it should be reasonably easy to ensure that your
script has cleanup code even on an unexpected termination, and it's
not that likely the Python interpreter itself would crash. Then
again, something external could kill the process. Ensuring accuracy
and cleanup of shared state can be non-trivial.

You don't mention if you can support a single master daemon, but if
you could, then it can get a little easier as it can maintain and
protect access to the state - you could have each worker process
maintain a socket connection of some sort with the master daemon so it
could detect when they terminate for the count, and it could just
reject such connections from new processes if too many are running
already. Of course, if the master daemon goes away then nobody would
run, which may or may not be an acceptable failure mode.

All in all, unless you need the scripts to enforce this behavior even
in the presence of arbitrary use, I'd just use an appropriate crontab
entry and move on to other problems :)

-- David
 
C

Carl Banks

I'm looking to run a process with a limit of 3 instances, but each
execution is over a crontab interval. I've been investigating the
threading module and using daemons to limit active thread objects, but
I'm not very successful at grasping the documentation.

Is it possible to do what I'm trying to do and if so anyone know of a
useful example to get started?

It seems like you want to limit the number of processes to three; the
threading module won't help you there because it deals with threads
within a single process.

What I'd do is to simply run the system ps to see how many processes
are running (ps is pretty versatile on most systems and can find
specifically targeted processes like you program), and exit if there
are already three.

If you really are talking about multiple threads on a single server
process, then you want to use a thread pool (create three threads, and
give them tasks as necessary). But you'll have to have a way for the
process started by crontab to communicate with the server.


Carl Banks
 
P

Piet van Oostrum

CB> It seems like you want to limit the number of processes to three; the
CB> threading module won't help you there because it deals with threads
CB> within a single process.
CB> What I'd do is to simply run the system ps to see how many processes
CB> are running (ps is pretty versatile on most systems and can find
CB> specifically targeted processes like you program), and exit if there
CB> are already three.

That will surely run into some race conditions. If the limit of 3
processes is soft then that wouldn't be a big deal, however.
 
J

John D Giotta

I'm working with up to 3 process "session" per server, each process
running three threads.
I was wishing to tie back the 3 "session"/server to a semaphore, but
everything (and everyone) say semaphores are only good per process.
 
J

John D Giotta

That was my original idea. Restricting each process by pid:

#bash
procs=`ps aux | grep script.pl | grep -v grep | wc -l`

if [ $procs -lt 3 ]; then
python2.4 script.py config.xml
else
exit 0
fi
 
C

Carl Banks

That will surely run into some race conditions.

What, the OS might not have gotten around to update the process table
to include a process started minutes ago? (He said he was starting
the processes over crontab intervals, not that he communicated what he
wanted well.)


Carl Banks
 
P

Piet van Oostrum

CB> What, the OS might not have gotten around to update the process table
CB> to include a process started minutes ago? (He said he was starting
CB> the processes over crontab intervals, not that he communicated what he
CB> wanted well.)

No but between the time you get the ps output and decide not to start a
new process one of the processes might have exited. As I said it
probably is not a big deal, but you (he) should be aware of it I think.

The other possible race condition: two processes starting at
approximately the same time and both not detecting the other will
probably not occur because of the time distance between starting the
processes by cron. Unless the system is so busy that ps takes a loooong
time.

The problem is similar to the sleeping barber problem (3 sleeping
barbers actually).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top