Parallelization with Python: which, where, how?

M

Mathias

Dear NG,

I have a (pretty much) "emberassingly parallel" problem and look for the
right toolbox to parallelize it over a cluster of homogenous linux
workstations. I don't need automatic loop-parallelization or the like
since I prefer to prepare the work packets "by hand".
I simply need
- to specify a list of clients
- a means of sending a work packet to a free client and receiving the
result (hopefully automatically without need to login to each one)
- optionally a timeout mechanism if a client doesn't respond
- optionally help for debugging of remote clients

So far I've seen scipy's COW (cluster of workstation) package, but
couldn't find documentation or even examples for it (and the small
example in the code crashes...).
I've noticed PYRO as well, but didn't look too far yet.

Can someone recommend a parallelization approach? Are there examples or
documentation? Has someone got experience with stability and efficiency?

Thanks a lot,
Mathias
 
F

Fredrik Lundh

Mathias said:
I have a (pretty much) "emberassingly parallel" problem and look for the right toolbox to
parallelize it over a cluster of homogenous linux workstations. I don't need automatic
loop-parallelization or the like since I prefer to prepare the work packets "by hand".
I simply need
- to specify a list of clients
- a means of sending a work packet to a free client and receiving the
result (hopefully automatically without need to login to each one)
- optionally a timeout mechanism if a client doesn't respond
- optionally help for debugging of remote clients

So far I've seen scipy's COW (cluster of workstation) package, but couldn't find documentation or
even examples for it (and the small example in the code crashes...).
I've noticed PYRO as well, but didn't look too far yet.

Can someone recommend a parallelization approach? Are there examples or documentation? Has someone
got experience with stability and efficiency?

googling for "parallel python" brings up lots of references; tools like

http://pympi.sourceforge.net/
http://datamining.anu.edu.au/~ole/pypar/

(see https://geodoc.uchicago.edu/climatewiki/DiscussPythonMPI for
a comparision)

seem to be commonly used.

</F>
 
G

Ganesan R

Mathias" == Mathias said:
Dear NG,
I have a (pretty much) "emberassingly parallel" problem and look for
the right toolbox to parallelize it over a cluster of homogenous linux
workstations. I don't need automatic loop-parallelization or the like
since I prefer to prepare the work packets "by hand".
I simply need
- to specify a list of clients
- a means of sending a work packet to a free client and receiving the
result (hopefully automatically without need to login to each one)
- optionally a timeout mechanism if a client doesn't respond
- optionally help for debugging of remote clients

pypvm or pympi? See http://pypvm.sourceforge.net/ and
http://pympi.sourceforge.net/.

Ganesan
 
M

Michael Hoffman

Mathias said:
I have a (pretty much) "emberassingly parallel" problem and look for the
right toolbox to parallelize it over a cluster of homogenous linux
workstations.

We have a >1000-node cluster here and use the commercial Platform LSF to
manage it. My Poly package
<http://www.ebi.ac.uk/~hoffman/software/poly/> makes that trivial to use
from Python and also avoids many of the pitfalls of programming farms
that large, such as accidental distributed denial of service attacks on
your own fileserver ;)

Due to the cost and difficulty of setup, LSF is probably not what you
want, or you would already have it. But MPI is probably not what you
want if you are doing embarassingly parallelizable problems. I would
look into OpenPBS <http://www.openpbs.org/>. If you want to write a Poly
plugin for OpenPBS, I would be happy to accept it. ;)
 
A

Albert Hofkamp

Can someone recommend a parallelization approach? Are there examples or
documentation? Has someone got experience with stability and efficiency?

If you think a light-weight approach of distributing work and collecting
the output afterwards (using ssh/rsh) fits your problem, send me an
email.

Albert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top