Parallel Python

N

Nick Maclaren

|>
|> Most threads on this planet are not used for number crunching jobs,
|> but for "organization of execution".

That is true, and it is effectively what POSIX and Microsoft threads
are suitable for. With reservations, even there.

|> Things like MPI, IPC are just for the area of "small message, big job"
|> - typically sci number crunching, where you collect the results "at
|> the end of day". Its more a slow network technique.

That is completely false. Most dedicated HPC systems use MPI for high
levels of message passing over high-speed networks.

|> > They use it for the communication, but don't expose it to the
|> > programmer. It is therefore easy to put the processes on different
|> > CPUs, and get the memory consistency right.
|>
|> Thus communicated data is "serialized" - not directly used as with
|> threads or with custom shared memory techniques like POSH object
|> sharing.

It is not used as directly with threads as you might think. Even
POSIX and Microsoft threads require synchronisation primitives, and
threading models like OpenMP and BSP have explicit control.

Also, MPI has asynchronous (non-blocking) communication.


Regards,
Nick Maclaren.
 
S

sturlamolden

robert said:
Thus communicated data is "serialized" - not directly used as with threads or with custom shared memory techniques like POSH object sharing.

Correct, and that is precisely why MPI code is a lot easier to write
and debug than thread code. The OP used a similar technique in his
'parallel python' project.

This does not mean that MPI is inherently slower than threads however,
as there are overhead associated with thread synchronization as well.
With 'shared memory' between threads, a lot more fine grained
synchronization ans scheduling is needed, which impair performance and
often introduce obscure bugs.
 
S

Sergei Organov

I mean precisely the first.

The C99 standard uses a bizarre consistency model, which requires serial
execution, and its consistency is defined in terms of only volatile
objects and external I/O. Any form of memory access, signalling or
whatever is outside that, and is undefined behaviour.

POSIX uses a different but equally bizarre one, based on some function
calls being "thread-safe" and others forcing "consistency" (which is
not actually defined, and there are many possible, incompatible,
interpretations). It leaves all language aspects (including allowed
code movement) to C.

There are no concepts in common between C's and POSIX's consistency
specifications (even when they are precise enough to use), and so no
way of mapping the two standards together.

Ah, now I see what you mean. Even though I only partly agree with what
you've said above, I'll stop arguing as it gets too off-topic for this
group.

Thank you for explanations.

-- Sergei.
 
R

robert

sturlamolden said:
Correct, and that is precisely why MPI code is a lot easier to write
and debug than thread code. The OP used a similar technique in his
'parallel python' project.

Thus there are different levels of parallelization:

1 file/database based; multiple batch jobs
2 Message Passing, IPC, RPC, ...
3 Object Sharing
4 Sharing of global data space (Threads)
5 Local parallelism / Vector computing, MMX, 3DNow,...

There are good reasons for all of these levels.
Yet "parallel python" to me fakes to be on level 3 or 4 (or even 5 :) ), while its just a level 2 system, where "passing", "remote", "inter-process" ... are the right vocables.

With all this fakes popping up - a GIL free CPython is a major feature request for Py3K - a name at least promising to run 3rd millenium CPU's ...

This does not mean that MPI is inherently slower than threads however,
as there are overhead associated with thread synchronization as well.

level 2 communication is slower. Just for selected apps it won't matter a lot.
With 'shared memory' between threads, a lot more fine grained
synchronization ans scheduling is needed, which impair performance and
often introduce obscure bugs.

Its a question of chances and costs and nature of application.
Yet one can easily restrict inter-thread communcation to be as simple and modular or even simpler as IPC. Search e.g. "Python CallQueue" and "BackgroundCall" on Google.
Thread programming is less complicated as it seems. (Just Python's stdlib offers cumbersome 'non-functional' classes)


Robert
 
N

Nick Maclaren

|>
|> Thus there are different levels of parallelization:
|>
|> 1 file/database based; multiple batch jobs
|> 2 Message Passing, IPC, RPC, ...
|> 3 Object Sharing
|> 4 Sharing of global data space (Threads)
|> 5 Local parallelism / Vector computing, MMX, 3DNow,...
|>
|> There are good reasons for all of these levels.

Well, yes, but to call them "levels" is misleading, as they are closer
to communication methods of a comparable level.

|> > This does not mean that MPI is inherently slower than threads however,
|> > as there are overhead associated with thread synchronization as well.
|>
|> level 2 communication is slower. Just for selected apps it won't matter a lot.

That is false. It used to be true, but that was a long time ago. The
reasons why what seems to be a more heavyweight mechanism (message
passing) can be faster than an apparently lightweight one (data sharing)
are both subtle and complicated.


Regards,
Nick Maclaren.
 
K

Konrad Hinsen

The 'parallel python' site seems very sparse on the details of how
it is
implemented but it looks like all it is doing is spawning some
subprocesses
and using some simple ipc to pass details of the calls and results.
I can't
tell from reading it what it is supposed to add over any of the other
systems which do the same.

Combined with the closed source 'no redistribution' license I can't
really
see anyone using it.

I'd also like to see more details - even though I'd probably never
use any Python module distributed in .pyc form only.

From the bit of information there is on the Web site, the
distribution strategy looks quite similar to my own master-slave
distribution model (based on Pyro) which is part of ScientificPython.
There is an example at

http://dirac.cnrs-orleans.fr/hg/ScientificPython/main/?
f=08361040f00a;file=Examples/master_slave_demo.py

and the code itself can be consulted at

http://dirac.cnrs-orleans.fr/hg/ScientificPython/main/?
f=bce321680116;file=Scientific/DistributedComputing/MasterSlave.py


The main difference seems to be that my implementation doesn't start
compute jobs itself; it leaves it to the user to start any number he
wants by any means that works for his setup, but it allows a lot of
flexibility. In particular, it can work with a variable number of
slave jobs and even handles disappearing slave jobs gracefully.

Konrad.
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: (e-mail address removed)
---------------------------------------------------------------------
 
P

parallelpython

sturlamolden said:

Thanks for bringing that into consideration.

I am well aware of MPI and have written several programs in C/C++ and
Fortran which use it.
I would agree that MPI is the most common solution to run software on a
cluster (computers connected by network). Although there is another
parallelization approach: PVM (Parallel Virtual Machine)
http://www.csm.ornl.gov/pvm/pvm_home.html. I would say ppsmp is more
similar to the later.

By the way there are links to different python parallelization
techniques (including MPI) from PP site:
http://www.parallelpython.com/component/option,com_weblinks/catid,14/Itemid,23/

The main difference between MPI python solutions and ppsmp is that with
MPI you have to organize both computations
{MPI_Comm_rank(MPI_COMM_WORLD, &id); if id==1 then ... else ....} and
data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
you just submit a function with arguments to the execution server and
retrieve the results later.
That makes transition from serial python software to parallel much
simpler with ppsmp than with MPI.

To make this point clearer here is a short example:
--------------------serial code 2 lines------------------
for input in inputs:
print "Sum of primes below", input, "is", sum_primes(input)
--------------------parallel code 3 lines----------------
jobs = [(input, job_server.submit(sum_primes,(input,), (isprime,),
("math",))) for input in inputs]
for input, job in jobs:
print "Sum of primes below", input, "is", job()
---------------------------------------------------------------
In this example parallel execution was added at the cost of 1 line of
code!

The other difference with MPI is that ppsmp dynamically decides where
to run each given job. For example if there are other active processes
running in the system ppsmp will use in the bigger extent the
processors which are free. Since in MPI the whole tasks is usually
divided between processors equally at the beginning, the overall
runtime will be determined by the slowest running process (the one
which shares processor with another running program). In this
particular case ppsmp will outperform MPI.

The third, probably less important, difference is that with MPI based
parallel python code you must have MPI installed in the system.

Overall ppsmp is still work in progress and there are other interesting
features which I would like to implement. This is the main reason why I
do not open the source of ppsmp - to have better control of its future
development, as advised here: http://en.wikipedia.org/wiki/Freeware :)

Best regards,
Vitalii
 
P

parallelpython

Thus there are different levels of parallelization:

1 file/database based; multiple batch jobs
2 Message Passing, IPC, RPC, ...
3 Object Sharing
4 Sharing of global data space (Threads)
5 Local parallelism / Vector computing, MMX, 3DNow,...

There are good reasons for all of these levels.
Yet "parallel python" to me fakes to be on level 3 or 4 (or even 5 :) ), while its just a level 2
system, where "passing", "remote", "inter-process" ... are the right vocables.
In one of the previous posts I've mentioned that ppsmp is based on
processes + IPC, which makes it a system with level 2 parallelization,
the same level where MPI is.
Also it's obvious from the fact that it's written completely in python,
as python objects cannot be shared due to GIL (POSH can do sharing
because it's an extension written in C).
 
P

Paul Boddie

The main difference between MPI python solutions and ppsmp is that with
MPI you have to organize both computations
{MPI_Comm_rank(MPI_COMM_WORLD, &id); if id==1 then ... else ....} and
data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
you just submit a function with arguments to the execution server and
retrieve the results later.

Couldn't you just provide similar conveniences on top of MPI? Searching
for "Python MPI" yields a lot of existing work (as does "Python PVM"),
so perhaps someone has already done so. Also, what about various grid
toolkits?

[...]
Overall ppsmp is still work in progress and there are other interesting
features which I would like to implement. This is the main reason why I
do not open the source of ppsmp - to have better control of its future
development, as advised here: http://en.wikipedia.org/wiki/Freeware :)

Despite various probable reactions from people who will claim that
they're comfortable with binary-only products from a single vendor, I
think more people would be inclined to look at your software if you did
distribute the source code, even if they then disregarded what you've
done. My own experience with regard to releasing software is that even
with an open source licence, most people are likely to ignore your
projects than to suddenly jump on board and take control, and even if
your project somehow struck a chord and attracted a lot of interested
developers, would it really be such a bad thing? Many developers have
different experiences and insights which can only make your project
better, anyway.

Related to your work, I've released a parallel execution solution
called parallel/pprocess [1] under the LGPL and haven't really heard
about anyone really doing anything with it, let alone forking it and
showing my original efforts in a bad light. Perhaps most of the
downloaders believe me to be barking up the wrong tree (or just
barking) with the approach I've taken, but I think the best thing is to
abandon any fears of not doing things the best possible way and just be
open to improvements and suggestions.

Paul

[1] http://www.python.org/pypi/parallel
 
N

Nick Maclaren

|> >
|> > The main difference between MPI python solutions and ppsmp is that with
|> > MPI you have to organize both computations
|> > {MPI_Comm_rank(MPI_COMM_WORLD, &id); if id==1 then ... else ....} and
|> > data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
|> > you just submit a function with arguments to the execution server and
|> > retrieve the results later.
|>
|> Couldn't you just provide similar conveniences on top of MPI? Searching
|> for "Python MPI" yields a lot of existing work (as does "Python PVM"),
|> so perhaps someone has already done so.

Yes. No problem.

|> Also, what about various grid toolkits?

If you can find one that is robust enough for real work by someone who
is not deeply into developing Grid software, I will be amazed.


Regards,
Nick Maclaren.
 
R

robert

Paul said:
The main difference between MPI python solutions and ppsmp is that with
MPI you have to organize both computations
{MPI_Comm_rank(MPI_COMM_WORLD, &id); if id==1 then ... else ....} and
data distribution (MPI_Send / MPI_Recv) by yourself. While with ppsmp
you just submit a function with arguments to the execution server and
retrieve the results later.

Couldn't you just provide similar conveniences on top of MPI? Searching
for "Python MPI" yields a lot of existing work (as does "Python PVM"),
so perhaps someone has already done so. Also, what about various grid
toolkits?

[...]
Overall ppsmp is still work in progress and there are other interesting
features which I would like to implement. This is the main reason why I
do not open the source of ppsmp - to have better control of its future
development, as advised here: http://en.wikipedia.org/wiki/Freeware :)

Despite various probable reactions from people who will claim that
they're comfortable with binary-only products from a single vendor, I
think more people would be inclined to look at your software if you did
distribute the source code, even if they then disregarded what you've
done. My own experience with regard to releasing software is that even
with an open source licence, most people are likely to ignore your
projects than to suddenly jump on board and take control, and even if
your project somehow struck a chord and attracted a lot of interested
developers, would it really be such a bad thing? Many developers have
different experiences and insights which can only make your project
better, anyway.

Related to your work, I've released a parallel execution solution
called parallel/pprocess [1] under the LGPL and haven't really heard
about anyone really doing anything with it, let alone forking it and
showing my original efforts in a bad light. Perhaps most of the
downloaders believe me to be barking up the wrong tree (or just
barking) with the approach I've taken, but I think the best thing is to
abandon any fears of not doing things the best possible way and just be
open to improvements and suggestions.

Paul

[1] http://www.python.org/pypi/parallel

I'd be interested in an overview.
For ease of use a major criterion for me would be a pure python
solution, which also does the job of starting and controlling the
other process(es) automatically right (by default) on common
platforms.
Which of the existing (RPC) solutions are that nice?


Robert
 
N

Neal Becker

Has anybody tried to run parallel python applications?
It appears that if your application is computation-bound using 'thread'
or 'threading' modules will not get you any speedup. That is because
python interpreter uses GIL(Global Interpreter Lock) for internal
bookkeeping. The later allows only one python byte-code instruction to
be executed at a time even if you have a multiprocessor computer.
To overcome this limitation, I've created ppsmp module:
http://www.parallelpython.com
It provides an easy way to run parallel python applications on smp
computers.
I would appreciate any comments/suggestions regarding it.
Thank you!

Looks interesting, but is there any way to use this for a cluster of
machines over a network (not smp)?
 
P

Paul Boddie

robert said:
Paul said:

I'd be interested in an overview.

I think we've briefly discussed the above solution before, and I don't
think you're too enthusiastic about anything using interprocess
communication, which is what the above solution uses. Moreover, it's
intended as a threading replacement for SMP/multicore architectures
where one actually gets parallel execution (since it uses processes).
For ease of use a major criterion for me would be a pure python
solution, which also does the job of starting and controlling the
other process(es) automatically right (by default) on common
platforms.
Which of the existing (RPC) solutions are that nice?

Many people have nice things to say about Pyro, and there seem to be
various modules attempting parallel processing, or at least some kind
of job control, using that technology. See Konrad Hinsen's
ScientificPython solution for an example of this - I'm sure I've seen
others, too.

Paul
 
K

Konrad Hinsen

done. My own experience with regard to releasing software is that even
with an open source licence, most people are likely to ignore your
projects than to suddenly jump on board and take control, and even if

My experience is exactly the same. And looking into the big world of
Open Source programs, the only case I ever heard of in which a
project was forked by someone else is the Emacs/XEmacs split. I'd be
happy if any of my projects ever reached that level of interest.
Related to your work, I've released a parallel execution solution
called parallel/pprocess [1] under the LGPL and haven't really heard
about anyone really doing anything with it, let alone forking it and

That's one more project... It seems that there is significant
interest in parallel computing in Python. Perhaps we should start a
special interest group? Not so much in order to work on a single
project; I believe that at the current state of parallel computing we
still need many different approaches to be tried. But an exchange of
experience could well be useful for all of us.

Konrad.
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: (e-mail address removed)
---------------------------------------------------------------------
 
P

Paul Boddie

Konrad said:
That's one more project... It seems that there is significant
interest in parallel computing in Python. Perhaps we should start a
special interest group? Not so much in order to work on a single
project; I believe that at the current state of parallel computing we
still need many different approaches to be tried. But an exchange of
experience could well be useful for all of us.

I think a special interest group might be productive, but I've seen
varying levels of special interest in the different mailing lists
associated with such groups: the Web-SIG list started with enthusiasm,
produced a cascade of messages around WSGI, then dried up; the XML-SIG
list seems to be a sorry indication of how Python's XML scene has
drifted onto other matters; other such groups have also lost their
momentum.

It seems to me that a more useful first step would be to create an
overview of the different modules and put it on the python.org Wiki:

http://wiki.python.org/moin/FrontPage
http://wiki.python.org/moin/UsefulModules (a reasonable entry point)

If no-one beats me to it, I may write something up over the weekend.

Paul
 
K

Konrad Hinsen

It seems to me that a more useful first step would be to create an
overview of the different modules and put it on the python.org Wiki:

http://wiki.python.org/moin/FrontPage
http://wiki.python.org/moin/UsefulModules (a reasonable entry point)

If no-one beats me to it, I may write something up over the weekend.

That sounds like a good idea. I won't beat you to it, but I'll have a
look next week and perhaps add information that I have.

Konrad.
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: (e-mail address removed)
---------------------------------------------------------------------
 
M

mheslep

Konrad Hinsen wrote:
..... Perhaps we should start a
special interest group? Not so much in order to work on a single
project; I believe that at the current state of parallel computing we
still need many different approaches to be tried. But an exchange of
experience could well be useful for all of us.
+ 1

-Mark
 
P

parallelpython

Looks interesting, but is there any way to use this for a cluster of
machines over a network (not smp)?

Networking capabilities will be included in the next release of
Parallel Python software (http://www.parallelpython.com), which is
coming soon.

Couldn't you just provide similar conveniences on top of MPI? Searching
for "Python MPI" yields a lot of existing work (as does "Python PVM"),
so perhaps someone has already done so.

Yes, it's possible to do it on the top of any environment which
supports IPC.
That's one more project... It seems that there is significant
interest in parallel computing in Python. Perhaps we should start a
special interest group? Not so much in order to work on a single
project; I believe that at the current state of parallel computing we
still need many different approaches to be tried. But an exchange of
experience could well be useful for all of us.
Well, I may just add that everybody is welcome to start discussion
regarding any parallel python project or idea in this forum:
http://www.parallelpython.com/component/option,com_smf/Itemid,29/board,2.0
 
A

A.T.Hofkamp


I'd be interested in an overview.
For ease of use a major criterion for me would be a pure python
solution, which also does the job of starting and controlling the
other process(es) automatically right (by default) on common
platforms.

Let me add a few cents to the discussion with this announcement:

About three years ago, I wrote two Python modules, one called 'exec_proxy',
which uses ssh to run another exec_proxy instance at a remote machine, thus
providing ligh-weight transparent access to a machine across a network.

The idea behind this module was/is that by just using ssh you have network
transparency, much more light weight than most other distributed modules where
you have to start deamons at all machines.
Recently, the 'rthread' module was announced which takes the same approach (it
seems from the announcement). I have not compared both modules with each other.


The more interesting Python module called 'batchlib' lies on top of the former
(or any other module that provides transparency across the network). It
handles distribution of computation jobs in the form of a 'start-computation'
and 'get-results' pair of functions.

That is, you give it a set of machines it may use, you say to the entry-point,
compute for me this-and-this function with this-and-this parameters, and
batchlib does the rest.
(that is, it finds a free machine, copies the parameters over the network, runs
the job, the result is transported back, and you can get the result of a
computation by using the same (uniq) identification given by you when the job
was given to batchlib.)

We used it as computation backend for optimization problems, but since
'computation job' may mean anything, the module should be very generically
applicable.


Compared to most other parallel/distributed modules, I think that the other
modules more-or-less compare with exec_proxy (that is, they stop with
transparent network access), where exec_proxy was designed to have minimal
impact on required infra structure (ie just ssh or rsh which is generally
already available) and thus without many of the features available from the
other modules.

Batchlib starts where exec_proxy ends, namely lifting network primitives to the
level of providing a simple way of doing distributed computations (in the case
of exec_proxy, without adding network infra structure such as deamons).




Until now, both modules were used in-house, and it was not clear what we wanted
to do further with the software. Recently, we have decided that we have no
further use for this software (we think we want to move into a different
direction), clearing the way to release this software to the community.

You can get the software from my home page http://seweb.se.wtb.tue.nl/~hat
Both packages can be downloaded, and include documentation and an example.
The bad news is that I will not be able to do further development of these
modules. The code is 'end-of-life' for us.


Maybe you find the software useful,
Albert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,276
Latest member
Sawatmakal

Latest Threads

Top