multi threading in multi processor (computer)

A

ajikoe

Hello,

Is anyone has experiance in running python code to run multi thread
parallel in multi processor. Is it possible ?

Can python manage which cpu shoud do every thread?

Sincerely Yours,
Pujo
 
P

Pierre Barbier de Reuille

(e-mail address removed) a écrit :
Hello,

Is anyone has experiance in running python code to run multi thread
parallel in multi processor. Is it possible ?

Can python manage which cpu shoud do every thread?

Sincerely Yours,
Pujo

There's just no way you can use Python in a multi-processor environment,
because the GIL (Global Interpreter Lock) will prevent two threads from
running concurrently. When I saw this discussed, the Python developper
were more into multi-process systems when it comes to multi-processors.
I think I even heard some discussion about efficient inter-process
messaging system, but I can't remember where :eek:)

Hope it'll help.

Pierre
 
A

ajikoe

Hello Pierre,

That's a pity, since when we have to run parallel, with single
processor is really not efficient. To use more computers I think is
cheaper than to buy super computer in developt country.

Sincerely Yours,
pujo aji
 
A

Alan Kennedy

[[email protected]]
That's a pity, since when we have to run parallel, with single
processor is really not efficient. To use more computers I think is
cheaper than to buy super computer in developt country.

Although cpython has a GIL that prevents multiple python threads *in the
same python process* from running *inside the python interpreter* at the
same time (I/O is not affected, for example), this can be gotten around
by using multiple processes, each bound to a different CPU, and using
some form of IPC (pyro, CORBA, bespoke, etc) to communicate between
those processes.

This solution is not ideal, because it will probably involve
restructuring your app. Also, all of the de/serialization involved in
the IPC will slow things down, unless you're using POSH, a shared memory
based system that requires System V IPC.

http://poshmodule.sf.net

Alternatively, you could simply use either jython or ironpython, both of
which have no central interpreter lock (because they rely on JVM/CLR
garbage collection), and thus will support transparent migration of
threads to multiple processors in a multi-cpu system, if the underlying
VM supports that.

http://www.jython.org
http://www.ironpython.com

And you shouldn't have to restructure your code, assuming that it is
already thread-safe?

For interest, I thought I'd mention PyLinda, a distributed object system
that takes a completely different, higher level, approach to object
distribution: it creates "tuple space", where objects live. The objects
can be located and sent messages. But (Py)Linda hides most of gory
details of how objects actually get distributed, and the mechanics of
actually connecting with those remote objects.

http://www-users.cs.york.ac.uk/~aw/pylinda/

HTH,
 
J

John Lenton

Hello Pierre,

That's a pity, since when we have to run parallel, with single
processor is really not efficient. To use more computers I think is
cheaper than to buy super computer in developt country.

and buying more, cheap computers gives you more processing power than
buying less, multi-processor computers. So the best thing you can do
is learn to leverage some distributed computing scheme. Take a look at
Pyro, and its Event server.

--
John Lenton ([email protected]) -- Random fortune:
When the cup is full, carry it level.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFCCjiegPqu395ykGsRAhYiAKCa837YZ2F6HK3DeAGOifbe9DPs5gCfd5ab
Cadcx1hVe9Jz+GE8DipUdt4=
=xCFM
-----END PGP SIGNATURE-----
 
A

Aahz

Is anyone has experiance in running python code to run multi thread
parallel in multi processor. Is it possible ?

Yes. The GIL prevents multiple Python threads from running
simultaneously, but C extensions can release the GIL; all I/O functions
in the standard library do, so threading Python makes sense for e.g. web
spiders. See the slides on my website for more info.
Can python manage which cpu shoud do every thread?

Nope.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
 
P

Paul Rubin

John Lenton said:
and buying more, cheap computers gives you more processing power than
buying less, multi-processor computers.

The day is coming when even cheap computers have multiple cpu's.
See hyperthreading and the coming multi-core P4's, and the finally
announced Cell processor.

Conclusion: the GIL must die.
 
A

Aahz

The day is coming when even cheap computers have multiple cpu's.
See hyperthreading and the coming multi-core P4's, and the finally
announced Cell processor.

Conclusion: the GIL must die.

It's not clear to what extent these processors will perform well with
shared memory space. One of the things I remember most about Bruce
Eckel's discussions of Java and threading is just how broken Java's
threading model is in certain respects when it comes to CPU caches
failing to maintain cache coherency. It's always going to be true that
getting fully scaled performance will require more CPUs with non-shared
memory -- that's going to mean IPC with multiple processes instead of
threads.

Don't get me wrong -- I'm probably one of the bigger boosters of threads.
But it bugs me when people think that getting rid of the GIL will be the
Holy Grail of Python performance. No way. No how. No time.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
 
J

Jack Diederich

I'm looking forward to Multi-core P4s (and Opterons). The Cell is a
non-starter for general purpose computing. Arstechnica has a couple
good pieces on it, the upshot is that it is one normal processor
with eight strange floating point co-processors hanging off it.
It's not clear to what extent these processors will perform well with
shared memory space. One of the things I remember most about Bruce
Eckel's discussions of Java and threading is just how broken Java's
threading model is in certain respects when it comes to CPU caches
failing to maintain cache coherency. It's always going to be true that
getting fully scaled performance will require more CPUs with non-shared
memory -- that's going to mean IPC with multiple processes instead of
threads.

Don't get me wrong -- I'm probably one of the bigger boosters of threads.
But it bugs me when people think that getting rid of the GIL will be the
Holy Grail of Python performance. No way. No how. No time.

"Me Too!" for a small number of processors (four) it is easier (and
usually sufficient) to pipeline functional parts into different
processes than it is to thread the whole monkey. As a bonus this usually
gives you scaling across machines (and not just CPUs) for cheap. I'm
aware there are some problems where this isn't true. From reading this
thread every couple months on c.l.py for the last few years it is my
opinion that the number of people who think threading is the only solution
to their problem greatly outnumber the number of people who actually have
such a problem (like, nearly all of them).

Killing the GIL is proposing a silver bullet where there is no werewolf-ly,

-Jack
 
C

Courageous

Killing the GIL is proposing a silver bullet where there is no werewolf-ly,

About the only reason for killing the GIL is /us/. We, purists,
pythonistas, language nuts, or what not, who for some reason or
other simply hate the idea of the GIL. I'd view it as an artistic
desire, unurgent, something to plan for the future canvas upon
which our painting is painted...

C//
 
M

Mike Meyer

Jack Diederich said:
From reading this
thread every couple months on c.l.py for the last few years it is my
opinion that the number of people who think threading is the only solution
to their problem greatly outnumber the number of people who actually have
such a problem (like, nearly all of them).

Here here. I find that threading typically introduces worse problems
than it purports to solve.

<mike
 
C

Courageous

Here here. I find that threading typically introduces worse problems
than it purports to solve.

I recently worked on a software effort, arguably one of the most
important software efforts in existence, in which individuals
responsible for critical performance of the application threw
arbitrarily large numbers of threads at a problem, on a multi
processor machine, on a problem that was intrinsically IO-bound.

The ease with which one can get into problems with threads (and
these days, also with network comms) leads to many problems if
the engineers aren't acquainted sufficiently with the theory.

Don't get me started on the big clusterfucks I've seen evolve
from CORBA...

C//
 
N

Nick Coghlan

Mike said:
Here here. I find that threading typically introduces worse problems
than it purports to solve.

In my experience, threads should mainly be used if you need asynchronous access
to a synchronous operation. You spawn the thread to make the call, it blocks on
the relevant API, then notifies the main thread when it's done.

Since any sane code will release the GIL before making the blocking call, this
scales to multiple CPU's just fine.

Another justification for threads is when you have a multi-CPU machine, and a
processor intensive operation you'd like to farm off to a separate CPU. In that
case, you can treat the long-running operation like any other synchronous call,
and farm off a thread that releases the GIL before start the time-consuming
operation.

The only time the GIL "gets in the way" is if the long-running operation you
want to farm off is itself implemented in Python.

However, consider this: threads run on a CPU, so if you want to run multiple
threads concurrently, you either need multiple CPU's or a time-slicing scheduler
that fakes it.

Here's the trick: PYTHON THREADS DO NOT RUN DIRECTLY ON THE CPU. Instead, they
run on a Python Virtual Machine (or the JVM/CLR Runtime/whatever), which then
runs on the CPU. So, if you want to run multiple Python threads concurrently,
you need multiple PVM's or a timeslicing scheduler. The GIL represents the latter.

Now, Python *could* try to provide the ability to have multiple virtual machines
in a single process in order to more effectively exploit multiple CPU's. I have
no idea if Java or the CLR work that way - my guess it that they do (or
something that looks the same from a programmer's POV). But then, they have
Sun/Microsoft directly financing the development teams.

A much simpler suggestion is that if you want a new PVM, just create a new OS
process to run another copy of the Python interpreter. The effectiveness of your
multi-CPU utilisation will then be governed by your OS's ability to correctly
schedule multiple processes rather than by the PVM's ability to fake multiple
processes using threads (Hint: the former is likely to be much better than the
latter).

Additionally, schemes for inter-process communication are often far more
scaleable than those for inter-thread communication, since the former generally
can't rely on shared memory (although good versions may utilise it for
optimisation purposes). This means they can usually be applied to clustered
computing rather effectively.

I would *far* prefer to see effort expended on making the idiom mentioned in the
last couple of paragraphs simple and easy to use, rather than on a misguided
effort to "Kill the GIL".

Cheers,
Nick.

P.S. If the GIL *really* bothers you, check out Stackless Python. As I
understand it, it does its best to avoid the C stack (and hence threads) altogether.
 
A

Aahz

Here here. I find that threading typically introduces worse problems
than it purports to solve.

Depends what you're trying to do with threads. Threads are definitely a
good technique for managing long-running work in a GUI application.
Threads are also good for handling blocking I/O. Threads can in theory
be useful for computational processing, but Python provides almost no
support for that.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
 
M

Mike Meyer

Threads are also good for handling blocking I/O.

Actually, this is one of the cases I was talking about. I find it
saner to convert to non-blocking I/O and use select() for
synchronization. That solves the problem, without introducing any of
the headaches related to shared access and locking that come with
threads.

<mike
 
C

Courageous

Actually, this is one of the cases I was talking about. I find it
saner to convert to non-blocking I/O and use select() for
synchronization. That solves the problem, without introducing any of
the headaches related to shared access and locking that come with
threads.

Threads aren't always the right entity for dealing with asynchronicity,
one might say.

C//
 
P

Paul Rubin

Mike Meyer said:
Actually, this is one of the cases I was talking about. I find it
saner to convert to non-blocking I/O and use select() for
synchronization. That solves the problem, without introducing any of
the headaches related to shared access and locking that come with
threads.

It's just a different style with its own tricks and traps. Threading
for blocking i/o is a well-accepted idiom and if Python supports
threads at all, people will want to use them that way.
 
A

Aahz

Actually, this is one of the cases I was talking about. I find
it saner to convert to non-blocking I/O and use select() for
synchronization. That solves the problem, without introducing any of
the headaches related to shared access and locking that come with
threads.

It may be saner, but Windows doesn't support select() for file I/O, and
Python's threading mechanisms make this very easy. If one's careful
with application design, there should be no locking problems. (Have you
actually written any threaded applications in Python?)
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
 
F

Frans Englich

[email protected] (Aahz) said:
It may be saner, but Windows doesn't support select() for file I/O, and
Python's threading mechanisms make this very easy. If one's careful
with application design, there should be no locking problems. (Have you
actually written any threaded applications in Python?)

Hehe.. the first thing a google search on "python non-blocking io threading"
returns "Threading is Evil".

Personally I need a solution which touches this discussion. I need to run
multiple processes, which I communicate with via stdin/out, simultaneously,
and my plan was to do this with threads. Any favorite document pointers,
common traps, or something else which could be good to know?


Cheers,

Frans
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,035
Latest member
HoTaKeDai

Latest Threads

Top