dual processor

J

John Brawley

Greetings, all.
I have a program I'm trying to speed up by putting it on a new machine.
The new machine is a Compaq W6000 2.0 GHz workstation with dual XEON
processors.
I've gained about 7x speed over my old machine, which was a 300 MHz AMD
K6II, but I think there ought to be an even greater speed gain due to the
two XEONs.
However, the thought occurs that Python (2.4.1) may not have the ability to
take advantage of the dual processors, so my question:
Does it?
If not, who knows where there might be info from people trying to make
Python run 64-bit, on multiple processors?
Thanks!

John Brawley


--
peace
JB
(e-mail address removed)
http://tetrahedraverse.com
NOTE! Charter is not blocking viruses,
Therefore NO ATTACHMENTS, please;
They will not be downloaded from the Charter mail server.
__Prearrange__ any attachments, with me first.
 
P

Paul Rubin

John Brawley said:
However, the thought occurs that Python (2.4.1) may not have the ability to
take advantage of the dual processors, so my question:
Does it?
No.

If not, who knows where there might be info from people trying to make
Python run 64-bit, on multiple processors?
Thanks!

Nobody is trying it in any serious way.
 
J

Jeremy Jones

John said:
Greetings, all.
I have a program I'm trying to speed up by putting it on a new machine.
The new machine is a Compaq W6000 2.0 GHz workstation with dual XEON
processors.
I've gained about 7x speed over my old machine, which was a 300 MHz AMD
K6II, but I think there ought to be an even greater speed gain due to the
two XEONs.
However, the thought occurs that Python (2.4.1) may not have the ability to
take advantage of the dual processors, so my question:
Does it?
Sure, but you have to write the program to do it. One Python process
will only saturate one CPU (at a time) because of the GIL (global
interpreter lock). If you can break up your problem into smaller
pieces, you can do something like start multiple processes to crunch the
data and use shared memory (which I haven't tinkered with...yet) to pass
data around between processes. Or an idea I've been tinkering with
lately is to use a BSD DB between processes as a queue just like
Queue.Queue in the standard library does between threads. Or you could
use Pyro between processes. Or CORBA.
If not, who knows where there might be info from people trying to make
Python run 64-bit, on multiple processors?
Thanks!

John Brawley


--
peace
JB
(e-mail address removed)
http://tetrahedraverse.com
NOTE! Charter is not blocking viruses,
Therefore NO ATTACHMENTS, please;
They will not be downloaded from the Charter mail server.
__Prearrange__ any attachments, with me first.
HTH,

JMJ
 
P

Paul Rubin

Jeremy Jones said:
to pass data around between processes. Or an idea I've been tinkering
with lately is to use a BSD DB between processes as a queue just like
Queue.Queue in the standard library does between threads. Or you
could use Pyro between processes. Or CORBA.

I think that doesn't count as using a the multiple processors; it's
just multiple programs that could be on separate boxes.
Multiprocessing means shared memory.

This module might be of interest: http://poshmodule.sf.net
 
J

Jeremy Jones

Paul said:
I think that doesn't count as using a the multiple processors; it's
just multiple programs that could be on separate boxes.
Multiprocessing means shared memory.
I disagree. My (very general) recommendation implies multiple
processes, very likely multiple instances (on the consumer side) of the
same "program". The OP wanted to know how to get Python to "take
advantage of the dual processors." My recommendation does that. Not in
the sense of a single process fully exercising multiple CPUs, but it's
an option, nonetheless. So, in that respect, your initial "no" was
correct. But,
This module might be of interest: http://poshmodule.sf.net
Yeah - that came to mind. Never used it. I need to take a peek at
that. This module keeps popping up in discussions like this one.

JMJ
 
W

William Park

John Brawley said:
Greetings, all. I have a program I'm trying to speed up by putting it
on a new machine. The new machine is a Compaq W6000 2.0 GHz
workstation with dual XEON processors. I've gained about 7x speed
over my old machine, which was a 300 MHz AMD K6II, but I think there
ought to be an even greater speed gain due to the two XEONs. However,
the thought occurs that Python (2.4.1) may not have the ability to
take advantage of the dual processors, so my question: Does it? If
not, who knows where there might be info from people trying to make
Python run 64-bit, on multiple processors? Thanks!

Break up your problem into 2 independent parts, and run 2 Python
processes. Your kernel should be SMP kernel, though.

--
William Park <[email protected]>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/
 
N

Nick Craig-Wood

Jeremy Jones said:
One Python process will only saturate one CPU (at a time) because
of the GIL (global interpreter lock).

I'm hoping python won't always be like this.

If you look at another well known open source program (the Linux
kernel) you'll see the progression I'm hoping for. At the moment
Python is at the Linux 2.0 level. Its supports multiple processors,
but has a single lock (Python == Global Interpreter Lock, Linux == Big
Kernel Lock).

Linux then took the path of splitting the BKL into smaller and smaller
locks, increasing the scalability over multiple processors.
Eventually by 2.6 we now have a fully preempt-able kernel, lock-less
read-copy-update etc.

Splitting the GIL introduces performance and memory penalties. Its
been tried before in python (I can't find the link at the moment -
sorry!). Exactly the same complaint was heard when Linux started
splitting its BKL.

However its crystal clear now the future is SMP. Modern chips seem to
have hit the GHz barrier, and now the easy meat for the processor
designers is to multiply silicon and make multiple thread / core
processors all in a single chip.

So, I believe Python has got to address the GIL, and soon.

A possible compromise (also used by Linux) would be to have two python
binaries. One with the BKL which will be faster on uniprocessor
machines, and one with a system of fine grained locking for
multiprocessor machines. This would be selected at compile time using
C Macro magic.
 
A

Alan Kennedy

[Jeremy Jones]
[Nick Craig-Wood]
I'm hoping python won't always be like this.

Me too.
However its crystal clear now the future is SMP.
Definitely.

So, I believe Python has got to address the GIL, and soon.

I agree.

I note that PyPy currently also has a GIL, although it should hopefully
go away in the future.

"""
Armin and Richard started to change genc so that it can handle the
new external objects that Armin had to introduce to implement
threading in PyPy. For now we have a simple GIL but it is not
really deeply implanted in the interpreter so we should be able to
change that later. After two days of hacking the were finished.
Despite that it is still not possible to translate PyPy with
threading because we are missing dictionaries with int keys on the
RPython level.
"""

http://codespeak.net/pipermail/pypy-dev/2005q3/002287.html

The more I read about such global interpreter locks, the more I think
that the difficulty in getting rid of them lies in implementing portable
and reliable garbage collection.

Read this thread to see what Matz has to say about threading in Ruby.

http://groups.google.com/group/comp.lang.ruby/msg/dcf5ca374e6c5da8

One of these years I'm going to have to set aside a month or two to go
through and understand the cpython interpreter code, so that I have a
first-hand understanding of the issues.
 
S

Scott David Daniels

Nick said:
Splitting the GIL introduces performance and memory penalties....
However its crystal clear now the future is SMP. Modern chips seem to
have hit the GHz barrier, and now the easy meat for the processor
designers is to multiply silicon and make multiple thread / core
processors all in a single chip.
So, I believe Python has got to address the GIL, and soon.
However, there is no reason to assume that those multiple cores must
work in the same process. One of the biggest issues in running python
in multiple simultaneously active threads is that the Python opcodes
themselves are no longer indivisible. Making a higher level language
that allows updates work with multiple threads involves lots of
coordination between threads simply to know when data structures are
correct and when they are in transition.

Even processes sharing some memory (in a "raw binary memory" style) are
easier to write and test. You'd lose too much processor to coordination
effort which was likely unnecessary. The simplest example I can think
of is decrementing a reference count. Only one thread can be allowed to
DECREF at any given time for fear of leaking memory, even though it will
most often turn out the objects being DECREF'ed by distinct threads are
themselves distinct.

In short, two Python threads running simultaneously cannot trust that
any basic Python data structures they access are in a consistent state
without some form of coordination.

--Scott David Daniels
(e-mail address removed)
 
N

Nick Craig-Wood

Scott David Daniels said:
However, there is no reason to assume that those multiple cores must
work in the same process.

No of course not. However if they aren't then you've got the horrors
of IPC to deal with! Which is difficult to do fast and portably. Much
easier to communicate with another thread, especially with the lovely
python threading primitives.
One of the biggest issues in running python in multiple
simultaneously active threads is that the Python opcodes themselves
are no longer indivisible. Making a higher level language that
allows updates work with multiple threads involves lots of
coordination between threads simply to know when data structures
are correct and when they are in transition.

Sure! No one said it was easy. However I think it can be done to all
of python's native data types, and in a way that is completely
transparent to the user.
Even processes sharing some memory (in a "raw binary memory" style) are
easier to write and test. You'd lose too much processor to coordination
effort which was likely unnecessary. The simplest example I can think
of is decrementing a reference count. Only one thread can be allowed to
DECREF at any given time for fear of leaking memory, even though it will
most often turn out the objects being DECREF'ed by distinct threads are
themselves distinct.

Yes locking is expensive. If we placed a lock in every python object
that would bloat memory usage and cpu time grabbing and releasing all
those locks. However if it meant your threaded program could use 90%
of all 16 CPUs, rather than 100% of one I think its obvious where the
payoff lies.

Memory is cheap. Multiple cores (SMP/SMT) are everywhere!
In short, two Python threads running simultaneously cannot trust
that any basic Python data structures they access are in a
consistent state without some form of coordination.

Aye, lots of locking is needed.
 
P

Paul Rubin

Nick Craig-Wood said:
Yes locking is expensive. If we placed a lock in every python object
that would bloat memory usage and cpu time grabbing and releasing all
those locks. However if it meant your threaded program could use 90%
of all 16 CPUs, rather than 100% of one I think its obvious where the
payoff lies.

Along with fixing the GIL, I think PyPy needs to give up on this
BASIC-style reference counting and introduce real garbage collection.
Lots of work has been done on concurrent GC and the techniques for it
are reasonably understood by now, especially if there's no hard
real-time requirement.
 
T

Terry Reedy

Paul Rubin said:
Along with fixing the GIL, I think PyPy needs to give up on this
BASIC-style reference counting and introduce real garbage collection.
Lots of work has been done on concurrent GC and the techniques for it
are reasonably understood by now, especially if there's no hard
real-time requirement.

I believe that gc method (ref count versus other) either is now or will be
a PyPy compile option. Flexibility in certain implementation details
leading to flexibility in host systems is, I also recall, part of their EC
funding rationale. But check their announcements, etc., for verification
and details.

Terry J. Reedy
 
S

Steve Jorgensen

I'm hoping python won't always be like this.

I don't get that. Phyton was never designed to be a high performance
language, so why add complexity to its implementation by giving it
high-performance capabilities like SMP? You can probably get a bigger speed
improvement for most tasks by writing them in C than by running them on 2
processors in an inerpreted language.

Instead of trying to make Python into a high-performance language, why not
try to factor out the smallest possible subset of the program that really
needs the performance boost, write that as a library in C, then put all the
high-level control logic, UI, etc. in Python? The C code can then use threads
and forks if need be to benefit from SMP.
 
M

Michael Sparks

Steve said:
I don't get that. Python was never designed to be a high performance
language, so why add complexity to its implementation by giving it
high-performance capabilities like SMP?

It depends on personal perspective. If in a few years time we all have
machines with multiple cores (eg the CELL with effective 9 CPUs on a chip,
albeit 8 more specialised ones), would you prefer that your code *could*
utilise your hardware sensibly rather than not.

Or put another way - would you prefer to write your code mainly in a
language like python, or mainly in a language like C or Java? If python,
it's worth worrying about!

If it was python (or similar) you might "only" have to worry about
concurrency issues. If it's a language like C you might have to worry
about memory management, typing AND concurrency (oh my!).
(Let alone C++'s TMP :)

Regards,


Michael
 
S

Steve Jorgensen

It depends on personal perspective. If in a few years time we all have
machines with multiple cores (eg the CELL with effective 9 CPUs on a chip,
albeit 8 more specialised ones), would you prefer that your code *could*
utilise your hardware sensibly rather than not.

Or put another way - would you prefer to write your code mainly in a
language like python, or mainly in a language like C or Java? If python,
it's worth worrying about!

If it was python (or similar) you might "only" have to worry about
concurrency issues. If it's a language like C you might have to worry
about memory management, typing AND concurrency (oh my!).
(Let alone C++'s TMP :)

Regards,


Michael

That argument makes some sense, but I'm still not sure I agree. Rather than
make Python programmers have to deal with concurrentcy issues in every app to
get it to make good use of the hardware it's on, why not have many of the
common libraries that Python uses to do processing take advantage of SMP when
you use them. A database server is a good example of a way we can already do
some of that today. Also, what if things like hash table updates were made
lazy (if they aren't already) and could be processed as background operations
to have the table more likely to be ready when the next hash lookup occurs.
 
C

Carl Friedrich Bolz

Terry said:
I believe that gc method (ref count versus other) either is now or will be
a PyPy compile option. Flexibility in certain implementation details
leading to flexibility in host systems is, I also recall, part of their EC
funding rationale. But check their announcements, etc., for verification
and details.

At the moment it is possible to choose between a refcounting GC and the
Boehm-Demers-Weiser garbage collector (a conservative mark&sweep GC) as
an option when you translate PyPy to C (when translating to LLVM only
the Boehm collector is supported). We plan to add more sophisticated
(and exact) GCs during the next phase of the project. Some amount of
work on this was done during my Summer of Code project (see
http://codespeak.net/pypy/dist/pypy/doc/garbage_collection.html)
although the results are not yet completely integrated into the
translation process yet.

In addition we plan to add threading with some sort of more fine-grained
locking as a compile time option although it is not really clear yet how
that will work in detail :). Right now you can translate with a GIL or
with no thread-support at all.

Carl Friedrich Bolz
 
G

Grant Edwards

I'm hoping python won't always be like this.

Quite a few people are. :)
So, I believe Python has got to address the GIL, and soon.

It would be nice if greater concurrency was possible, but it's
open source: Python hasn't "got" to do anything.
 
J

Jeremy Jones

Michael said:
Steve Jorgensen wrote:




It depends on personal perspective.
Ummmm....not totally. It depends on what you're doing. If what you're
doing is not going to be helped by scaling across a bunch of CPUs, then
you're just as well off if Python still has the GIL. Sort of. Steve
brings up an interesting argument of making the language do some of your
thinking for you. Maybe I'll address that momentarily....

I'm not saying I wish the GIL would stay around. I wish it would go.
As the price of computers goes down, the number of CPUs per computer
goes up, and the price per CPU in a single system goes down, the ability
to utilize a bunch of CPUs is going to become more important. And maybe
Steve's magical thinking programming language will have a ton of merit.
If in a few years time we all have
machines with multiple cores (eg the CELL with effective 9 CPUs on a chip,
albeit 8 more specialised ones), would you prefer that your code *could*
utilise your hardware sensibly rather than not.
I'm not picking on you. Trust me. But let me play devil's advocate for
a sec. Let's say we *could* fully utilize a multi CPU today with
Python. What has that bought us (as the amorphous "we" of the Python
community)? I would almost bet money that the majority of code would
not be helped by that at all. I'd almost bet that the vast majority of
Python code out there runs single threaded and would see no performance
boost whatsoever. Who knows, maybe that's why we still have the GIL.
Or put another way - would you prefer to write your code mainly in a
language like python, or mainly in a language like C or Java?
There are benefits to writing code in C and Java apart from
concurrency. Most of them are massochistic, but there are benefits
nonetheless. For my programming buck, Python wins hands down.

But I agree with you. Python really should start addressing solutions
for concurrent tasks that will benefit from simultaneously utilizing
multiple CPUs.
If python,
it's worth worrying about!

If it was python (or similar) you might "only" have to worry about
concurrency issues. If it's a language like C you might have to worry
about memory management, typing AND concurrency (oh my!).
(Let alone C++'s TMP :)

Regards,


Michael
JMJ
 
J

Jeremy Jones

Steve said:
That argument makes some sense, but I'm still not sure I agree. Rather than
make Python programmers have to deal with concurrentcy issues in every app to
get it to make good use of the hardware it's on, why not have many of the
common libraries that Python uses to do processing take advantage of SMP when
you use them. A database server is a good example of a way we can already do
some of that today. Also, what if things like hash table updates were made
lazy (if they aren't already) and could be processed as background operations
to have the table more likely to be ready when the next hash lookup occurs.
Now, *this* is a really interesting line of thought. I've got a feeling
that it'd be pretty tough to implement something like this in a
language, though. An application like an RDBMS is one thing, an
application framework another, and a programming language is yet a
different species altogether. It'd have to be insanely intelligent
code, though. If you had bunches of Python processes, would they all
start digging into each list or generator or hash to try to predict what
the code is going to potentially need next? Is this predictive behavior
going to chew up more CPU time than it should? What about memory?
You've got to store the predictive results somewhere. Sounds great.
Has some awesomely beneficial implications. Sounds hard as anything to
implement well, though.

JMJ
 
S

Steve Jorgensen

Steve Jorgensen wrote:
....
Now, *this* is a really interesting line of thought. I've got a feeling
that it'd be pretty tough to implement something like this in a
language, though. An application like an RDBMS is one thing, an
application framework another, and a programming language is yet a
different species altogether. It'd have to be insanely intelligent
code, though. If you had bunches of Python processes, would they all
start digging into each list or generator or hash to try to predict what
the code is going to potentially need next? Is this predictive behavior
going to chew up more CPU time than it should? What about memory?
You've got to store the predictive results somewhere. Sounds great.
Has some awesomely beneficial implications. Sounds hard as anything to
implement well, though.

I think you're making the concept harder than it needs to be. From what I'm
told by folks who know SMP way better than me, it's usually best to just look
for stuff that's waiting to be done and do it without regard to whether the
result will ever be used. That tends to be more efficient than trying to
figure out the most useful tasks to do.

In this case, it woiuld just be keeping a list of dirty hash tables, and
having a process that pulls the next one from the queue, and cleans it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,733
Messages
2,569,440
Members
44,830
Latest member
ZADIva7383

Latest Threads

Top