threading support in python

S

skip

Sandra> However, I don't expect that the GIL can be safely removed from
Sandra> CPython.

It was removed at one point in the dim, dark past (circa Python 1.4) on an
experimental basis. Aside from the huge amount of work, it resulted in
significantly lower performance for single-threaded apps (that is, the
common case). Maybe more effort should have been put in at that time to
improve performance, but that didn't happen. Much more water has gone under
the bridge at this point, so extracting the GIL from the core would be
correspondingly more difficult.

Skip
 
K

km

Hi all,
And yet, Java programmers manage to write threaded applications all
day long without getting bitten (once they're used to the issues),
despite usually being less skilled than Python programmers ;-).
These days, even semi-entry-level consumer laptop computers have dual
core CPU's, and quad Opteron boxes (8-way multiprocessing using X2
processors) are quite affordable for midrange servers or engineering
workstations, and there's endless desire to write fancy server apps
completely in Python. There is no point paying for all that
multiprocessor hardware if your programming language won't let you use
it. So, Python must punt the GIL if it doesn't want to keep
presenting undue obstacles to writing serious apps on modern hardware.

True
GIL implementation must have got its own good causes as it it designed
but as language evolves its very essential that one increases the
scope such that it fits into many usage areas(eg. scientific
applications using multiprocessors etc.).

In the modern scientific age where
__multiprocessor_execution_environment__ is quite common, i feel there
is a need to rethink abt the introduction of true parallelization
capabilities in python.
I know many of my friends who didnot choose python for obvious reasons
of the nature of thread execution in the presence of GIL which means
that one is wasting sophisticated hardware resources.


##########################################
if __name__ == ''__multiprocessor_execution_environment__':
for python_version in range(python2.4.x, python3.x, x):

if python_version.GIL:

print 'unusable for computation intensive multiprocessor
architecture'

else:
print cmp(python,java)
############################################

regards,
KM
 
B

Bryan Olson

Paul said:
Shared memory means there's a byte vector (the shared memory region)
accessible to multiple processes. The processes don't use the same
machine addresses to reference the vector. Any data structures
(e.g. those containing pointers) shared between the processes have to
be marshalled in and out of the byte vector instead of being accessed
normally.

I think it's even worse. The standard Python library offers
shared memory, but not cross-process locks. Sharing read-write
memory looks like an automatic race condition. I guess one could
implement one of the primitive spin-lock based mutual exclusion
algorithms, but I think even that would depend on non-portable
assumptions about cache consistency.
 
R

Richard Brodie

I know many of my friends who did not choose python for obvious reasons
of the nature of thread execution in the presence of GIL which means
that one is wasting sophisticated hardware resources.

It would probably be easier to find smarter friends than to remove the
GIL from Python.
 
K

km

True, since smartness is a comparison, my friends who have chosen java
over python for considerations of a true threading support in a
language are smarter, which makes me a dumbo ! :)

KM
 
R

Richard Brodie

True, since smartness is a comparison, my friends who have chosen java
over python for considerations of a true threading support in a
language are smarter, which makes me a dumbo ! :)

No, but I think you making unwise assumptions about performance.
You have to ask yourself: is Amdahl's law really hurting me?

In some situations Python could no doubt benefit from fine grained
locking. However, it's likely that scientific programming is not typically
one of them, because most of the heavy lifting is done in C or C++
extensions which can run in parallel if they release the GIL. Or you
are going to use a compute farm, and fork as many worker processes
as you have cores.

You might find these slides from SciPy 2004 interesting:
http://datamining.anu.edu.au/~ole/pypar/py4cfd.pdf
 
S

Steve Holden

Sandra> However, I don't expect that the GIL can be safely removed from
Sandra> CPython.

It was removed at one point in the dim, dark past (circa Python 1.4) on an
experimental basis. Aside from the huge amount of work, it resulted in
significantly lower performance for single-threaded apps (that is, the
common case). Maybe more effort should have been put in at that time to
improve performance, but that didn't happen. Much more water has gone under
the bridge at this point, so extracting the GIL from the core would be
correspondingly more difficult.
Given the effort that GIL-removal would take, I'm beginning to wonder if
PyPy doesn't offer a better way forward than CPython, in terms of
execution speed improvements returned per developer-hour.

regards
Steve
 
S

skip

Steve> Given the effort that GIL-removal would take, I'm beginning to
Steve> wonder if PyPy doesn't offer a better way forward than CPython,
Steve> in terms of execution speed improvements returned per
Steve> developer-hour.

How about execution speed improvements per hour of discussion about removing
the GIL? ;-)


Skip
 
S

skip

Richard> It would probably be easier to find smarter friends than to
Richard> remove the GIL from Python.

And if the friends you find are smart enough, they can remove the GIL for
you!

Skip
 
S

sjdevnull

Bryan said:
I think it's even worse. The standard Python library offers
shared memory, but not cross-process locks.

File locks are supported by the standard library (at least on Unix,
I've not tried on Windows). They work cross-process and are a normal
method of interprocess locking even in C code.
 
S

skip

Andre> This seems to be an important issue and fit for discussion in the
Andre> context of Py3k. What is Guido's opinion?

Dunno. I've never tried channeling Guido before. You'd have to ask him.
Well, maybe Tim Peters will know. He channels Guido on a fairly regular
basis.

Skip
 
S

Sandra-24

You can do the same on Windows if you use CreateProcessEx to create the
new processes and pass a NULL SectionHandle. I don't think this helps
in your case, but I was correcting your impression that "you'd have to
physically double the computer's memory for a dual core, or quadruple
it for a quadcore". That's just not even near true.

Sorry, my bad. What I meant to say is that for my application I would
have to increase the memory linearly with the number of cores. I have
about 100mb of memory that could be shared between processes, but
everything else would really need to be duplicated.
As I said, Apache used to run on Windows with multiple processes; using
a version that supports that is one option. There are good reasons not
to do that, though, so you could be stuck with threads.

I'm not sure it has done that since the 1.3 releases. mod_python will
work for that, but involves going way back in it's release history as
well. I really don't feel comfortable with that, and I don't doubt I'd
give up a lot of things I'd miss.
Having memory protection is superior to not having it--OS designers
spent years implementing it, why would you toss out a fair chunk of it?
Being explicit about what you're sharing is generally better than not.

Actually, I agree. If shared memory will prove easier, then why not use
it, if the application lends itself to that.
But as I said, threads are a better solution if you're sharing the vast
majority of your memory and have complex data structures to share.
When you're starting a new project, really think about whether they're
worth the considerable tradeoffs, though, and consider the merits of a
multiprocess solution.

There are merits, the GIL being one of those. I believe I can fairly
easily rework things into a multi-process environment by duplicating
memory. Over time I can make the memory usage more efficient by sharing
some data structures out, but that may not even be necessary. The
biggest problem is learning my way around Linux servers. I don't think
I'll choose that option initially, but I may work on it as a project in
the future. It's about time I got more familiar with Linux anyway.
It's almost certainly not worth rewriting a large established
codebase.

Lazy me is in perfect agreement.
I disagree with this, though. The benefits of deterministic GC are
huge and I'd like to see ref-counting semantics as part of the language
definition. That's a debate I just had in another thread, though, and
don't want to repeat.

I just took it for granted that a GC like Java and .NET use is better.
I'll dig up that thread and have a look at it.
I didn't say that. It can be a big hit or it can be unnoticeable. It
depends on your application. You have to benchmark to know for sure.

But if you're trying to make a guess: if you're doing a lot of heavy
lifting in native modules then the GIL may be released during those
calls, and you might get good multithreading performance. If you're
doing lots of I/O requests the GIL is generally released during those
and things will be fine. If you're doing lots of heavy crunching in
Python, the GIL is probably held and can be a big performance issue.

I don't do a lot of work in native modules, other than the standard
library things I use, which doesn't count as heavy lifting. However I
do a fair amount of database calls, and either the GIL is released by
MySQLdb, or I'll contribute a patch so that it is. At any rate, I will
measure, and I suspect the GIL will not be an issue.

-Sandra
 
P

Paul Rubin

It was removed at one point in the dim, dark past (circa Python 1.4) on an
experimental basis. Aside from the huge amount of work, it resulted in
significantly lower performance for single-threaded apps (that is, the
common case).

That's probably because they had to put locking and unlocking around
every access to a reference count. A real GC might have fixed that.
 
P

Paul Rubin

File locks are supported by the standard library (at least on Unix,
I've not tried on Windows). They work cross-process and are a normal
method of interprocess locking even in C code.

I may be missing your point but I didn't realize you could use file
locks to synchronize shared memory in any useful way. File locks are
usually made and released when the file is opened and closed, or at
best through flock or fcntl calls. Shared memory locks should
generally be done with mechanisms like futex, that in the no-wait case
should not involve any system calls.
 
S

sjdevnull

Paul said:
I may be missing your point but I didn't realize you could use file
locks to synchronize shared memory in any useful way.

You can, absolutely. If you're sharing memory through mmap it's
usually the preferred solution; fcntl locks ranges of an open file, so
you lock exactly the portions of the mmap that you're using at a given
time.

It's not an unusual use at all, Unix programs have used file locks in
this manner for upwards of a decade--things like the Apache public
runtime use fcntl or flock for interprocess mutexes, and they're quite
efficient. (The futexes you mentioned are a very recent Linux
innovation).
 
P

Paul Rubin

You can, absolutely. If you're sharing memory through mmap it's
usually the preferred solution; fcntl locks ranges of an open file, so
you lock exactly the portions of the mmap that you're using at a given
time.

How can it do that without having to touch the PTE for every single
page in the range, which might be gigabytes? For that matter, how can
it do that on regions smaller than a page? And how does another
process query whether a region is locked, without taking a kernel trap
if it's locked? This sounds absolutely horrendous compared to a
futex, which should usually be just one or two user-mode instructions
and no context switches.
It's not an unusual use at all, Unix programs have used file locks in
this manner for upwards of a decade--things like the Apache public
runtime use fcntl or flock for interprocess mutexes, and they're quite
efficient. (The futexes you mentioned are a very recent Linux
innovation).

Apache doesn't use shared memory in the same way that something like a
database does, so maybe it can more easily tolerate the overhead of
fcntl. Futex is just a somewhat standardized way to do what
programmers have done less portably since the dawn of multiprocessors.
 
B

Bryan Olson

File locks are supported by the standard library (at least on Unix,
I've not tried on Windows). They work cross-process and are a normal
method of interprocess locking even in C code.

Ah, O.K. Like Paul, I was unaware how Unix file worked with
mmap.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top