Will multithreading make python less popular?

S

sturlamolden

And the story begins here. As i search on the net, I have found that
because of the natural characteristics of python such as GIL, we are
not able to write multi threaded programs. Oooops, in a kind of time
with lots of cpu cores and we are not able to write multi threaded
programs.

The GIL does not prevent multithreaded programs. If it did, why does
Python have a "threading" module?

The GIL prevents one use of threads: parallel processing in plain
Python. You can still do parallel processing using processes. Just
import "multiprocessing" instead of "threading". The two modules have
fairly similar APIs. You can still use threads to run tasks in the
background.

The GIL by the way, is an implementation detail. Nobody likes it very
much. But it is used for making it easier to extend Python with C
libraries (Python's raison d'etre). Not all C libraries are thread-
safe. The GIL is also used to synchronize access to reference counts.
In fact, Ruby is finally going to get a GIL as well. So it can't be
that bad.

As for parallel processing and multicore processors:

1. Even if a Python script can only exploit one core, we are always
running more than one process on the computer. For some reason this
obvious fact has to be repeated.

2. Parallel processing implies "need for speed". We have a 200x speed
penalty form Python alone. The "need for speed" are better served by
moving computational bottlenecks to C or Fortran. And in this case,
the GIL does not prevent us from doing parallel processing. The GIL
only affects the Python portion of the code.

3. Threads are not designed to be an abstraction for parallel
processing. For this they are awkward, tedious, and error prone.
Current threading APIs were designed for asynchronous tasks. Back in
the 1990s when multithreading became popular, SMPs were luxury only
few could afford, and multicore processors were unheard of.

4. The easy way to do parallel processing is not threads but OpenMP,
MPI, or HPF. Threads are used internally by OpenMP and HPF, but those
implementation details are taken care of by the compiler. Parallel
computers have been used by scientists and engineers for three decades
now, and threads have never been found a useful abstraction for manual
coding. Unfortunately, this knowledge has not been passed on from
physicists and engineers to the majority of computer programmers.
Today, there is a whole generation of misguided computer scientists
thinking that threads must be the way to use the new multicore
processors. Take a lesson from those actually experienced with
parallel computers and learn OpenMP!

5. If you still insist on parallel processing with Python threads --
ignoring what you can do with multiprocessing and native C/Fortran
extensions -- you can still do that as well. Just compile your script
with Cython or Pyrex and release the GIL manually. The drawback is
that you cannot touch any Python objects (only C objects) in your GIL-
free blocks. But after all, the GIL is used to synchronize reference
counting, so you would have to synchronize access to the Python
objects anyway.


import threading

def _threadproc():
with nogil:
# we do not hold the GIL here
pass
# now we have got the GIL back
return

def foobar():
t = threading.Thread(target=_threadproc)
t.start()
t.join()

That's it.





Sturla Molden
 
S

sturlamolden

As you mentioned, using multi cores makes programs more fast and more
popular. But what about stackless python? Does it interpret same set
of python libraries with Cpython or Does it have a special sub set?

Stackless and CPython have a GIL, Jython and IronPython do not.

S.M.
 
H

Hyuga

Hi everybody,
I am an engineer. I am trying to improve my software development
abilities. I have started programming with ruby. I like it very much
but i want to add something more. According to my previous research i
have designed a learning path for myself. It's like something below.
      1. Ruby (Mastering as much as possible)
      2. Python (Mastering as much as possible)
      3. Basic C++ or Basic Java
And the story begins here. As i search on the net,  I have found that
because of the natural characteristics of python such as GIL, we are
not able to write multi threaded programs. Oooops, in a kind of time
with lots of cpu cores and we are not able to write multi threaded
programs. That is out of fashion. How a such powerful language doesn't
support multi threading. That is a big minus for python. But there is
something interesting, something like multi processing. But is it a
real alternative for multi threading. As i searched it is not, it
requires heavy hardware requirements (lots of memory, lots of cpu
power). Also it is not easy to implement, too much extra code...

After all of that, i start to think about omiting python from my
carrier path and directly choosing c++ or java. But i know google or
youtube uses python very much. How can they choose a language which
will be killed by multi threading a time in near future. I like python
and its syntax, its flexibility.

What do you think about multi threading and its effect on python. Why
does python have such a break and what is the fix. Is it worth to make
investment of time and money to a language it can not take advantage
of multi cores?

Though I'm sure this has already been shot to death, I would just add
that maybe the better question would be: "Will Python make
multithreading less popular?" My answer would be something along the
lines of, that would be nice, but it just depends on how many people
adopt Python for their applications, realize they can't use threads to
take advantage of multiple CPUs, ask this same bloody question for the
thousandth time, and are told to use the multiprocessing module.
 
G

Graham Dumpleton

multiprocessing is already implemented for you in the standard
library.
Of course it does not require heavy hardware requirements.



You can take advantage of multi cores, just not with threads but with
processes,
which BTW is the right way to go in most situations. So (assuming you
are not
a troll) you are just mistaken in thinking that the only way to
use multicores is via multithreading.

It is also a mistaken belief that you cannot take advantage of multi
cores with multiple threads inside of a single process using Python.

What no one seems to remember is that when calls are made into Python
extension modules implemented in C code, they have the option of
releasing the Python GIL. By releasing the Python GIL at that point,
it would allow other Python threads to run at the same time as
operations are being done in C code in the extension module.

Obviously if the extension module needs to manipulate Python objects
it will not be able to release the GIL, but not all extension modules
are going to be like this and could have quite sizable sections of
code that can run with the GIL released. Thus, you could have many
threads running at the same time in sections of C code, at same time
as well as currently delegated thread within Python code.

A very good example of this is when embeddeding Python inside of
Apache. So much stuff is actually done inside of Apache C code with
the GIL released, that there is more than ample opportunity for
multiple threads to be running across cores at the same time.

Graham
 
M

Michele Simionato

It is also a mistaken belief that you cannot take advantage of multi
cores with multiple threads inside of a single process using Python.

What no one seems to remember is that when calls are made into Python
extension modules implemented in C code, they have the option of
releasing the Python GIL. By releasing the Python GIL at that point,
it would allow other Python threads to run at the same time as
operations are being done in C code in the extension module.

You are perfectly right and no one forgot this point,
I am sure. However I think we were answering to the question
"can pure Python code take advantage of multiple CPUs
via multithreading" and the answer is no. Of course a
C extension can do that, but that is beside the point.
It is still worth noticing - as you did -
that some well known Python (+C extension) library -
such as mod_wsgi - is already well equipped to manage
multithreading on multiple cores without any further
effort from the user. This is a good point.

Michele Simionato
 
R

rushenaly

Thank you for all your answers...

I think i am going to pick Java instead of Python...

Rushen
 
S

Steve Holden

Thank you for all your answers...

I think i am going to pick Java instead of Python...
Well, good luck. See what a helpful bunch of people you meet in the
Python world? Glad you found all the advice helpful. Come back when you
want to try Python!

regards
Steve
 
R

rushenaly

Thank you Steve,

I really wanted to learn python, but as i said i don't want to make a
dead investment. I hope someone can fix these design errors and maybe
can write an interpreter in python :)

Thank you so much great community...
Rushen
 
T

Tim Rowe

2009/2/19 said:
Thank you Steve,

I really wanted to learn python, but as i said i don't want to make a
dead investment. I hope someone can fix these design errors and maybe
can write an interpreter in python :)

Good luck with Java, and with your search for a perfect language. I
think it will be a long search.
 
R

rushenaly

Thank you Tim...

It is not a search for perfect language. It is a search for a capable
language to modern worlds' needs.

Rushen
 
T

Tim Rowe

2009/2/19 said:
Thank you Tim...

It is not a search for perfect language. It is a search for a capable
language to modern worlds' needs.

That would be just about any of the ones you mentioned, then. Unless
you mean the needs of a specific project, in which case the
suitability will depend on the project.
 
S

sturlamolden

I really wanted to learn python, but as i said i don't want to make a
dead investment. I hope someone can fix these design errors and maybe
can write an interpreter in python :)

Java and Python has different strengths and weaknesses. There is no
such thing as "the perfect language". It all depends on what you want
to do.

Just be aware that scientists and engineers who need parallel
computers do not use Java or C#. A combination of a scripting language
with C or Fortran seems to be preferred. Popular scripting languages
numerical computing include Python, R, IDL, Perl, and Matlab.

You will find that in the Java community, threads are generally used
for other tasks than parallel computing, and mainly asynchronous I/O.
Java does not have a GIL, nor does Jython or Microsoft's IronPython.
But if you use threads for I/O, the GIL does not matter. Having more
than one CPU does not make your harddisk or network connection any
faster. The GIL does not matter before crunching numbers on the CPU
becomes the bottleneck. And when you finally get there, perhaps it is
time to look into some C programming? Those that complain about
CPython's GIL (or the GIL of Perl/PHP/Ruby for that matter) seem to be
developers who have no or little experience with parallel computers.
Yes, the GIL prevents Python threads from being used in a certain way.
But do you really need to use threads like that? Or do you just think
you do?

S.M.
 
R

Richard Brodie

The GIL does not matter before crunching numbers on the CPU
becomes the bottleneck. And when you finally get there, perhaps it is
time to look into some C programming?

Or numpy on a 512 core GPGPU processor, because using the CPU
for crunching numbers is just *so* dated. ;)
 
P

Paul Rubin

sturlamolden said:
Yes, the GIL prevents Python threads from being used in a certain way.
But do you really need to use threads like that? Or do you just think
you do?

How old is your computer, why did you buy it, and is it the first one
you ever owned?

For most of us, I suspect, it is not our first one, and we bought it
to get a processing speedup relative to the previous one. If such
speedups were useless or unimportant, we would not have blown our hard
earned cash replacing perfectly good older hardware, so we have to
accept the concept that speed matters and ignore those platitudes that
say otherwise.

It used to be that new computers were faster than the old ones because
they ran at higher clock rates. That was great, no software changes
at all were required to benefit from the higher speed. Now, they get
the additional speed by having more cores. That's better than nothing
but making use of it requires fixing the GIL.
 
S

Steve Holden

Thank you Steve,

I really wanted to learn python, but as i said i don't want to make a
dead investment. I hope someone can fix these design errors and maybe
can write an interpreter in python :)

Thank you so much great community...
Rushen

By the way, since you have chosen Java you might be interested to know
that the JPython implementation (also open source) generates JVM
bytecode, and allows you to freely mix Java and Python classes.

There is no Global Interpreter Lock in JPython ...

regards
Steve
 
S

Steve Holden

Thank you Steve,

I really wanted to learn python, but as i said i don't want to make a
dead investment. I hope someone can fix these design errors and maybe
can write an interpreter in python :)

Thank you so much great community...
Rushen

By the way, since you have chosen Java you might be interested to know
that the JPython implementation (also open source) generates JVM
bytecode, and allows you to freely mix Java and Python classes.

There is no Global Interpreter Lock in JPython ...

regards
Steve
 
F

Falcolas

...  If such
speedups were useless or unimportant, we would not have blown our hard
earned cash replacing perfectly good older hardware, so we have to
accept the concept that speed matters and ignore those platitudes that
say otherwise.

That's fair, but by using a high level language in the first place,
you've already made the conscious decision to sacrifice speed for ease
of programming. Otherwise, you would probably be programming in C.

The question really is "Is it fast enough", and the answer usually is
"Yes". And when the answer is "No", there are many things which can be
done before the need to multi-thread the whole script comes about.

It's a proposition that used to bother me, until I did some actual
programming of real world problems in Python. I've yet to really find
a case where the application was slow enough to justify the cost of
using multiple Python processes.

~G
 
P

Paul Rubin

Falcolas said:
That's fair, but by using a high level language in the first place,
you've already made the conscious decision to sacrifice speed for ease
of programming. Otherwise, you would probably be programming in C.

That Python is so much slower than C is yet another area where Python
can use improvement.
It's a proposition that used to bother me, until I did some actual
programming of real world problems in Python. I've yet to really find
a case where the application was slow enough to justify the cost of
using multiple Python processes.

Right, that's basically the issue here: the cost of using multiple
Python processes is unnecessarily high. If that cost were lower then
we could more easily use multiple cores to make oru apps faster.
 
R

rushenaly

Hi again

I really want to imply that i am not in search of a perfect language.
Python for programming productivity is a great language but there are
some real world facts. Some people want a language that provides great
flexibility. A language can provide threads and processes and
programmer choose the way. I really believe that GIL is a design
error.

Thanks.

Rushen
 
T

Tim Wintle

That's fair, but by using a high level language in the first place,
you've already made the conscious decision to sacrifice speed for ease
of programming. Otherwise, you would probably be programming in C.
My parents would have gone mad at me for saying that when I was young -
C is just about the highest-level language they ever used - Assembly/hex
all the way!

So if you really want speed then why don't you write your code in
assembly? That's the only "perfect language" - it's capable of doing
everything in the most efficient way possible on your machine.

Of course that's a hassle, so I guess you may as well use C, since
that's almost certainly only linearly worse than using assembly, and it
takes far less time to use.

Oh, but then you may as well use python, since (again) that's probably
only linearly worse than C, and well-programmed C at that - I certainly
wouldn't end up with some of the optimisations that have gone into the
python interpreter!

That's basically what my mind goes through whenever I choose a language
to use for a task - and why I almost always end up with Python.
It's a proposition that used to bother me, until I did some actual
programming of real world problems in Python. I've yet to really find
a case where the application was slow enough to justify the cost of
using multiple Python processes.

I deal with those situations a fair bit - but the solutions are normally
easy - if it's limited by waiting for IO then I use threads, if it's
limited by waiting for CPU time then I use multiple processes, or share
the job over another application (such as MySQL), or run a task over a
cluster.

If you have a task where the linear optimisation offered by multiple
cores is really important then you can either:
* Run it over multiple processes, or multiple machines in Python
or
* spend a year writing it in C or assembly, by which time you can buy a
new machine that will run it fine in Python.


Yes, we're coming to a point where we're going to have tens of cores in
a chip, but by that time someone far cleverer than me (possibly someone
who's on this list) will have solved that problem. The top experts in
many fields use Python, and if they weren't able to make use of multiple
core chips, then there wouldn't be any demand for them.

Tim Wintle
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top