JIT compilers for Python, what is the latest news?

J

John Ladasky

I'm revisiting a project that I haven't touched in over a year. It was written in Python 2.6, and executed on 32-bit Ubuntu 10.10. I experienced a 20% performance increase when I used Psyco, because I had a computationally-intensive routine which occupied most of my CPU cycles, and always receivedthe same data type. (Multiprocessing also helped, and I was using that too.)

I have now migrated to a 64-bit Ubuntu 12.10.1, and Python 3.3. I would rather not revert to my older configuration. That being said, it would appear from my initial reading that 1) Psyco is considered obsolete and is no longer maintained, 2) Psyco is being superseded by PyPy, 3) PyPy doesn't support Python 3.x, or 64-bit optimizations.

Do I understand all that correctly?

I guess I can live with the 20% slower execution, but sometimes my code would run for three solid days...
 
M

MRAB

I'm revisiting a project that I haven't touched in over a year. It
was written in Python 2.6, and executed on 32-bit Ubuntu 10.10. I
experienced a 20% performance increase when I used Psyco, because I
had a computationally-intensive routine which occupied most of my CPU
cycles, and always received the same data type. (Multiprocessing
also helped, and I was using that too.)

I have now migrated to a 64-bit Ubuntu 12.10.1, and Python 3.3. I
would rather not revert to my older configuration. That being said,
it would appear from my initial reading that 1) Psyco is considered
obsolete and is no longer maintained, 2) Psyco is being superseded by
PyPy, 3) PyPy doesn't support Python 3.x, or 64-bit optimizations.

Do I understand all that correctly?

I guess I can live with the 20% slower execution, but sometimes my
code would run for three solid days...
Have you looked at Cython? Not quite the same, but still...
 
C

Chris Angelico

I'm revisiting a project that I haven't touched in over a year. It was written in Python 2.6, and executed on 32-bit Ubuntu 10.10. I experienced a20% performance increase when I used Psyco, because I had a computationally-intensive routine which occupied most of my CPU cycles, and always received the same data type. (Multiprocessing also helped, and I was using that too.)

I guess I can live with the 20% slower execution, but sometimes my code would run for three solid days...

Two things to try, in order:

1) Can you optimize your algorithms? Three days of processing is... a LOT.

2) Rewrite some key portions in C, possibly using Cython (as MRAB suggested).

You may well find that you don't actually need to make any
language-level changes. If there's some critical mathematical function
that already exists in C, making use of it might make all the
difference you need.

ChrisA
 
J

John Ladasky

Have you looked at Cython? Not quite the same, but still...

I'm already using Numpy, compiled with what is supposed to be a fast LAPACK. I don't think I want to attempt to improve on all the work that has gone into Numpy.
 
J

John Ladasky

Have you looked at Cython? Not quite the same, but still...

I'm already using Numpy, compiled with what is supposed to be a fast LAPACK. I don't think I want to attempt to improve on all the work that has gone into Numpy.
 
J

John Ladasky

1) Can you optimize your algorithms? Three days of processing is... a LOT..

Neural network training. Yes, it takes a long time. Still, it's not the most tedious code I run. I also do molecular-dynamics simulations with GROMACS, those runs can take over a week!
2) Rewrite some key portions in C, possibly using Cython (as MRAB suggested).

And as I replied to MRAB, my limiting code is within Numpy. I've taken care to look for ways that I might have been using Numpy itself inefficiently (and I did find a problem once: fixing it tripled my execution speed). ButI would like to think that Numpy itself, since it is already a C extension, should be optimal.
 
J

John Ladasky

1) Can you optimize your algorithms? Three days of processing is... a LOT..

Neural network training. Yes, it takes a long time. Still, it's not the most tedious code I run. I also do molecular-dynamics simulations with GROMACS, those runs can take over a week!
2) Rewrite some key portions in C, possibly using Cython (as MRAB suggested).

And as I replied to MRAB, my limiting code is within Numpy. I've taken care to look for ways that I might have been using Numpy itself inefficiently (and I did find a problem once: fixing it tripled my execution speed). ButI would like to think that Numpy itself, since it is already a C extension, should be optimal.
 
C

Chris Angelico

Neural network training. Yes, it takes a long time. Still, it's not themost tedious code I run. I also do molecular-dynamics simulations with GROMACS, those runs can take over a week!


And as I replied to MRAB, my limiting code is within Numpy. I've taken care to look for ways that I might have been using Numpy itself inefficiently (and I did find a problem once: fixing it tripled my execution speed). But I would like to think that Numpy itself, since it is already a C extension, should be optimal.

Ahh, yeah, that's gonna take a while. Your minimum processing time is
likely to remain fairly high. There won't be any stupidly easy
improvements to make (like one of my favorite examples from
databasing: an overnight job became a three-second run, just by making
proper use of a Btrieve file's index).

ChrisA
 
R

Robert Kern

Neural network training. Yes, it takes a long time. Still, it's not the most tedious code I run. I also do molecular-dynamics simulations with GROMACS, those runs can take over a week!


And as I replied to MRAB, my limiting code is within Numpy. I've taken care to look for ways that I might have been using Numpy itself inefficiently (and I did find a problem once: fixing it tripled my execution speed). But I would like to think that Numpy itself, since it is already a C extension, should be optimal.

Well, Psyco obviously wasn't optimizing numpy. I believe the suggestion is to
identify the key parts of the code that Psyco was optimizing to get you the 20%
performance increase and port those to Cython.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
I

Ian Foote

I'm revisiting a project that I haven't touched in over a year. It was written in Python 2.6, and executed on 32-bit Ubuntu 10.10. I experienced a 20% performance increase when I used Psyco, because I had a computationally-intensive routine which occupied most of my CPU cycles, and always received the same data type. (Multiprocessing also helped, and I was using that too.)

I have now migrated to a 64-bit Ubuntu 12.10.1, and Python 3.3. I would rather not revert to my older configuration. That being said, it would appear from my initial reading that 1) Psyco is considered obsolete and is no longer maintained, 2) Psyco is being superseded by PyPy, 3) PyPy doesn't support Python 3.x, or 64-bit optimizations.

Do I understand all that correctly?

I guess I can live with the 20% slower execution, but sometimes my code would run for three solid days...

Pypy is working on porting to python 3. They are accepting donations:
http://pypy.org/py3donate.html

Regards,
Ian F
 
I

Ian Kelly

And as I replied to MRAB, my limiting code is within Numpy. I've taken care to look for ways that I might have been using Numpy itself inefficiently (and I did find a problem once: fixing it tripled my execution speed). But I would like to think that Numpy itself, since it is already a C extension, should be optimal.

That doesn't seem to follow from your original post. Because Numpy is
a C extension, its performance would not be improved by psyco at all.
The 20% performance increase that you reported must have been a result
of the JIT compiling of some Python code, and if you can identify that
and rewrite it in C, then you may be able to see the same sort of
boost you had from psyco.
 
J

John Ladasky

That doesn't seem to follow from your original post. Because Numpy is
a C extension, its performance would not be improved by psyco at all.

What about the fact that Numpy accommodates Python's dynamic typing? You can pass arrays of integers, floats, bytes, or even PyObjects. I don't know exactly how all that is implemented.

In my case, I was always passing floats. So what I assumed that psyco was doing for me was compiling a neural network class that always expected floats.
 
J

John Ladasky

That doesn't seem to follow from your original post. Because Numpy is
a C extension, its performance would not be improved by psyco at all.

What about the fact that Numpy accommodates Python's dynamic typing? You can pass arrays of integers, floats, bytes, or even PyObjects. I don't know exactly how all that is implemented.

In my case, I was always passing floats. So what I assumed that psyco was doing for me was compiling a neural network class that always expected floats.
 
I

Ian Kelly

What about the fact that Numpy accommodates Python's dynamic typing? You can pass arrays of integers, floats, bytes, or even PyObjects. I don't know exactly how all that is implemented.

I don't know exactly either, but psyco JIT compiles Python, not C. In
the PyObject case you might see some benefit if numpy ends up calling
back into methods that are implemented in Python.
In my case, I was always passing floats. So what I assumed that psyco was doing for me was compiling a neural network class that always expected floats.

Right, so if you take that routine and rewrite it as a C function that
expects floats and handles them internally as such, I would think that
you might see a similar improvement.
 
J

Joshua Landau

I'm revisiting a project that I haven't touched in over a year. It was
written in Python 2.6, and executed on 32-bit Ubuntu 10.10. I experienced
a 20% performance increase when I used Psyco, because I had a
computationally-intensive routine which occupied most of my CPU cycles, and
always received the same data type. (Multiprocessing also helped, and I
was using that too.)

I have now migrated to a 64-bit Ubuntu 12.10.1, and Python 3.3. I would
rather not revert to my older configuration. That being said, it would
appear from my initial reading that 1) Psyco is considered obsolete and is
no longer maintained, 2) Psyco is being superseded by PyPy, 3) PyPy doesn't
support Python 3.x, or 64-bit optimizations.

Do I understand all that correctly?

I guess I can live with the 20% slower execution, but sometimes my code
would run for three solid days...

If you're not willing to go far, I've heard really, really good things
about Numba. I've not used it, but seriously:
http://jakevdp.github.io/blog/2012/08/24/numba-vs-cython/.

Also, PyPy is fine for 64 bit, even if it doesn't gain much from it. So
going back to 2.7 might give you that 20% back for almost free. It depends
how complex the code is, though.
 
R

rusi

I guess I can live with the 20% slower execution, but sometimes my code would run for three solid days...


Oooff! Do you know where your goal-posts are?
ie if your code were redone in (top-class) C or Fortran would it go
from 3 days to 2 days or 2 hours?
[The 'top-class' qualification is needed because it could also go from
3 days to 5!]
 
S

Stefan Behnel

Joshua Landau, 06.04.2013 12:27:
If you're not willing to go far, I've heard really, really good things
about Numba. I've not used it, but seriously:
http://jakevdp.github.io/blog/2012/08/24/numba-vs-cython/.

Also, PyPy is fine for 64 bit, even if it doesn't gain much from it. So
going back to 2.7 might give you that 20% back for almost free. It depends
how complex the code is, though.

I would guess that the main problem is rather that PyPy doesn't support
NumPy (it has its own array implementation, but that's about it). John
already mentioned that most of the heavy lifting in his code is done by NumPy.

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,049
Latest member
Allen00Reed

Latest Threads

Top