pyvm -- faster python

Paul Rubin · May 11, 2005

Kay Schluehr said:
Delete the "standard" and You still obtain huge librarys for .Net, Java
and Python. I also regret that Prothon starved in infancy but it might
be exeggerated to demand that each language designer or one of his
apostels should manage a huge community that enjoys doing redundant
stuff like writing Tk-bindings, regexp-engines and all that.

Maybe there needs to be some kind of standard FFI (foreign function
interface) like Lisp implementations have.

Sooner or later Python will go the LISP way of having a standardized
"Common-Python" ( std_objectspace) and a number of dialects and DSLs
running in their own derived object spaces. Maybe Python 3000 is an
illusion and will fade away like a Fata Morgana the closer we seem come.

Right now Python feels about like Maclisp in the 1970's must have felt
(that was before my time so I can't know for certain). Lots of
hackerly excitement, lots of cruft. It needs to ascend to the next
level. PyPy looks like the best vehicle for that so far. See

http://catb.org/~esr/jargon/html/M/MFTL.html

for the canonical remark about languages that can't be used to
implement their own compilers. Python is fun and useful, but it
really isn't mature until PyPy is released for production use.

Paul Rubin · May 11, 2005

Roger Binns said:
Some examples are gui toolkits (eg wxPython), SSL (eg M2Crypto, pyopenssl)
and database (pysqlite, APSW). These aren't in the shipped with Python
library but are widely used.

M2Crypto is a straightforward SWIG wrapper around OpenSSL, I thought.
I don't know about wxPython or pysqlite. It seems to me that some
kind of SQL client should be part of the stdlib. But why isn't a SWIG
wrapper enough? The CPython stdlib has for possibly good reasons
avoided SWIG but a new implementation doesn't need to.

You don't have to stay backwards compatible. It is best to provide
some sort of way of using the old extensions even if it is suboptimal
(eg some sort of mapping shim).

Yeah, there's something to be said for that, if it doesn't cause too
much pain.

I already get burnt out on the matrix of CPython versions and different
platforms. Adding another interpretter would make life even harder
for extension authors.

There's already Jython and Python.net is on its way. This is
something Lisp users have dealt with for decades and they've developed
workable machinery for it. Maybe Python can adopt some of those
methods.

Stelios Xanthakis · May 11, 2005

Roger said:
That will rule out all the gui frameworks, SSL, cryptography
and numerous other packages. Have a look at what happened to
Prothon. What ultimately killed it was the problem of having
a decent library. You don't have to make the C library
compatibility totally high performance, but having any form of
it there will make adoption by others easier.

There are two kinds of C modules: those that do have a knowledge
of the C API (like sre, tkinter, etc) and those that are just C/C++
libraries which are simply wrapped as modules. For the latter there
are two solutions besides adding a wrapper which makes pyvm appear
as libpython:
- an advanced ctypes module which will make dlopening libraries
and wrapping their symbols behind python functions, a matter of
python code. I'm considering this approach to provide things
like 'raw_input'.
- hacking SWIG. Shouldn't be too hard and will instantly give
us access to wx, qt, etc.

The thing is that the C API of pyvm is IMHO superior and much more fun.
You can wrap the entire sockets module in a couple of hours and also
enjoy it. I wish I could clone myself to port the entire std library
to pyvm -- so much fun it is

thanks,

Stelios

Stelios Xanthakis · May 11, 2005

Paul said:
I hope that PyPy will replace CPython once it's solid enough. Trying
to stay backwards compatible with the legacy C API doesn't seem to me
to be that important a goal. Redoing the library may take more work
than the Prothon guy was willing to do for Prothon, but PyPy has more
community interest and maybe can attract more resources.

I didn't know much about PyPy. It seems that pyvm is *exactly* what
pypy needs to boost its performance. Does pypy has the vm in python
as well? Does pypy have a compiler that produces 2.4 bytecodes?

I think the next step in speeding up python (not that it's slow, I'm
in the group of people that don't think Python is slow), is the AST
compiler. An AST compiler for CPython is in the works AFAIK, but it
would be more powerful in written in python. It would be also easy
to have 'alternative' compilers that, for example are not whitespace
sensitive, or the code looks like perl

, but they all produce python
bytecode.

Stelios

Cameron Laird · May 11, 2005

I'd also be curious to know if the performance gains would remain
once it gets fleshed out with things like closures, long numbers,
new style classes and a C library compatibility shim.

Roger

And Unicode. And ...

Robert Kern · May 11, 2005

Stelios said:
There are two kinds of C modules: those that do have a knowledge
of the C API (like sre, tkinter, etc) and those that are just C/C++
libraries which are simply wrapped as modules. For the latter there
are two solutions besides adding a wrapper which makes pyvm appear
as libpython:
- an advanced ctypes module which will make dlopening libraries
and wrapping their symbols behind python functions, a matter of
python code. I'm considering this approach to provide things
like 'raw_input'.
- hacking SWIG. Shouldn't be too hard and will instantly give
us access to wx, qt, etc.

No, writing a pyvm modules for SWIG won't give you compatibility with
most existing SWIG wrappers. Such wrappers are very, very rarely "pure
SWIG". Almost all nontrivial wrappers include ad hoc typemaps that use
the Python C API.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Paul Rubin · May 12, 2005

Stelios Xanthakis said:
I didn't know much about PyPy. It seems that pyvm is *exactly* what
pypy needs to boost its performance. Does pypy has the vm in python
as well? Does pypy have a compiler that produces 2.4 bytecodes?

Pypy makes native machine code, not bytecode.

Mike Meyer · May 12, 2005

Stelios Xanthakis said:
- hacking SWIG. Shouldn't be too hard and will instantly give
us access to wx, qt, etc.

You can't assume that because some package is a C/C++ library wrapped
for Python that it uses SWIG. pyqt, for example, doesn't use SWIG at
all. It uses SIP, which is considerably more complicated than SWIG.

<mike

Greg Ewing · May 12, 2005

Paul said:
PyPy looks like the best vehicle for that so far. See

http://catb.org/~esr/jargon/html/M/MFTL.html

for the canonical remark about languages that can't be used to
implement their own compilers.

Which makes it clear that the remark is only intended
to apply to *compiled* languages.

Paul Rubin · May 12, 2005

Greg Ewing said:
Which makes it clear that the remark is only intended to apply to
*compiled* languages.

Yes, there are several Python compilers already, (Psyco, Jython followed
up by your favorite JIT compiler, IronPython (similarly), Pyrex (not
quite the same input language but I think it should be counted), etc.
It's true that CPython doesn't have a compiler and that's a serious
deficiency. A lot of Python language features don't play that well
with compilation, and that's often unnecessary. So I hope the baseline
implementation changes to a compiled one before the language evolves
too much more.

Kay Schluehr · May 12, 2005

Paul said:
Maybe there needs to be some kind of standard FFI (foreign function
interface) like Lisp implementations have.
come.

Right now Python feels about like Maclisp in the 1970's must have felt
(that was before my time so I can't know for certain). Lots of
hackerly excitement, lots of cruft. It needs to ascend to the next
level. PyPy looks like the best vehicle for that so far. See

http://catb.org/~esr/jargon/html/M/MFTL.html

for the canonical remark about languages that can't be used to
implement their own compilers. Python is fun and useful, but it
really isn't mature until PyPy is released for production use.

Yes. What we are seeking for and this may be the meaning of Armins
intentiously provocative statement about the speed of running HLLs is a
successor of the C-language and not just another VM interpreter that is
written in C and limits all efforts to extend it in a flexible and
OO-manner. Python is just the most promising dynamic OO-language to
follow this target.

Ciao,
Kay

Bengt Richter · May 12, 2005

Pypy makes native machine code, not bytecode.

I thought they hoped to experiment with many targets, which would presumably
include generating code for different CPUs and VMs in various representations,
not necessarily just machine code via low level C or ASM, but I haven't checked
current status re that.

Regards,
Bengt Richter

Stelios Xanthakis · May 12, 2005

Kay said:
Yes. What we are seeking for and this may be the meaning of Armins
intentiously provocative statement about the speed of running HLLs is a
successor of the C-language and not just another VM interpreter that is
written in C and limits all efforts to extend it in a flexible and
OO-manner. Python is just the most promising dynamic OO-language to
follow this target.

Bytecode engine is the best method for dynamic code execution
("exec", eval, etc). A low level OOP language would be very suitable
for a python VM.

pyvm has that. A big part of it is written in "lightweight C++" [1].
That makes it less portable as the lwc preprocessor is using GNU-C
extensions. However, it's the same extensions also used by the linux
kernel and AFAIK the intel compiler supports them too.

So probably the bigger "competitor" of pyvm is boost-python.
And that's one reason the release of the source is stalled until it
gets better.

Stelios

[1] http://students.ceid.upatras.gr/~sxanth/lwc/

Skip Montanaro · May 12, 2005

Mike> You can't assume that because some package is a C/C++ library
Mike> wrapped for Python that it uses SWIG. pyqt, for example, doesn't
Mike> use SWIG at all. It uses SIP, which is considerably more
Mike> complicated than SWIG.

PyGTK uses its own lisp-y thing as the wrapper IDL. Actually, it's specific
to GTK and is AFAIK is used to wrap GTK by languages besides Python.

Skip

=?iso-8859-1?Q?Fran=E7ois?= Pinard · May 12, 2005

[Paul Rubin]

It's true that CPython doesn't have a compiler and that's
a serious deficiency.

Hi, Paul. I did not closely follow all of the thread, so maybe my
remark below, only repeats what others might have said and I missed?

Deep down, why or how not having a [traditional, to-native-code]
compiler is a deficiency for CPython? We already know that such a beast
would not increase speed so significantly, while using much more memory.

It is true that a standard traditional compiler for CPython would allow
one would be to check the box:

[x] has a compiler

in the fashion of the day language information sheet, and for some
readers, not having that box checked is a deficiency in itself.

So far, it seems that the only way to get speed is to attach static
type information to some variables. Some compilation avenues do it
through information added either in Python source code or in extraneous
declarative files, other approaches do it by delaying compilation
until such information is discovered at run-time. The former taints
the purity of real CPython as the only source. The later often shows
spectacular speed gain, but not always, and may bloat size unboudedly.

Andrew Dalke · May 12, 2005

Paul said:
Yes, there are several Python compilers already ...
It's true that CPython doesn't have a compiler and that's a serious
deficiency. A lot of Python language features don't play that well
with compilation, and that's often unnecessary. So I hope the baseline
implementation changes to a compiled one before the language evolves
too much more.

Years ago, presented at one of the Python conferences, was a
program to generate C code from the byte code. It would still
make calls to the Python run-time library (just as C does to
its run-time library).

The presenter did some optimizations, like not decref at the
end of one instruction when the next immediately does an incref
to it. The conclusion I recall was that it wasn't faster -
at best a few percent - and there was a big memory hit because
of all the duplicated code. One thought was that the cache miss
caused some of the performance problems.

Does that count as a compiler?

Andrew
(e-mail address removed)

Kay Schluehr · May 12, 2005

Stelios said:
Bytecode engine is the best method for dynamic code execution
("exec", eval, etc). A low level OOP language would be very suitable
for a python VM.

Why this? eval() consumes a string, produces a code object and executes
it. Wether the code-object is bytecode or a chunk of machine code makes
a difference in the runtime but does not alter the high level
behavioural description of eval(). In either way the compile() function
behind eval is a JIT.

pyvm has that. A big part of it is written in "lightweight C++" [1].
That makes it less portable as the lwc preprocessor is using GNU-C
extensions.

Hmmm... I'm not amazingly happy that it is not ANSI-C++. To be honest I
don't want to learn lw-C++ and the peculiaritys of the translator and
debuging through translated code. This may be an interesting study on
it's own but it is just not me who is interested in it. Using an
experimental language to write an experimental VM is clearly out of my
motivation ( only as a remark for the quest of contribution not as a
general criticism ).

However, it's the same extensions also used by the linux
kernel and AFAIK the intel compiler supports them too.

So probably the bigger "competitor" of pyvm is boost-python.

How could boost-Python be a "competitor"? Isn't it just an ANSI
C++-binding that relates heavily on templates that are not even
supported by lw-C++?

Regards,
Kay

Paul Rubin · May 12, 2005

François Pinard said:
Deep down, why or how not having a [traditional, to-native-code]
compiler is a deficiency for CPython? We already know that such a beast
would not increase speed so significantly, while using much more memory.

I'd say the opposite. The 4x speedup from Psyco is quite significant.
The speedup would be even greater if the language itself were more
compiler-friendly.

So far, it seems that the only way to get speed is to attach static
type information to some variables.

No of course not, there's quite a bit of overhead in interpretation in
general. Plus, having to do a dictionary lookup for 'bar' in every
call like foo.bar() adds overhead and I don't think I'd call fixing
that as similar to adding static type info.

Some compilation avenues do it through information added either in
Python source code or in extraneous declarative files, other
approaches do it by delaying compilation until such information is
discovered at run-time.

Also, lots of times one can do type inference at compile time.

The former taints the purity of real CPython as the only source.

I don't understand this. What purity? Why is real CPython the
only source? There are dozens of C compilers and none of them is
the "only source". Why should Python be different?

Paul Rubin · May 12, 2005

Andrew Dalke said:
Years ago, presented at one of the Python conferences, was a program
to generate C code from the byte code.... The conclusion I recall
was that it wasn't faster - at best a few percent - and there was a
big memory hit because of all the duplicated code. One thought was
that the cache miss caused some of the performance problems. Does
that count as a compiler?

I would say it counts as a compiler and that other languages have
used a similar compilation approach and gotten much better speedups.
For example, Java JIT compilers. The DEC Scheme-to-C translator
and Kyoto Common Lisp also produced C output from their compilers
and got really significant speedups. Part of the problem may be
with the actual Python language. One of the reasons for wanting
real compilation as a high priority, is that the presence of a
compiler will affect future evolution of the language, in a way
that makes it more conducive to compilation.

Despite the shrieks of the "Python is not Lisp!" crowd, Python
semantics and Lisp semantics aren't THAT different, and yet compiled
Lisp implementations com completely beat the pants off of interpreted
Python in terms of performance. I don't think Python can ever beat
carefully coded C for running speed, but it can and should aim for
parity with compiled Lisp.

david.tolpin · May 12, 2005

I don't think Python can ever beat

carefully coded C for running speed, but it can and should aim for
parity with compiled Lisp.

But common lisp compilers often beat C compilers in speed for similar
tasks
of moderate complexity. In particular, CMUCL beats GCC in numerical
computations.

David

pyvm source code	2	Dec 30, 2005
Looking for Benchmarklets to improve pyvm	2	Apr 1, 2005
ANN: Python benchmark suite	0	Apr 9, 2005
Towards faster Python implementations - theory	25	May 8, 2007
Python 3.3, gettext and Unicode problems	0	Dec 31, 2012
Python comparison matrix	4	Jan 4, 2011
Brython - Python in the browser	52	Dec 19, 2012
[ANN] Lupa 0.17 released - Lua in Python	1	Nov 5, 2010

pyvm -- faster python

Paul Rubin

Paul Rubin

Stelios Xanthakis

Stelios Xanthakis

Cameron Laird

Robert Kern

Paul Rubin

Mike Meyer

Greg Ewing

Paul Rubin

Kay Schluehr

Bengt Richter

Stelios Xanthakis

Skip Montanaro

=?iso-8859-1?Q?Fran=E7ois?= Pinard

Andrew Dalke

Kay Schluehr

Paul Rubin

Paul Rubin

david.tolpin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads