Python compilers?

Yermat · May 18, 2004

simo a écrit :

[...]

I thought Pyrex was a hybrid of C and Python (like Jython/Java) not
actually a Python-to-C convertor? And Pysco is just a different VM
isn't it?

Pyrex is python plus access to C structures and type declarations. But
then the current implementation create intermediary C files.

Psyco is not a different VM, it's like the JIT of java. It's a Just In
Time compilers, ie it runs over the CPythonVM.

> [...]

Mitja · May 18, 2004

Thorsten Kampe said:
* Tor Iver Wilhelmsen (2004-05-18 17:26 +0100)

Must have been a long time since you went to school... 1/3 is
/exactly/ 0.3...: http://mathworld.wolfram.com/RepeatingDecimal.html

Even worse....
0.999999999999999999999... is exactly 1

Svein Ove Aas · May 18, 2004

Mitja said:
Even worse....
0.999999999999999999999... is exactly 1

Only for infinite counts of '9', and computers don't do infinity.

Peter Hansen · May 19, 2004

Tor said:
No, because at some point you will stop writing 3's, either out of
boredom, exhaustion or because you need to pee. At that instant, you
introduce a rounding error, making 3 * 1/3 = 0.99999999999... instead
of 1.0

Actually, Grant probably meant the "..." (which is an ellipsis,
meaning it substitutes for something else that is left out) to
represent "3 repeating to infinity", the same as putting a dot
over the last 3, or a bar over the last three 3s, or whatever
other convention you might have seen. Of course, it's just a
convention, so perhaps someone else would think it meant "3s
repeating to the limit of the computer's precision" or something
like that...

-Peter

Peter Hansen · May 19, 2004

>
Yes, fast execution. I have been using C. In my applications there is
a population of "chromosomes" which are arrays of floats, about 2 to 5 k
in length. Then there are subroutines which operate on a chromosome
using pointers. For example, the "crossover" routine uses two pointers
to swap portions of two chromosomes. My software sometimes runs for
hours, perform many millions of operations like these. Clearly, speed
of execution is of dramatic importance.

The bottleneck is almost certainly in evaluating the fitness
function, not in performing the mutations and cross-overs.
What does your fitness function do with all those floats?
Perhaps it can be handled much faster with one of the numeric
extensions for Python... or with Pyrex.

-Peter

Carl Banks · May 19, 2004

[[ Note for Peter Hansen: all uses of the word "compiler" below is
understood to refer to an optimizing compiler to machine language, and
I mean a real, not virtual, machine. ]]

Heiko said:
Am Dienstag, 18. Mai 2004 13:41 schrieb Jacek Generowicz:

Native compilers for other languages just as dynamic as Python
exist. These compilers manage to achieve very significant speed
increases[*].

Click to expand...

You are again refering to LISP as an example of a dynamic language
which when compiled gives huge speed increases. This is true in some
respect, in others it isn't. LISP has the advantage that
type-inference may be used throughout the program to create one
version of each function, which can then be compiled.

Can you give an example of a function Lisp is able to compile that
Python manifestly couldn't. I don't buy it.

[snip]

In Python this isn't true. Python, instead of LISP, is "completely"
dynamic, meaning that it's pretty impossible to do type-inference
for each function that is called (even checking types isn't
possible).

I don't follow you. In what way is Python dynamic that Lisp isn't?
And Python certainly can check types.

E.g. how do you expect type-inference to work with the pickle
module? string -> something/Error would be the best description what
pickle does. For the function which calls pickle, do you want to
create versions for each possible output of Pickle? Which outputs
of Pickle are possible?

You can write pickling package in Lisp. I think the Lisp compiler
would handle such a package fine, and I see no reason why a
hypothetical Python compiler wouldn't.

(depends on the loaded modules, which can be loaded at runtime)
There is no (sane) way to create machine-code which calls into the
appropriate (low-level) Python-runtime functions (such as Py_List*,
Py_Dict*, etc.) for such a method, at least not at compile-time.

That might be true, but pickling is only one module. The fact that
I'm able to write a pickling package in Lisp doesn't make it
impossible to write a Lisp compiler, and the fact that pickling module
exists in Python doesn't make it impossible to write a Python
compiler.

What would a Lisp compiler do faced with a Lisp pickling package?

At runtime, this is possible. See what psyco does. There's a nice
presentation on the psyco-website which explains what it does, and I
guess you'll understand when you see that why writing a "compiler"
for Python is pretty impossible.

Well, I don't buy it, and I don't see any fundamental between Python
and Lisp dynamicism. The only thing you've demonstrated is that there
is some code in Python that could make optimizing difficult--a
statement which is true of Lisp also--but it's a specific case that is
not applicable generally.

SeeBelow · May 19, 2004

Peter Hansen wrote:

The bottleneck is almost certainly in evaluating the fitness
function, not in performing the mutations and cross-overs.
What does your fitness function do with all those floats?
Perhaps it can be handled much faster with one of the numeric
extensions for Python... or with Pyrex.

You are right, the worst bottleneck is the fitness function. Every
population member is an ANN (artificial neural network) and the ANNout()
function must be called for each fitness evaluation. You can see there
is a lot of looping. This C code runs very quickly:
-----------------------------------------------------------------------
/* neursubs.c - for computing output of artificial neural network - for
EvSail-2.2
M. Timin, August, 2003

piecewise parabolic approximator replaced conventional sigmoid,
October, 2003
*/

#include <math.h>
#include <stdlib.h>

#define NEUR_MAX 80 /* maximum number of neurons in a
layer */
#define LOOP(i,N) for(i=0; i<N; i++) /* be careful using this! */

/* This 4 piece curve is a good sigmoid approximator. */
float sigAprox(register float x) {
register float z;

if(x <= -4.0)
return 0.0;
else if(x <= 0.0) {
z = x + 4.0;
return z*z/32;
}
else if(x < 4.0) {
z = x - 4.0;
return 1.0 - z*z/32;
}
else
return 1.0;
}

/* oneNeuron() uses a vector of inputs and a vector of weights, and the
sigmoid activity
function, to compute the output of one neuron. It is assumed that an
extra
input of 1.0 is at the beginning of the input vector, and that there
is a
corresponding value at the beginning of the weight vector. This is
actually
the bias. So upon entering this function, wptr points to the bias
and inptr
points to 1.0. The inputCount should include the bias, so it should
be one
more than the number of inputs. */
float oneNeuron(float *inptr, float *wptr, int inputCount) {
int i;
float sum = 0.0;

LOOP(i, inputCount) { /* summation loop */
sum += *inptr++ * *wptr++;
}
return sigAprox(sum); /* this is the sigmoid formula */
}

/* This is the routine which calculates the outputs of the ANN. Before
calling it the input
values must be in the array pointd to by inptr. Values of the
outputs will be placed
in the array pointed to by outValues */
void ANNout(int numIn, /* number of inputs to the ANN */
int numHid, /* number of neurons that receive the inputs */
int numOut, /* number of final output neurons */
float *inptr, /* pointer to the array of input values */
float *wptr, /* pointer to array of weights & biases in a
specific order */
float *outValues) /* pointer to where to write the output */
{
float t1[NEUR_MAX]; /* NEUR_MAX defined above */
float t2[NEUR_MAX];
int i;

/* prepare the input array: */
t1[0] = 1.0;
LOOP(i, numIn)
t1[i+1] = *inptr++;
/* compute and store intermediate outputs: */
t2[0] = 1.0;
LOOP(i, numHid)
{
t2[i+1] = oneNeuron(t1, wptr, numIn+1);
wptr += numIn+1;
}
/* do similar for final layer, writing to destination */
LOOP(i, numOut)
{
outValues = oneNeuron(t2, wptr, numHid+1);
wptr += numHid+1;
}
}

Greg Ewing · May 19, 2004

Yes, fast execution. I have been using C. In my applications there is
a population of "chromosomes" which are arrays of floats, about 2 to 5 k
in length. Then there are subroutines which operate on a chromosome
using pointers. For example, the "crossover" routine uses two pointers
to swap portions of two chromosomes.

Take a look at Pyrex. It may be just what you need:

http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/

Paul Rubin · May 19, 2004

Carl Banks said:
I don't follow you. In what way is Python dynamic that Lisp isn't?

.... def bar(self, x):
.... return x*x
....

Michael Hudson · May 19, 2004

Svein Ove Aas said:
What you're claiming, though, is that it's possible to write Python code
that can't easily be translated to equivalent Lisp code. Can you give an
example?

class C:
def __eq__(self, other):
return True

for example. For better or worse, Python *is* more dynamic than
Common Lisp, and this *does* contribute to making it harder to make
Python go fast.

I wrote a rant about this subject:

http://starship.python.net/crew/mwh/hacks/speeding-python.html

Cheers,
mwh

Harry George · May 19, 2004

Svein Ove Aas said:
Only for infinite counts of '9', and computers don't do infinity.

We need to stop randomly mixing exact math with computer
implementations (and approximations) thereof.

"<digit><digit>..." is the accepted ASCII rendition of the repeating
overbar, and thus explicitly means "on to infinity". 0.99... exactly
equals 1.

If you want to shift the discussion to computer implementations then
that is a different story. We can talk binary representations of
decimal fractions, and the value of rationals as a useful
representation.

But why not also complain that Python does not have a complete
representation of pi. After all, the value of pi is known to well
beyond the IEEE 80-bit or 64-bit or whatever that an implementation
provides. Even if we did mp math and did really long pi
representations, they would of course not be exact. "e" isn't handled
completely either. Why not complain about those?

We don't complain because the available values are "good enough".
IEEE 764 64-bit reals (internally handled as IEEE 764 80-bit) are
good-enough for most numerical needs. I'll admit, the US debt needs
some extended precision :-( but most numerical analysis gets by with
epsilons under 1e-14.

Hey, while we are on the subject of exact representation, what's with
multithreading? My computer has only one CPU. What's going on
here???? It's a lie, a LIE I tell you...

Carl Banks · May 19, 2004

Paul Rubin said:
... def bar(self, x):
... return x*x
...

Well, come on, of course there's going to be some things here and
there you can do in one and not the other. In wat is Python dynamic
that Lisp isn't to such an extent that it would cripple any attempts
to compile it?

Heather Coppersmith · May 20, 2004

On 19 May 2004 14:54:48 -0700,

Well, come on, of course there's going to be some things here
and there you can do in one and not the other. In wat is Python
dynamic that Lisp isn't to such an extent that it would cripple
any attempts to compile it?

It's not a matter of not being able to compile python; it's a
matter of what sort of benefits you'd gain.

For example:

if x.y.z == a.b.c:
print 'equal'

What is x.y.z? Who knows? The object to which x is bound might
create y on the fly, based on information not available to the
compiler (see __getattr__ and properties). Once the run-time
system asks object x for attribute y, it (the run-time) has to go
through the whole process again to determine z (and, therefore,
x.y.z). Similar for a.b.c, and any of that code might have
redefined what it means for such objects to be equal, which means
that the compiler can't even know what sorts of equality tests
might be available at the time that the code executes, let alone
generate a simple compare instruction.

This is important: There is little, if any, difference between
that code and running everything through the interpreter anyway.

For that matter, simply accessing x.y might change a.b. (FWIW,
though, the ensuing programmer-cide would be entirely justified.)

"Plain" Common Lisp code has (mostly) the same issues, but
performance critical Common Lisp programs contain strategically
placed type declarations (thus reducing the dynamicity of the
language) to help the compiler to know what it's up against.

There are proposals to add type declarations to Python; google is
your friend (see also 'decorators'). Similar for JIT compilers
(psyco falls into this category).

Regards,
Heather

Paul Rubin · May 20, 2004

Well, come on, of course there's going to be some things here and
there you can do in one and not the other. In wat is Python dynamic
that Lisp isn't to such an extent that it would cripple any attempts
to compile it?

The example above kills any attempt to turn a.bar() into a static
procedure call. There's more like it, e.g. the existence of the
locals() dictionary and the ability to modify it. However, it should
be possible to define a reasonable subset of Python that can compile
into good code. The stuff that makes compilation difficult makes the
code unmaintainable too.

I do think that Python's designers should wait til PyPy with
native-code backends has been deployed for a while before defining too
much of Python 3.0, so we can first gain some experience with compiled
Python. Python should evolve towards being compiled most of the time.

Carl Banks · May 20, 2004

Paul said:
The example above kills any attempt to turn a.bar() into a static
procedure call.

Of course it does--but it's one method. A compiler, if it's good,
would only make the optization on methods named "bar", and it could
probably pare the number of possible classes it could happen to down
to only a few.

I mean you could have a Turing nightmare on your hands, with all kinds
of crazy setattrs and execs and stuff, in both Python and Lisp, and
then there's not much a compiler could do but emit totally general
code. I assume Lisp compilers do this sometimes.

There's more like it, e.g. the existence of the
locals() dictionary and the ability to modify it.

New feature? I didn't think modifying the dict returned by locals
affected the variables.

However, it should
be possible to define a reasonable subset of Python that can compile
into good code. The stuff that makes compilation difficult makes the
code unmaintainable too.

True, but even if you didn't do that, I think a compiler could do a
decent job with reasonable code.

I do think that Python's designers should wait til PyPy with
native-code backends has been deployed for a while before defining too
much of Python 3.0, so we can first gain some experience with compiled
Python. Python should evolve towards being compiled most of the time.

Agreed

Paul Rubin · May 20, 2004

Carl Banks said:
Of course it does--but it's one method. A compiler, if it's good,
would only make the optization on methods named "bar", and it could
probably pare the number of possible classes it could happen to down
to only a few.

How could it possibly know? The reassignment of a.bar could happen
anytime, anywhere in the code. Maybe even in an eval.

I mean you could have a Turing nightmare on your hands, with all kinds
of crazy setattrs and execs and stuff, in both Python and Lisp, and
then there's not much a compiler could do but emit totally general
code. I assume Lisp compilers do this sometimes.

Lisp compilers might have to do that sometimes, but Python compilers
would have to do it ALL the time. Psyco took one way out, basically
generating code at runtime and caching it for specific operand types,
but the result is considerable code expansion compared to precompilation.

Konstantin Veretennicov · May 20, 2004

Carl Banks said:
New feature? I didn't think modifying the dict returned by locals
affected the variables.

Evidence of crime

Python 2.3.2Traceback (most recent call last):

File said:
locals()['x'] = 1
x

Click to expand...

Click to expand...

1

Not to mention stack frame magic available via inspect.*...

- kv

Michael Hudson · May 20, 2004

Carl Banks said:
Carl Banks said:

New feature? I didn't think modifying the dict returned by locals
affected the variables.

Click to expand...

Evidence of crime

Python 2.3.2Traceback (most recent call last):

File said:

locals()['x'] = 1
x

Click to expand...

Click to expand...

1

Not to mention stack frame magic available via inspect.*...

Well, fortunately *this* level of gimmickery is already somewhat
forbidden...

Cheers,
mwh

--

> so python will fork if activestate starts polluting it?

I find it more relevant to speculate on whether Python would fork
if the merpeople start invading our cities riding on the backs of
giant king crabs. -- Brian Quinlan, comp.lang.python

Duncan Booth · May 20, 2004

(e-mail address removed) (Konstantin Veretennicov) wrote in

File said:
New feature? I didn't think modifying the dict returned by locals
affected the variables.

Click to expand...

Evidence of crime

Python 2.3.2Traceback (most recent call last):

File said:

locals()['x'] = 1
x

Click to expand...

Click to expand...

1

That works because you are using locals() to access your global variables.
Put the same code in a function and it behaves differently:
.... x = 0
.... locals()['x'] = 1
.... print x
....0

You cannot depend on the behaviour of modifying locals() remaining
unchanged over different releases of Python. Bottom line: don't do this.

Ville Vainio · May 20, 2004

Paul> I do think that Python's designers should wait til PyPy with
Paul> native-code backends has been deployed for a while before
Paul> defining too much of Python 3.0, so we can first gain some
Paul> experience with compiled Python. Python should evolve
Paul> towards being compiled most of the time.

I think we might be able to get significant benefits from the .NET
implementation (provided that Mono would prove to be legally feasible
in the future) - a lot of the JIT work is done by "other people", and
being able to seamlessly combine Python with a statically typed
language (which probably produces faster code) would be able to give
Python a sweet role in many programming projects - the role of the
initial implementation/specification language, with grunts doing the
possible mechanical porting of performance critical areas.

We could have such a role already, but only in theory. C/C++
integration is not seamless enough, and the languages themselves are
considered to be too difficult, clumsy and unproductive to attract
companies.

One thing is pretty certain to benefit any future speedup efforts,
whatever road is chosen, namely optional type declarations. Their
implementation could be left unspecified, and the initial Python
implementation could use them only for runtime type checking in
debugmode (or ignore altogether). Type inferencing could also use them
to smooth up things.

compilers	2	Jan 22, 2010
How do I solidify my Python skills	1	Sep 15, 2023
40 year old compilers	7	Jun 3, 2013
Pixel based analysis of landsat satellite image Using python codes	1	May 1, 2023
Python GUI?	46	Sep 11, 2013
New to python looking for help	4	Sep 26, 2023
Python programmer looking at transitioning to Java. Any advice or resources?	3	Oct 18, 2022
When deployed to Heroku, python setup.py egg info did not run successfully.	1	Jul 4, 2022

Python compilers?

Yermat

Mitja

Svein Ove Aas

Peter Hansen

Peter Hansen

Carl Banks

SeeBelow

Greg Ewing

Paul Rubin

Michael Hudson

Harry George

Carl Banks

Heather Coppersmith

Paul Rubin

Carl Banks

Paul Rubin

Konstantin Veretennicov

Michael Hudson

Duncan Booth

Ville Vainio

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads