Towards faster Python implementations - theory

John Nagle · May 8, 2007

Some faster Python implementations are under development.
JPython has been around for a while, and PyPy and ShedSkin
continue to move forward. It's worth thinking about what slows
down Python implementations.

It isn't the dynamism, really. As others have pointed out
in the Python literature, most of the time, the more elaborate
dynamic features aren't being used for any given variable or
object. If code has something like "x = 1.0", the odds are that
"x" is going to stay a floating point number, and not suddenly turn
into a list or an object reference. The type inference of Shed Skin
builds on that assumption, adding some restrictions about changing of
variable types.

The Shed Skin effort indicates that explicit typing, via 'decorators'
or otherwise, isn't really necessary. What's necessary is the avoidance
of "surprises" in the language. In this context, a "surprise" is
the use of a dynamic feature in a way that can't be seen at compile time.

A typical "surprise" would be the use of "setattr" on an object from
outside the compilation unit that defines the object. Within a module,
"setattr" on an object in that module is no problem; the compiler can see
it and generate the extra machinery needed to make an object dynamically
alterable at run time. But if an object doesn't need that extra machinery
and associated dictionary, it's a big win to discard the excess baggage
and use a simpler fixed-size representation, comparable to a C struct,
for the object.

On the typing front, the neatest way to express typing is via
initialization. With the Shed Skin restrictions, if all variables are
initialized before use (preferably in __init__), there's no need to
maintain an "undefined" flag for a variable. And, of course, if
the type of a varaible is simple and can't change, it doesn't have to
be "boxed", (enclosed in an object) which is a big win.

The point here is that we don't need language changes or declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

John Nagle

Paul Boddie · May 8, 2007

Some faster Python implementations are under development.
JPython

Ahem: Jython!

has been around for a while, and PyPy and ShedSkin
continue to move forward. It's worth thinking about what slows
down Python implementations.

It's the dynamicity! ;-) But things like clever memory allocation can
pay dividends, too, and I wouldn't be surprised if this explained
Python's better-than-expected showing when compared to other languages
- such as that comparison you provoked with those claims of superior
JavaScript performance. ;-)

It isn't the dynamism, really.

In theory, no, but in practice CPython doesn't optimise away the
dynamism. Recent experiments with method caches seem to have shown
modest performance improvements, and I'd guess that such things are
fairly established in other dynamic language implementations.

As others have pointed out
in the Python literature, most of the time, the more elaborate
dynamic features aren't being used for any given variable or
object. If code has something like "x = 1.0", the odds are that
"x" is going to stay a floating point number, and not suddenly turn
into a list or an object reference. The type inference of Shed Skin
builds on that assumption, adding some restrictions about changing of
variable types.

The problem here, amongst many others, is knowing for sure whether the
more elaborate features have been avoided. Approaches which attempt to
avoid knowing such things include just-in-time compilation (you avoid
knowing in advance) and type declarations (you give up thinking about
whether it's possible and have the programmer do all the work).

The Shed Skin effort indicates that explicit typing, via 'decorators'
or otherwise, isn't really necessary. What's necessary is the avoidance
of "surprises" in the language. In this context, a "surprise" is
the use of a dynamic feature in a way that can't be seen at compile time.

I concur with your assessment of the supposed necessity of explicit
typing. However, you might be surprised as to what constitutes a
"surprise" in Python...

A typical "surprise" would be the use of "setattr" on an object from
outside the compilation unit that defines the object. Within a module,
"setattr" on an object in that module is no problem; the compiler can see
it and generate the extra machinery needed to make an object dynamically
alterable at run time. But if an object doesn't need that extra machinery
and associated dictionary, it's a big win to discard the excess baggage
and use a simpler fixed-size representation, comparable to a C struct,
for the object.

You don't even need to bring out setattr to make the inference
activity a difficult one. Even straightforward attribute access needs
to be repeatedly checked to see if the target of a normal attribute
assignment or query has changed. Traditionally, people have shown some
trivial function in isolation...

def f(x):
return x.a

....and said, "We don't know anything! It's all impossible!" But
context is everything, as you know, and whole program analysis is the
only way to go with the aforementioned goals in mind. What if the
parameter to the above itself comes from attribute access?

def g(y):
return f(y.b)

You can descend into some fairly demanding situations with respect to
keeping track of all the possibilities.

On the typing front, the neatest way to express typing is via
initialization. With the Shed Skin restrictions, if all variables are
initialized before use (preferably in __init__), there's no need to
maintain an "undefined" flag for a variable. And, of course, if
the type of a varaible is simple and can't change, it doesn't have to
be "boxed", (enclosed in an object) which is a big win.

I'm fairly sure, although my intuition unfortunately doesn't provide a
ready example right now, that typing via initialisation, whilst an
important tool, may either impose unacceptable restrictions if
enforced too rigidly or limit the precision that one might desire in
determining type information in the resulting system. But it is a
worthwhile objective to identify fixed-size structures and unboxed
values, in my opinion.

The point here is that we don't need language changes or declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

Agreed. And I don't buy into the negative "lesser Python" labelling of
such work, either. People seem to have forgotten how to use older,
conservative Python features, preferring to show off with metaclasses
and closures even for problems that could be solved using simple
classes and objects. If that's "greater Python" then call me a
minimalist!

Paul

Marc 'BlackJack' Rintsch · May 8, 2007

A variable?

Maybe that last sentence should start with: And, of course, if the type of
objects bound to a name won't changeâ€¦

I'm fairly sure, although my intuition unfortunately doesn't provide a
ready example right now, that typing via initialisation, whilst an
important tool, may either impose unacceptable restrictions if
enforced too rigidly or limit the precision that one might desire in
determining type information in the resulting system.

I often bind a name to `None` first if it is going to be bound to it's
"real" value later on. So this initialization doesn't say anything about
the type(s) that will be bound to it later.

I don't see how this type inference for static types will work unless some
of the dynamism of the language will get restricted. But is this really
necessary? Isn't a JIT compiler and maybe type hinting good enough? By
type hinting I really mean just recommendations from the programmer. So
you can say this name should be an `int` and the JIT compiler produces
code in advance, but it's okay to pass another object instead.

Ciao,
Marc 'BlackJack' Rintsch

Bruno Desthuilliers · May 8, 2007

John Nagle a écrit :

Some faster Python implementations are under development.
JPython has been around for a while,

s/JP/J/

And FWIW, I'm not sure Jython is really faster than CPython...

John Nagle · May 9, 2007

Marc said:
I don't see how this type inference for static types will work unless some
of the dynamism of the language will get restricted. But is this really
necessary? Isn't a JIT compiler and maybe type hinting good enough?

Not necessarily. One of the more powerful optimizations is to optimize
reference count updates. Often, you can hoist reference count updates
out of loops, which is a big win. But to do that, you need to be sure
that the code executed by the loop won't change unexpectedly.

My point is that while all the dynamism in Python is occasionally
useful, most of the time for most of the variables most of it isn't
being used. If we can discard the excess baggage, unnecessary
dictionary lookups, and unnecessary reference count updates a
large fraction of the time, it's a big win. This requires
"surprise-free" language semantics, so that when something unusual
is going on, it's visibly reflected in the code.

Surprise-free semantics also make code easier to debug.

John Nagle

Alex Martelli · May 9, 2007

Bruno Desthuilliers said:
John Nagle a écrit :

s/JP/J/

These days, yes, but it WAS originally called JPython (it was renamed to
Jython later). So, "has been around a while" is an appropriate context
for using the good old name.

And FWIW, I'm not sure Jython is really faster than CPython...

Definitely not, last I measured. Not IronPython either, overall (though
it's much better than Jython and does get to beat CPython on some
specific benchmarks).

Alex

Kay Schluehr · May 9, 2007

Not necessarily. One of the more powerful optimizations is to optimize
reference count updates. Often, you can hoist reference count updates
out of loops, which is a big win. But to do that, you need to be sure
that the code executed by the loop won't change unexpectedly.

The advantage of having a JIT is just that it can record data at
runtime and respond flexible to them. It doesn't run into global
static analysis problems mentioned by Paul. A "semi-dynamical"
compromise would mean to use a profile of samples of runtime data and
assert that they reflect typical" behaviour. Then the system needs an
ability to fall back to usual bytecode interpretation. Psyco does that
by analyzing the next opcode and a very clever dispatch mechanism.

A more code oriented, profile driven approach would factor source into
natively compilable parts and those that have to be bytecode
interpreted. I wonder whether bridges to Pyrex, Boost.Python or the
celerid bridge to D could be used and what the performance penalties
are for all the argument/return value wrapping and unwrapping.
Unfortunately ShedSkin lacks CPython integration. We talked about this
here recently.

Kay

Hendrik van Rooyen · May 9, 2007

8<---------------- summary of existing work and thinking ------------------

The point here is that we don't need language changes or declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

I am relatively new on this turf, and from what I have seen so far, it
would not bother me at all to tie a name's type to its first use, so that
the name can only be bound to objects of the same type as the type
of the object that it was originally bound to.

But maybe I am missing the point of dynamism.

Would an implementation of the above break lots of stuff in practice?

It seems to me that it could have the effect of a declaration without
the wart of actually doing it.

- Hendrik

Paul Boddie · May 9, 2007

I am relatively new on this turf, and from what I have seen so far, it
would not bother me at all to tie a name's type to its first use, so that
the name can only be bound to objects of the same type as the type
of the object that it was originally bound to.

But it's interesting to consider the kinds of names you could restrict
in this manner and what the effects would be. In Python, the only kind
of name that can be considered difficult to arbitrarily modify "at a
distance" - in other words, from outside the same scope - are locals,
and even then there are things like closures and perverse
implementation-dependent stack hacks which can expose local namespaces
to modification, although any reasonable "conservative Python"
implementation would disallow the latter.

In a local namespace you could restrict names in this way, although
I'd argue that with the limitations on locals, you don't gain as much
as you would by restricting other names similarly. However, by
restricting other kinds of names (eg. instance attributes) you have to
start thinking about polymorphism: what if attribute x on instances of
class C can have different types? If you aren't careful, you've
introduced interfaces as the primary mechanism permitting some kind of
polymorphism.

Paul

Terry Reedy · May 9, 2007

| I am relatively new on this turf, and from what I have seen so far, it
| would not bother me at all to tie a name's type to its first use, so that
| the name can only be bound to objects of the same type as the type
| of the object that it was originally bound to.
|
| But maybe I am missing the point of dynamism.
|
| Would an implementation of the above break lots of stuff in practice?

For function local variables, if you mean 'originally bound to' in the
current call invocation, that would sometimes be ok (and that is sort of
what Psycho does). But if you mean in the original binding in the first
call invocation, then that would cripple many functions.

tjr

John Nagle · May 9, 2007

Paul said:
But it's interesting to consider the kinds of names you could restrict
in this manner and what the effects would be. In Python, the only kind
of name that can be considered difficult to arbitrarily modify "at a
distance" - in other words, from outside the same scope - are locals,
and even then there are things like closures and perverse
implementation-dependent stack hacks which can expose local namespaces
to modification, although any reasonable "conservative Python"
implementation would disallow the latter.

Modifying "at a distance" is exactly what I'm getting at. That's the
killer from an optimizing compiler standpoint. The compiler, or a
maintenance programmer, looks at a block of code, and there doesn't seem
to be anything unusual going on. But, if in some other section of
code, something does a "setattr" to mess with the first block of code,
something unusual can be happening. This is tough on both optimizing
compilers and maintenance programmers.

Python has that capability mostly because it's free in an
"everything is a dictionary" implementation. ("When all you have
is a hash, everything looks like a dictionary".) But that limits
implementation performance. Most of the time, nobody is using
"setattr" to mess with the internals of a function, class, or
module from far, far away. But the cost for that flexibility is
being paid, unnecessarily.

I'm suggesting that the potential for "action at a distance" somehow
has to be made more visible.

One option might be a class "simpleobject", from which other classes
can inherit. ("object" would become a subclass of "simpleobject").
"simpleobject" classes would have the following restrictions:

- New fields and functions cannot be introduced from outside
the class. Every field and function name must explicitly appear
at least once in the class definition. Subclassing is still
allowed.
- Unless the class itself uses "getattr" or "setattr" on itself,
no external code can do so. This lets the compiler eliminate the
object's dictionary unless the class itself needs it.

This lets the compiler see all the field names and assign them fixed slots
in a fixed sized object representation. Basically, this means simple objects
have a C/C++ like internal representation, with the performance that comes
with that representation.

With this, plus the "Shed Skin" restrictions, plus the array features of
"numarray", it should be possible to get computationally intensive code
written in Python up to C/C++ levels of performance. Yet all the dynamic
machinery of Python remains available if needed.

All that's necessary is not to surprise the compiler.

John Nagle

Paul Boddie · May 9, 2007

John said:
Modifying "at a distance" is exactly what I'm getting at. That's the
killer from an optimizing compiler standpoint. The compiler, or a
maintenance programmer, looks at a block of code, and there doesn't seem
to be anything unusual going on. But, if in some other section of
code, something does a "setattr" to mess with the first block of code,
something unusual can be happening. This is tough on both optimizing
compilers and maintenance programmers.

Agreed. It's tempting to look at some code and say which types one
thinks are being manipulated, but the actual program behaviour can be
quite different. Adding explicit type declarations "anchors" the names
to a very restrictive set of types - typically those permitted through
interfaces or inheritance in many object-oriented languages - and the
challenge, as we've already discussed, is to attempt to "anchor" the
names when they are in a sense "floating free" without any explicit
annotations from the programmer.

Python also presents other challenges. Unlike systems programming
languages, almost nothing is a local operation: each Python function
is mostly a product of function calls and the occasional conditional
or looping construct, themselves wired up using yet more function
calls. Whilst the expection is that most of these will return sensible
results (eg. a call to the iter method on a sequence returns an
iterator), there isn't any guarantee that this will be the case, and
even then the precise iterator involved can vary. Some might claim
that the class of operations which could be done locally are precisely
the ones which would benefit from identification and optimisation,
namely those involving primitive types, yet some pretty good solutions
for such cases exist already: Pyrex, various numeric packages, and so
on.

Python has that capability mostly because it's free in an
"everything is a dictionary" implementation. ("When all you have
is a hash, everything looks like a dictionary".) But that limits
implementation performance. Most of the time, nobody is using
"setattr" to mess with the internals of a function, class, or
module from far, far away. But the cost for that flexibility is
being paid, unnecessarily.

I've heard claims that virtual machines are evolving to make
dictionary-based access more performant, and I guess Microsoft's DLR
and any JVM improvements might point in that direction, but Shed Skin
shows the way by avoiding such things entirely.

I'm suggesting that the potential for "action at a distance" somehow
has to be made more visible.

One option might be a class "simpleobject", from which other classes
can inherit. ("object" would become a subclass of "simpleobject").
"simpleobject" classes would have the following restrictions:

- New fields and functions cannot be introduced from outside
the class. Every field and function name must explicitly appear
at least once in the class definition. Subclassing is still
allowed.

I suppose one could mandate that only methods defined inside class
blocks belong to the class and subclasses, and that various kinds of
optimisations can then be applied to the code in those methods such
that the self parameter's type is constrained in a way probably
already imposed by CPython. This would forbid the adding of functions
to classes and instances after the fact, but in my fairly conservative
world of programming, I hardly ever do such things. I do add
attributes to instances outside those instances (by the above
definition) quite often, though.

- Unless the class itself uses "getattr" or "setattr" on itself,
no external code can do so. This lets the compiler eliminate the
object's dictionary unless the class itself needs it.

I think this is a natural progression from determining the access
needs of particular classes and their objects.

This lets the compiler see all the field names and assign them fixed slots
in a fixed sized object representation. Basically, this means simple objects
have a C/C++ like internal representation, with the performance that comes
with that representation.

Yes, I agree that this would be desirable.

With this, plus the "Shed Skin" restrictions, plus the array features of
"numarray", it should be possible to get computationally intensive code
written in Python up to C/C++ levels of performance. Yet all the dynamic
machinery of Python remains available if needed.

All that's necessary is not to surprise the compiler.

One thing I had cause to think about again today was the cost of
function invocations. Python not only has pervasive dynamic dispatch,
but there are also such things as *args and **kwargs and the cost of
dealing with them. However, such things are very useful in integrating
components and extending class hierarchies, so I wouldn't want to see
them vanish in their entirety.

Paul

Klaas · May 10, 2007

One option might be a class "simpleobject", from which other classes
can inherit. ("object" would become a subclass of "simpleobject").
"simpleobject" classes would have the following restrictions:

- New fields and functions cannot be introduced from outside
the class. Every field and function name must explicitly appear
at least once in the class definition. Subclassing is still
allowed.
- Unless the class itself uses "getattr" or "setattr" on itself,
no external code can do so. This lets the compiler eliminate the
object's dictionary unless the class itself needs it.

This lets the compiler see all the field names and assign them fixed slots
in a fixed sized object representation. Basically, this means simple objects
have a C/C++ like internal representation, with the performance that comes
with that representation.

Hey look, it already exists:
.... __slots__ = 'a b c d'.split()
Traceback (most recent call last):
False

-Mike

Hendrik van Rooyen · May 10, 2007

Terry Reedy said:
| I am relatively new on this turf, and from what I have seen so far, it
| would not bother me at all to tie a name's type to its first use, so that
| the name can only be bound to objects of the same type as the type
| of the object that it was originally bound to.
|
| But maybe I am missing the point of dynamism.
|
| Would an implementation of the above break lots of stuff in practice?

For function local variables, if you mean 'originally bound to' in the
current call invocation, that would sometimes be ok (and that is sort of
what Psycho does). But if you mean in the original binding in the first
call invocation, then that would cripple many functions.

Errr - I was thinking a bit simplistic - I know I can write:

def f(x):
for y in x:
print y # using print here as short for doing something complicated

And that would currently "work" with any iterable, as x could
currently be anything.

It seems that such functions are the problem, as something like this:

x = [1,2,3,4,5]
for y in x:
print y

does not have the same hassle for x, but can shift the problem to y:

x = [1,2,3,4,(1,2)]
for y in x:
print y

I can't see an easy way to put the patient back into his straight jacket.

Makes you want to use pre defined globals...

- Hendrik

Hendrik van Rooyen · May 10, 2007

Modifying "at a distance" is exactly what I'm getting at. That's the
killer from an optimizing compiler standpoint. The compiler, or a
maintenance programmer, looks at a block of code, and there doesn't seem
to be anything unusual going on. But, if in some other section of
code, something does a "setattr" to mess with the first block of code,
something unusual can be happening. This is tough on both optimizing
compilers and maintenance programmers.

Python has that capability mostly because it's free in an
"everything is a dictionary" implementation. ("When all you have
is a hash, everything looks like a dictionary".) But that limits
implementation performance. Most of the time, nobody is using
"setattr" to mess with the internals of a function, class, or
module from far, far away. But the cost for that flexibility is
being paid, unnecessarily.

I'm suggesting that the potential for "action at a distance" somehow
has to be made more visible.

One option might be a class "simpleobject", from which other classes
can inherit. ("object" would become a subclass of "simpleobject").
"simpleobject" classes would have the following restrictions:

- New fields and functions cannot be introduced from outside
the class. Every field and function name must explicitly appear
at least once in the class definition. Subclassing is still
allowed.
- Unless the class itself uses "getattr" or "setattr" on itself,
no external code can do so. This lets the compiler eliminate the
object's dictionary unless the class itself needs it.

This lets the compiler see all the field names and assign them fixed slots
in a fixed sized object representation. Basically, this means simple objects
have a C/C++ like internal representation, with the performance that comes
with that representation.

With this, plus the "Shed Skin" restrictions, plus the array features of
"numarray", it should be possible to get computationally intensive code
written in Python up to C/C++ levels of performance. Yet all the dynamic
machinery of Python remains available if needed.

All that's necessary is not to surprise the compiler.

If this is all it takes, I would even be happy to have to declare which things
could be surprising - some statement like:

x can be anything

Would that help?

It kind of inverts the logic - and states that if you want what is now
the default behaviour, you have to ask for it.

- Hendrik

sturlamolden · May 10, 2007

The point here is that we don't need language changes or declarations
to make Python much faster. All we need are a few restrictions that
insure that, when you're doing something unusual, the compiler can
tell.

Franz, CMUCL, SBCL and GCL teams made Lisp almost as fast as C. A
dynamic language can be fast if the implementation is good.

If you look at SBCL and GCL, no code is interpreted. It's all compiled
on the fly to native machine code. The compiler begins with some code
and some input data, and compiles as much as it can. Then the RT
executes the machine code, compiles again, etc. Often long stretches
of code can be compiled without break, and tight performance critical
loops are usually compiled only once. In addition to this, one needs
an efficient system to cache compiled code, in order to do the
compilation work only once. making a dynamic language fast is not
rocket science.

We should have somthing like "GPython", a Python RT on top of a GCC
backend, similar to what the GCL team did for Lisp. There is no
obvious reason as to why Lisp should have better performance than
Python.

Tim Golden · May 10, 2007

sturlamolden said:
Franz, CMUCL, SBCL and GCL teams made Lisp almost as fast as C. A
dynamic language can be fast if the implementation is good.

If you look at SBCL and GCL, no code is interpreted. It's all compiled
on the fly to native machine code. The compiler begins with some code
and some input data, and compiles as much as it can. Then the RT
executes the machine code, compiles again, etc. Often long stretches
of code can be compiled without break, and tight performance critical
loops are usually compiled only once. In addition to this, one needs
an efficient system to cache compiled code, in order to do the
compilation work only once. making a dynamic language fast is not
rocket science.

We should have somthing like "GPython", a Python RT on top of a GCC
backend, similar to what the GCL team did for Lisp. There is no
obvious reason as to why Lisp should have better performance than
Python.

I doubt if anyone disputes the gist of what you're
saying[*], viz that Python could be made faster by using
technique (a), (b) or (c) which have been successful
elsewhere. At least that it's worth investgating.

But the relevant bit of your last paragraph is at the start:
"We should...". Unless someone or someones has the time,
inclination, money, backing, wherewithal etc. to implement
this or any other measure of speeding-up, it's all
pie-in-the-sky. Useful, maybe, as discussion of what
options are viable, but a project of this magnitude
doesn't just happen in some developer's lunchbreak.

Many people (and I include myself) are quite
happy with Python's speed. In fact, I'm quite happy
with most things about Python. Others would like to
see it faster. That's great. But unless people
puts their money where their mouths are, I don't
see it happening.

TJG

[*] Actually, knowing this community, I'm sure *someone's*
going to!

Terry Reedy · May 10, 2007

| Franz, CMUCL, SBCL and GCL teams made Lisp almost as fast as C. A
| dynamic language can be fast if the implementation is good.
|
| If you look at SBCL and GCL, no code is interpreted. It's all compiled
| on the fly to native machine code.

Unfortunately, native machine code depends on the machine, or at least the
machine being emulated by the hardware. Fortunately or not, the dominance
of the x386 model makes this less of a problem.

| The compiler begins with some code
| and some input data, and compiles as much as it can. Then the RT
| executes the machine code, compiles again, etc. Often long stretches
| of code can be compiled without break, and tight performance critical
| loops are usually compiled only once. In addition to this, one needs
| an efficient system to cache compiled code, in order to do the
| compilation work only once. making a dynamic language fast is not
| rocket science.

This sound somewhat similar to Psyco. If a code section is entered with
different machine types, recompilation is needed. One of the problems with
Psyco is that its efficient caching of multiple versions of compiled code
is space-hungry.

| We should have somthing like "GPython", a Python RT on top of a GCC
| backend, similar to what the GCL team did for Lisp. There is no
| obvious reason as to why Lisp should have better performance than
| Python.

In the 1980s, during the Lisp/AI boom, perhaps a billion dollars was put
into making Lisp run faster. Plus decades of work in Computer Science
departments. Other than that, I agree. Now, Python is slowly making
inroads into academia as a subject of research and theses, and the PyPy
group got at least one large EU grant.

Terry Jan Reedy

olsongt · May 10, 2007

Paul Boddie wrote:

Python has that capability mostly because it's free in an
"everything is a dictionary" implementation. ("When all you have
is a hash, everything looks like a dictionary".) But that limits
implementation performance. Most of the time, nobody is using
"setattr" to mess with the internals of a function, class, or
module from far, far away. But the cost for that flexibility is
being paid, unnecessarily.

I've had this idea in the back of my head for a year or so now, but
have been too lazy or busy or both to give the implementation a try.

The idea is to create a special dictionary for python internals that
contains a pointer-to-a-pointer for the value side of the dictionary,
instead of just the standard PyObject*. Then when you start to eval a
code block, you could do a one-time lookup of the pointer-to-the-
pointer for each method, attribute, etc; and store them in pre-
allocated slots. The calls in the block to various GET_ and SET_
functions can then just deal with pointer derefencing instead of a
dictionary lookup, and the dictionary can still get changed while this
is going on without things blowing up.

I think you'd get a big speed up by changing all those dictionary
function calls to inlined pointer deref code.

Anyone else buy this approach? Or see any fatal flaws?

Terry Reedy · May 10, 2007

| The idea is to create a special dictionary for python internals that
| contains a pointer-to-a-pointer for the value side of the dictionary,
| instead of just the standard PyObject*. Then when you start to eval a
| code block, you could do a one-time lookup of the pointer-to-the-
| pointer for each method, attribute, etc; and store them in pre-
| allocated slots. The calls in the block to various GET_ and SET_
| functions can then just deal with pointer derefencing instead of a
| dictionary lookup, and the dictionary can still get changed while this
| is going on without things blowing up.
|
| I think you'd get a big speed up by changing all those dictionary
| function calls to inlined pointer deref code.
|
| Anyone else buy this approach? Or see any fatal flaws?

Within CPython functions, local variables are usually implemented as
numbered slots in a C array (of PyObject pointers). There is why a) the
compiler makes a first pass thru a function body (to determine what is
local and what is not) and b) LOAD_FAST is faster than LOAD_GLOBAL.

Terry Jan Reedy

On-topic: alternate Python implementations	41	Aug 4, 2012
Pointers in python?	1	Feb 6, 2024
Standardizing RPython - it's time.	8	Oct 11, 2010
Serpent, a serializer based on ast.literal_eval. Now also with Javaand .net implementations	0	Apr 3, 2013
The spinoza papers: towards a theory of progress reporting	0	May 16, 2008
Python battle game help	2	Feb 23, 2023
Pixel based analysis of landsat satellite image Using python codes	1	May 1, 2023
SOLVE THIS IF YOU CAN PYTHON MASTER	7	Jan 30, 2023

Towards faster Python implementations - theory

John Nagle

Paul Boddie

Marc 'BlackJack' Rintsch

Bruno Desthuilliers

John Nagle

Alex Martelli

Kay Schluehr

Hendrik van Rooyen

Paul Boddie

Terry Reedy

John Nagle

Paul Boddie

Klaas

Hendrik van Rooyen

Hendrik van Rooyen

sturlamolden

Tim Golden

Terry Reedy

olsongt

Terry Reedy

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads