Python is faster than C

Duncan Booth · Apr 4, 2004

Why can't the C implementation do the same thing?

It easily could, however sorting a list of ints sounds to me like a
comparatively unusual thing to do in any real application. Sorting strings,
or sorting more complex data structures are much more usual. In other words
it would be kind of pointless to introduce an optimisation that only helps
speedup benchmarks.

Also the builtin sort handles other cases which are important. Not all data
you want to sort is randomly ordered. The builtin sort method should
outperform quicksort in those cases where the input is already largely
sorted. The builtin sort is also stable, although for a list of random
integers you might be hard pushed to tell

Michel Claveau/Hamster · Apr 4, 2004

Hi !

Yes, but Psyco is writted in C & Python, and it use an C module.
Said "Psyco is faster than C", it's like said "Psyco is faster than Psyco".

@-salutations (and sorry for my *bad english*)

Michel Claveau/Hamster · Apr 4, 2004

Hi !

Sorry, but i had let down C language. I can't compare.
See also my answer to Josiah Carlson.

And Psyco is sufficiently good not to need ambiguous arguments.

@-salutations

Jeff Epler · Apr 4, 2004

from pprint import *
def install_hook():
def displayhook(v):
import __main__
if v is not None:
if isiterator(v):
print "<%s object at 0x%8x:" % (
type(v).__name__, id(v)),
v, copy = itertools.tee(v)
for elem in itertools.islice(copy, 3):
print saferepr(elem),
try:
copy.next()
except StopIteration:
pass
else:
print '...',
sys.stdout.softspace=0
print ">"
else:
pprint(v)
__main__._ = v

sys.displayhook = displayhook

Jeff

Andrew Dalke · Apr 4, 2004

Armin Rigo:

Another example would be 'a'*999999999: the result is a string, but
there is no reason that it takes 100MB of memory. Instead, store it
into a C structure that contains a pointer to the original string object
'a' and the repetition counter, but still give this C structure the
Python type str, so that the difference doesn't show up and the Python
language remains simple.

I've written lazy code like this for my own projects, where the
concrete object wasn't fully created until needed. One of the
problems I had with it was error behaviour. In this case, suppose
there isn't enough memory available. Python as is will attempt to
allocate enough space when you request it and raise a MemoryError
if there isn't enough space.

Python as you propose it may allocate the string-like object and
only later (eg, when passing the object to some external C
function which expects a char *) realize the full string. There
isn't enough memory so the allocation fails, raising a MemoryError.
But how is the programmer to realize which operations may raise
a MemoryError? Instead, will everything need a try/except guard
around it?

Andrew
(e-mail address removed)

Josiah Carlson · Apr 4, 2004

Yes, but Psyco is writted in C & Python, and it use an C module.

Said "Psyco is faster than C", it's like said "Psyco is faster than Psyco".

You probably meant "C is faster than C". That is a reasonable
statement, but users of Psyco don't need to write anything special
beyond a handful of extra statements in Python.

To the user, Psyco is a module that makes things fast. They wrote
everything in Python, so to them, "Python with Psyco is faster than C"
is far more accurate.

- Josiah

Carl Banks · Apr 5, 2004

Raymond said:
[Armin Rigo]

enumerate([6,7,8,9]) # uh ?

Click to expand...

<enumerate object at 0x401a102c>

Click to expand...

This got me thinking about how much I would like to see the contents
of an iterator at the interactive prompt.

I wonder if the read-eval-print loop could be trained to make a better
display:

# rough pseudo-code sketch
while 1:
command = raw_input()
result = eval(command)
if result is None:
continue
if is_iterator(result):
result, copy = itertools.tee(result)
print "<%s object at 0x%8x:" %
(type(result).__name__, id(result)),
for elem in itertools.islice(copy, 3):
print repr(elem),
else:
print '...',
print '>'
else:
print repr(result)
_ = result

# intended result[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f'), (6, 'g'),
(7, 'h'), (8, 'i'), (9, 'j'), (10, 'k'), (11, 'l'), (12, 'm'), (13,
'n')]

I thought this myself, but what if the iterator is computationally
intensive?

Carl Banks · Apr 5, 2004

Armin said:
No, range should return an object that is a list, as far as you can tell
from Python, but which is represented more efficiently than an array of
objects internally. The distinction is between the language level (it
would be a list, with all operations, etc.) and the implementation
(there is no reason why all lists should be arrays of PyObjects
internally).

Another example would be 'a'*999999999: the result is a string, but
there is no reason that it takes 100MB of memory. Instead, store it
into a C structure that contains a pointer to the original string object
'a' and the repetition counter, but still give this C structure the
Python type str, so that the difference doesn't show up and the Python
language remains simple. (This is a bit difficult to implement
currently in CPython, but not impossible.)

Hmm. What would be really cool is an abstract sequence type with all
kinds of knobs to specify the operations you want to support. In
other words, rather than saying "I want a vector" or "I want a linked
list," you say, "I want fast indexing, fast sorting, no insertion, and
no resizing," or "I want no indexing, no sorting, fast insertion, fast
appending on both ends". Then, let the computer choose the actual
implementation.

Better yet, if the compilation is highly optimized, the compiler could
trace the path of the object (best as it can) through the program, and
maybe use profiling data too, to see for itself the how the object is
used, thus freeing the programmer from even specifying the operations.
The programmer could say, "I want a sequence," use it, and the
compiler could choose the best implementation.

The only thing is, I think that would be prohibitively difficult in an
on-the-fly compiled language like Python. Even having a list with one
or two optimized forms would have drawbacks: it would be hard to
optimize the fast parts with the interpretter second-guessing you all
the time, and it would be hard to write extension modules, since
they'd have to be aware of all the different layouts, or have to
always use an API.

And, frankly, I think having a simple implementation is important,
too. Complex implementations are more likely to have lots of bugs
that could burden users more than some distinct optimized types would.
In my opinion, Python is doing the right thing by having one internal
form per type.

Given that, and keeping in mind that some of these optimizations
really do help, the question to ask is: do the extra power and
efficiency the optimized version gives you justify the extra burden of
complexity on the programmer? I believe that the answer is
occasionally yes.

Consider tuples. Tuples are for the most part an optimized version
of list, but with a few powers list doesn't have, and without a few
powers list has. It's the same with iterators.

IMO, the the benefit an iterator gives you is far more than the
benefit a tuple gives you. I would say 'yes' for an iterator, it's
worth a bit of complexity; and probably 'no' for a tuple (although
it's close).

my-two-cents-ly y'rs,

Raymond Hettinger · Apr 5, 2004

[Armin Rigo]

Armin> Worse, and more importantly, the optimization starts to
Armin> become visible to the programmer. Iterators, for example,
Armin> are great in limited cases but I consider their
Armin> introduction a significant complication in the language;
Armin> before, you could expect that some function from which you
Armin> would expect a sequence returned a list. Python was all
Armin> lists and

[Ville Vainio]

Iterators are an optimization? I always considered them just a more
clean and elegant way to do the same thing .

In the planning discussions, optimization was not mentioned as goal.
In fact, I think everyone was surprised that they happened to run
faster than the __getitem__ hack. It was also a surprise at how much
they unified the language and made the various pieces work together a
little better.

The original goals were much smaller:
* less cumbersome file iteration (eliminating the need for xreadlines
and reducing the need for the fileinput module).
* providing an extensible interface
* making the code for non-sequence collections more concise and
readable

There was a thought that the performance of dictionary iteration would
improve. Also, while it was not discussed explicitly, everyone was
aware that iterators were more memory/cache friendly than their list
based counterparts.

Raymond Hettinger

Raymond Hettinger · Apr 5, 2004

[Jeff Epler]

from pprint import *
def install_hook():
def displayhook(v):
import __main__
if v is not None:
if isiterator(v):
print "<%s object at 0x%8x:" % (
type(v).__name__, id(v)),
v, copy = itertools.tee(v)
for elem in itertools.islice(copy, 3):
print saferepr(elem),
try:
copy.next()
except StopIteration:
pass
else:
print '...',
sys.stdout.softspace=0
print ">"
else:
pprint(v)
__main__._ = v

sys.displayhook = displayhook

This is very nice.

In my sketch, I used Py2.4's itertools.tee() to handle non-restartable
iterators. For Py2.3, a good alternative is:

copy = list(itertools.islice(v, 3))
v = itertools.chain(copy, v)

Now, copy can be used in the display and "v" produces the same values
as the original iterator.

Cheers,

Raymond Hettinger

P.S. If some experimentation shows your code to be useful at the
interactive prompt, please submit a patch on SF so it won't be lost.

Raymond Hettinger · Apr 5, 2004

enumerate('abcdefgh')

list(_)

Click to expand...

[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f'), (6, 'g'),
(7, 'h'), (8, 'i'), (9, 'j'), (10, 'k'), (11, 'l'), (12, 'm'), (13,
'n')]

Click to expand...

[Carl Banks]

I thought this myself, but what if the iterator is computationally
intensive?

Since no more than three items are displayed, the interactive prompt
will still handle this much better than something like range(10000000)
which takes a while to compute and display.

Besides, everything you do in Python pays a little performance penalty
just to make sure you can break out of computationally intensive
tasks.

For me, these almost never arise at the interactive prompt.

Likewise, I program in a less hostile world than Andrew Dalke who has
to contend with pandora's box iterators which ruin the lives of mortal
men who think they can call .next() with impunity ;-)

Raymond Hettinger

Jacek Generowicz · Apr 5, 2004

Michel Claveau/Hamster said:
Yes, but Psyco is writted in C & Python, and it use an C module.
Said "Psyco is faster than C", it's like said "Psyco is faster than Psyco".

You appear to be missing the point completely.

The language used to implement the tools you use is irrelevant. What
is important is the language in which you are programming.

Armin's point is that (in certain circumstances) you can achieve a
significant speedup by one of two ways

- Recode some Python code in C

- import psyco; psyco.full()

Note: in the second case, this is _all_[*] you have to do; it takes
about 2 seconds of work. In the first case it take many
hours/days/weeks of tedious and error-prone work.

The point is: Why code in C when you can get just as good program
speed performance by coding in Python.

Put another way: Why code in a low-level, tedious, rigid,
error-inducing language, when you can get just as good program speed
performance by writing in a high-level, flexible, dynamic, pleasant
language.

[*] Yes, I'm deliberately keeping things simple in order not to draw
attention away from the real point.

Paul Prescod · Apr 5, 2004

Raymond said:
...

P.S. If some experimentation shows your code to be useful at the
interactive prompt, please submit a patch on SF so it won't be lost.

Why put this behaviour in the interactive prompt rather than in repr()?

Paul Prescod

Michael Hudson · Apr 5, 2004

Paul Rubin said:
Armin Rigo said:

Ideally: If you do x=range(100); x[50]='hi' then the interpreter first
builds this optimized range representation and assigns it to x; and when
in the next statement you modify this list x it says 'oops! i cannot do
that with this representation', so it reverts to an array-like
representation (i.e. it creates all 100 elements) and then changes the
50th. No gain here. If on the other hand you only ever do 'easy'
things with your list, like iterate over it or read elements, then it
can all be done with the range representation, without falling back to
the array representation.

Click to expand...

Maybe there is something to this.

Armin is usually worth listening too

You can find a step towards many of these ideas in PyPy, which should
surprise no one at all...

Cheers,
mwh

Andrew Dalke · Apr 5, 2004

Raymond Hettinger:

Likewise, I program in a less hostile world than Andrew Dalke who has
to contend with pandora's box iterators which ruin the lives of mortal
men who think they can call .next() with impunity ;-)

I did say "Should be rare though."

Andrew
(e-mail address removed)

(And now you've got me thinking about the Tomb Raider II movie.
Not a bad thing.)

Christian Tismer · Apr 5, 2004

Michel said:
Hi !

Yes, but Psyco is writted in C & Python, and it use an C module.
Said "Psyco is faster than C", it's like said "Psyco is faster than Psyco".

This is not so much of a contradiction.

First of all, Psyco creates assembly code. Not the fastest,
but it enters an area where Python has no entrance, yet.
So what it does is to produce faster code, although not as fast
as C could, but faster than the existing C implementation
of Python.

Seconde, depending on the algorithm you use, even Python can be
faster than C. For example, a few years ago, I implemented
a fast long integer multiplication in Python. At that time,
CPython still had asymptotic behavior of O(n**2), while
the Karatsuba algorithm I used hat something like O(2**1.53).
The difference showed up with several thousands of digits, but
it was remarkable.

Later on, Karatsuba was built into the core, and it became clear
to me that *I* caused this mess, because I had the chance to
get that algorithm into the core, long time ago. I just missed it.

What I'm trying to say: Finally it is always the algorithm.

ciao - chris

--
Christian Tismer :^) <mailto:[email protected]>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/

Stephen Horne · Apr 6, 2004

Armin Rigo said:
Armin Rigo said:

Ideally: If you do x=range(100); x[50]='hi' then the interpreter first
builds this optimized range representation and assigns it to x; and when
in the next statement you modify this list x it says 'oops! i cannot do
that with this representation', so it reverts to an array-like
representation (i.e. it creates all 100 elements) and then changes the
50th. No gain here. If on the other hand you only ever do 'easy'
things with your list, like iterate over it or read elements, then it
can all be done with the range representation, without falling back to
the array representation.

Click to expand...

Maybe there is something to this.

The problem is, once you start where do you stop.

At the moment, Armin suggests optimising the a list consisting of a
single repeating item. But what about optimising a repeating list with
one or two special cases, so that the "x[50]='hi'" above doesn't incur
a penalty?

Consider integer lists - what about optimising arithetic progressions?
Geometric progressions? Progressions with special cases? Progressions
that are the intersection, or union, or whatever of other
progressions?

If the internal implementation doesn't special-case the implementation
of operations on these, all you have is lazy evaluation. But if the
internal implementation adds special-case implementations of
operations, you either end up with an huge number of special case
implementation methods (other parameters can end up with special-case
implementations, not just 'self') or you have to implement what
amounts to a full algebraic solving engine in the Python interpreter.

Or else you can just choose to special case the really important types
and operations, which I believe Python already does to some degree (an
integer is a special case of a long integer, for instance, and an
iterator is a special case of a list with only a subset of the
operations available to a standard list) and provide the programmer
with the tools to easily implement any further special cases that he
may need.

Lua is faster than Fortran???	42	Jul 4, 2010
Taking list as an input from Python to C	1	Jul 21, 2022
Why is Python telling me variable is local not global?	3	Sep 2, 2023
IronPython faster than CPython?	3	Dec 18, 2007
Hello I have question for a programmer it`s regarding REDDIT but this forum is the best where to put it?	0	Sep 12, 2023
JIT compilers for Python, what is the latest news?	19	Apr 5, 2013
Why is it impossible to create a compiler than can compile Python tomachinecode like C?	25	Feb 28, 2013
Why is this so much faster?	3	Jun 2, 2011

Python is faster than C

Duncan Booth

Michel Claveau/Hamster

Michel Claveau/Hamster

Jeff Epler

Andrew Dalke

Josiah Carlson

Carl Banks

Carl Banks

Raymond Hettinger

Raymond Hettinger

Raymond Hettinger

Jacek Generowicz

Paul Prescod

Michael Hudson

Andrew Dalke

Christian Tismer

Stephen Horne

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads