Profiling, sum-comprehension vs reduce

C

cnb

This must be because of implementation right? Shouldn't reduce be
faster since it iterates once over the list?
doesnt sum first construct the list then sum it?

-----------------------
reduce with named function: 37.9864357062
reduce with nested, named function: 39.4710288598
reduce with lambda: 39.2463927678
sum comprehension: 25.9530121845reduce with named function: 36.4529584067
reduce with nested, named function: 37.6278529813
reduce with lambda: 38.2629448715
sum comprehension: 26.0197561422


from timeit import Timer

def add(x,y):
return x+y

def rednamed(lst):
return reduce(add, lst)

def rednn(lst):
def add2(x,y):
return x+y
return reduce(add2, lst)

def redlambda(lst):
return reduce(lambda x,y:x+y, lst)

def com(lst):
return sum(x for x in lst)

s = xrange(101)

t1 = Timer('rednamed(s)', 'from __main__ import rednamed, s')
t2 = Timer('rednn(s)', 'from __main__ import rednn, s')
t3 = Timer('redlambda(s)', 'from __main__ import redlambda, s')
t4 = Timer('com(s)', 'from __main__ import com, s')

print "reduce with named function: ", t1.timeit()
print "reduce with nested, named function: ", t2.timeit()
print "reduce with lambda: ", t3.timeit()
print "sum comprehension: ", t4.timeit()
[/QUOTE][/QUOTE][/QUOTE]
reduce with named function: 36.7560729087
reduce with nested, named function: 38.5393266463
reduce with lambda: 38.3852953378
sum comprehension: 27.9001007111
 
M

Marc 'BlackJack' Rintsch

This must be because of implementation right? Shouldn't reduce be faster
since it iterates once over the list? doesnt sum first construct the
list then sum it?

No it doesn't. Why should it?
also, using range instead of xrange doesnt seem to generate a
performance-penalty:

(De)Allocating a list of length 100 isn't very slow. Try some million
elements. And watch the memory consumption too.

Ciao,
Marc 'BlackJack' Rintsch
 
H

Harold Fellermann

doesnt sum first construct the list then sum it?
def com(lst):
return sum(x for x in lst)

You construct a generator over an existing list in your code.
Try sum([x for x in lst]) to see the effect of additional list
construction. And while you're at it, try the simple sum(lst).

Cheers,

- harold -
 
B

Bas

This must be because of implementation right? Shouldn't reduce be
faster since it iterates once over the list?
doesnt sum first construct the list then sum it?

No, sum also iterates the sequence just once and doesn't create a
list. It is probably implemented similar to

def sum(sequence, start=0):
it = iter(sequence)
total = start
for i in it:
total += i
return i

but then implemented in C for speed. Reduce is probably implemented
pretty similar, but then with a random function instead of addition.
Make sure that you understand the difference between generator
expression and list comprehension, and that [f(x) for x in something]
is (almost) equal to list(f(x) for x in something), so you can emulate
a LC by using the list constructor on the equivalent GE.

HTH,
Bas
 
S

Steven D'Aprano

This must be because of implementation right? Shouldn't reduce be faster
since it iterates once over the list? doesnt sum first construct the
list then sum it?

What makes you think that?

Given the speed of sum(), it sure doesn't look like it's generating a
full list before summing. Why would it?

reduce with named function: 37.9864357062
reduce with nested, named function: 39.4710288598
reduce with lambda: 39.2463927678
sum comprehension: 25.9530121845

If you want to see reduce really shine, time it with a C-based function
rather than one written in pure Python:
.... 'from operator import add').repeat(number=10000)
[19.724750995635986, 19.410486936569214, 19.614511013031006].... 'def add(x, y): return x+y').repeat(number=10000)
[45.210143089294434, 44.814558982849121, 46.906874895095825]


You probably won't see much (if any) benefit for small lists, so make
sure your test is on a significantly-sized input list.

Of course, sum() is even faster than reduce:
[9.814924955368042, 8.7169640064239502, 9.5062401294708252]
 
T

Terry Reedy

Steven said:
If you want to see reduce really shine, time it with a C-based function
rather than one written in pure Python:
... 'from operator import add').repeat(number=10000)
[19.724750995635986, 19.410486936569214, 19.614511013031006]... 'def add(x, y): return x+y').repeat(number=10000)
[45.210143089294434, 44.814558982849121, 46.906874895095825] ....
Of course, sum() is even faster than reduce:
[9.814924955368042, 8.7169640064239502, 9.5062401294708252]

'Of course', because the irreducible difference between reduce(add.seq)
and sum(seq) is that reduce has to call an add function while sum has
the operation built-in in place of a call.
 
R

Raymond Hettinger

Note that, despite appearances, it's not as built-in as one might
wish.  sum(seq) is still completely generic and works on all
number-like objects (in fact on all objects that define an __add__
operation except strings, which are explicitly forbidden).  This means
that it can't just go ahead and execute a C addition, it must properly
call PyNumber_Add (the C API call equivalent to Python's "+"
operator), which will then inspect the objects and invoke the
appropriate implementation of addition.

The time machine strikes again! Try using Py2.6.
The built-in sum() function is much smarter and faster.
It does in fact use C addition.


Raymond
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,070
Latest member
BiogenixGummies

Latest Threads

Top