Profiling, sum-comprehension vs reduce

cnb · Sep 13, 2008

This must be because of implementation right? Shouldn't reduce be
faster since it iterates once over the list?
doesnt sum first construct the list then sum it?

-----------------------
reduce with named function: 37.9864357062
reduce with nested, named function: 39.4710288598
reduce with lambda: 39.2463927678
sum comprehension: 25.9530121845reduce with named function: 36.4529584067
reduce with nested, named function: 37.6278529813
reduce with lambda: 38.2629448715
sum comprehension: 26.0197561422

from timeit import Timer

def add(x,y):
return x+y

def rednamed(lst):
return reduce(add, lst)

def rednn(lst):
def add2(x,y):
return x+y
return reduce(add2, lst)

def redlambda(lst):
return reduce(lambda x,y:x+y, lst)

def com(lst):
return sum(x for x in lst)

s = xrange(101)

t1 = Timer('rednamed(s)', 'from __main__ import rednamed, s')
t2 = Timer('rednn(s)', 'from __main__ import rednn, s')
t3 = Timer('redlambda(s)', 'from __main__ import redlambda, s')
t4 = Timer('com(s)', 'from __main__ import com, s')

print "reduce with named function: ", t1.timeit()
print "reduce with nested, named function: ", t2.timeit()
print "reduce with lambda: ", t3.timeit()
print "sum comprehension: ", t4.timeit()
[/QUOTE][/QUOTE][/QUOTE]
reduce with named function: 36.7560729087
reduce with nested, named function: 38.5393266463
reduce with lambda: 38.3852953378
sum comprehension: 27.9001007111

Marc 'BlackJack' Rintsch · Sep 13, 2008

This must be because of implementation right? Shouldn't reduce be faster
since it iterates once over the list? doesnt sum first construct the
list then sum it?

No it doesn't. Why should it?

also, using range instead of xrange doesnt seem to generate a
performance-penalty:

(De)Allocating a list of length 100 isn't very slow. Try some million
elements. And watch the memory consumption too.

Ciao,
Marc 'BlackJack' Rintsch

Harold Fellermann · Sep 13, 2008

doesnt sum first construct the list then sum it?

def com(lst):
return sum(x for x in lst)

You construct a generator over an existing list in your code.
Try sum([x for x in lst]) to see the effect of additional list
construction. And while you're at it, try the simple sum(lst).

Cheers,

- harold -

Bas · Sep 13, 2008

This must be because of implementation right? Shouldn't reduce be
faster since it iterates once over the list?
doesnt sum first construct the list then sum it?

No, sum also iterates the sequence just once and doesn't create a
list. It is probably implemented similar to

def sum(sequence, start=0):
it = iter(sequence)
total = start
for i in it:
total += i
return i

but then implemented in C for speed. Reduce is probably implemented
pretty similar, but then with a random function instead of addition.
Make sure that you understand the difference between generator
expression and list comprehension, and that [f(x) for x in something]
is (almost) equal to list(f(x) for x in something), so you can emulate
a LC by using the list constructor on the equivalent GE.

HTH,
Bas

Steven D'Aprano · Sep 13, 2008

This must be because of implementation right? Shouldn't reduce be faster
since it iterates once over the list? doesnt sum first construct the
list then sum it?

What makes you think that?

Given the speed of sum(), it sure doesn't look like it's generating a
full list before summing. Why would it?

reduce with named function: 37.9864357062
reduce with nested, named function: 39.4710288598
reduce with lambda: 39.2463927678
sum comprehension: 25.9530121845

If you want to see reduce really shine, time it with a C-based function
rather than one written in pure Python:
.... 'from operator import add').repeat(number=10000)
[19.724750995635986, 19.410486936569214, 19.614511013031006].... 'def add(x, y): return x+y').repeat(number=10000)
[45.210143089294434, 44.814558982849121, 46.906874895095825]

You probably won't see much (if any) benefit for small lists, so make
sure your test is on a significantly-sized input list.

Of course, sum() is even faster than reduce:
[9.814924955368042, 8.7169640064239502, 9.5062401294708252]

Terry Reedy · Sep 13, 2008

Steven said:
If you want to see reduce really shine, time it with a C-based function
rather than one written in pure Python:
... 'from operator import add').repeat(number=10000)
[19.724750995635986, 19.410486936569214, 19.614511013031006]... 'def add(x, y): return x+y').repeat(number=10000)
[45.210143089294434, 44.814558982849121, 46.906874895095825] ....
Of course, sum() is even faster than reduce:
[9.814924955368042, 8.7169640064239502, 9.5062401294708252]

'Of course', because the irreducible difference between reduce(add.seq)
and sum(seq) is that reduce has to call an add function while sum has
the operation built-in in place of a call.

Raymond Hettinger · Sep 16, 2008

Note that, despite appearances, it's not as built-in as one might
wish. sum(seq) is still completely generic and works on all
number-like objects (in fact on all objects that define an __add__
operation except strings, which are explicitly forbidden). This means
that it can't just go ahead and execute a C addition, it must properly
call PyNumber_Add (the C API call equivalent to Python's "+"
operator), which will then inspect the objects and invoke the
appropriate implementation of addition.

The time machine strikes again! Try using Py2.6.
The built-in sum() function is much smarter and faster.
It does in fact use C addition.

Raymond

List comprehension timing difference.	4	Sep 2, 2011
Profiling weirdness: Timer.timeit(), fibonacci and memoization	5	Aug 2, 2008
is list comprehension necessary?	15	Oct 26, 2010
looping versus comprehension	0	Jan 30, 2013
HMAC encription issue	4	Aug 31, 2013
Map vs. List Comprehensions (was "lint warnings")	1	Feb 15, 2011
A performance issue when using default value	4	Feb 1, 2010
Can parellelized program run slower than single process program?	1	Jun 21, 2012

Profiling, sum-comprehension vs reduce

cnb

Marc 'BlackJack' Rintsch

Harold Fellermann

Bas

Steven D'Aprano

Terry Reedy

Raymond Hettinger

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads