R
Rhamphoryncus
Arnaud Delobelle wrote:
sum() works for any sequence of objects with an __add__ method, not
just floats! Your algorithm is specific to floats.
This occurred to me also, but then I tried
sum(['abc', 'efg'], '')Interesting, I always thought that sum is like shortcut of
reduce(operator.add, ...), but I was mistaken.reduce() is more forgiving:
reduce(operator.add, ['abc', 'efg'], '' ) # it works
'abcefg'
Hm, it works for lists:
sum(([1], [2]), [])
[1, 2]
However I find the seccond argument hack ugly.
Does the sum way have any performance advantages over the reduce way?
The relative differences are irrelevant. The important thing is they
both perform quadratically, which makes them the wrong choice (unless
you can guarantee your inputs are trivial.)
$ python2.5 -m timeit 'x = [[0]*1000]*10' 'sum(x, [])'
1000 loops, best of 3: 566 usec per loop
$ python2.5 -m timeit 'x = [[0]*1000]*100' 'sum(x, [])'
10 loops, best of 3: 106 msec per loop
$ python2.5 -m timeit 'x = [[0]*1000]*1000' 'sum(x, [])'
10 loops, best of 3: 18.1 sec per loop
Per element of the outer list, that's 0.0566, 1.06, 18.1 msec.
Nasty. Use this instead:
$ python2.5 -m timeit -s 'x = [[0]*1000]*10' 'y = []' 'for a in x:
y.extend(a)'
10000 loops, best of 3: 80.2 usec per loop
$ python2.5 -m timeit -s 'x = [[0]*1000]*100' 'y = []' 'for a in x:
y.extend(a)'
1000 loops, best of 3: 821 usec per loop
$ python2.5 -m timeit -s 'x = [[0]*1000]*1000' 'y = []' 'for a in x:
y.extend(a)'
10 loops, best of 3: 24.3 msec per loop
$ python2.5 -m timeit -s 'x = [[0]*1000]*10000' 'y = []' 'for a in x:
y.extend(a)'
10 loops, best of 3: 241 msec per loop
$ python2.5 -m timeit -s 'x = [[0]*1000]*100000' 'y = []' 'for a in x:
y.extend(a)'
10 loops, best of 3: 2.35 sec per loop
Per element of the outer list t hat's 0.00802, 0.00821, 0.0243,
0.0241, 0.0235 msec. At 1000 elements it seemed to cross some
threshold (maybe cache related, maybe allocation related), but overall
it scales linearly. Know your algorithm's complexity.