Creating a List of Empty Lists

D

Duncan Booth

The recommended way these days is usually:

a = [ [] for i in range(10) ]

That still has a loop and works by appending empty lists, but at least
its just a single expression. Also you can incorporate the next stage
of your initialisation quite easily:

a = [ for b in range(10) ]

I seem to remember the fastest way to do this was map(list,n*[[]])
from a couple of earlier threads, but is that true in 2.3?


I think I would prefer to maintain code with the list comprehension rather
than the map which, to my eyes, isn't immediately obvious. However, it
would appear that the list comprehension is much faster in any case, so in
this case I believe the clearest solution is also the fastest.

C:\>d:\python23\lib\timeit.py "[ [] for i in range(10) ]"
10000 loops, best of 3: 28.6 usec per loop

C:\>d:\python23\lib\timeit.py "map(list,10*[[]])"
10000 loops, best of 3: 102 usec per loop

C:\>d:\python23\lib\timeit.py -s ll=list -s mm=map "mm(ll,10*[[]])"
10000 loops, best of 3: 91.9 usec per loop
 
R

Robin Becker

unfortunately all of these tests are slower than 1 clock tick on my
machine. I believe the may be biassed.
 
R

Robin Becker

Duncan Booth's prompted me to repeat all the nonsense about empty lists
of a few years ago. I am amazed at the differences between pythons.
Lambda is now faster than list!!! I would really like to know why
list([]) is so much slower than list(()). Clearly comprehensions are now
fast, but still slower than the corresponding map with a lambda.


My results all obtained on the same win2k sp4 machine.
C:\tmp>\python20\python ttt.py
list () = 2.09
list [] = 1.19
comprehension = 1.97
copy = 4.69
cCopy.copy = 2.09
lambda z: z[:] = 1.56
lambda z: list(z) = 2.66
lambda z: [] = 1.42

C:\tmp>\python21\python ttt.py
list () = 2.33
list [] = 1.23
comprehension = 1.78
copy = 4.34
cCopy.copy = 2.22
lambda z: z[:] = 1.55
lambda z: list(z) = 2.33
lambda z: [] = 1.41

C:\tmp>\python22\python ttt.py
list () = 3.22
list [] = 1.33
comprehension = 1.59
copy = 4.05
cCopy.copy = 2.13
lambda z: z[:] = 1.69
lambda z: list(z) = 2.55
lambda z: [] = 1.64

C:\tmp>\python23\python ttt.py
list () = 1.73
list [] = 3.22
comprehension = 1.00
copy = 3.94
cCopy.copy = 1.59
lambda z: z[:] = 1.14
lambda z: list(z) = 4.77
lambda z: [] = 0.95

############### ttt.py
import time
s = 'list ()'
t0=time.time()
for y in xrange(1000):
x = map(list,1000*[()])
t1 = time.time()
print "%-20s = %.2f" % (s,(t1-t0))
assert x[0]==[] and x[0] is not x[-1], "%s failed identity" % s

s = 'list []'
t0=time.time()
for y in xrange(1000):
x = map(list,1000*[[]])
t1 = time.time()
print "%-20s = %.2f" % (s,(t1-t0))
assert x[0]==[] and x[0] is not x[-1], "%s failed identity" % s

s = "comprehension"
t0=time.time()
for y in xrange(1000):
x = [[] for i in xrange(1000)]
t1 = time.time()
print "%-20s = %.2f" % (s,(t1-t0))
assert x[0]==[] and x[0] is not x[-1], "%s failed identity" % s

from copy import copy
s = 'copy'
t0=time.time()
for y in xrange(1000):
x = map(copy,1000*[[]])
t1 = time.time()
print "%-20s = %.2f" % (s,(t1-t0))
assert x[0]==[] and x[0] is not x[-1], "%s failed identity" % s

try:
from cCopy import copy as ccopy
s = 'cCopy.copy'
t0=time.time()
for y in xrange(1000):
x = map(ccopy,1000*[[]])
t1 = time.time()
print "%-20s = %.2f" % (s,(t1-t0))
assert x[0]==[] and x[0] is not x[-1], "%s failed identity" % s
except ImportError:
pass

s = 'lambda z: z[:]'
t0=time.time()
for y in xrange(1000):
x = map(lambda z: z[:],1000*[[]])
t1 = time.time()
print "%-20s = %.2f" % (s,(t1-t0))
assert x[0]==[] and x[0] is not x[-1], "%s failed identity" % s

s = 'lambda z: list(z)'
t0=time.time()
for y in xrange(1000):
x = map(lambda z: list(z),1000*[[]])
t1 = time.time()
print "%-20s = %.2f" % (s,(t1-t0))
assert x[0]==[] and x[0] is not x[-1], "%s failed identity" % s

s = 'lambda z: []'
t0=time.time()
for y in xrange(1000):
x = map(lambda z: [],xrange(1000))
t1 = time.time()
print "%-20s = %.2f" % (s,(t1-t0))
assert x[0]==[] and x[0] is not x[-1], "%s failed identity" % s
 
F

Fredrik Lundh

Robin said:
C:\tmp>\python22\python ttt.py
list () = 3.22
list [] = 1.33
lambda z: list(z) = 2.55
C:\tmp>\python23\python ttt.py
list () = 1.73
list [] = 3.22
lambda z: list(z) = 4.77

is this part of the benchmark stable, or was your computer doing
something else in the background? I find it a bit disturbing that a
2.3 optimization would slow down a trivial operation that much...

</F>
 
D

Duncan Booth

Duncan Booth's prompted me to repeat all the nonsense about empty lists
of a few years ago. I am amazed at the differences between pythons.
Lambda is now faster than list!!! I would really like to know why
list([]) is so much slower than list(()). Clearly comprehensions are now
fast, but still slower than the corresponding map with a lambda.

May I ask you to repeat your tests but putting all the code inside
functions, please? I think it is a bit unfair to compare a list
comprehension with a map/lambda and force the list comprehension to access
a global variable every time round the loop.

I think if you let the list comprehension use a local variable it should
comfortably beat the best of your lambdas (at least it does on my system).

By the way, I didn't understand your earlier comment:
unfortunately all of these tests are slower than 1 clock tick on my
machine. I believe the may be biassed.

You obviously think something was wrong with my earlier timings, but I
can't understand from this what the problem was; timeit.py does some
reasonably clever things to try to give an accurate answer including
varying the number of times it repeats the test to ensure that the time per
loop is based on a reasonably large time interval.
 
R

Robin Becker

Robin said:
C:\tmp>\python22\python ttt.py
list () = 3.22
list [] = 1.33
lambda z: list(z) = 2.55
C:\tmp>\python23\python ttt.py
list () = 1.73
list [] = 3.22
lambda z: list(z) = 4.77

is this part of the benchmark stable, or was your computer doing
something else in the background? I find it a bit disturbing that a
2.3 optimization would slow down a trivial operation that much...

</F>
No this is stable I also find it really weird.
 
R

Robin Becker

You obviously think something was wrong with my earlier timings, but I
can't understand from this what the problem was; timeit.py does some
reasonably clever things to try to give an accurate answer including
varying the number of times it repeats the test to ensure that the time per
loop is based on a reasonably large time interval.
not at all, I was puzzled by the reversal of earlier stuff. You're right
about the comprehension as well, I had thought the variable was
internal, but making it local gives the comprehension the edge.

C:\tmp>\python23\python ttt.py
list () = 1.72
list [] = 3.27
comprehension = 0.81
copy = 3.94
cCopy.copy = 1.59
lambda z: z[:] = 1.14
lambda z: list(z) = 4.73
lambda z: [] = 0.95
f=lambda z: [] = 0.95

like /F though I am extremely puzzled about the list result.
 
D

Duncan Booth

C:\tmp>\python23\python ttt.py
list () = 1.72
list [] = 3.27
comprehension = 0.81
copy = 3.94
cCopy.copy = 1.59
lambda z: z[:] = 1.14
lambda z: list(z) = 4.73
lambda z: [] = 0.95
f=lambda z: [] = 0.95

like /F though I am extremely puzzled about the list result.

I just had a look at listobject.c, in particular the code for list_fill.

static int
list_fill(PyListObject *result, PyObject *v)
{
PyObject *it; /* iter(v) */
int n; /* guess for result list size */
int i;

n = result->ob_size;

/* Special-case list(a_list), for speed. */
if (PyList_Check(v)) {
if (v == (PyObject *)result )
return 0; /* source is destination, we're done */
return list_ass_slice(result, 0, n, v);
}

/* Empty previous contents */
if (n != 0) {
if (list_ass_slice(result, 0, n, (PyObject *)NULL) != 0)
return -1;
}

/* Get iterator. There may be some low-level efficiency to be gained
* by caching the tp_iternext slot instead of using PyIter_Next()
* later, but premature optimization is the root etc.
*/
... and so on
}

Timing 'python \python23\lib\timeit.py "list([])"' gave times around
4.5uSec per loop, whereas the same with "list(())" gives about 2.3uSec per
loop.

Putting #ifdef 0/#endif around the 'special case for speed' block makes the
first case go from 4.5uS to 2.4uS, and the second case also speeds up
marginally for good measure.

So it looks suspiciously like a premature optimisation 'for speed' should
say 'for reduced speed'.

Next I tried:
python \python23\lib\timeit.py -s"r=range(10000)" "list(r)"

and
python \python23\lib\timeit.py -s"r=tuple(range(10000))" "list(r)"

For this length of list the optimisation is a clear winner.

It looks to me as though the breakeven is around 100 elements in the list.
With fewer than 100 elements the optimisation slows things down, with more
it speeds them up.

I put the tests in a batch file:

------ test.bat --------
@echo off
@echo Length: 0
python \python23\lib\timeit.py -s"r=[]" "list(r)"
python \python23\lib\timeit.py -s"r=()" "list(r)"
@echo Length: 1
python \python23\lib\timeit.py -s"r=range(1)" "list(r)"
python \python23\lib\timeit.py -s"r=tuple(range(1))" "list(r)"
@echo Length: 10
python \python23\lib\timeit.py -s"r=range(10)" "list(r)"
python \python23\lib\timeit.py -s"r=tuple(range(10))" "list(r)"
@echo Length: 100
python \python23\lib\timeit.py -s"r=range(100)" "list(r)"
python \python23\lib\timeit.py -s"r=tuple(range(100))" "list(r)"
@echo Length: 1000
python \python23\lib\timeit.py -s"r=range(1000)" "list(r)"
python \python23\lib\timeit.py -s"r=tuple(range(1000))" "list(r)"
@echo Length: 10000
python \python23\lib\timeit.py -s"r=range(10000)" "list(r)"
python \python23\lib\timeit.py -s"r=tuple(range(10000))" "list(r)"
------------------------
With the original code I get:

C:\Pythonsrc\python\dist\src\PCbuild>test
Length: 0
100000 loops, best of 3: 4.38 usec per loop
100000 loops, best of 3: 2.32 usec per loop
Length: 1
100000 loops, best of 3: 5.26 usec per loop
100000 loops, best of 3: 2.4 usec per loop
Length: 10
100000 loops, best of 3: 5.39 usec per loop
100000 loops, best of 3: 2.84 usec per loop
Length: 100
100000 loops, best of 3: 7.44 usec per loop
100000 loops, best of 3: 7.26 usec per loop
Length: 1000
10000 loops, best of 3: 28.5 usec per loop
10000 loops, best of 3: 50.4 usec per loop
Length: 10000
1000 loops, best of 3: 258 usec per loop
1000 loops, best of 3: 505 usec per loop

With the optimisation code completely removed I get:
C:\Pythonsrc\python\dist\src\PCbuild>test
Length: 0
100000 loops, best of 3: 2.21 usec per loop
100000 loops, best of 3: 2.24 usec per loop
Length: 1
100000 loops, best of 3: 2.34 usec per loop
100000 loops, best of 3: 2.36 usec per loop
Length: 10
100000 loops, best of 3: 2.84 usec per loop
100000 loops, best of 3: 2.85 usec per loop
Length: 100
100000 loops, best of 3: 7.56 usec per loop
100000 loops, best of 3: 7.13 usec per loop
Length: 1000
10000 loops, best of 3: 53.2 usec per loop
10000 loops, best of 3: 50.4 usec per loop
Length: 10000
1000 loops, best of 3: 549 usec per loop
1000 loops, best of 3: 534 usec per loop

With the optimisation tweaked to ignore 0 length lists entirely and only
'optimise' for lists with more than 100 elements:

C:\Pythonsrc\python\dist\src\PCbuild>test
Length: 0
1000000 loops, best of 3: 1.39 usec per loop
100000 loops, best of 3: 2.31 usec per loop
Length: 1
100000 loops, best of 3: 2.28 usec per loop
100000 loops, best of 3: 2.33 usec per loop
Length: 10
100000 loops, best of 3: 2.85 usec per loop
100000 loops, best of 3: 2.84 usec per loop
Length: 100
100000 loops, best of 3: 7.43 usec per loop
100000 loops, best of 3: 7.14 usec per loop
Length: 1000
10000 loops, best of 3: 28.4 usec per loop
10000 loops, best of 3: 50.2 usec per loop
Length: 10000
1000 loops, best of 3: 256 usec per loop
1000 loops, best of 3: 532 usec per loop

The relevant part of list_fill now looks like this:

/* Special-case list(a_list), for speed. */
if (PyList_Check(v)) {
int vsize = ((PyListObject*)v)->ob_size;
if (v == (PyObject *)result || vsize==0)
return 0; /* nothing to copy */
if (vsize > 100)
return list_ass_slice(result, 0, n, v);
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,163
Latest member
Sasha15427
Top