removing list comprehensions in Python 3.0

J

John Roth

It's funny how one of the
arguments for removing lambda -- you can do the same by defining a
named function -- does not apply for list comprehensions.

Which is a point a number of people have made many times,
with about as much effect as spitting into the wind.

Making a piece of functionality less convenient simply to
satisfy someone's sense of language esthetics doesn't seem
to me, at least, to be a really good idea.

John Roth
 
S

Steven Bethard

Devan said:
import timeit
t1 = timeit.Timer('list(i for i in xrange(10))')
t1.timeit()
27.267753024476576
t2 = timeit.Timer('[i for i in xrange(10)]')
t2.timeit()
15.050426800054197
t3 = timeit.Timer('list(i for i in xrange(100))')
t3.timeit()
117.61078097914682
t4 = timeit.Timer('[i for i in xrange(100)]')
t4.timeit()

83.502424470149151

Hrm, okay, so generators are generally faster for iteration, but not
for making lists(for small sequences), so list comprehensions stay.

Ahh, thanks. Although, it seems like a list isn't very useful if you
never iterate over it. ;)

Also worth noting that in Python 3.0 it is quite likely that list
comprehensions and generator expressions will have the same underlying
implementation. So while your tests above satisfy my curiosity
(thanks!) they're not really an argument for retaining list
comprehensions in Python 3.0. And list comprehensions won't go away
before then because removing them will break loads of existing code.

STeVe
 
S

Steven Bethard

Raymond said:
The root of this discussion has been the observation that a list
comprehension can be expressed in terms of list() and a generator
expression.

As George Sakkis already noted, the root of the discussion was actually
the rejection of the dict comprehensions PEP.
However, the former is faster when you actually want a list result

I would hope that in Python 3.0 list comprehensions and generator
expressions would be able to share a large amount of implementation, and
thus that the speed differences would be much smaller. But maybe not...
and many people (including Guido) like the square brackets.
^
|
This --------------------------+ of course, is always a valid point. ;)

STeVe
 
R

Raymond Hettinger

[Steven Bethard]
I would hope that in Python 3.0 list comprehensions and generator
expressions would be able to share a large amount of implementation, and
thus that the speed differences would be much smaller. But maybe not...

Looking under the hood, you would see that the implementations are
necessarily as different as night and day. Only the API is similar.


Raymond
 
R

Raymond Hettinger

[Raymond Hettinger]
[George Sakkis]
Similar arguments can be given for dict comprehensions as well.

You'll find that "lever" arguments carry little weight in Python
language design (well, you did X in place Y so now you have to do it
everywhere even if place Z lacks compelling use cases).

For each variant, the balance is different. Yes, of course, list
comprehensions have pros and cons similar to set comprehensions, dict
comps, etc. However, there are marked differences in frequency of use
cases, desirability of having an expanded form, implementation issues,
varying degrees of convenience, etc.

The utility and generality of genexps raises the bar quite high for
these other forms. They would need to be darned frequent and have a
superb performance advantage.

Take it from the set() and deque() guy, we need set, dict, and deque
comprehensions like we need a hole in the head. The constructor with a
genexp does the trick just fine.

Why the balance tips the other way for list comps is both subjective
and subtle. I don't expect to convince you by a newsgroup post.
Rather, I can communicate how one of the core developers perceives the
issue. IMHO, the current design strikes an optimal balance.

'nuff said,


Raymond
 
S

Steven Bethard

Raymond said:
[Steven Bethard]
I would hope that in Python 3.0 list comprehensions and generator
expressions would be able to share a large amount of implementation, and
thus that the speed differences would be much smaller. But maybe not...

Looking under the hood, you would see that the implementations are
necessarily as different as night and day. Only the API is similar.

Necessarily? It seems like list comprehensions *could* be implemented
as a generator expression passed to the list constructor. They're not
now, and at the moment, changing them to work this way seems like a bad
idea because list comprehensions would take a performance hit. But I
don't understand why the implementations are *necessarily* different.
Could you explain?

STeVe

P.S. The dis.dis output for list comprehensions makes what they're doing
pretty clear. But dis.dis doesn't seem to give me as much information
when looking at a generator expression:

py> def ge(items):
.... return (item for item in items if item)
....
py> dis.dis(ge)
2 0 LOAD_CONST 1 (<code object <generator
expression> at 0116FD20, file "<interactive input>", line 2>)
3 MAKE_FUNCTION 0
6 LOAD_FAST 0 (items)
9 GET_ITER
10 CALL_FUNCTION 1
13 RETURN_VALUE

I tried to grep through the dist\src directories for what a generator
expression code object looks like, but without any luck. Any chance you
could point me in the right direction?
 
B

Bengt Richter

Raymond said:
[Steven Bethard]
I would hope that in Python 3.0 list comprehensions and generator
expressions would be able to share a large amount of implementation, and
thus that the speed differences would be much smaller. But maybe not...

Looking under the hood, you would see that the implementations are
necessarily as different as night and day. Only the API is similar.

Necessarily? It seems like list comprehensions *could* be implemented
as a generator expression passed to the list constructor. They're not
now, and at the moment, changing them to work this way seems like a bad
idea because list comprehensions would take a performance hit. But I
don't understand why the implementations are *necessarily* different.
Could you explain?

STeVe

P.S. The dis.dis output for list comprehensions makes what they're doing
pretty clear. But dis.dis doesn't seem to give me as much information
when looking at a generator expression:

py> def ge(items):
... return (item for item in items if item)
...
py> dis.dis(ge)
2 0 LOAD_CONST 1 (<code object <generator
expression> at 0116FD20, file "<interactive input>", line 2>)
3 MAKE_FUNCTION 0
6 LOAD_FAST 0 (items)
9 GET_ITER
10 CALL_FUNCTION 1
13 RETURN_VALUE

I tried to grep through the dist\src directories for what a generator
expression code object looks like, but without any luck. Any chance you
could point me in the right direction?
>>> import dis
>>> g = ge([1,2,0,3,'',4])
>>> dis.dis(g)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "d:\python-2.4b1\lib\dis.py", line 46, in dis
raise TypeError, \
TypeError: don't know how to disassemble generator objects

but:
2 0 LOAD_CONST 1 (<code object <generator expression> at 02EE4FA0, file "<stdin>", line 2>)
3 MAKE_FUNCTION 0
6 LOAD_FAST 0 (items)
9 GET_ITER
10 CALL_FUNCTION 1
13 RETURN_VALUE
>>> ge.func_code
>>> ge.func_code.co_consts
(None said:
>>> ge.func_code.co_consts[1]
>>> dis.dis(ge.func_code.co_consts[1])
2 0 SETUP_LOOP 28 (to 31)
3 LOAD_FAST 0 ([outmost-iterable]) 9 STORE_FAST 1 (item)
12 LOAD_FAST 1 (item)
15 JUMP_IF_FALSE 8 (to 26)
18 POP_TOP
19 LOAD_FAST 1 (item)
22 YIELD_VALUE
23 JUMP_ABSOLUTE 6 34 RETURN_VALUE

A little more info, anyway. HTH.

Regards,
Bengt Richter
 
E

EP

Well, I want to offer a more radical proposal: why not free squared
braces from the burden of representing lists at all? It should be
sufficient to write

list()

From a visual comprehenison point of view, I would assert that the square form [] is much easier on the eyes than the subtler curved forms (e.g. "{" and "(").

Burdened with old eyes, small fonts, and an old, inflexible mind (;-), one of Python features near and dear to me are lists, [], and list comprehensions, but perhaps a more important point for 3.0 would be that there is a seamless consistency across the language (e.g. [list(), dict(), tuple()] or [ [], {}, () ] rather than [list(), {}, ()]) thus reflecting a cohesiveness both in underlying approach and symbology.

Even old guys can adjust to something new that is good and clean.
 
S

Steven Bethard

Steven said:
py> def ge(items):
... return (item for item in items if item)
...
2 0 LOAD_CONST 1 (<code object <generator expression> at 02EE4FA0, file "<stdin>", line 2>)
3 MAKE_FUNCTION 0
6 LOAD_FAST 0 (items)
9 GET_ITER
10 CALL_FUNCTION 1
13 RETURN_VALUE [snip]
dis.dis(ge.func_code.co_consts[1])
2 0 SETUP_LOOP 28 (to 31)
3 LOAD_FAST 0 ([outmost-iterable])9 STORE_FAST 1 (item)
12 LOAD_FAST 1 (item)
15 JUMP_IF_FALSE 8 (to 26)
18 POP_TOP
19 LOAD_FAST 1 (item)
22 YIELD_VALUE
23 JUMP_ABSOLUTE 634 RETURN_VALUE

Outstanding. Thanks a lot! For comparison, here's the relevant dis.dis
output for list comprehensions.

py> def lc(items):
.... return [item for item in items if item]
....
py> dis.dis(lc)
2 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_FAST 1 (_[1])
7 LOAD_FAST 0 (items)
10 GET_ITER 14 STORE_FAST 2 (item)
17 LOAD_FAST 2 (item)
20 JUMP_IF_FALSE 11 (to 34)
23 POP_TOP
24 LOAD_FAST 1 (_[1])
27 LOAD_FAST 2 (item)
30 LIST_APPEND
31 JUMP_ABSOLUTE 11
>> 34 POP_TOP 35 JUMP_ABSOLUTE 11
>> 38 DELETE_FAST 1 (_[1])
41 RETURN_VALUE

Interestingly, the LC code and the code of a GE's "generator-expression"
code object look quite similar, with basically a LOAD_FAST/LIST_APPEND
replaced by a YIELD_VALUE.

But I don't know byte code well enough to guess how the dangling local
variable in LCs will be eliminated in Python 3.0 (as has been suggested
a number of times). One way to eliminate it would be (as suggested) to
make LCs syntactic sugar for list(<genexp>). But it also looks like it
might be possible to do a DELETE_FAST with an appropriately hidden name...

STeVe
 
E

Edvard Majakari

Steven Bethard said:
$ python -m timeit "for x in (i for i in xrange(10)): y = x"
100000 loops, best of 3: 4.75 usec per loop

Yowza! One of the features I really liked in Perl has shored Python island
somewhere in the 2.4'ies, it seems[1]. Thanks for the tip!

PS. In case it wasn't clear what I referred to, it was the ability to run
given module as a script. Of course you could supply full path to timeit.py:

$ python2.3 /usr/lib/python2.3/timeit.py \
"for x in [i for i in xrange(10)]: y = x"
100000 loops, best of 3: 9.96 usec per loop

But using -m makes it much more convenient.


Footnotes:
[1] Well, not exactly equal to -M in Perl, but close enough for timing stuff
 
S

Steven Bethard

Edvard said:
Steven Bethard said:
$ python -m timeit "for x in (i for i in xrange(10)): y = x"
100000 loops, best of 3: 4.75 usec per loop

Yowza! One of the features I really liked in Perl has shored Python island
somewhere in the 2.4'ies, it seems[1]. Thanks for the tip! [snip]
But using -m makes it much more convenient.

Yup, it showed up in Python 2.4. Great, isn't it?

STeVe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top