What's do list comprehensions do that generator expressions don't?

M

Mike Meyer

Ok, we've added list comprehensions to the language, and seen that
they were good. We've added generator expressions to the language, and
seen that they were good as well.

I'm left a bit confused, though - when would I use a list comp instead
of a generator expression if I'm going to require 2.4 anyway?

Thanks,
<mike
 
R

Robert Kern

Mike said:
Ok, we've added list comprehensions to the language, and seen that
they were good. We've added generator expressions to the language, and
seen that they were good as well.

I'm left a bit confused, though - when would I use a list comp instead
of a generator expression if I'm going to require 2.4 anyway?

Never. If you really need a list

list(x*x for x in xrange(10))

Sadly, we can't remove list comprehensions until 3.0.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
S

Steven Bethard

Robert said:
Never. If you really need a list

list(x*x for x in xrange(10))

Not quite true. If you discovered the unlikely scenario that the
construction of a list from the generator expression was an efficiency
bottleneck, you might choose a list comprehension -- they're slightly
faster when you really do want a list:

$ python -m timeit "list(x*x for x in xrange(10))"
100000 loops, best of 3: 6.54 usec per loop

$ python -m timeit "[x*x for x in xrange(10)]"
100000 loops, best of 3: 5.08 usec per loop

STeVe
 
R

Robert Kern

Steven said:
Robert said:
Never. If you really need a list

list(x*x for x in xrange(10))

Not quite true. If you discovered the unlikely scenario that the
construction of a list from the generator expression was an efficiency
bottleneck, you might choose a list comprehension -- they're slightly
faster when you really do want a list:

$ python -m timeit "list(x*x for x in xrange(10))"
100000 loops, best of 3: 6.54 usec per loop

$ python -m timeit "[x*x for x in xrange(10)]"
100000 loops, best of 3: 5.08 usec per loop

Okay, okay, *almost* never.

However, I don't expect that speed relationship to hold past Python 2.4.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
J

jfj

If you want a list right away you'd use a list comprehension.
X =[i for i in something() if somethingelse()]
random.shuffle(X)
print x[23]

On the other hand it's generator expressions which should be used
only when the code can be written in as a pipe. For example a filter
of a -otherwise- very long list:

make_fractal_with_seed (x for x in range(100000000) if
fibonacci_prime (x))

Never. If you really need a list

list(x*x for x in xrange(10))

Sadly, we can't remove list comprehensions until 3.0.

Why???
Then we should also remove:
x=[] to x=list()
x=[1,2,3] to x=list(1,2,3)

I think "list" is useful only:
1) to subclass it
2) to convert a list/tuple/string to a list, which is
done extremely fast.

But for iterators I find the list comprehension syntax nicer.


jfj
 
J

jfj

Robert said:
Add "any iteratable". Genexps are iterables.

The thing is that when you want to convert a tuple to a list
you know already the size of it and you can avoid using append()
and expanding the list gradually. For iterables you can't avoid
appending items until StopIteration so using list() doesn't have
any advantage. The OP was about genexps vs list comprehensions
but this is about list() vs. list comprehensions.
Possibly. I find them too similar with little enough to choose between them, hence the OP's question.

One solution is to forget about list(). If you want a list use [].
Unless you want to convert a tuple...

I think a better question would be "What do *generator expressions* do
that list comprehensions don't?". And always use list comprehensions
unless you want the extra bit.


jfj
 
R

Robert Kern

jfj said:
If you want a list right away you'd use a list comprehension.
X =[i for i in something() if somethingelse()]
random.shuffle(X)
print x[23]

On the other hand it's generator expressions which should be used
only when the code can be written in as a pipe. For example a filter
of a -otherwise- very long list:

make_fractal_with_seed (x for x in range(100000000) if fibonacci_prime
(x))
Never. If you really need a list

list(x*x for x in xrange(10))

Sadly, we can't remove list comprehensions until 3.0.

Why???
Then we should also remove:
x=[] to x=list()
x=[1,2,3] to x=list(1,2,3)

Well, that last one doesn't work. Removing the empty list literal would
be inconsistent.
I think "list" is useful only:
1) to subclass it
2) to convert a list/tuple/string to a list, which is
done extremely fast.

Add "any iteratable". Genexps are iterables.
But for iterators I find the list comprehension syntax nicer.

Possibly. I find them too similar with little enough to choose between
them, hence the OP's question.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
R

Robert Kern

jfj said:
The thing is that when you want to convert a tuple to a list
you know already the size of it and you can avoid using append()
and expanding the list gradually. For iterables you can't avoid
appending items until StopIteration so using list() doesn't have
any advantage.

But no real disadvantage either.
The OP was about genexps vs list comprehensions
but this is about list() vs. list comprehensions.

Yes, and list(genexp) replaces list comprehensions rather handily.
There's very little reason to prefer list comprehensions over the
list(genexp) construction. If anything, it's pretty much a wash. That
was the OP's question:

[mike]
> when would I use a list comp instead of a generator expression if
> I'm going to require 2.4 anyway?
Possibly. I find them too similar with little enough to choose between
them, hence the OP's question.

One solution is to forget about list(). If you want a list use [].
Unless you want to convert a tuple...

I think a better question would be "What do *generator expressions* do
that list comprehensions don't?". And always use list comprehensions
unless you want the extra bit.

<shrug> What we have now are two very similar constructs, one of which
is more general and subsumes nearly all the uses of the other. The
presence of both is confusing, as evidenced by the fact that the OP
asked his question. In the Great Breaking of Backwards Compatibility,
list comprehensions should go away.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
J

jfj

Mike said:
As the OP, I can say why I didn't ask those questions.

Sorry. I was referring to the subject line:)
Generator expressions don't build the entire list in memory before you
have to deal with it. This makes it possible to deal with expressions
that are to long to fit in memory.

Which means that the real rule should be always use generator
expressions, unless you *know* the expression will always fit in
memory.

Consider this code which I also included the first reply:

x = [i for in in something()]
random.shuffle (x)
x.sort ()

Shuffle and sort are two examples where need *the entire list* to
work. Similarily for a dictionary where the values are small lists.
In this example using a generator buys you *nothing* because you
will immediately build a list.

So there are cases where we need the list as the product of an algorithm
and a generator is not good enough. In fact, in my experience with
python so far I'd say that those cases are the most common case.

That is the one question.
The other question is "why not list(generator) instead of [list
comprehension]?"

I guess that lists are *so important* that having a primary language
feature for building them is worth it. On the other hand "list()" is
not a primary operator of the python language. It is merely a builtin
function.


jfj
 
M

Mike Meyer

jfj said:
I think a better question would be "What do *generator expressions* do
that list comprehensions don't?". And always use list comprehensions
unless you want the extra bit.

As the OP, I can say why I didn't ask those questions.

Generator expressions don't build the entire list in memory before you
have to deal with it. This makes it possible to deal with expressions
that are to long to fit in memory.

Which means that the real rule should be always use generator
expressions, unless you *know* the expression will always fit in
memory.

Which leads to the obvious question of why the exception.

<mike
 
J

Jeremy Bowers

Never. If you really need a list

list(x*x for x in xrange(10))

Sadly, we can't remove list comprehensions until 3.0.

Why "remove" them? Instead, we have these things called "comprehensions"
(which, now that I say that, seems a rather odd name), and you can control
whether they result in a list or a generator with () or [].

I don't see why they need to be "removed". Lists are already a special
case of the "only one way to do it" principle ([] vs. list()), and
pragmatically I don't see any reason to remove them here; it doesn't add
comprehensibility, leaving them in doesn't significantly affect the mental
size of the code (the *comprehension* is the hard part, the final form
should be relatively simple), it's not worth breaking that code.
 
R

Robert Kern

Jeremy said:
Never. If you really need a list

list(x*x for x in xrange(10))

Sadly, we can't remove list comprehensions until 3.0.


Why "remove" them? Instead, we have these things called "comprehensions"
(which, now that I say that, seems a rather odd name), and you can control
whether they result in a list or a generator with () or [].

They are *not* the same thing. They have completely different
implementations although they have very similar syntax and the former
subsumes just about *every* use of the latter. That's pointless except
for maintaining backwards compatibility. The difference in
implementation is one reason why generator expressions were not called
generator comprehensions.

What's more, the list comprehension implementation is warty. It leaks
variables.
I don't see why they need to be "removed". Lists are already a special
case of the "only one way to do it" principle ([] vs. list()), and
pragmatically I don't see any reason to remove them here; it doesn't add
comprehensibility,

Yes it does. The OP was confused as to which to use now that both exist.
This has come up here before.
leaving them in doesn't significantly affect the mental
size of the code (the *comprehension* is the hard part, the final form
should be relatively simple), it's not worth breaking that code.

Like I said, this would only happen at 3.0 when all of your code will
break anyways.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
V

Ville Vainio

Jeremy> Why "remove" them? Instead, we have these things called
Jeremy> "comprehensions" (which, now that I say that, seems a
Jeremy> rather odd name), and you can control whether they result
Jeremy> in a list or a generator with () or [].

Still, list comprehensions should be implemented in terms of genexps
to get rid of the LC variable that is visible outside the scope of the
LC.

Jeremy> should be relatively simple), it's not worth breaking that
Jeremy> code.

Well, the code that relies on the dangling variable deserves to break.
 
B

Bill Mill

Jeremy> Why "remove" them? Instead, we have these things called
Jeremy> "comprehensions" (which, now that I say that, seems a
Jeremy> rather odd name), and you can control whether they result
Jeremy> in a list or a generator with () or [].

Still, list comprehensions should be implemented in terms of genexps
to get rid of the LC variable that is visible outside the scope of the
LC.

+1 . I think that we should still have the form [genexp] , but without
the dangling variable, and implemented with generator expressions. It
seems to me that it is inconsistent if I can do list(genexp) but not
[genexp] , since they are so similar. Once that happens, we can tell
people who ask the OP's question that [genexp] is just another way to
spell list(genexp), and he should use it if he wants the entire list
constructed in memory.
Jeremy> should be relatively simple), it's not worth breaking that
Jeremy> code.

Well, the code that relies on the dangling variable deserves to break.

Agreed.

Peace
Bill Mill
bill.mill at gmail.com
 
J

Jeremy Bowers

Still, list comprehensions should be implemented in terms of genexps to
get rid of the LC variable that is visible outside the scope of the LC.
+1 . I think that we should still have the form [genexp] , but without the
dangling variable, and implemented with generator expressions. It seems to
me that it is inconsistent if I can do list(genexp) but not [genexp] ,
since they are so similar. Once that happens, we can tell people who ask
the OP's question that [genexp] is just another way to spell list(genexp),
and he should use it if he wants the entire list constructed in memory.

This is what I meant.

Robert Kern says the implementations really differ, but I submit that
is an accident of the order they were created in, not a fundamental
constraint. list(genexp) works today and does almost exactly the same
thing, minus an optimization or two that people are working on
generalizing anyways, which is the right approach. I doubt that Python 3.0
would have two radically different implementations; they'll just have the
genexp implementation, and an optimization for list creation if the list
creation can know the size in advance, regardless of where it came from.

This is not what I meant and I agree with it too. Breaking code depending
on variable leaking is one thing; breaking all code that uses list
comprehensions is quite another. Just because 3.0 is going to break a lot
of code anyhow doesn't mean we should be *gratuitous* about it!
 
M

Mike Meyer

Jeremy Bowers said:
On Mon, 25 Apr 2005 16:48:46 -0400, Bill Mill wrote:
generalizing anyways, which is the right approach. I doubt that Python 3.0
would have two radically different implementations; they'll just have the
genexp implementation, and an optimization for list creation if the list
creation can know the size in advance, regardless of where it came from.

Why do we have to wait for Python 3.0 for this? Couldn't list
comprehensions and generator expression be unified without breaking
existing code that didn't deserve to be broken?

<mike
 
J

Jeremy Bowers

Why do we have to wait for Python 3.0 for this? Couldn't list
comprehensions and generator expression be unified without breaking
existing code that didn't deserve to be broken?

We don't; my mentioning 3.0 was just in reference to a previous comment.
In fact it'll probably happen sooner than then, I just have no direct
knowledge. (I know some people were talking about creating iterators that
know how many things they contain, which is the critical optimization, but
I don't know if that's gotten anywhere; I don't track the dev list as it
is too much traffic for my already strained email client :) )
 
B

Bengt Richter

"Jeremy" =3D=3D Jeremy Bowers <[email protected]> writes:
=20
Never. If you really need a list

list(x*x for x in xrange(10))

Sadly, we can't remove list comprehensions until 3.0.
=20
Jeremy> Why "remove" them? Instead, we have these things called
Jeremy> "comprehensions" (which, now that I say that, seems a
Jeremy> rather odd name), and you can control whether they result
Jeremy> in a list or a generator with () or [].
=20
Still, list comprehensions should be implemented in terms of genexps
to get rid of the LC variable that is visible outside the scope of the
LC.
=20

+1 . I think that we should still have the form [genexp] , but without
the dangling variable, and implemented with generator expressions. It
seems to me that it is inconsistent if I can do list(genexp) but not
[genexp] , since they are so similar. Once that happens, we can tell
people who ask the OP's question that [genexp] is just another way to
spell list(genexp), and he should use it if he wants the entire list
constructed in memory.
ISTM you have to account for
>>> def foo(g): return g ...
>>> foo(123) 123
>>> foo(c for c in 'abc')
>>> [(c for c in 'abc')]
[ said:
>>> [c for c in 'abc']
['a', 'b', 'c']
Probably ;-)

Regards,
Bengt Richter
 
M

Mike Meyer

+1 . I think that we should still have the form [genexp] , but without
the dangling variable, and implemented with generator expressions. It
seems to me that it is inconsistent if I can do list(genexp) but not
[genexp] , since they are so similar. Once that happens, we can tell
people who ask the OP's question that [genexp] is just another way to
spell list(genexp), and he should use it if he wants the entire list
constructed in memory.
ISTM you have to account for
def foo(g): return g ...
foo(123) 123
foo(c for c in 'abc')
[(c for c in 'abc')]
[ said:
[c for c in 'abc']
['a', 'b', 'c']

Right. But that shouldn't be hard to do. Let genexp stand for a a
generator expression/list comprehension without any brackets on it at
all. Then [genexp] is the syntax to expand the list. [(genexp)] is the
syntax to create a list of one element - a generator
object. foo(genexp) will do the right thing.

The question under these circumstances is then: do you want bare
genexp to mean something? Right now, it's a syntax error. But there's
no reason you couldn't have:

y = x for x in stuff

assign a generator object to y.

<mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top