Misuse of list comprehensions?

T

Thomas Bellman

Diez B. Roggisch said:
(e-mail address removed) wrote:
No, it doesn't. It uses append because it refers to itself in the
if-expression. So the append(c) is needed - and thus the assignment
possible but essentially useless.

Yes it does. A list comprehension *always* creates a list. In
this case it will be a list of None, since that is what list.append()
returns. See this:
>>> new=[]
>>> s="This is a foolish use of list comprehensions"
>>> [ new.append(c) for c in s if c not in new ]
[None, None, None, None, None, None, None, None, None, None, None,
None, None, None, None, None, None]

Yes, you do get a correct result in 'new', but you *also* create
a 17 long list with all elements set to None, that is immediately
thrown away.
 
J

John Salerno

I posted this code last night in response to another thread, and after I
posted it I got to wondering if I had misused the list comprehension. Here's
the two examples:

Example 1:
--------------------
def compress(s):
new = []

for c in s:
if c not in new:
new.append(c)
return ''.join(new)
----------------------

Example 2:
------------------------
def compress(s):
new = []
[new.append(c) for c in s if c not in new]
return ''.join(new)
--------------------------

In example 1, the intention to make an in-place change is explicit, and it's
being used as everyone expects it to be used. In example 2, however, I began
to think this might be an abuse of list comprehensions, because I'm not
assigning the result to anything (nor am I even using the result in any
way).

What does everyone think about this? Should list comprehensions be used this
way, or should they only be used to actually create a new list that will
then be assigned to a variable/returned/etc.?
 
D

Diez B. Roggisch

John said:
I posted this code last night in response to another thread, and after I
posted it I got to wondering if I had misused the list comprehension.
Here's the two examples:

Example 1:
--------------------
def compress(s):
new = []

for c in s:
if c not in new:
new.append(c)
return ''.join(new)
----------------------

Example 2:
------------------------
def compress(s):
new = []
[new.append(c) for c in s if c not in new]
return ''.join(new)
--------------------------

In example 1, the intention to make an in-place change is explicit, and
it's being used as everyone expects it to be used. In example 2, however,
I began to think this might be an abuse of list comprehensions, because
I'm not assigning the result to anything (nor am I even using the result
in any way).
What does everyone think about this? Should list comprehensions be used
this way, or should they only be used to actually create a new list that
will then be assigned to a variable/returned/etc.?

the above code is pretty much of a no-no because it has quadratic runtime
behavior.

That being said, I use that idiom myself. But I don't see anything wrong
with using a list-comp as loop-abbreviation. because that is it's actual
purpose. And also it is common in non-functional languages that l-values
aren't always assigned, if the aren't needed. It's the consequence of
having side-effects in a language - and python has them.

Diez
 
B

bearophileHUGS

John Salerno:
What does everyone think about this?

The Example 2 builds a list, that is then thrown away. It's just a
waste of memory (and time).

Bye,
bearophile
 
D

Diez B. Roggisch

John Salerno:

The Example 2 builds a list, that is then thrown away. It's just a
waste of memory (and time).

No, it doesn't. It uses append because it refers to itself in the
if-expression. So the append(c) is needed - and thus the assignment
possible but essentially useless.

Diez
 
B

Bruno Desthuilliers

John Salerno a écrit :
I posted this code last night in response to another thread, and after I
posted it I got to wondering if I had misused the list comprehension.
(snip)
def compress(s):
new = []
[new.append(c) for c in s if c not in new]
return ''.join(new)

As far as I'm concerned (and I'm a big fan of list-comps, generator
expressions etc), this is definitively an abuse. And a waste of
resources since it builds a useless list of 'None'.
 
H

Hrvoje Niksic

Diez B. Roggisch said:
No, it doesn't.

This line:

[new.append(c) for c in s if c not in new]

....throws away the list built by the comprehension itself, composed of
None values (returned by append).
It uses append because it refers to itself in the if-expression. So
the append(c) is needed

The append is not in question here.
 
T

Thomas Bellman

What's that mean, exactly?

It means that running time will increase with the square of the
problem size. 2.000.000 items will take 4 times as long as
1.000.000 items, and 5.000.000 items will take 25 times as long
as 1.000.000 items.
Are you referring to both examples, or just the
second one?

The problem is that the lookup you did to see if an item had
already been seen, is done by a linear search of a list. The
search time is linearly proportional to the number of items on
the list 'new', but since the number of times you do that search
increases with the length of the input, the total runtime for
your function increases even more.

This thus applies to both your examples, since they really do the
same thing.

However, it actually depends on what your input is. For the
runtime to increase with the square of the input length in your
function, the number of *unique* items on the input must be a
constant fraction of the input length. By that I mean that, for
example, 5% of the input items are unique, so that if your input
is 100 items, then the list 'new' will be 5 items at the end of
the function, and if your input is 5.000.000 items, then 'new'
will become 250.000 items long.

This might not be the case in your use. If you for example know
that the input only consists of integers between 0 and 10, then
'new' will never become longer than 11 items, regardless of
whether the input is 20 or 20.000.000.000 items. The runtime of
your function then suddenly becomes linear in the problem size
instead of quadratic.

Even so, maintaining the set of already seen items as a set is
likely to be faster than keeping them as a list, even for a
fairly small number of unique items.

One small exception from this, though. If only maintaining one
data structure (the ordered list 'new' in your code) makes your
working set fit within the processor cache, while maintaining two
data structures (a list for keeping the order, and a set for
searching) will make it *not* fit within the cache, then it
*might* actually be faster to just maintain the list. Unlikely,
but not entirely impossible, and just a small change of the
problem size can change the balance again.
 
D

Diez B. Roggisch

Hrvoje said:
Diez B. Roggisch said:
No, it doesn't.

This line:

[new.append(c) for c in s if c not in new]

...throws away the list built by the comprehension itself, composed of
None values (returned by append).

Ah. You are right of course.

Diez
 
D

Diez B. Roggisch

That being said, I use that idiom myself. But I don't see anything wrong
with using a list-comp as loop-abbreviation. because that is it's actual
purpose. And also it is common in non-functional languages that l-values
aren't always assigned, if the aren't needed. It's the consequence of
having side-effects in a language - and python has them.

After being corrected about missing the construction of a None-containing
list, one needs of course to think about the waste of resources, as a
possible result-list is created in any case.

I personally still don't mind that (if we talk about a few hundred objects a
top) - but it makes a strong point against a general usage.

Diez
 
D

Diez B. Roggisch

Thomas said:
Yes it does. A list comprehension *always* creates a list. In
this case it will be a list of None, since that is what list.append()
returns. See this:

Yep - no idea how that slipped me. I still don't mind the occasional waste
of a list-creation over a more concise looping-construct, but I totally
admit that one has to be aware of this. more than I was....

Diez
 
P

Paul McGuire

I posted this code last night in response to another thread, and after I
posted it I got to wondering if I had misused the list comprehension. Here's
the two examples:

Example 1:
--------------------
def compress(s):
    new = []

    for c in s:
        if c not in new:
            new.append(c)
    return ''.join(new)
----------------------

Example 2:
------------------------
def compress(s):
    new = []
    [new.append(c) for c in s if c not in new]
    return ''.join(new)
--------------------------

In example 1, the intention to make an in-place change is explicit, and it's
being used as everyone expects it to be used. In example 2, however, I began
to think this might be an abuse of list comprehensions, because I'm not
assigning the result to anything (nor am I even using the result in any
way).

What does everyone think about this? Should list comprehensions be used this
way, or should they only be used to actually create a new list that will
then be assigned to a variable/returned/etc.?

Why not make the list comp the actual list you are trying to build?

def compress(s):
seen = set()
new = [c for c in s if c not in seen and (seen.add(c) or True)]
return ''.join(new)

or just:

def compress(s):
seen = set()
return ''.join(c for c in s if c not in seen and (seen.add(c) or
True))

Using the set also gets rid of that nasty quadratic performance
thingy.

-- Paul
 
A

Arnaud Delobelle

Paul McGuire said:
I posted this code last night in response to another thread, and after I
posted it I got to wondering if I had misused the list comprehension. Here's
the two examples:

Example 1:
--------------------
def compress(s):
    new = []

    for c in s:
        if c not in new:
            new.append(c)
    return ''.join(new)
----------------------

Example 2:
------------------------
def compress(s):
    new = []
    [new.append(c) for c in s if c not in new]
    return ''.join(new)
--------------------------

In example 1, the intention to make an in-place change is explicit, and it's
being used as everyone expects it to be used. In example 2, however, I began
to think this might be an abuse of list comprehensions, because I'm not
assigning the result to anything (nor am I even using the result in any
way).

What does everyone think about this? Should list comprehensions be used this
way, or should they only be used to actually create a new list that will
then be assigned to a variable/returned/etc.?

Why not make the list comp the actual list you are trying to build?

def compress(s):
seen = set()
new = [c for c in s if c not in seen and (seen.add(c) or True)]

<split hairs>
Isn't

c not in seen and (seen.add(c) or True)

the same as

seen.add(c) or c not in seen

?
return ''.join(new)

(notice I haven't closed the tag!)
 
S

s0suk3

I posted this code last night in response to another thread, and after I
posted it I got to wondering if I had misused the list comprehension. Here's
the two examples:
Example 1:
for c in s:
if c not in new:
new.append(c)
return ''.join(new)
----------------------
Example 2:
------------------------
def compress(s):
new = []
[new.append(c) for c in s if c not in new]
return ''.join(new)
--------------------------
In example 1, the intention to make an in-place change is explicit, and it's
being used as everyone expects it to be used. In example 2, however, I began
to think this might be an abuse of list comprehensions, because I'm not
assigning the result to anything (nor am I even using the result in any
way).
What does everyone think about this? Should list comprehensions be used this
way, or should they only be used to actually create a new list that will
then be assigned to a variable/returned/etc.?

Why not make the list comp the actual list you are trying to build?

def compress(s):
seen = set()
new = [c for c in s if c not in seen and (seen.add(c) or True)]
return ''.join(new)

or just:

def compress(s):
seen = set()
return ''.join(c for c in s if c not in seen and (seen.add(c) or
True))

Using the set also gets rid of that nasty quadratic performance
thingy.

You don't need all those conditionals. A set differs from a list
precisely in the fact that each element is unique. And since the
function is expecting "s" to be an iterable object, it can be
constructed even without a for loop:

def compress(s):
return list(set(s))

That does the trick.
 
P

Paul McGuire

Paul McGuire said:
I posted this code last night in response to another thread, and after I
posted it I got to wondering if I had misused the list comprehension. Here's
the two examples:
Example 1:
--------------------
def compress(s):
    new = []
    for c in s:
        if c not in new:
            new.append(c)
    return ''.join(new)
----------------------
Example 2:
------------------------
def compress(s):
    new = []
    [new.append(c) for c in s if c not in new]
    return ''.join(new)
--------------------------
In example 1, the intention to make an in-place change is explicit, and it's
being used as everyone expects it to be used. In example 2, however, I began
to think this might be an abuse of list comprehensions, because I'm not
assigning the result to anything (nor am I even using the result in any
way).
What does everyone think about this? Should list comprehensions be used this
way, or should they only be used to actually create a new list that will
then be assigned to a variable/returned/etc.?
Why not make the list comp the actual list you are trying to build?
def compress(s):
    seen = set()
    new = [c for c in s if c not in seen and (seen.add(c) or True)]

<split hairs>
Isn't

    c not in seen and (seen.add(c) or True)

the same as

    seen.add(c) or c not in seen

?
    return ''.join(new)

(notice I haven't closed the tag!)

Unfortunately, no. "seen.add(c) or c not in seen" will never return
true, since c gets added to seen before testing if c in seen.

-- Paul
 
P

Paul McGuire

You don't need all those conditionals. A set differs from a list
precisely in the fact that each element is unique. And since the
function is expecting "s" to be an iterable object, it can be
constructed even without a for loop:

def compress(s):
    return list(set(s))

That does the trick.

Only if order does not need to be maintained. list(set(s)) will not
necessarily keep the unique characters in the order they are seen.
We'll have to check with the OP to see if this is important (I just
assumed that it was because of the use of list comps).

-- Paul
 
P

Paul Hankin

def compress(s):
seen = set()
return ''.join(c for c in s if c not in seen and (seen.add(c) or
True))

Slightly nicer is to move the set add out of the conditional...

def compress(s):
seen = set()
return ''.join(seen.add(c) or c for c in s if c not in seen)

I wouldn't write it this way though :)
 
J

John Salerno

Diez B. Roggisch said:
the above code is pretty much of a no-no because it has quadratic runtime
behavior.


What's that mean, exactly? Are you referring to both examples, or just the
second one?
 
A

Arnaud Delobelle

Paul McGuire said:
Unfortunately, no. "seen.add(c) or c not in seen" will never return
true, since c gets added to seen before testing if c in seen.

-- Paul

Ha you're right of course. But I haven't closed the tag yet, so:

c not in seen and (seen.add(c) or True)

is definitely the same as

c not in seen and not seen.add(c)

which is

not (c in seen or seen.add(c))

:)

</split hairs>
 
J

John Salerno

Diez B. Roggisch said:
After being corrected about missing the construction of a None-containing
list, one needs of course to think about the waste of resources, as a
possible result-list is created in any case.

Yeah, I was already aware of the list of Nones, which is why I asked. Simply
typing the list comprehension without an assignment just *looked* wrong,
too.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,262
Messages
2,571,045
Members
48,769
Latest member
Clifft

Latest Threads

Top