convert loop to list comprehension

B

bvdp

Please help my poor brain :) Every time I try to do a list
comprehension I find I just don't comprehend ...

Anyway, I have the following bit of code:

seq = [2, 3, 1, 9]
tmp = []
for a in range(len(seq)):
tmp.extend([a]*seq[a])

which correctly returns:

[0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3]

Question is, can I do this as a list comprehension?

Thanks!
 
P

Paul Rubin

seq = [2, 3, 1, 9]
tmp = []
for a in range(len(seq)):
tmp.extend([a]*seq[a])

which correctly returns:

[0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3]

Question is, can I do this as a list comprehension?

import operator
x = reduce(operator.add, (*a for i,a in enumerate(seq)), [])
 
P

Paul Rubin

Paul Rubin said:
Question is, can I do this as a list comprehension?

import operator
x = reduce(operator.add, (*a for i,a in enumerate(seq)), [])


Maybe more in the iterative spirit:

import itertools
seq = [2, 3, 1, 9]
x = itertools.chain(*(*a for i,a in enumerate(seq)))
 
R

Rob Williscroft

(e-mail address removed) wrote in @i42g2000cwa.googlegroups.com in comp.lang.python:
Please help my poor brain :) Every time I try to do a list
comprehension I find I just don't comprehend ...

Anyway, I have the following bit of code:

seq = [2, 3, 1, 9]
tmp = []
for a in range(len(seq)):
tmp.extend([a]*seq[a])

which correctly returns:

[0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3]

Question is, can I do this as a list comprehension?

Well apparently I can, though it did take me 2 goes:
seq = [2, 3, 1, 9]
sum( [ [a]*seq[a] for a in range(len(seq)) ] )

Traceback (most recent call last):
File "<pyshell#1>", line 1, in -toplevel-
sum( [ [a]*seq[a] for a in range(len(seq)) ] )
TypeError: unsupported operand type(s) for +: 'int' and 'list'
sum( [ [a]*seq[a] for a in range(len(seq)) ], [] ) [0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3]

But thats just me showing off what a filthy programmer I am.

My third attempt went like:
[ x for a in a in range(len(seq)) for x in [a] * seq[a] ]

Traceback (most recent call last):
File "<pyshell#3>", line 1, in -toplevel-
[ x for a in a in range(len(seq)) for x in [a] * seq[a] ]
TypeError: iteration over non-sequence

Ah the perils of cut-n-paste (and my what a strange error),
But my forth attemp yeilded (If that's a pun I do appologise)
this:
[ x for a in range(len(seq)) for x in [a] * seq[a] ] [0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3]

Which is possibly something you might of expected.

Note the unrolled version goes:
tmp = []
for a in range(len(seq)):
for x in [a] * seq[a]:
tmp.append( x )
tmp [0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3]

IOW a comprehension appends rather than extends. Other than
that you move the value you're appending from the inside of
the loop('s) to the begining of the comprehension then remove
newlines and colon's [inside some brakets ofcourse].

Rob.
 
B

bearophileHUGS

Two possibile solutions:

seq = [2, 3, 1, 9]

print sum( (*n for i,n in enumerate(seq)), [])

print [i for i, x in enumerate(seq) for _ in xrange(x)]

The second one is probably quite faster.

Bye,
bearophile
 
B

bvdp

Rob Williscroft wrote:

But my forth attemp yeilded (If that's a pun I do appologise)
this:
[ x for a in range(len(seq)) for x in [a] * seq[a] ]

Ahh, that's the magic ... I didn't understand that one could have
multiple "statments" in this single line. Now, you can't have python
line "for a in ... for b in..." can you? I probably shoudn't even try
to figure out how/why it works in a
  • .
    [0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3]
    Which is possibly something you might of expected.

    Note the unrolled version goes:
    tmp = []
    for a in range(len(seq)):
    for x in [a] * seq[a]:
    tmp.append( x )
    tmp [0, 0, 1, 1, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3]

    IOW a comprehension appends rather than extends. Other than
    that you move the value you're appending from the inside of
    the loop('s) to the begining of the comprehension then remove
    newlines and colon's [inside some brakets ofcourse].

    Rob.

    Cool ... and damn but you guys are fast with the answers. This appears
    to work find, but in a quick and dirty test it appears that the

    • version takes about 2x as long to run as the original loop. Is this
      normal?
 
P

Paul Rubin

print sum( (*n for i,n in enumerate(seq)), [])


Wow, I had no idea you could do that. After all the discussion about
summing strings, I'm astonished.
 
B

bvdp

Two possibile solutions:

seq = [2, 3, 1, 9]

print sum( (*n for i,n in enumerate(seq)), [])

print [i for i, x in enumerate(seq) for _ in xrange(x)]


Cool as well. So much to learn :)

1. Using an _ is an interesting way to use a throw-away variable. Never
would I think of that ... but, then, I don't do Perl either :)

2. Any reason for xrange() instead of range()
The second one is probably quite faster.

This seems to be a bit faster than the other suggestion, but is still
quite a bit slower than the original loop. Guess that I'll stick with
the loop for now.

Thanks again!
 
B

bvdp

Paul said:
print sum( (*n for i,n in enumerate(seq)), [])


Wow, I had no idea you could do that. After all the discussion about
summing strings, I'm astonished.


Me too ... despite what bearophile said, this is faster than the 2nd
example. Nearly as fast as the original loop :)
 
F

Felipe Almeida Lessa

8 Sep 2006 17:37:02 -0700 said:
1. Using an _ is an interesting way to use a throw-away variable. Never
would I think of that ... but, then, I don't do Perl either :)

It's a kind of convention. For example, Pylint complains for all
variables you set and don't use unless its name is "_".
2. Any reason for xrange() instead of range()

It's faster.
 
F

Felipe Almeida Lessa

08 Sep 2006 17:33:20 -0700 said:
print sum( (*n for i,n in enumerate(seq)), [])


Wow, I had no idea you could do that. After all the discussion about
summing strings, I'm astonished.


Why? You already had the answer: summing *strings*. Everything but
strings can be summed by sum(). E.g.:

Python 2.4.3 (#2, Apr 27 2006, 14:43:58)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information..... def __add__(self, other):
.... return x(self.a + other.a)
.... def __init__(self, a):
.... self.a = a
....
Traceback (most recent call last):
File said:
sum([t, t], t)
_.a 30
sum([t, t], x(0)).a 20
sum([t, t]*1000, t).a
20010
 
M

MonkeeSage

Cool ... and damn but you guys are fast with the answers. This appears
to work find, but in a quick and dirty test it appears that the

  • version takes about 2x as long to run as the original loop. Is this
    normal?


  • You could also do it 'functionally' with map(), but it's ugly and would
    probably fail (in this case) for non-unique list indicies; it is fast
    though.

    map(lambda x: tmp.extend([seq.index(x)]*x), seq)

    Ps. I don't know if xrange is faster...I thought the difference was
    that range created a temporary variable holding a range object and
    xrange created an iterator?

    Regards,
    Jordan
 
P

Paul Rubin

MonkeeSage said:
Ps. I don't know if xrange is faster...I thought the difference was
that range created a temporary variable holding a range object and
xrange created an iterator?

There's no such thing as a "range object"; range creates a list, which
consumes O(n) memory where n is the number of elements. xrange
creates an xrange object, which is a reusable iterator of sorts.
 
M

MonkeeSage

Paul said:
There's no such thing as a "range object"; range creates a list, which
consumes O(n) memory where n is the number of elements. xrange
creates an xrange object, which is a reusable iterator of sorts.

Aha, thanks for explaining. :)

Regards,
Jordan
 
C

Carl Banks

Paul said:
print sum( (*n for i,n in enumerate(seq)), [])


Wow, I had no idea you could do that. After all the discussion about
summing strings, I'm astonished.


Me too ... despite what bearophile said, this is faster than the 2nd
example. Nearly as fast as the original loop :)


See if it's still faster if the original sequence is length 1000.
(Hint: it won't be.) Adding lists with sum has the same performance
drawback that adding strings does, so it should be avoided.

sum is for adding numbers; please stick to using it that way.

FWIW, the original loop looked perfectly fine and readable and I'd
suggest going with that over these hacked-up listcomp solutions. Don't
use a listcomp just for the sake of using a listcomp.


Carl Banks
 
B

bvdp

Carl said:
Paul said:
(e-mail address removed) writes:
print sum( (*n for i,n in enumerate(seq)), [])

Wow, I had no idea you could do that. After all the discussion about
summing strings, I'm astonished.


Me too ... despite what bearophile said, this is faster than the 2nd
example. Nearly as fast as the original loop :)


See if it's still faster if the original sequence is length 1000.
(Hint: it won't be.) Adding lists with sum has the same performance
drawback that adding strings does, so it should be avoided.

sum is for adding numbers; please stick to using it that way.

FWIW, the original loop looked perfectly fine and readable and I'd
suggest going with that over these hacked-up listcomp solutions. Don't
use a listcomp just for the sake of using a listcomp.


Thanks for that, Carl. I think that using the loop is probably what
I'll end up doing. I had no idea that the listcomp thing would be quite
a complicated as it is appearing. I had it in my mind that I was
missing some obvious thing which would create a simple solution :)

Mind you, there are some interesting bits and pieces of code in this
thread!
 
P

Paul Rubin

Thanks for that, Carl. I think that using the loop is probably what
I'll end up doing. I had no idea that the listcomp thing would be quite
a complicated as it is appearing. I had it in my mind that I was
missing some obvious thing which would create a simple solution :)

I think even if you code a loop, it's still cleaner to use enumerate:


seq = [2, 3, 1, 9]
tmp = []
for i,a in enumerate(seq):
tmp.extend(*a)
 
S

Simon Forman

Thanks for that, Carl. I think that using the loop is probably what
I'll end up doing. I had no idea that the listcomp thing would be quite
a complicated as it is appearing. I had it in my mind that I was
missing some obvious thing which would create a simple solution :)

Mind you, there are some interesting bits and pieces of code in this
thread!

List (and generator) comprehensions are not as bad as all that,
although it took me a little while to figure them out too. ;-)

They are basically normal for loops with optional if statements:

res = [expression1 for var in some_iter if expression2]

is just like:

res = []
for var in some_iter:
if expression2:
res.append(expression1)


More complex comprehensions can be broken down the same way (like Rob
Williscroft did for his fourth attempt):

res = [i for i, x in enumerate(seq) for _ in xrange(x)]

becomes:

res = []
for i, x in enumerate(seq):
for _ in xrange(x):
res.append(i)

Doing this can help you puzzle out variables and if statements in
complex list comps, as the following, admittedly contrived, examples
indicate:

R = range(10)

res = []
for n in R:
if n % 2:
for m in range(n):
res.append(m)

print res == [m for n in R if n % 2 for m in range(n)]

res2 = []
for n in R:
for m in range(n):
if n % 2:
res2.append(m)

print res2 == [m for n in R for m in range(n) if n % 2]

res3 = []
for n in R:
for m in range(n):
if m % 2:
res3.append(m)

print res3 == [m for n in R for m in range(n) if m % 2]


# The above prints True three times.

Of course, if your loops get much more complicated than this you should
probably "spell them out" anyway.

HTH,
~Simon
 
S

Steve Holden

Cool ... and damn but you guys are fast with the answers. This appears
to work find, but in a quick and dirty test it appears that the

  • version takes about 2x as long to run as the original loop. Is this
    normal?

  • No hard and fast information, but as a general rue of thumb replacing
    calls to a c function like list.extend() with iteration inside a list
    comprehension is very likely to be slower.

    regards
    Steve
 
S

Steve Holden

Carl said:
Paul Rubin wrote:

(e-mail address removed) writes:

print sum( (*n for i,n in enumerate(seq)), [])

Wow, I had no idea you could do that. After all the discussion about
summing strings, I'm astonished.

Me too ... despite what bearophile said, this is faster than the 2nd
example. Nearly as fast as the original loop :)


See if it's still faster if the original sequence is length 1000.
(Hint: it won't be.) Adding lists with sum has the same performance
drawback that adding strings does, so it should be avoided.

sum is for adding numbers; please stick to using it that way.

FWIW, the original loop looked perfectly fine and readable and I'd
suggest going with that over these hacked-up listcomp solutions. Don't
use a listcomp just for the sake of using a listcomp.



Thanks for that, Carl. I think that using the loop is probably what
I'll end up doing. I had no idea that the listcomp thing would be quite
a complicated as it is appearing. I had it in my mind that I was
missing some obvious thing which would create a simple solution :)

Your original solution was the simplest, in that it's easy to understand
and maintain.
Mind you, there are some interesting bits and pieces of code in this
thread!
Right. But Python in general is designed to encourage straightforward
expression of straightforward ideas, while retaining the ability to
solve complex problems in innovative ways.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,251
Latest member
41Ki

Latest Threads

Top