Do this as a list comprehension?

J

John Salerno

Is it possible to write a list comprehension for this so as to produce a
list of two-item tuples?

base_scores = range(8, 19)
score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3]
print zip(base_scores, score_costs)

I can't think of how the structure of the list comprehension would work
in this case, because it seems to require iteration over two separate
sequences to produce each item in the tuple.

zip seems to work fine anyway, but my immediate instinct was to try a
list comprehension (until I couldn't figure out how!). And I wasn't sure
if list comps were capable of doing everything a zip could do.

Thanks.
 
M

Mensanator

Is it possible to write a list comprehension for this so as to produce a
list of two-item tuples?

base_scores = range(8, 19)
score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3]
print zip(base_scores, score_costs)

I can't think of how the structure of the list comprehension would work
in this case, because it seems to require iteration over two separate
sequences to produce each item in the tuple.

zip seems to work fine anyway, but my immediate instinct was to try a
list comprehension (until I couldn't figure out how!). And I wasn't sure
if list comps were capable of doing everything a zip could do.

Thanks.

base_scores = range(8, 19)
score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3]
print zip(base_scores, score_costs)

s = [(i+8,j) for i,j in enumerate( [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3])]
print s

##>>>
##[(8, 0), (9, 1), (10, 1), (11, 1), (12, 1), (13, 1), (14, 1), (15,
2), (16, 2), (17, 3), (18, 3)]
##[(8, 0), (9, 1), (10, 1), (11, 1), (12, 1), (13, 1), (14, 1), (15,
2), (16, 2), (17, 3), (18, 3)]
##>>>
 
T

Terry Reedy

| > Is it possible to write a list comprehension for this so as to produce
a
| > list of two-item tuples?
| >
| > base_scores = range(8, 19)
| > score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3]
| > print zip(base_scores, score_costs)
| >
| > I can't think of how the structure of the list comprehension would work
| > in this case, because it seems to require iteration over two separate
| > sequences to produce each item in the tuple.

Which is exactly the purpose of zip, or its specialization enumerate!

| > zip seems to work fine anyway, but my immediate instinct was to try a
| > list comprehension (until I couldn't figure out how!). And I wasn't
sure
| > if list comps were capable of doing everything a zip could do.
|
| base_scores = range(8, 19)
| score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3]
| print zip(base_scores, score_costs)
|
| s = [(i+8,j) for i,j in enumerate( [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3])]
| print s
|
| ##>>>
| ##[(8, 0), (9, 1), (10, 1), (11, 1), (12, 1), (13, 1), (14, 1), (15,
| 2), (16, 2), (17, 3), (18, 3)]
| ##[(8, 0), (9, 1), (10, 1), (11, 1), (12, 1), (13, 1), (14, 1), (15,
| 2), (16, 2), (17, 3), (18, 3)]
| ##>>>

Of course, enumerate(iterable) is just a facade over zip(itertools.count(),
iterable)
 
D

dwahli

Of course, enumerate(iterable) is just a facade over zip(itertools.count(),
iterable)

So you could write:
gen = (x for x in itertools.izip(itertools.count(8), [0, 1, 1, 1, 1,
1, 1, 2, 2, 3, 3]))
print list(gen)

Using zip like you own example is the best option.

If you have a huge amount of data and only want to iterate over the
result, using a generator is probably better:
gen = (x for x in itertools.izip(itertools.count(8), [0, 1, 1, 1, 1,
1, 1, 2, 2, 3, 3]))
for i, j in gen:
... your code here ...
 
M

Marc 'BlackJack' Rintsch

Of course, enumerate(iterable) is just a facade over zip(itertools.count(),
iterable)

So you could write:
gen = (x for x in itertools.izip(itertools.count(8), [0, 1, 1, 1, 1,
1, 1, 2, 2, 3, 3]))
print list(gen)

Useless use of a generator expression. This:

gen = itertools.izip(itertools.count(8), [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3])
print list(gen)

has the same effect without the intermediate generator that does nothing
but passing the items.

Ciao,
Marc 'BlackJack' Rintsch
 
M

Mensanator

| > Is it possible to write a list comprehension for this so as to produce
a
| > list of two-item tuples?
| >
| > base_scores = range(8, 19)
| > score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3]
| > print zip(base_scores, score_costs)
| >
| > I can't think of how the structure of the list comprehension would work
| > in this case, because it seems to require iteration over two separate
| > sequences to produce each item in the tuple.

Which is exactly the purpose of zip, or its specialization enumerate!

Aren't you overlooking the fact that zip() truncates the output
to the shorter length iterable? And since the OP foolishly
hardcoded his range bounds, zip(base_scores,score_cost) will
silently return the wrong answer if the base_count list grows.

Surely enumerate() wasn't added to Python with no intention of
ever being used.

| > zip seems to work fine anyway, but my immediate instinct was to try a
| > list comprehension (until I couldn't figure out how!). And I wasn't
sure
| > if list comps were capable of doing everything a zip could do.
|
| base_scores = range(8, 19)
| score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3]
| print zip(base_scores, score_costs)
|
| s = [(i+8,j) for i,j in enumerate( [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3])]
| print s
|
| ##>>>
| ##[(8, 0), (9, 1), (10, 1), (11, 1), (12, 1), (13, 1), (14, 1), (15,
| 2), (16, 2), (17, 3), (18, 3)]
| ##[(8, 0), (9, 1), (10, 1), (11, 1), (12, 1), (13, 1), (14, 1), (15,
| 2), (16, 2), (17, 3), (18, 3)]
| ##>>>

Of course, enumerate(iterable) is just a facade over zip(itertools.count(),
iterable)

But if all I'm using itertools for is the count() function, why would
I
go to the trouble of importing it when I can simply use enumerate()?

Is it a couple orders of magnitude faster?
 
J

John Salerno

"Mensanator" <[email protected]> wrote in message

And since the OP foolishly
hardcoded his range bounds

Hmm, I just love the arrogance of some people. I actually posted a response
to my own thread that asked about this situation of how best to make the
range, but it doesn't seem to have posted.
 
M

Mensanator

And since the OP foolishly
hardcoded his range bounds

Hmm, I just love the arrogance of some people. I actually posted a response
to my own thread that asked about this situation of how best to make the
range, but it doesn't seem to have posted.

It wasn't meant to be arrogant. Just that you must be careful
with zip() because it will not throw an exception if the two
iterables are of different length (this behaviour is by design)
but simply return tuples for the shorter of the iterables.

Hardcoding the range bounds instead of setting them dynamically
is a classic cause of this type of error. Obviously, you want the
range to start with 8, but what should be the upper bound?
The start plus the length of the other iterable keeping in mind
that if length is 11, last index is 8+10 since counting starts at 0.

So you want

range(8,8+len(score_costs))

Using enumerate() means you don't have to figure this out and
you'll never get an error or bad results that don't make an
error.
 
M

Mensanator

Is it possible to write a list comprehension for this so as to produce a
list of two-item tuples?
base_scores = range(8, 19)
score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3] print zip(base_scores,
score_costs)

score_costs = [(base_scores, score_costs) for i in range (len
(base_scores))]


What happens if your iterables aren't the same length?
But, I'd rather just use zip. :)

And with zip() you won't get an error, but it won't be correct,
either.
 
T

Terry Reedy

| > Is it possible to write a list comprehension for this so as to
produce
a
| > list of two-item tuples?
| >
| > base_scores = range(8, 19)
| > score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3]
| > print zip(base_scores, score_costs)
| >
| > I can't think of how the structure of the list comprehension would
work
| > in this case, because it seems to require iteration over two separate
| > sequences to produce each item in the tuple.

Which is exactly the purpose of zip, or its specialization enumerate!

Aren't you overlooking the fact that zip() truncates the output
to the shorter length iterable?
=========================
<message does not quote correctly>
<me> No.
=========================
And since the OP foolishly
hardcoded his range bounds, zip(base_scores,score_cost) will
silently return the wrong answer if the base_count list grows.
============
<me> So, to future proof his code he should better use
zip(itertools.count(8), score_costs). I consider this better than using
enumerate to make the wrong pairing (with itertools.count(0)) and then
correcting the mistake.
====================
Surely enumerate() wasn't added to Python with no intention of
ever being used.
========================
<me> Of course not, so why suggest that is was?
However, it was intended for the most common case when one wants to pair
items with counts beginning with 0.
=================================
Of course, enumerate(iterable) is just a facade over
zip(itertools.count(),
iterable)

But if all I'm using itertools for is the count() function, why would
I go to the trouble of importing it when I can simply use enumerate()?
====================================
<me>I have no idea. The purpose of enumerate is to be easy.
But it is not so easy when it gives the wrong pairings.
===================================
Is it a couple orders of magnitude faster?
=================================
<me> Perhaps you do not understand 'facade' - the front part or face of
something that you see. I was saying that enumerate is a face on a room
containing zip and itertools.count, or the equivalent code thereof.
Therefore, enumerate is an easy way to do a particular zip, not an
alternative to zip. And there should be no significant performance
difference, certainly for long sequences which make the additional lookups
irrelevant.

tjr
 
M

Mensanator

"Mensanator" <[email protected]> wrote in message
| > Is it possible to write a list comprehension for this so as to
produce
a
| > list of two-item tuples?
| >
| > base_scores = range(8, 19)
| > score_costs = [0, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3]
| > print zip(base_scores, score_costs)
| >
| > I can't think of how the structure of the list comprehension would
work
| > in this case, because it seems to require iteration over two separate
| > sequences to produce each item in the tuple.
Which is exactly the purpose of zip, or its specialization enumerate!

Aren't you overlooking the fact that zip() truncates the output
to the shorter length iterable?
=========================
<message does not quote correctly>
<me> No.
=========================
And since the OP foolishly
hardcoded his range bounds, zip(base_scores,score_cost) will
silently return the wrong answer if the base_count list grows.
============
<me> So, to future proof his code he should better use
zip(itertools.count(8), score_costs).  I consider this better than using
enumerate to make the wrong pairing (with itertools.count(0)) and then
correcting the mistake.

Mistake? How is starting at 0 a mistake? Because .count() can
start at 8, eliminating the i+8 construction? But what if
I wanted to count by two or want a sequence cubes that start
at 8? Can itertools do that? I would say knowing how to
manipulate a 0-based iterable will pay off more in the long
run. If you don't know how to get the index numbers you want
from enumerate(), itertools isn't going to help.
====================
Surely enumerate() wasn't added to Python with no intention of
ever being used.
========================
<me> Of course not, so why suggest that is was?
However, it was intended for the most common case when one wants to pair
items with counts beginning with 0.
=================================


But if all I'm using itertools for is the count() function, why would
I go to the trouble of importing it when I can simply use enumerate()?
====================================
<me>I have no idea.  The purpose of enumerate is to be easy.
But it is not so easy when it gives the wrong pairings.

The same can be said of zip() if you're not careful
about the size of the iterables. There is no substitute
for understanding how things work.
===================================
Is it a couple orders of magnitude faster?
=================================
<me> Perhaps you do not understand 'facade' - the front part or face of
something that you see.  

Well, I also understand the secondary meaning:
2. An artificial or deceptive front
I was saying that enumerate is a face on a room
containing zip and itertools.count, or the equivalent code thereof.

I thought you were trying to imply that to use enumerate()
is to be un-Pythonic, that _real_ programmers always use
itertools.
Therefore, enumerate is an easy way to do a particular zip, not an
alternative to zip.  

Ok, I mistook your use of 'facade' for arrogance,
just as the OP mistook my use of 'foolishly' for
arrogance. Ain't English wonderful?
 
M

Mensanator

I chose not to consider that case,

That's a bad habit to teach a newbie, isn't it?
since they were the same length in the
original post. �

The issue I'm stressing is HOW they got to be
the same size. If I was teaching programming
and the OP turned in that example, I would mark
him down. Not because he got the answer wrong.
Not because he used zip. Not because he failed
to use enumerate or itertools. But because he
hardcoded the array bounds and THAT'S the lesson
the OP should take away from this.
Based on the variable names, it seemed reasonable that
there would always be a 1-to-1 correspondence between
elements of each list. �

Yes, reasonable at the time. But this is Python,
not C. Lists can change size under program control,
a feature that's extremely useful. But there's a
price for that usefulness. Practices that made
sense in C (define a constant to set array bounds)
aren't transportable over to systems that simply
don't have the restrictions that require those
practices.
However, if you do

score_costs = [(base_scores, score_costs) for i in range (min (len
(base_scores), len (score_costs))]

then you get exactly what you would get using zip. �


And if the iterables change size dynamically, you
could get either a crash due to index out of range
or else silently a wrong answer.
That's one heck of a
long line, though, hence my earlier comment:

I have no problem telling the OP this method.
But I think you should warn him about potential
problems.

Surely you didn't intend to leave him to twist
in the wind so that he learns a thing or two
when his programs crash?
If it doing what zip() does makes sense, then just use zip(). �Otherwise,
check for the case where the iterables are of different length, and do
the appropriate thing (raise an error, pad the shorter one, whatever).

That's all I'm suggesting. Along with pointing out
that enumerate or itertools can do this for him.
 
J

John Salerno

Mensanator said:
Surely enumerate() wasn't added to Python with no intention of
ever being used.

I see your reasons for preferring enumerate over zip, but I'm wondering
if using enumerate this way isn't a little hackish or artificial. Isn't
the point of enumerate to get the index of a specific item, as well as
that item itself? It seems like arbitrarily altering the index (i+8) is
an abuse of enumerate's true purpose.

Does this make any sense, or is it generally acceptable to use enumerate
like this?
 
M

Mensanator

I see your reasons for preferring enumerate over zip,

It's not that I prefer it, it's that you specifically
asked a list comprehension and I gave you one. I use
zip when I need to, but I would never use it with a
range function unless I'm deliberately creating a
subset (where you certainly don't want to use enumerate).
but I'm wondering if using enumerate this way isn't
a little hackish or artificial.

I wouldn't say so. An index can be part of the
data structure - a part that doesn't have to be
stored, it's implied. And just like everything
else, this entails a certain responsibility on
your part that the data you are storing is indexed
correctly.
Isn't the point of enumerate to get the index
of a specific item, as well as that item itself?

I would say that's the main point but not the
only point. It's handy when you have multiple
iterables that are related by their position.

But even in a single iterable the position can
be part of the data structure.
It seems like arbitrarily altering the index (i+8) is
an abuse of enumerate's true purpose.

It's not an abuse because we aren't using (i+8) as
an index, it's simply data. Your point would be
more valid if we turned around and used (i+8) as
an index into another iterable. Not that we couldn't
do it, but we would have to be concerned about
"out of range" errors.
Does this make any sense,

It makes as much sense as range(8,19).
or is it generally acceptable to use enumerate
like this?

I use enumerate frequently, and often the index
isn't used as an index. Let me ive you an example:

Say I have a list sv=[1,2,3,4].

What does this list represent? It represents
the contiguous n/2 operations in a sequence from
the Collatz Conjecture. If you know that, you should
immediately ask why I'm not counting the 3n+1
operations? Ah, but I AM counting them. In the CC,
every block of contiguous n/2 operations is seperated
by EXACTLY one 3n+1 operation. So the list structure
implies that every number is preceeded by a single
3n+1 operation. Thus, I don't have to waste memory
counting the 3n+1 blocks because the count is ALWAYS
one and ALWAYS precedes each number in the list.

Now, there's a real nice function you can derive from
any given list. The function is always the same but
it contains three constants (X,Y,Z) that have to be
derived from the list. X is simply 2**sum(sv). Y is
simply 3**len(sv).

But Z is much trickier. For every term in sv, there
is a term that's a power of 3 times a power of 2,
all of which must be summed to get Z.
sv =[1,2,3,4]
b = [2**i for i in sv]
b
[2, 4, 8, 16]

Here I'm simply using the elements of sv to compute
the powers of 2. Nothing special here.
c = [3**i for i,j in enumerate(sv)]
c
[1, 3, 9, 27]

Here, I'm using the index to find the powers of 3.
Note that j isn't being used. But, I can then
combine them thusly:
d = [3**i*2**j for i,j in enumerate(sv)]
d
[2, 12, 72, 432]

Here the index isn't being used as an index at all,
it's an integral part of the data structure that
can be extracted along with the physical contents
of the list by using enumerate.

Actually, that formula doesn't give the right
answer, it was just a demo. The true formula
would be this:
f = [3**i*2**(sum(sv[:len(sv)-i-1])) for i,j in enumerate(sv)]
f [64, 24, 18, 27]
Z = sum(f)
Z
133

Although in my actual collatz_functions library I use

for i in xrange(svll,-1,-1):
z = z + (TWE**i * TWO**sum(itertools.islice(sv,0,svll-i)))

Which I csn use to check the above answer.
import collatz_functions as cf
xyz = cf.calc_xyz(sv)
xyz[2]
mpz(133)

Maybe I've gone beyond enumerate's true purpose,
but I find it hard to imagine the designers not
anticipating such useage.
 
T

Terry Reedy

| Mensanator wrote:
|
| > Surely enumerate() wasn't added to Python with no intention of
| > ever being used.
|
| I see your reasons for preferring enumerate over zip, but I'm wondering
| if using enumerate this way isn't a little hackish or artificial.

It seems to be a difference of personal preference. I see no reason to
write a for loop (statement or expression) when a simple usage of basic
builtins does the same. Mensanator apparently does. So it goes.

Because zip stops when the first iterator is exhausted, the original zip
with range can be pretty well future proofed with a high stop value.

zip(range(9,2000000000), iterable)

Of course, a non-1 step can be added to the range.

tjr
 
M

Mensanator

Mensanator wrote:

|
| > Surely enumerate() wasn't added to Python with no intention of
| > ever being used.
|
| I see your reasons for preferring enumerate over zip, but I'm wondering
| if using enumerate this way isn't a little hackish or artificial.

It seems to be a difference of personal preference. �I see no reason to
write a for loop (statement or expression) when a simple usage of basic
builtins does the same. �Mensanator apparently does. �

*sigh* I never said a for..loop was preferable.
What I said was the answer to "Can I do this with
a list comprehension?"

I never said you shouldn't use the builtins.

What I DID say was that how the builtins actually
work should be understood and it APPEARED that the
OP didn't understand that. Maybe he understood that
all along but his example betrayed no evidence of
that understanding.
So it goes.

So I was trying to help the guy out. So sue me.
Because zip stops when the first iterator is exhausted, the original zip
with range can be pretty well future proofed with a high stop value.

zip(range(9,2000000000), iterable)

Oh, dear. You didn't actually try this, did you?
Of course, a non-1 step can be added to the range.

I would hope so, otherwise you're going to have
a problem creating a list with two billion integers.
You might want to try xrange() in your production code.
 
J

John Salerno

Mensanator said:
What I DID say was that how the builtins actually
work should be understood and it APPEARED that the
OP didn't understand that. Maybe he understood that
all along but his example betrayed no evidence of
that understanding.

Well, the truth is that I know zip truncates to the shorter of the two
arguments, and also in my case the two arguments would always be the
same length. But it is still helpful for other people to point out to me
potential problems like this, so I can be aware of it the next time I
might want to use zip with unequal length arguments.
 
P

Paul Hankin

I chose not to consider that case, since they were the same length in the
original post.  Based on the variable names, it seemed reasonable that
there would always be a 1-to-1 correspondence between elements of each
list.

If you think something is true but want to guarantee it, use assert.
Then when your assumption is wrong at least you find out.

assert len(base_scores) == len(score_costs)
score_costs = zip(base_scores, score_costs)
 
M

Mensanator

Well, the truth is that I know zip truncates to the shorter of the two
arguments,

Ok, sorry I thought otherwise.
and also in my case the two arguments would always be the
same length.

Yes, because you're controlling the source code.
But since lists are mutable, source code literals
don't always control the length of the list.
But it is still helpful for other people to point out to me
potential problems like this,

My intentions were always to be helpful, not arrogant.
Otherwise, I'd just tell you to go RTFM. I do seem to have
a gift for rubbing people the wrong way.

What's important at the end of the day is that you
have a working program, right?
so I can be aware of it the next time I
might want to use zip with unequal length arguments.

There are, of course, situations where that's
desireable, that's why I said such behaviour is
by design. It's much easier to debug a program
that crahes than one that works but gives you
the wrong answer. You'll find the "out of range"
errors readily enough, it's when the list is longer
than the range and the answer comes up short that
are hard to spot.

That previous example I gave you? I've run that
with a list of over 800,000 numbers, far too many
to manually verify. I must have absolute confidence
that the algorithm works correctly. That's what
you should strive for - confidence that things will
work even when you can't manually verify them.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top