Incorrect scope of list comprehension variables

  • Thread starter Alain Ketterlin
  • Start date
A

Alain Ketterlin

Hi all,

I've just spent a few hours debugging code similar to this:

d = dict()
for r in [1,2,3]:
d[r] = [r for r in [4,5,6]]
print d

THe problem is that the "r" in d[r] somehow captures the value of the
"r" in the list comprehension, and somehow kills the loop interator. The
(unexpected) result is {6: [4, 5, 6]}. Changing r to s inside the list
leads to the correct (imo) result.

Is this expected? Is this a known problem? Is it solved in newer
versions?

This is python 2.6.4, on a stock ubuntu 9.10 x86-64 linux box. Let me
know if more detail is needed. Thanks in advance.

-- Alain.
 
C

Chris Rebert

I've just spent a few hours debugging code similar to this:

d = dict()
for r in [1,2,3]:
   d[r] = [r for r in [4,5,6]]
print d

THe problem is that the "r" in d[r] somehow captures the value of the
"r" in the list comprehension, and somehow kills the loop interator. The
(unexpected) result is {6: [4, 5, 6]}. Changing r to s inside the list
leads to the correct (imo) result.

Is this expected? Is this a known problem? Is it solved in newer
versions?

Quoting http://docs.python.org/reference/expressions.html#id19 :

Footnotes
[1] In Python 2.3 and later releases, a list comprehension “leaks†the
control variables of each 'for' it contains into the containing scope.
However, this behavior is deprecated, and relying on it will not work
in Python 3.0

Cheers,
Chris
 
S

Steven D'Aprano

Hi all,

I've just spent a few hours debugging code similar to this:

d = dict()
for r in [1,2,3]:
d[r] = [r for r in [4,5,6]]
print d

This isn't directly relevant to your problem, but why use a list
comprehension in the first place? [r for r in [4,5,6]] is just [4,5,6],
only slower.

I presume that is just a stand-in for a more useful list comp, but I
mention it because I have seen people do exactly that, in real code,
without knowing any better. (I may have even done so myself, once or
twice.)

THe problem is that the "r" in d[r] somehow captures the value of the
"r" in the list comprehension, and somehow kills the loop interator. The
(unexpected) result is {6: [4, 5, 6]}.

Actually, no it doesn't kill the loop at all. You have misinterpreted
what you have seen:
.... print r
.... d[r] = [r for r in [4,5,6]]
.... print d
....
1
{6: [4, 5, 6]}
2
{6: [4, 5, 6]}
3
{6: [4, 5, 6]}


Changing r to s inside the list
leads to the correct (imo) result.

Is this expected? Is this a known problem? Is it solved in newer
versions?

Yes, yes and yes.

It is expected, because list comprehensions leak the variable into the
enclosing scope. Yes, it is a problem, as you have found, although
frankly it is easy enough to make sure your list comp variable has a
unique name. And yes, it is fixed in Python 3.1.
 
S

Steve Howell

Alain Ketterlin said:
I've just spent a few hours debugging code similar to this:
d = dict()
for r in [1,2,3]:
   d[r] = [r for r in [4,5,6]]
print d

Yes, this has been fixed in later revisions, but I'm curious to know what
led you to believe that a list comprehension created a new scope.  I don't
that was ever promised.

Common sense about how programming languages should work? As
confirmed by later revisions?
 
E

Ethan Furman

Steve said:
I've just spent a few hours debugging code similar to this:
d = dict()
for r in [1,2,3]:
d[r] = [r for r in [4,5,6]]
print d

Yes, this has been fixed in later revisions, but I'm curious to know what
led you to believe that a list comprehension created a new scope. I don't
that was ever promised.


Common sense about how programming languages should work? As
confirmed by later revisions?

Common sense? About *somebody else's* idea of how a programming
language should work?

Please. Experiment and read the manual.

~Ethan~
 
A

Alf P. Steinbach

* Ethan Furman:
Steve said:
I've just spent a few hours debugging code similar to this:

d = dict()
for r in [1,2,3]:
d[r] = [r for r in [4,5,6]]
print d

Yes, this has been fixed in later revisions, but I'm curious to know
what
led you to believe that a list comprehension created a new scope. I
don't
that was ever promised.


Common sense about how programming languages should work? As
confirmed by later revisions?

Common sense? About *somebody else's* idea of how a programming
language should work?

Common sense is about practical solutions.

Since there is no practical gain from a list comprehension affecting the
bindings of outside variables, and there correspondingly is a practical pay-off
from list comprehensions not affecting the bindings of outside variables, common
sense is to expect the latter.

It's in the nature of common sense that those who possess this ability often
tend to make the same tentative assumptions when presented with the same
problem. It doesn't mean that they're consulting each other, like your "somebody
else's": it just means that they're applying similar common sense reasoning. So,
there's no great conspiracy.

Please. Experiment and read the manual.

Common sense is applied first, as a heuristic. You really wouldn't want to drill
down into the architect's drawings in order to get office 215 in a building.
First you apply common sense.



Cheers & hth.,

- Alf
 
D

D'Arcy J.M. Cain

Common sense is applied first, as a heuristic. You really wouldn't want to drill
down into the architect's drawings in order to get office 215 in a building.
First you apply common sense.

Oh goodie, bad analogies. Can I play too?

Getting to office 215 is not analogous to writing a program. It is
analogous to using the program. Writing the program is like building
the office tower. You need to know about the tools and materials that
you are working with. You don't use "common sense" to decide what
materials to use. You study the literature and the specs.
 
P

Paul Rubin

Alain Ketterlin said:
d[r] = [r for r in [4,5,6]]
THe problem is that the "r" in d[r] somehow captures the value of the
"r" in the list comprehension, and somehow kills the loop interator.

Yes, this is a well known design error in Python 2.x. The 3.x series
fixes this error but introduces other errors of its own. It is evil
enough that I almost always use this syntax instead:

d[r] = list(r for r in [4,5,6])

that works in 3.x and the later releases of 2.x. In early 2.x (maybe up
to 2.2) it throws an error at compile time rather than at run time.
 
S

Steven D'Aprano

Common sense? About *somebody else's* idea of how a programming
language should work?

Nevertheless, it is a common intuition that the list comp variable should
*not* be exposed outside of the list comp, and that the for-loop variable
should. Perhaps it makes no sense, but it is very common -- I've never
heard of anyone being surprised that the for-loop variable is exposed,
but I've seen many people surprised by the fact that list-comps do expose
their loop variable.
 
S

Steven D'Aprano

Alain Ketterlin said:
d[r] = [r for r in [4,5,6]]
THe problem is that the "r" in d[r] somehow captures the value of the
"r" in the list comprehension, and somehow kills the loop interator.

Yes, this is a well known design error in Python 2.x. The 3.x series
fixes this error but introduces other errors of its own.


Oh, do tell?
 
A

Alain Ketterlin

Alain Ketterlin said:
d = dict()
for r in [1,2,3]:
d[r] = [r for r in [4,5,6]]
print d

Thanks to Chris and Paul for the details (the list comp. r actually
leaks). I should have found this by myself.

My background is more on functional programming languages, that's why I
thought the list comprehension iterator should be purely local. And yes,
I think a "classical" for-loop iterator should also be local to the
loop, but I understand this may be too counter-intuitive to many :)

-- Alain.
 
A

Alain Ketterlin

Steven D'Aprano said:
d = dict()
for r in [1,2,3]:
d[r] = [r for r in [4,5,6]]
print d

This isn't directly relevant to your problem, but why use a list
comprehension in the first place? [r for r in [4,5,6]] is just [4,5,6],
only slower.

Sure. But I've actually spent some time reducing the real code to a
simple illustration of the problem.
THe problem is that the "r" in d[r] somehow captures the value of the
"r" in the list comprehension, and somehow kills the loop interator. The
(unexpected) result is {6: [4, 5, 6]}.

Actually, no it doesn't kill the loop at all. You have misinterpreted
what you have seen:

It kills the iterator, not the loop. Sorry, I used 'kill' with the
meaning it has in compiler textbooks: to assign a new value to a
variable.
It is expected, because list comprehensions leak the variable into the
enclosing scope.

Thanks.

-- Alain.
 
L

Lie Ryan

Alain Ketterlin said:
d = dict()
for r in [1,2,3]:
d[r] = [r for r in [4,5,6]]
print d

Thanks to Chris and Paul for the details (the list comp. r actually
leaks). I should have found this by myself.

My background is more on functional programming languages, that's why I
thought the list comprehension iterator should be purely local. And yes,
I think a "classical" for-loop iterator should also be local to the
loop, but I understand this may be too counter-intuitive to many :)

Actually in other programming languages, loop counter is usually local:

for (int i = 0; i < something; i++) {
....
}
foo(i); // illegal

The reason why python's loop counter leaks is for implementation
simplicity because otherwise python will have to deal with multi-layered
local namespace. Currently in python, the local namespace is just sugar
for an array access (a bit of hand-waving here). In other languages, a
{} block is a namespace and nested {} block means nested namespace even
if they're still in a single function; in python there is only a flat
local namespace and the names resolver becomes a thousand times simpler
(and faster).
 
G

Gregory Ewing

Lie said:
in python there is only a flat
local namespace and the names resolver becomes a thousand times simpler

No, it doesn't. The compiler already has to deal with multiple
scopes for nested functions. There may be some simplification,
but not a lot.

The main reason is linguistic. Having nested blocks create new
scopes does not fit well with lack of variable declarations.
 
R

Rolando Espinoza La Fuente

   d[r] = list(r for r in [4,5,6])

This have a slightly performance difference. I think mainly the
generator's next() call.

In [1]: %timeit list(r for r in range(10000))
100 loops, best of 3: 2.78 ms per loop

In [2]: %timeit [r for r in range(10000)]
100 loops, best of 3: 1.93 ms per loop

~Rolando
 
A

Aahz

Nevertheless, it is a common intuition that the list comp variable should
*not* be exposed outside of the list comp, and that the for-loop variable
should. Perhaps it makes no sense, but it is very common -- I've never
heard of anyone being surprised that the for-loop variable is exposed,
but I've seen many people surprised by the fact that list-comps do expose
their loop variable.

I've definitely seen people surprised by the for-loop behavior.
 
R

Raymond Hettinger

Hi all,

I've just spent a few hours debugging code similar to this:

d = dict()
for r in [1,2,3]:
    d[r] = [r for r in [4,5,6]]
print d

THe problem is that the "r" in d[r] somehow captures the value of the
"r" in the list comprehension, and somehow kills the loop interator. The
(unexpected) result is {6: [4, 5, 6]}. Changing r to s inside the list
leads to the correct (imo) result.

Is this expected? Is this a known problem? Is it solved in newer
versions?

It is the intended behavior in 2.x. The theory was that a list
comprehension would have the same effect as if it had been unrolled
into a regular for-loop.

In 3.x, Guido changed his mind and the induction variable is hidden.
The theory is that some folks (like you) expect the variable to be
private and that is what we already do with generator expressions.

There's no RightAnswer(tm), just our best guess as to what is the most
useful behavior for the most number of people.

Raymond
 
S

Steven D'Aprano

I've definitely seen people surprised by the for-loop behavior.

What programming languages were they used to (if any)?

I don't know of any language that creates a new scope for loop variables,
but perhaps that's just my ignorance...
 
C

Chris Rebert

What programming languages were they used to (if any)?

I don't know of any language that creates a new scope for loop variables,
but perhaps that's just my ignorance...

Well, technically it's the idiomatic placement of the loop variable
declaration rather than the loop construct itself, but:

//Written in Java
//Can trivially be changed to C99 or C++
for (int i = 0; i < array.length; i++)
{
// code
}
// variable 'i' no longer accessible

//Using a for-each loop specific to Java
for (ItemType item : array)
{
// code
}
// variable 'item' no longer accessible

Cheers,
Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top