scope of generators, class variables, resulting in global na

N

Nomen Nescio

Hello,

Can someone help me understand what is wrong with this example?

class T:
A = range(2)
B = range(4)
s = sum(i*j for i in A for j in B)

It produces the exception:

<type 'exceptions.NameError'>: global name 'j' is not defined

The exception above is especially confusing since the following similar example (I just replaced the generator by an explicit array) works:

class T:
A = range(2)
B = range(4)
s = sum([(i*j) for i in A for j in B])

(BTW, the class scope declarations are intentional).

Thanks, Leo.
 
A

Alf P. Steinbach

* Nomen Nescio:
Hello,

Can someone help me understand what is wrong with this example?

class T:
A = range(2)
B = range(4)
s = sum(i*j for i in A for j in B)

It produces the exception:

<type 'exceptions.NameError'>: global name 'j' is not defined

Which Python implementation are you using?

I can't reproduce the error message that you cite.


<example>
C:\test> py2
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information..... A = range(2)
.... B = range(4)
.... s = sum(i*j for i in A for j in B)
....
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in T

C:\test> py3
Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information..... A = range(2)
.... B = range(4)
.... s = sum(i*j for i in A for j in B)
....
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in T

C:\test> _
</example>

Reason for the NameError:

The above is a /generator expression/, evaluated in a class definition.

The docs have a similar example but I'm sorry, I'm unable to find it! Anyway the
generator expression is evaluated as if its code was put in function. And from
within the scope of that function you can't access the class scope implicitly,
hence, no access to 'B'.


The exception above is especially confusing since the following similar example
(I just replaced the generator by an explicit array) works:

class T:
A = range(2)
B = range(4)
s = sum([(i*j) for i in A for j in B])

(BTW, the class scope declarations are intentional).

Thanks, Leo.

This is a /list comprehension/, not a generator expression (although
syntactically it's almost the same).

It obeys different rules.

Essentially the generator expression produces a generator object that you may
name or pass around as you wish, while the comprehension is just a syntactical
device for writing more concisely some equivalent code that's generated inline.

However, apparently the rules changed between Python 2.x and Python 3.x.

In Python 3.x also the list comprehension fails in a class definition:


<example>
C:\test> py2
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information..... A = range(2)
.... B = range(4)
.... s = sum([(i*j) for i in A for j in B])
....
C:\test> py3
Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information..... A = range(2)
.... B = range(4)
.... s = sum([(i*j) for i in A for j in B])
....
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in T

C:\test> _
</example>


From one point of view it's good that Py3 provides about the same behavior for
generator expressions and list comprehensions.

But I'd really like the above examples to Just Work. :)


Cheers & hth.,

- Alf
 
B

b3ng0

Hello,

Can someone help me understand what is wrong with this example?

class T:
  A = range(2)
  B = range(4)
  s = sum(i*j for i in A for j in B)

It produces the exception:

<type 'exceptions.NameError'>: global name 'j' is not defined

The exception above is especially confusing since the following similar example (I just replaced the generator by an explicit array) works:

class T:
  A = range(2)
  B = range(4)
  s = sum([(i*j) for i in A for j in B])

(BTW, the class scope declarations are intentional).

Thanks, Leo.

The best way to mimic the same behavior, while getting around the
scoping issues, would be as follows:

class T:
A = range(2)
B = range(4)

@property
def s(self):
return sum(i*j for i in self.A for j in self.B)

T().s will now return 6
 
J

Jon Clements

* Nomen Nescio:
Can someone help me understand what is wrong with this example?
class T:
  A = range(2)
  B = range(4)
  s = sum(i*j for i in A for j in B)
It produces the exception:
<type 'exceptions.NameError'>: global name 'j' is not defined

Which Python implementation are you using?

I can't reproduce the error message that you cite.

<example>
C:\test> py2
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> class T:
...   A = range(2)
...   B = range(4)
...   s = sum(i*j for i in A for j in B)
...
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "<stdin>", line 4, in T
   File "<stdin>", line 4, in <genexpr>
NameError: global name 'B' is not defined
 >>> exit()

C:\test> py3
Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> class T:
...   A = range(2)
...   B = range(4)
...   s = sum(i*j for i in A for j in B)
...
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "<stdin>", line 4, in T
   File "<stdin>", line 4, in <genexpr>
NameError: global name 'B' is not defined
 >>> exit()

C:\test> _
</example>

Reason for the NameError:

The above is a /generator expression/, evaluated in a class definition.

The docs have a similar example but I'm sorry, I'm unable to find it! Anyway the
generator expression is evaluated as if its code was put in function. And from
within the scope of that function you can't access the class scope implicitly,
hence, no access to 'B'.
The exception above is especially confusing since the following similar example
(I just replaced the generator by an explicit array) works:
class T:
  A = range(2)
  B = range(4)
  s = sum([(i*j) for i in A for j in B])
(BTW, the class scope declarations are intentional).
Thanks, Leo.

This is a /list comprehension/, not a generator expression (although
syntactically it's almost the same).

It obeys different rules.

Essentially the generator expression produces a generator object that you may
name or pass around as you wish, while the comprehension is just a syntactical
device for writing more concisely some equivalent code that's generated inline.

However, apparently the rules changed between Python 2.x and Python 3.x.

In Python 3.x also the list comprehension fails in a class definition:

<example>
C:\test> py2
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> class T:
...   A = range(2)
...   B = range(4)
...   s = sum([(i*j) for i in A for j in B])
...
 >>> exit()

C:\test> py3
Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> class T:
...   A = range(2)
...   B = range(4)
...   s = sum([(i*j) for i in A for j in B])
...
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "<stdin>", line 4, in T
   File "<stdin>", line 4, in <listcomp>
NameError: global name 'B' is not defined
 >>> exit()

C:\test> _
</example>

 From one point of view it's good that Py3 provides about the same behavior for
generator expressions and list comprehensions.

But I'd really like the above examples to Just Work. :)

Cheers & hth.,

- Alf


s = sum( i*j for i,j in iterools.product(A, B) ) is a possible work
around.

Which could generalise to (something like):

s = sum( reduce(op.mul, seq) for seq in product(A, B, B, A) )

Cheers,

Jon.
 
A

Arnaud Delobelle

Nomen said:
Hello,

Can someone help me understand what is wrong with this example?

class T:
A = range(2)
B = range(4)
s = sum(i*j for i in A for j in B)

It produces the exception:

<type 'exceptions.NameError'>: global name 'j' is not defined

It's due to scoping rules for classes and/or how generator expressions
are compiled. When a function definition is executed from within the
body of a class, the body of the class doesn't act as an outer scope
for it. E.g.

class A:
x = 2
def f(self): return x

When f is defined (technically, when the closure is made), the name
'x' is not bound to 2 but is considered to be in the global namespace
(unless the class is defined within a function for example). Now
generator expressions are defined as generator functions so your
example is akin to something like:

class T:
A = range(2)
B = range(4)
def _gen(L):
for i in L:
for j in B:
yield i*j
s = sum(_gen(A))

(From memory, I might be wrong on some details)

Now looking at the above, when _gen is defined, B is considered to be
belonging to the global namespace, therefore if it is not defined
there a NameError will be thrown.

Now a safe workaround to this would be:

class T:
A = range(2)
B = range(4)
s = (lambda A=A, B=B: sum(i*j for i in A for j in B))()

The lambda form is evaluated when the body of the class is executed,
binding the names A and B to the objects you want in the generator
expression.

I remember suggesting a while ago that all generator expressions be
implicitely wrapped like the one above in order to avoid such
surprises. I can't quite remember what the arguments against were,
but you can probably find them in the archives!
 
G

Guest

Hi Folks,

Thanks everyone for the great contributions! I understand this better
now. The distinction between a shorthand for a function definition and
a shorthand for a loop iteration is crucial.

Also: thanks for pointing out the even the list comprehension doesn't
work in py3. That was incredibly useful! I was about to build a
package using Python and now (unfortunately) I will have to find
another language. Saved me a big headache!

More details...
I can't reproduce the error message that you cite.

Sorry, I made a cut and paste error in my first post. The error was
exactly the one in your post.
Reason for the NameError:

The above is a /generator expression/, evaluated in a class definition.

The docs have a similar example but I'm sorry, I'm unable to find it! Anyway the
generator expression is evaluated as if its code was put in function. And from
within the scope of that function you can't access the class scope implicitly,
hence, no access to 'B'.
This is a /list comprehension/, not a generator expression (although
syntactically it's almost the same).

It obeys different rules.

Essentially the generator expression produces a generator object that you may
name or pass around as you wish, while the comprehension is just a syntactical
device for writing more concisely some equivalent code that's generated inline.

However, apparently the rules changed between Python 2.x and Python 3.x.

In Python 3.x also the list comprehension fails in a class definition:
C:\test> py3
Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> class T:
...   A = range(2)
...   B = range(4)
...   s = sum([(i*j) for i in A for j in B])
...
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "<stdin>", line 4, in T
   File "<stdin>", line 4, in <listcomp>
NameError: global name 'B' is not defined
 >>> exit()

Yuck! Why should the list comprehension fail in Py3? The scope rules
that explain why the generator expression would fail don't apply in
that case. Plus Guido's comment on generators being consumed quickly
also doesn't apply.
 From one point of view it's good that Py3 provides about the same behavior for
generator expressions and list comprehensions.

But I'd really like the above examples to Just Work. :)

Absolutely agreed. Especially with the list comprehension.

I read PEP 289 and the 2004 exchanges in the mailing list regarding
scoping and binding issues in generator expressions, when this feature
was added to the language. Forgive my blasphemy, but Guido got it
wrong on this one, by suggesting that an obscure use case should drive
design considerations when a far more common use case exists. The
obscure use case involves confusing sequences of exceptions, while the
common use case is consistent scope rules at different levels. For
details search the mailing list for Guido's quote in PEP 289.

Here is another example that fails...
In [7]: class T:
...: A = range(2)
...: B = range(2,4)
...: g = (i*j for i in A for j in B)
...: s = sum(g)
---------------------------------------------------------------------------
NameError Traceback (most recent call
last)
C:\Python26\<ipython console> in <module>()
C:\Python26\<ipython console> in T()
C:\Python26\<ipython console> in <genexpr>((i,))
NameError: global name 'B' is not defined

.... at sum(g)!

These two examples work (doing the same thing at a global scope and a
local scope):

In [1]: class T:
...: def local(self):
...: A = range(2)
...: B = range(2,4)
...: g = (i*j for i in A for j in B)
...: s = sum(g)
...:
In [2]: t = T()
In [3]: t.local()
In [4]: A = range(2)
In [5]: B = range(2,4)
In [6]: g = (i*j for i in A for j in B)
In [7]: s = sum(g)

Thanks to everyone who suggested workarounds. They are very helpful.
At the same time, they are -- forgive my language -- so perlish (as in
clever, and requiring deep understanding of the language). In Python,
simple elegant constructs should work consistently across all scopes.
The very fact that people chose to use the word 'workaround' indicates
how quirky this aspect of the language is.

It really doesn't need to be so quirky. Defining the semantics of
generator expressions to mimic Arnaud's lambda workaround below would
be just as justified as the current definition, and more pythonic (in
the consistent, elegant sense). @Arnaud: I tried to find your earlier
post -- googled "Arnaud lambda" -- but couldn't.

Does anyone know of a more recent PEP that addresses these issues?

Thanks,
Leo.
 
A

Arnaud Delobelle

dontspamleo said:
@Arnaud: I tried to find your earlier post -- googled "Arnaud lambda"
-- but couldn't.

I remembered after posting that I sent this to python-ideas. Here is the
first message in the thread:

http://mail.python.org/pipermail/python-ideas/2007-December/001260.html

In this thread, someone claims that this issue was discussed when
genexps were first added to the language, but doesn't remember the
rationale for taking this decision. Perhaps someone knows what
discussion they refer to...
 
G

Guest

Hi Arnaud et al,

Here is the link to the bug report from which the discussion in PEP
289 was extracted:

http://bugs.python.org/issue872326

It looks like they were fixing a bunch of bugs and that this
discussion was one of the many in that thread.

Here is another post which points to the core of the issue: early/late
binding. It is also pointed to in PEP 289.

http://mail.python.org/pipermail/python-dev/2004-April/044555.html

Here is Guido's rationale (his text is the items (a) and (b) below. My
comments follow.

(a) I find it hard to defend automatic variable capture given Python's
general late-binding semantics

MY COMMENTS: That is a good point. There are however exceptions to the
"general late-binding semantics" (GLBS) in Python. The outer loop does
bind early in this case in fact. A point to consider along with GLBS
is the unintuitive early bind of the external iterator vs late bind of
the internal iterator. That is a function not of early vs. late
binding, but of the current translation of generator expressions to
generator functions. Using Arnaud's approach would fix that and make
the code IMO so more pythonic.

(b) I believe that use cases needing early binding are uncommon and
strained: they all involve creating a list of generator
expressions, which IMO is a pretty unusual thing to do

MY COMMENTS: This is actually a pretty common use case. Any iteration
over objects of arity 2 or greater (A matrix, a table, a tree searched
breadthwise under some data structures, etc.) can conveniently and
cleanly be expressed using a generator expression.

I'd also add that perhaps the title of Guido's post: "Generator
Expressions - Let's Move Forward" may have unintentionally discouraged
people from reexamining the issue.

I would like to see this decision revisited. Obviously before spending
any time on this I'd like to gauge if there is further interest. Am I
off the mark? Maybe there is a technical misunderstanding on my part.
If there are more people affected (Since the last week I found some
other postings here and on other lists and blogs) who can make a clear
case, then what is the next step?

Cheers,
Leo.
 
A

Arnaud Delobelle

dontspamleo said:
Hi Arnaud et al,

Here is the link to the bug report from which the discussion in PEP
289 was extracted:

http://bugs.python.org/issue872326

It looks like they were fixing a bunch of bugs and that this
discussion was one of the many in that thread.

Here is another post which points to the core of the issue: early/late
binding. It is also pointed to in PEP 289.

http://mail.python.org/pipermail/python-dev/2004-April/044555.html

Thanks for digging those out!
Here is Guido's rationale (his text is the items (a) and (b) below. My
comments follow.

(a) I find it hard to defend automatic variable capture given Python's
general late-binding semantics

MY COMMENTS: That is a good point. There are however exceptions to the
"general late-binding semantics" (GLBS) in Python. The outer loop does
bind early in this case in fact. A point to consider along with GLBS
is the unintuitive early bind of the external iterator vs late bind of
the internal iterator. That is a function not of early vs. late
binding, but of the current translation of generator expressions to
generator functions. Using Arnaud's approach would fix that and make
the code IMO so more pythonic.

But see this message from Guido:

http://bugs.python.org/issue872326#msg45190

It looks like he's thinking of early binding of free variables. Later
on he seems to contradict this.

Below it is mentioned that two patches were made, one with the current
behaviour and one with the one you (and I) would prefer.
(b) I believe that use cases needing early binding are uncommon and
strained: they all involve creating a list of generator
expressions, which IMO is a pretty unusual thing to do

MY COMMENTS: This is actually a pretty common use case. Any iteration
over objects of arity 2 or greater (A matrix, a table, a tree searched
breadthwise under some data structures, etc.) can conveniently and
cleanly be expressed using a generator expression.

And also your example when defining a generator expression in a class
scope becomes confusing with the current semantics.
I'd also add that perhaps the title of Guido's post: "Generator
Expressions - Let's Move Forward" may have unintentionally discouraged
people from reexamining the issue.

I would like to see this decision revisited. Obviously before spending
any time on this I'd like to gauge if there is further interest. Am I
off the mark? Maybe there is a technical misunderstanding on my part.
If there are more people affected (Since the last week I found some
other postings here and on other lists and blogs) who can make a clear
case, then what is the next step?

I suppose it would be interesting to see how this can be implemented,
maybe look at the gexp.diff.capture file on the bug tracker. I was
intending to look into it a couple of years ago but I could never find
the time :(
 
G

Guest

I think a big part of the problem is that the scoping rules in Python
are inconsistent because classes are a different kind of object. An
example helps:

This works:

x = 1
def f(y): return y + x

This works:

def f():
x = 1
def g(y): return x + y
return g(2)


But this doesn't work...
class C:
x = 1
def f(self,y): return x + y

....although what was probably meant was this, which does work...
class C:
x = 1
def f(self,y): return self.x + y

....and really means this...
class C:
x = 1
def f(self,y): return T.x + y

....which create other quirkiness when x is mutable as illustrated
nicely here:

http://bioscreencastwiki.com/Python_Variable_scope_gymnastics

One reasonable answer to this quirkiness is RTFM. Classes are well
documented as is everything else in python. Mostly late-binding
semantics are also well documented.

I argue that it is more consistent to have the scope for classes be
consistent with the scope of everything else, which makes the early/
late binding point mute.

I know this is a major change, that it would break existing code, etc.
It would have been better to talk about these things before py3k.
Still:

1. Has this been discussed before?

1. What would this suggestion break?

2. What are the advantages of making the scope of class variables
different? Maybe is it just a historical trait?

Cheers,
Leo.
 
S

Steven D'Aprano

I think a big part of the problem is that the scoping rules in Python
are inconsistent because classes are a different kind of object. An
example helps: [...]
But this doesn't work...
class C:
x = 1
def f(self,y): return x + y

...although what was probably meant was this, which does work...
class C:
x = 1
def f(self,y): return self.x + y

Yes, this is a deliberate design choice. See below.

...and really means this...
class C:
x = 1
def f(self,y): return T.x + y

I don't understand what T is. Did you mean C?

.... x = 1
.... def __init__(self, x=None):
.... if x is not None:
.... self.x = x
........ def f(self, y): return self.x + y
........ def f(self, y): return C.x + y
....11


I have no doubt that there will be other examples involving multiple
inheritance, but I trust I've made my point.

I argue that it is more consistent to have the scope for classes be
consistent with the scope of everything else, which makes the early/
late binding point mute.

Yes, it would be more consistent, but less useful. Apart from methods
themselves, using class attributes is a relatively rare thing to do, but
using global level names inside methods is very common.

I know this is a major change, that it would break existing code, etc.
It would have been better to talk about these things before py3k. Still:

1. Has this been discussed before?

Yes.
1. What would this suggestion break?

Nearly all existing code using classes, which is nearly everything.
2. What are the advantages of making the scope of class variables
different? Maybe is it just a historical trait?

See the discussion in the PEP for introducing nested scopes in the first
place:

http://www.python.org/dev/peps/pep-0227/
 
G

Guest

...and really means this...
I don't understand what T is. Did you mean C?

Yes, I meant C. Thanks.
If so, you are wrong. self.x is not the same as <class>.x due to
inheritance rules. Consider one example:
<example snipped see thread/>

Thanks for the nice example. Sorry for my loose language. By "really
means", what I really meant was that the most appropriate construct
should be the one referring to the class variable explicitly. I would
consider it inelegant (at least) to use an instance variable with the
same name as a class variable.
Nearly all existing code using classes, which is nearly everything.

Is it that common to have code containing a class variable with the
same name as a global variable? Are there other use cases that would
break?
See the discussion in the PEP for introducing nested scopes in the first
place:

http://www.python.org/dev/peps/pep-0227/

Thanks. That is really useful. I just read the PEP. I find this
paragraph very helpful:

"An alternative would have been to allow name binding in class
scope to behave exactly like name binding in function scope. This
rule would allow class attributes to be referenced either via
attribute reference or simple name. This option was ruled out
because it would have been inconsistent with all other forms of
class and instance attribute access, which always use attribute
references. Code that used simple names would have been obscure."

The point about all other access use cases requiring attribute
references is a really good one. If a language requires self.f() to
access the member f of class C, which happens to be a function, then
self.x or C.x should also be required to access attribute x. And no
one would be crazy enough to ask to have that fixed.

@steven: Are there other snippets from the PEP you were pointing to
specifically?

Cheers,
Leo.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top