Nested scopes and class variables

D

Dave Benjamin

I ran into an odd little edge case while experimenting with functions that
create classes on the fly (don't ask me why):
... class C(object):
... x = x
... print C.x
... Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 2, in f
File "<stdin>", line 3, in C
NameError: name 'x' is not defined

"x" clearly is defined, but apparently Python is not looking at the nested
variable scope to find it. What's stranger is that if I rename the parameter
"x" to "y", the error goes away:
... class C(object):
... x = y
... print C.x
... 5

So, it's not like nested scopes aren't supported in the class block. Rather,
when it sees "x = x", it seems like Python is determining at that point that
"x" is a class variable, and refuses to search any further.

At the top-level, it works as expected:
... x = x
... 5

Any implementation gurus have some insight into what's going on here?
 
W

wittempj

To me it seems you should do it something like this:
-def f(x):
- class C(object):
- def __init__(self, x):
- self.x = x # here you set the attribute for class C
- c = C(x) # instantiate a C object
- print c.x

-f(5)
 
N

Nick Coghlan

Dave said:
I ran into an odd little edge case while experimenting with functions that
create classes on the fly (don't ask me why):

It gets even kookier:


Py> x = 5
Py> def f(y):
.... class C(object):
.... x = x
.... print C.x
....
Py> f(5) # OK with x bound at global
5

Py> def f(x):
.... class C(object):
.... x = x
.... print C.x
....
Py> f(6) # Oops, ignores the argument!
5

Py> def f(y):
.... class C(object):
.... x = y
.... print C.x
....
Py> f(6) # OK with a different name
6
Py> y = 5
Py> def f(y):
.... class C(object):
.... x = y
.... print C.x
....
Py> f(6) # Correctly use the nearest scope
6

That second case is the disturbing one - the class definition has silently
picked up the *global* binding of x, whereas the programmer presumably meant the
function argument.

With a nested function definition (instead of a class definition), notice that
*both* of the first two cases will generate an UnboundLocalError.

Anyway, the Python builtin disassembler is very handy when looking at behaviour
like this (I've truncated the dis output below after the interesting section):

Py> import dis
Py> def f1(x):
.... class C(object):
.... x = x
.... print C.x
....
Py> def f2(y):
.... class C(object):
.... x = y
.... print C.x
....

Py> dis.dis(f1)
2 0 LOAD_CONST 1 ('C')
3 LOAD_GLOBAL 0 (object)
6 BUILD_TUPLE 1
9 LOAD_CONST 2 (<code object C at 00B3E3E0, file "<s
tdin>", line 2>)
[...]

Py> dis.dis(f2)
2 0 LOAD_CONST 1 ('C')
3 LOAD_GLOBAL 0 (object)
6 BUILD_TUPLE 1
9 LOAD_CLOSURE 0 (y)
12 LOAD_CONST 2 (<code object C at 00B3E020, file "<s
tdin>", line 2>)
[...]

Notice the extra LOAD_CLOSURE call in the second version of the code. What if we
define a function instead of a class?:

Py> def f3(x):
.... def f():
.... x = x
.... print x
.... f()
....

Py> def f4(y):
.... def f():
.... x = y
.... print x
.... f()
....

Py> dis.dis(f3)
2 0 LOAD_CONST 1 (<code object f at 00B3EA60, file "<s
tdin>", line 2>)
[...]

Py> dis.dis(f4)
2 0 LOAD_CLOSURE 0 (y)
3 LOAD_CONST 1 (<code object f at 00B3EC60, file "<s
tdin>", line 2>)
[...]

Again, we have the extra load closure call. So why does the function version
give us an unbound local error, while the class version doesn't?. Again, we look
at the bytecode - this time of the corresponding internal code objects:

Py> dis.dis(f1.func_code.co_consts[2])
2 0 LOAD_GLOBAL 0 (__name__)
3 STORE_NAME 1 (__module__)

3 6 LOAD_NAME 2 (x)
9 STORE_NAME 2 (x)
12 LOAD_LOCALS
13 RETURN_VALUE

Py> dis.dis(f3.func_code.co_consts[1])
3 0 LOAD_FAST 0 (x)
3 STORE_FAST 0 (x)

4 6 LOAD_FAST 0 (x)
9 PRINT_ITEM
10 PRINT_NEWLINE
11 LOAD_CONST 0 (None)
14 RETURN_VALUE

In this case, it's the LOAD_FAST opcode that blows up, while the LOAD_NAME falls
back on the globals and then the builtins. Looking at the class based version
that works also lets us see why:

Py> dis.dis(f2.func_code.co_consts[2])
2 0 LOAD_GLOBAL 0 (__name__)
3 STORE_NAME 1 (__module__)

3 6 LOAD_DEREF 0 (y)
9 STORE_NAME 3 (x)
12 LOAD_LOCALS
13 RETURN_VALUE

Here we can see the "LOAD_DEREF" instead of the "LOAD_NAME" that was present in
the version where the same name is reused. The dereference presumably picks up
the closure noted earlier.

I vote bug. If the assignment is going to be allowed, it should respect the
nested scopes.

Cheers,
Nick.
 
N

Nick Coghlan

To me it seems you should do it something like this:
-def f(x):
- class C(object):
- def __init__(self, x):
- self.x = x # here you set the attribute for class C
- c = C(x) # instantiate a C object
- print c.x

-f(5)

That does something different - in this case, x is an instance variable, not a
class variable.

You do raise an interesting questions though:

Py> def f(x):
.... class C(object):
.... x = None
.... def __init__(self):
.... if C.x is None:
.... C.x = x
.... C()
.... print C.x
....
Py> x = 5
Py> f(6)
6

So, the real problem here is the interaction between the ability to write
"<name> = <name>" in a class definition with the reference on the RHS being
resolved in the global namespace and nested scopes (since that first lookup does
NOT respect nested scopes).

Functions don't have the problem, since they don't allow that initial lookup to
be made from the outer scope.

Cheers,
Nick.
 
A

Alex Martelli

Dave Benjamin said:
I ran into an odd little edge case while experimenting with functions that
create classes on the fly (don't ask me why):

"Why not?". But classes have little to do with it, in my view.

... class C(object):
... x = x

You bind x, so x is local (to the class), not free. Videat, classless:
.... def g():
.... x=x
.... return x
.... return g
.... Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 3, in g
UnboundLocalError: local variable 'x' referenced before assignment

In this example the error is discovered when the body of g executes; in
your case, the body of C executes as part of the class statement, i.e.
when f is called, so the error is discovered earlier.
"x" clearly is defined, but apparently Python is not looking at the nested
variable scope to find it. What's stranger is that if I rename the parameter
"x" to "y", the error goes away:

Why is this strange? There's no name conflict then.
So, it's not like nested scopes aren't supported in the class block. Rather,
when it sees "x = x", it seems like Python is determining at that point that
"x" is a class variable, and refuses to search any further.

That's like saying that nested scopes aren't supported in a function...
when Python sees "x = x", etc etc.
At the top-level, it works as expected:

... x = x
...
5

Any implementation gurus have some insight into what's going on here?

Class bodies and function bodies are compiled slightly differently,
leading to a "handy specialcasing" of globals in the latter example
which is probably what's confusing you. OK, let's try digging into more
detail:
.... class C:
.... x = x
.... return C
.... 2 0 LOAD_CONST 1 ('C')
3 BUILD_TUPLE 0
6 LOAD_CONST 2 (<code object C at 0x389860,
file "<stdin>", line 2>)
9 MAKE_FUNCTION 0
12 CALL_FUNCTION 0
15 BUILD_CLASS
16 STORE_FAST 1 (C)

4 19 LOAD_FAST 1 (C)
22 RETURN_VALUE

this shows you where the codeobject for C's body is kept -- constant
number two among f's constants. OK then:
dis.dis(f.func_code.co_consts[2])
2 0 LOAD_GLOBAL 0 (__name__)
3 STORE_NAME 1 (__module__)

3 6 LOAD_NAME 2 (x)
9 STORE_NAME 2 (x)
12 LOAD_LOCALS
13 RETURN_VALUE

Compare with:
.... def g():
.... x = x
.... return x
.... return g
.... 2 0 LOAD_CONST 1 (<code object g at 0x389620,
file "<stdin>", line 2>)
3 MAKE_FUNCTION 0
6 STORE_FAST 1 (g)

5 9 LOAD_FAST 1 (g)
12 RETURN_VALUE
and:
dis.dis(f.func_code.co_consts[1])
3 0 LOAD_FAST 0 (x)
3 STORE_FAST 0 (x)

4 6 LOAD_FAST 0 (x)
9 RETURN_VALUE


See the difference? In a function, the 'x = x' compiles into LOAD_FAST,
STORE_FAST, which only looks at locals and nowhere else. In a
classbody, it compiles to LOAD_NAME, STORE_NAME, which looks at locals
AND globals -- but still not at closure cells...


Alex
 
S

Steven Bethard

Alex said:
... class C:
... x = x
... return C
...
[snip]
... def g():
... x = x
... return x
... return g
...
[snip]

See the difference? In a function, the 'x = x' compiles into LOAD_FAST,
STORE_FAST, which only looks at locals and nowhere else. In a
classbody, it compiles to LOAD_NAME, STORE_NAME, which looks at locals
AND globals -- but still not at closure cells...

Is there a reason why the class body doesn't look at closure cells?
That is, are there cases where this lookup scheme is preferred to one
that checks locals, closure cells and globals?

Steve

P.S. Question disclaimer:
My questions here are founded in a curiosity about language design, and
are not intended as an attack on Python. =)
 
D

Dave Benjamin

Thanks, Nick and Alex, for the nice, detailed explanations. My understanding
of Python bytecode is not deep enough to comment at this time. ;)
 
S

Steven Bethard

I said:
Is there a reason why the class body doesn't look at closure cells? That
is, are there cases where this lookup scheme is preferred to one that
checks locals, closure cells and globals?

For anyone who's interested I found a long thread about this here:

http://mail.python.org/pipermail/python-dev/2002-April/023427.html

And a bug report here:

http://sourceforge.net/tracker/?func=detail&aid=532860&group_id=5470&atid=105470

Steve

[1] http://docs.python.org/ref/naming.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top