How do functions get access to builtins?

S

Steven D'Aprano

I've been playing around with ChainedMap in Python 3.3, and run into
something which perplexes me. Let's start with an ordinary function that
accesses one global and one builtin.


x = 42
def f():
print(x)


If you call f(), it works as expected. But let's make a version with no
access to builtins, and watch it break:

from types import FunctionType
g = FunctionType(f.__code__, {'x': 23})


If you call g(), you get an exception:

py> g()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f
NameError: global name 'print' is not defined


(Don't be fooled by the traceback referring to "f" rather than g.
That's because g's code was copied from f.)

We can add support for builtins:

import builtins # use __builtin__ with no "s" in Python 2
g.__globals__['__builtins__'] = builtins # Note the "s" in the key.


and now calling g() prints 23, as expected.

Now let me try the same thing using Python 3.3's ChainMap. Unfortunately,
functions insist that their __global__ is a dict, so we fool it into
accepting a ChainMap with some multiple inheritance trickery:


from collections import ChainMap
class ChainedDict(ChainMap, dict):
pass

d = ChainedDict({}, {'x': 23}, {'x': 42})
assert d['x'] == 23
g = FunctionType(f.__code__, d)


As before, calling g() raises NameError, "global name 'print' is not
defined". So I expected to be able to fix it just as I did before:

g.__globals__['__builtins__'] = builtins


But it doesn't work -- I still get the same NameError. Why does this not
work here, when it works for a regular dict?


I can fix it by adding the builtins into the ChainMap:

g.__globals__.maps.append(builtins.__dict__)


And now calling g() prints 23 as expected.
 
R

Rouslan Korneychuk

I've been playing around with ChainedMap in Python 3.3, and run into
something which perplexes me. Let's start with an ordinary function that
accesses one global and one builtin.


x = 42
def f():
print(x)


If you call f(), it works as expected. But let's make a version with no
access to builtins, and watch it break:

from types import FunctionType
g = FunctionType(f.__code__, {'x': 23})


If you call g(), you get an exception:

py> g()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f
NameError: global name 'print' is not defined


(Don't be fooled by the traceback referring to "f" rather than g.
That's because g's code was copied from f.)

We can add support for builtins:

import builtins # use __builtin__ with no "s" in Python 2
g.__globals__['__builtins__'] = builtins # Note the "s" in the key.


and now calling g() prints 23, as expected.

Now let me try the same thing using Python 3.3's ChainMap. Unfortunately,
functions insist that their __global__ is a dict, so we fool it into
accepting a ChainMap with some multiple inheritance trickery:


from collections import ChainMap
class ChainedDict(ChainMap, dict):
pass

d = ChainedDict({}, {'x': 23}, {'x': 42})
assert d['x'] == 23
g = FunctionType(f.__code__, d)


As before, calling g() raises NameError, "global name 'print' is not
defined". So I expected to be able to fix it just as I did before:

g.__globals__['__builtins__'] = builtins


But it doesn't work -- I still get the same NameError. Why does this not
work here, when it works for a regular dict?

I found the answer in Python's source code. When you execute a code
object, PyFrame_New is called which gets 'bultins' from 'globals', but
inside PyFrame_New (defined on line 596 of Objects/frameobject.c) is the
following (line 613):

builtins = PyDict_GetItem(globals, builtin_object);

Unlike PyObject_GetItem, PyDict_GetItem is specialized for dict objects.
Your ChainedDict class uses ChainMaps's storage and leaves dict's
storage empty, so PyDict_GetItem doesn't find anything.

I can fix it by adding the builtins into the ChainMap:

g.__globals__.maps.append(builtins.__dict__)


And now calling g() prints 23 as expected.

The reason this works is unlike PyFrame_New, the LOAD_GLOBAL opcode
first checks if globals' type is dict (and not a subclass), and falls
back to using PyObject_GetItem if it's anything else.


Interestingly: it looks like it could be fixed easily enough. Unless
there are other places where globals is assumed to be a dict object, it
would just be a matter of doing the same check and fallback in
PyFrame_New that is done in LOAD_GLOBAL (technically, you could just use
PyObject_GetItem; obviously, this is an optimization).
 
S

Steven D'Aprano

Rouslan said:
I found the answer in Python's source code. When you execute a code
object, PyFrame_New is called which gets 'bultins' from 'globals', but
inside PyFrame_New (defined on line 596 of Objects/frameobject.c) is the
following (line 613):

builtins = PyDict_GetItem(globals, builtin_object);

Unlike PyObject_GetItem, PyDict_GetItem is specialized for dict objects.
Your ChainedDict class uses ChainMaps's storage and leaves dict's
storage empty, so PyDict_GetItem doesn't find anything. [...]
Interestingly: it looks like it could be fixed easily enough. Unless
there are other places where globals is assumed to be a dict object, it
would just be a matter of doing the same check and fallback in
PyFrame_New that is done in LOAD_GLOBAL (technically, you could just use
PyObject_GetItem; obviously, this is an optimization).


Thanks for the reply Rouslan.

Perhaps I should report this as a bug.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,527
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top