Assigning to __class__ attribute

K

kj

I have a couple of questions regarding assigning to an instance's
__class__ attribute.

The first is illustrated by the following interaction. First I
define an empty class:
....

Now I define an instance of Spam and an instance of Spam's superclass:
x = Spam()
y = Spam.__mro__[1]() # (btw, is there a less uncouth way to do this???)
[z.__class__.__name__ for z in x, y]
['Spam', 'object']

Now I define a second empty class:....

Next, I attempt to assign the value Ham to x.__class__:
x.__class__ = Ham
[isinstance(x, z) for z in Spam, Ham]
[False, True]

This was the first surprise for me: assigning to the __class__
attribute not only isn't vetoed, but in fact changes the instances
class:

Oh-kaaaay...

First question: how kosher is this sort of class transmutation
through assignment to __class__? I've never seen it done. Is this
because it considered something to do only as a last resort, or is
it simply because the technique is not needed often, but it is
otherwise perfectly ok?

The second, and much bigger, surprise comes when I attempt to do
the same class-switching with y:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __class__ assignment: only for heap types

(If you recall, y's class is object, the superclass of x.) Apparently
Spam is a "heap type" (whatever that is) but its superclass, object,
isn't. This definitely rattles my notions of inheritance: since
the definition of Spam was empty, I didn't expect it to have any
significant properties that are not already present in its superclass.
What's going on here? Is this a bug, or a feature? I can see no
logical justification for allowing such class switching for only
some class and not others.

One last question: as the questions above make clear, I have a
colossal muddle inside my head regarding Python's model of classes
and inheritance. This is not for lack of trying to understand it,
but, rather, for exactly the opposite reason: in my zeal to gain
the fullest understanding of this topic, I think I have read too
much that is incorrect, or obsolete, or incomplete...

What is the most complete, definitive, excruciatingly detailed
exposition of Python's class and inheritance model? I'm expressly
avoiding Google to answer this question, and instead asking it
here, because I suspect that there's some connection between my
state of utter confusion on this topic and the ease with which the
complete/definitive/highest-quality information can get lost among
a huge number of Google hits to popular-but-only-partially-correct
sources of information. (In fact, I *may* have already read the
source I'm seeking, but subsequent readings of incorrect stuff may
have overwritten the correct information in my brain.)

TIA!

~kj
 
R

Robert Kern

I have a couple of questions regarding assigning to an instance's
__class__ attribute.

The first is illustrated by the following interaction. First I
define an empty class:
...

Now I define an instance of Spam and an instance of Spam's superclass:
x = Spam()
y = Spam.__mro__[1]() # (btw, is there a less uncouth way to do this???)
[z.__class__.__name__ for z in x, y]
['Spam', 'object']

Now I define a second empty class:...

Next, I attempt to assign the value Ham to x.__class__:
x.__class__ = Ham
[isinstance(x, z) for z in Spam, Ham]
[False, True]

This was the first surprise for me: assigning to the __class__
attribute not only isn't vetoed, but in fact changes the instances
class:

Oh-kaaaay...

First question: how kosher is this sort of class transmutation
through assignment to __class__? I've never seen it done. Is this
because it considered something to do only as a last resort, or is
it simply because the technique is not needed often, but it is
otherwise perfectly ok?

Last resort for very special purposes, and only extremely rarely in production.
The second, and much bigger, surprise comes when I attempt to do
the same class-switching with y:

Traceback (most recent call last):
File "<stdin>", line 1, in<module>
TypeError: __class__ assignment: only for heap types

(If you recall, y's class is object, the superclass of x.) Apparently
Spam is a "heap type" (whatever that is) but its superclass, object,
isn't. This definitely rattles my notions of inheritance: since
the definition of Spam was empty, I didn't expect it to have any
significant properties that are not already present in its superclass.
What's going on here? Is this a bug, or a feature? I can see no
logical justification for allowing such class switching for only
some class and not others.

Feature. When you have C-implemented types, you cannot safely swap out their
instance's __class__. There are memory issues involved. Only subclasses of
object made by the class statement (or the equivalent type(...) call), i.e.
"heap types", permit this modification. object is a C-implemented type.
Importantly, as for most other C-implemented types, plain object instances have
no __dict__.

Types and classes were unified a long time ago in Python 2.2, but there are
still relevant distinctions between C-implemented types and Python-implemented
types.
One last question: as the questions above make clear, I have a
colossal muddle inside my head regarding Python's model of classes
and inheritance. This is not for lack of trying to understand it,
but, rather, for exactly the opposite reason: in my zeal to gain
the fullest understanding of this topic, I think I have read too
much that is incorrect, or obsolete, or incomplete...

What is the most complete, definitive, excruciatingly detailed
exposition of Python's class and inheritance model? I'm expressly
avoiding Google to answer this question, and instead asking it
here, because I suspect that there's some connection between my
state of utter confusion on this topic and the ease with which the
complete/definitive/highest-quality information can get lost among
a huge number of Google hits to popular-but-only-partially-correct
sources of information. (In fact, I *may* have already read the
source I'm seeking, but subsequent readings of incorrect stuff may
have overwritten the correct information in my brain.)

http://www.python.org/download/releases/2.2.3/descrintro/

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
A

Arnaud Delobelle

kj said:
I have a couple of questions regarding assigning to an instance's
__class__ attribute.

The first is illustrated by the following interaction. First I
define an empty class:
...

Now I define an instance of Spam and an instance of Spam's superclass:
x = Spam()
y = Spam.__mro__[1]() # (btw, is there a less uncouth way to do this???)

Yes:

y = object()
[z.__class__.__name__ for z in x, y]
['Spam', 'object']

Now I define a second empty class:...

Next, I attempt to assign the value Ham to x.__class__:
x.__class__ = Ham
[isinstance(x, z) for z in Spam, Ham]
[False, True]

This was the first surprise for me: assigning to the __class__
attribute not only isn't vetoed, but in fact changes the instances
class:

Oh-kaaaay...

First question: how kosher is this sort of class transmutation
through assignment to __class__? I've never seen it done. Is this
because it considered something to do only as a last resort, or is
it simply because the technique is not needed often, but it is
otherwise perfectly ok?

It's OK as long as the slots defined in the classes are the same (using
Python 3 below so no need for specifying that classes derive from object):
Traceback (most recent call last):
File said:
The second, and much bigger, surprise comes when I attempt to do
the same class-switching with y:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __class__ assignment: only for heap types

y is of type object, which is a builtin type. You can only switch the
__class__ of an instance of a user-defined class.
(If you recall, y's class is object, the superclass of x.) Apparently
Spam is a "heap type" (whatever that is) but its superclass, object,
isn't. This definitely rattles my notions of inheritance: since
the definition of Spam was empty, I didn't expect it to have any
significant properties that are not already present in its superclass.
What's going on here? Is this a bug, or a feature? I can see no
logical justification for allowing such class switching for only
some class and not others.

There is a big difference:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'object' object has no attribute '__dict__'

This means that you can have instance attributes for x but not for y:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'object' object has no attribute 'myattr'

This reflects the fact that x and y are a different kind of
object, with a different layout so you can't hotswap their types.

I imagine that the term "heap type" is used to types which are objects
that live on the heap (in practice they are types defined in Python,
by using a "class" statement or calling type(name, bases, attrs)) as
opposed to builtin types such as object, int, str, etc... which don't
live on the heap.
One last question: as the questions above make clear, I have a
colossal muddle inside my head regarding Python's model of classes
and inheritance. This is not for lack of trying to understand it,
but, rather, for exactly the opposite reason: in my zeal to gain
the fullest understanding of this topic, I think I have read too
much that is incorrect, or obsolete, or incomplete...
What is the most complete, definitive, excruciatingly detailed
exposition of Python's class and inheritance model?

I learnt by reading Guido's "Unifying types and classes in Python 2.2",
available here:

http://www.python.org/download/releases/2.2.3/descrintro/
 
M

Mark Wooding

kj said:
Now I define an instance of Spam and an instance of Spam's superclass:
x = Spam()
y = Spam.__mro__[1]() # (btw, is there a less uncouth way to do this???)

There's the `__bases__' attribute, which is simply a tuple of the
class's direct superclasses in order. Spam.__bases__[0] will always be
equal to Spam.__mro__[1] because of the way the linearization works.

There's also `__base__' attribute, which seems to correspond to a
PyTypeObject's `tp_base' slot; this /isn't/ always the first direct
superclass; I'm not quite sure what the rules are, and it doesn't seem
to be documented anywhere.
[z.__class__.__name__ for z in x, y]
['Spam', 'object']
...
x.__class__ = Ham
[isinstance(x, z) for z in Spam, Ham]
[False, True]

First question: how kosher is this sort of class transmutation
through assignment to __class__?

Yep. That's allowed, and useful. Consider something like a red/black
tree: red and black nodes behave differently from one another, and it
would be convenient to make use of method dispatch rather than writing a
bunch of conditional code; unfortunately, nodes change between being red
and black occasionally. Swizzling classes lets you do this.

Various other languages have similar features. The scariest is probably
Smalltalk's `become: anObject' method, which actually swaps two objects.
Python `__class__' assignment is similarly low-level: all it does is
tweak a pointer, and it's entirely up to the program to make sure that
the object's attributes are valid according to the rules for the new
class. The Common Lisp Object System has CHANGE-CLASS, which is a
rather more heavyweight and hairy procedure which tries to understand
and cope with the differences between the two classes.
I've never seen it done. Is this because it considered something to
do only as a last resort, or is it simply because the technique is not
needed often, but it is otherwise perfectly ok?

It's not needed very often, and can be surprising to readers who aren't
familiar with other highly dynamic object systems, so I don't think it's
particularly encouraged; but it seems like it might be the best approach
in some cases. I think it's one of those things where you'll just
/know/ when it's the right answer, and if you don't know that it's the
right answer, it isn't.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __class__ assignment: only for heap types

`Heap types' are types which are allocated dynamically. Non-heap types
are ones which are dreamt up by C extensions or built into Python.

I suspect that the logic here is that non-heap types in general are
magical and weird, and their instances might have special C stuff
attached to them, so changing their classes is a Bad Idea, though
there's an extra check which catches most problems:

In [1]: class foo (str): pass
In [2]: class bar (object): pass
In [3]: x = foo()
In [4]: x.__class__ = bar
TypeError: __class__ assignment: 'foo' object layout differs from 'bar'

Anyway, I'd guess it's just a bug that `object' is caught this way, but
it doesn't seem an especially important one.
This definitely rattles my notions of inheritance: since the
definition of Spam was empty, I didn't expect it to have any
significant properties that are not already present in its superclass.

Yes, sorry. I think this is a bit poor, really, but it's hard to do a
much better job.

Pure Python objects are pretty simple things, really: they have a
pointer to their class, and a bunch of attributes stored in a
dictionary. What other languages call instance variables or methods are
found using attribute lookup, which just searches the object, and then
its class and its superclasses in the method resolution order. If you
fiddle with `__class__', then attribute lookup searches a different
bunch of classes. Nothing else needs to change -- or, at least, if it
does, then you'd better do it yourself.

Types implemented in C work by extending the underlying Python object
structure. The magical type-specific stuff is stored in the extra
space. Unfortunately, changing classes is now hard, because Python code
can't even see the magic C stuff -- and, besides, a different special C
type may require a different amount of space, and the CPython
implementation can't cope with the idea that objects might move around
in memory.

Similar complications occur if one of the classes has a `__slots__'
attribute: in this case, both the original and new class must have a
`__slots__' attribute and they must be the same length and have the same
names.
What is the most complete, definitive, excruciatingly detailed
exposition of Python's class and inheritance model?

It's kind of scattered. The language reference sections 3.3 and 3.4
contain some useful information, but it's rather detailed and it's a
(mostly) comprehensive list of what a bunch of strangely shaped things
do rather than a presentation of a coherent model.

Guido's essay `Unifying types and classes in Python 2.2' is pretty good
(http://www.python.org/download/releases/2.2.3/descrintro/) and provides
some of the background.

Unsurprisingly, Python's object system takes (mostly good) ideas from
other languages, particularly dynamic ones. Python's object system is
/very/ different from what you might expect from C++, Ada and Eiffel,
for example; but coming from Smalltalk, Flavors or Dylan, you might not
be particularly surprised. Some knowledge of these other languages will
help fill in the gaps.
I'm expressly avoiding Google to answer this question,

I'd say this was sensible, except for the fact that you seem to expect a
more reliable answer from Usenet. ;-)

-- [mdw]
 
S

Steven D'Aprano

This was the first surprise for me: assigning to the __class__ attribute
not only isn't vetoed, but in fact changes the instances class:

Oh-kaaaay...

First question: how kosher is this sort of class transmutation through
assignment to __class__? I've never seen it done. Is this because it
considered something to do only as a last resort, or is it simply
because the technique is not needed often, but it is otherwise perfectly
ok?

While not exactly encouraged, it is a deliberate feature. It's not a good
idea to go around randomly monkey-patching instances with a new class,
and in fact Python makes some attempts to protect you from the most
egregious errors, but the technique has its uses. Especially if you're
swapping between two classes which you have designed to be swapped
between.

For instance, if you have an object that needs to display two distinct
behaviours during different times of its life, then a good technique is
to write two classes, one for each set of behaviours, and automatically
switch between them as needed. A ring buffer is a common example:

http://code.activestate.com/recipes/68429-ring-buffer/

This made it into the Python Cookbook, so it comes with the approval of
Alex Martelli, which is nearly as good as the approval of Guido :)


The second, and much bigger, surprise comes when I attempt to do the
same class-switching with y:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __class__ assignment: only for heap types

(If you recall, y's class is object, the superclass of x.) Apparently
Spam is a "heap type" (whatever that is) but its superclass, object,
isn't. This definitely rattles my notions of inheritance: since the
definition of Spam was empty, I didn't expect it to have any significant
properties that are not already present in its superclass. What's going
on here? Is this a bug, or a feature? I can see no logical justification
for allowing such class switching for only some class and not others.

That is more of a design compromise than a feature... it is unsafe to
change the class of types implemented in C, for various implementation-
related reasons, and so Python prohibits it.

Heap types are dynamically allocated types that you create using the
class statement (or the type function). They're so-called because they
live in the heap. The builtin types defined in C presumably aren't
dynamically allocated, but statically. Presumably they have a fixed
layout in memory and don't live in the heap. Not being a C programmer,
I've never cared about the implementation, only that you can't sensibly
turn a str instance into an int instance by assigning to __class__ and so
Python prohibits it.

One last question: as the questions above make clear, I have a colossal
muddle inside my head regarding Python's model of classes and
inheritance. This is not for lack of trying to understand it, but,
rather, for exactly the opposite reason: in my zeal to gain the fullest
understanding of this topic, I think I have read too much that is
incorrect, or obsolete, or incomplete...

I suspect you're trying to make this more complicated than it actually
is. You keep finding little corner cases that expose implementation
details (such as the heap-types issue above) and leaping to the erroneous
conclusion that because you didn't understand this tiny little corner of
Python's class model, you didn't understand any of it. Python's object
model is relatively simple, but it does occasionally expose a few messy
corners.

What is the most complete, definitive, excruciatingly detailed
exposition of Python's class and inheritance model?

That would be the source code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top