Cyclic GC rules for subtyped objects with tp_dictoffset

B

BChess

Hi,

I'm writing a new PyTypeObject that is base type, supports cyclic GC,
and has a tp_dictoffset. If my type is sub-typed by a python class,
what exactly are the rules for how I'm supposed to treat my PyDict
object with regards to cyclic GC? Do I still visit it in my traverse
() function if I'm subtyped? Do I decrement the refcount upon
dealloc? By the documentation, I'm assuming I should always be using
_PyObject_GetDictPtr() to be accessing the dictionary, which I do.
But visiting the dictionary in traverse() in the case it's subtyped
results in a crash in weakrefobject.c. I'm using Python 2.5.

Thanks for any help anyone has!
Ben
 
H

Hrvoje Niksic

[ Questions such as this might be better suited for the capi-sig list,
http://mail.python.org/mailman/listinfo/capi-sig ]

BChess said:
I'm writing a new PyTypeObject that is base type, supports cyclic
GC, and has a tp_dictoffset. If my type is sub-typed by a python
class, what exactly are the rules for how I'm supposed to treat my
PyDict object with regards to cyclic GC? Do I still visit it in my
traverse () function if I'm subtyped? Do I decrement the refcount
upon dealloc? By the documentation, I'm assuming I should always be
using _PyObject_GetDictPtr() to be accessing the dictionary, which I
do. But visiting the dictionary in traverse() in the case it's
subtyped results in a crash in weakrefobject.c. I'm using Python
2.5.

First off, if your class is intended only as a base class, are you
aware that simply inheriting from a dictless class adds a dict
automatically? For example, the base "object" type has no dict, but
inheriting from it automatically adds one (unless you override that
using __slots__). Having said that, I'll assume that the base class
is usable on its own and its direct instances need to have a dict as
well.

I'm not sure if this kind of detail is explicitly documented, but as
far as the implementation goes, the answer to your question is in
Objects/typeobject.c:subtype_traverse. That function gets called to
traverse instances of heap types (python subclasses of built-in
classes such as yours). It contains code like this:

if (type->tp_dictoffset != base->tp_dictoffset) {
PyObject **dictptr = _PyObject_GetDictPtr(self);
if (dictptr && *dictptr)
Py_VISIT(*dictptr);
}

According to this, the base class is responsible for visiting its dict
in its tp_traverse, and the subtype only visits the dict it added
(which is why its location differs). Note that visiting an object
twice still shouldn't cause a crash; objects may be and are visited an
arbitrary number of times, and it's up to the GC to ignore those it
has already seen. So it's possible that you have a bug elsewhere in
the code.

As far as the decrementing goes, the rule of thumb is: if you created
it, you get to decref it. subtype_dealloc contains very similar
logic:

/* If we added a dict, DECREF it */
if (type->tp_dictoffset && !base->tp_dictoffset) {
PyObject **dictptr = _PyObject_GetDictPtr(self);
if (dictptr != NULL) {
PyObject *dict = *dictptr;
if (dict != NULL) {
Py_DECREF(dict);
*dictptr = NULL;
}
}
}

So, if the subtype added a dict, it was responsible for creating it
and it will decref it. If the dict was created by you, it's up to you
to dispose of it.
 
B

BChess

[ Questions such as this might be better suited for the capi-sig list,
 http://mail.python.org/mailman/listinfo/capi-sig]

BChess said:
I'm writing a new PyTypeObject that is base type, supports cyclic
GC, and has a tp_dictoffset.  If my type is sub-typed by a python
class, what exactly are the rules for how I'm supposed to treat my
PyDict object with regards to cyclic GC?  Do I still visit it in my
traverse () function if I'm subtyped?  Do I decrement the refcount
upon dealloc?  By the documentation, I'm assuming I should always be
using _PyObject_GetDictPtr() to be accessing the dictionary, which I
do.  But visiting the dictionary in traverse() in the case it's
subtyped results in a crash in weakrefobject.c.  I'm using Python
2.5.

First off, if your class is intended only as a base class, are you
aware that simply inheriting from a dictless class adds a dict
automatically?  For example, the base "object" type has no dict, but
inheriting from it automatically adds one (unless you override that
using __slots__).  Having said that, I'll assume that the base class
is usable on its own and its direct instances need to have a dict as
well.

I'm not sure if this kind of detail is explicitly documented, but as
far as the implementation goes, the answer to your question is in
Objects/typeobject.c:subtype_traverse.  That function gets called to
traverse instances of heap types (python subclasses of built-in
classes such as yours).  It contains code like this:

     if (type->tp_dictoffset != base->tp_dictoffset) {
         PyObject **dictptr = _PyObject_GetDictPtr(self);
             if (dictptr && *dictptr)
                 Py_VISIT(*dictptr);
     }

According to this, the base class is responsible for visiting its dict
in its tp_traverse, and the subtype only visits the dict it added
(which is why its location differs).  Note that visiting an object
twice still shouldn't cause a crash; objects may be and are visited an
arbitrary number of times, and it's up to the GC to ignore those it
has already seen.  So it's possible that you have a bug elsewhere in
the code.

As far as the decrementing goes, the rule of thumb is: if you created
it, you get to decref it.  subtype_dealloc contains very similar
logic:

        /* If we added a dict, DECREF it */
        if (type->tp_dictoffset && !base->tp_dictoffset) {
                PyObject **dictptr = _PyObject_GetDictPtr(self);
                if (dictptr != NULL) {
                        PyObject *dict = *dictptr;
                        if (dict != NULL) {
                                Py_DECREF(dict);
                                *dictptr = NULL;
                        }
                }
        }

So, if the subtype added a dict, it was responsible for creating it
and it will decref it.  If the dict was created by you, it's up to you
to dispose of it.

My confusion stemmed from the fact that I wasn't actually,
technically, allocating a PyDict in this space.
PyObject_GenericSetAttr() does that automatically when it finds that
it's NULL. But that seems to be the same as if I had made it myself
-- so I'm to dealloc either way.

Thank you for the very in-depth answer, though. You're right: the
problem was elsewhere. The crash stemmed from using a negative number
for tp_dictoffset. This doesn't seem to do the right thing when
subtyping -- the tp_dictoffset was pointing to the same memory as the
offset specified in the subtype's tp_weaklistoffset. I missed the
sentence in the documentation that negative tp_dictoffsets should only
be used for variable-length objects. Using a positive offset instead
worked like a charm.

Thanks again,
Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top