Recursive loading trouble for immutables

R

rekkufa

I am currently building a system for serializing python objects to a readable file-format, as well as creating python objects by parsing the same format. It is more or less complete except for a single issue I just cannot figure out by myself: How to load data that specifies immutables that recursively reference themselves.

There are only a few solutions I can think of.

One: While loading recursive objects, I always create empty versions of objects (lists, dicts, classes) etc, and fill them in afterwards. This works fine for loading recursive lists and such, but as immutables are, well, immutable, this gets me nowhere with important datatypes like tuples.

Two: Global replacement. If I dont remember incorrectly, PyPy has a function for simply globally replacing all references to a given object with another. This would make the loading code a piece of cake, although I assume this functionality doesn't exist in CPython? This is the second time I've had good use for it.

Three: Create transparent proxies everywhere. Just kidding.

Four: Disallow immutable recursiveness. This is bad for two reasons. Firstly, it requires me to greatly increase the complexity of the loading code as I have to topsort all references to avoid recursiveness in immutables while at the SAME TIME allow mutables to be recursive. I can't imagine how unelegant the code will be. Secondly, there is nothing wrong with recursive tuples. To disallow them and work miles around them just because they can't be properly expressed in Python

In any case I am stumped. It's the last piece of a module I am otherwise very pleased with. There must be a way. I certainly know most people on this list can get around python much better than I do, so, any ideas?
 
S

Steven D'Aprano

I am currently building a system for serializing python objects to a
readable file-format, as well as creating python objects by parsing the
same format.

You mean like pickle? (Pardon me for telling you something you may
already know, but then you may not already know it...)

import pickle

# Create a recursive tuple. .... alist = [1, 2, 3]
atuple = (4, 5, alist)
alist.append(atuple)

atuple (4, 5, [1, 2, 3, (4, 5, [...])])
pickle.dumps(atuple)
'(I4\nI5\n(lp0\nI1\naI2\naI3\na(I4\nI5\ng0\ntp1\na0000g1\n.'


pickle can dump to files, using either text or binary protocols, and load
objects back from either files or strings. I won't pretend the text
protocol is exactly human readable, but perhaps the way forward is to
write a post-processor to convert the output of pickle to something more
human-readable, rather than writing a completely new serializer.
 
M

Mel

rekkufa said:
I am currently building a system for serializing python objects
> to a readable file-format, as well as creating python objects by
> parsing the same format. It is more or less complete except for
> a single issue I just cannot figure out by myself: How to load
> data that specifies immutables that recursively reference
> themselves.

There are only a few solutions I can think of.

One: While loading recursive objects, I always create empty versions
> of objects (lists, dicts, classes) etc, and fill them in afterwards.
> This works fine for loading recursive lists and such, but as
> immutables are, well, immutable, this gets me nowhere with important
> datatypes like tuples.
[ ... ]

I can imagine a C function that might do it.
If it were a list, of course, a Python function would be easy:

def IdioList (contents, marker):
t = []
t[:] = [x if x is not marker else t for x in contents]
return t

With tuples, by the time we have the tuple's identity it's too late,
but I suspect that in C we could bend the rules enough.
The marker would be a caller-supplied object with the property that
the caller would never want to put it in a self-referencing sequence.

I'll have to check this out today to see.

Mel.
 
M

Mel

Mel said:
rekkufa wrote: [ ... ]
How to load
data that specifies immutables that recursively reference
themselves.
I can imagine a C function that might do it.
[ ... ]

Here's something that works, in the sense of creating a tuple
containing a self-reference. I don't know how dangerous it realliy is
-- haven't tested for memory leaks or any other large-scale trouble.
Also, I've only tested on Python 2.5.1 under Ubuntu Linux.

Called as

idiotuple.idiotuple (a_sequence, a_marker)

it returns a tuple containing the items of a_sequence, except that
instances of a_marker are replaced by references to the returned
tuple. Eg.:

import idiotuple
class IdioMarker: "An object that I will never insert into a tuple."
def showid (x):
print id(x)
for y in x:
print ' ', id(y)

showid (idiotuple.idiotuple ((1,2,3), IdioMarker))
showid (idiotuple.idiotuple ((1, IdioMarker, 3), IdioMarker))



The C code is:


/* $Id$ */
#include <Python.h>

/*=======================================================*/

static PyObject *idiotuple_idiotuple (PyObject *self, PyObject *args)
// In Python, call with
// sequence contents
// object marker
// returns
// tuple containing contents, with instances of marker
// replaced by borrowed self-references
{
PyObject *t = NULL;
PyObject *contents, *marker, *x;
Py_ssize_t i, n, z;
if (!PyArg_ParseTuple (args, "OO", &contents, &marker))
return NULL;
n = PySequence_Size (contents);
if (n < 0)
return NULL;
t = PyTuple_New (n); // new tuple
if (t == NULL)
return NULL;
for (i=0; i < n; ++i) {
x = PySequence_GetItem (contents, i); // new reference
if (x != marker) {
z = PyTuple_SetItem (t, i, x); // steals the new
reference to x
if (z == -1) {
goto fail;
}
}
if (x == marker) {
z = PyTuple_SetItem (t, i, t); // stolen reference to t
// Dereference the marker.
// The internal reference to the tuple is effectively
'borrowed'.
// Only external references to the tuple are reflected in
its reference count.
Py_DECREF (x); // dereference the marker
if (z == -1) {
goto fail;
}
}
}
return t;
fail:
Py_DECREF (t); // arrange for the tuple to go away
return NULL;
} /* idiotuple_idiotuple */


/*=======================================================*/

static PyMethodDef IdioTupleMethods[] = {
{"idiotuple", idiotuple_idiotuple, METH_VARARGS, "Create a
possibly self-referential tuple."},
{NULL, NULL, (int)NULL, NULL}
};

PyMODINIT_FUNC initidiotuple (void)
{
PyObject *module;
module = Py_InitModule ("idiotuple", IdioTupleMethods);
}





Setup.py is


# for use by distutils
from distutils.core import setup, Extension

module1 = Extension('idiotuple',
sources = ['idiotuple.c'])

setup (name = 'idiotuple',
version = '1.0',
description = 'Create a possibly self-referential tuple.',
author = 'Mel Wilson',
author_email = '(e-mail address removed)',
ext_modules = [module1])
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top