Fixed keys() mapping

G

George Sakkis

I wrote an 'fkdict' dict-like class for mappings with a fixed set of
keys but I'm wondering if there's a simpler way to go about it.

First off, the main motivation for it is to save memory in case of many
dicts with the same keys, for example when reading from a
csv.DictReader or constructing dicts out of rows fetched from a
database. For example, test_mem(dict) takes up around 246 MB according
to the Windows task manager while test_mem(fkdict) takes around 49 MB:

def test_mem(maptype):
d = [(i,str(i)) for i in range(1000)]
ds = [maptype(d) for i in xrange(10000)]
raw_input('finished')

An additional benefit is predictable ordering (e.g.
fkdict.fromkeys('abcd').keys() == list('abcd')), like several
ordered-dict recipes.

The implementation I came up with goes like this: each fkdict instance
stores only the values as a list in self._values. The keys and the
mapping of keys to indices are stored in a dynamically generated
subclass of fkdict, so that self._keys and self._key2index are also
accessible from the instance. The dynamically generated subclasses are
cached so that the second time an fkdict with the same keys is created,
the cached class is called.

Since the keys are determined in fkdict.__init__(), this scheme
requires changing self.__class__ to the dynamically generated subclass.
As much as I appreciate Python's dynamic nature, I am not particularly
comfortable with objects that change their class and the implications
this may have in the future (e.g. how well does this play with
inheritance). Is this a valid use case for type-changing behavior or is
there a better, more "mainstream" OO design pattern for this ? I can
post the relevant code if necessary.

George
 
N

Neil Cerutti

I wrote an 'fkdict' dict-like class for mappings with a fixed
set of keys but I'm wondering if there's a simpler way to go
about it.

First off, the main motivation for it is to save memory in case
of many dicts with the same keys, for example when reading from
a csv.DictReader or constructing dicts out of rows fetched from
a database. For example, test_mem(dict) takes up around 246 MB
according to the Windows task manager while test_mem(fkdict)
takes around 49 MB:

It occurs to me you could create custom classes using __slots__
to get something similar. It's not terribly convenient.

class XYDict(object):
__slots__ = ['x', 'y']
def __getitem__(self, item):
return self.__getattribute__(item)
def __setitem__(self, key, item):
return self.__setattr__(key, item)

This isn't terribly convenient because you have to create a new
class for every new set of keys. It isn't obvious to me how to
program a metaclass to automate the process. A lot more
boilerplate is necessary to act like a dict.
def test_mem(maptype):
d = [(i,str(i)) for i in range(1000)]
ds = [maptype(d) for i in xrange(10000)]
raw_input('finished')

An additional benefit is predictable ordering (e.g.
fkdict.fromkeys('abcd').keys() == list('abcd')), like several
ordered-dict recipes.

The implementation I came up with goes like this: each fkdict
instance stores only the values as a list in self._values. The
keys and the mapping of keys to indices are stored in a
dynamically generated subclass of fkdict, so that self._keys
and self._key2index are also accessible from the instance. The
dynamically generated subclasses are cached so that the second
time an fkdict with the same keys is created, the cached class
is called.

Since the keys are determined in fkdict.__init__(), this scheme
requires changing self.__class__ to the dynamically generated
subclass. As much as I appreciate Python's dynamic nature, I am
not particularly comfortable with objects that change their
class and the implications this may have in the future (e.g.
how well does this play with inheritance). Is this a valid use
case for type-changing behavior or is there a better, more
"mainstream" OO design pattern for this ? I can post the
relevant code if necessary.

Since the type gets changed before __init__ finishes, I don't see
any problem with it. It sounds cool.
 
G

Gabriel Genellina

The implementation I came up with goes like this: each fkdict instance
stores only the values as a list in self._values. The keys and the
mapping of keys to indices are stored in a dynamically generated
subclass of fkdict, so that self._keys and self._key2index are also
accessible from the instance. The dynamically generated subclasses are
cached so that the second time an fkdict with the same keys is created,
the cached class is called.

Since the keys are determined in fkdict.__init__(), this scheme
requires changing self.__class__ to the dynamically generated subclass.
As much as I appreciate Python's dynamic nature, I am not particularly
comfortable with objects that change their class and the implications
this may have in the future (e.g. how well does this play with
inheritance). Is this a valid use case for type-changing behavior or is
there a better, more "mainstream" OO design pattern for this ? I can
post the relevant code if necessary.

I think a better place would be __new__ instead. This is where you
can determine the right class to use and construct the new instance.


--
Gabriel Genellina
Softlab SRL






__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,234
Latest member
SkyeWeems

Latest Threads

Top