Using dict as object

P

Pierre Tardy

One thing that is cooler with java-script than in python is that dictionaries and objects are the same thing. It allows browsing of complex hierarchical data syntactically easy.

For manipulating complex jsonable data, one will always prefer writing:
buildrequest.properties.myprop
rather than
brdict['properties']['myprop']

This ability in JS is well known for its flaws (e.g. http://drupal.org/node/172169#forin ), and I understand why this is not a feature that we want inpython by default. I did work on class that adds this feature, and that I wish to use for manipulating my json data.

The following github pull request to buildbot has tests that defines specification of such a class, and has several commits, which gives several implementation of the same thing.
https://github.com/buildbot/buildbot/pull/525

All implementation I tried are much slower than a pure native dict access.
Each implementation have bench results in commit comment. All of them are 20+x slower than plain dict!
I would like to have python guys advices on how one could optimize this.

I'd like to eventually post this to python-dev, please tell if this is really not a good idea.

Regards,
Pierre
 
D

Dave Angel

One thing that is cooler with java-script than in python is that dictionaries and objects are the same thing. It allows browsing of complex hierarchical data syntactically easy.

You probably need some different terminology, since a dictionary is
already an object. So's an int, or a list, or anything else visible in
python. You're trying to blur the distinction between attribute access
and access by key (square brackets).
For manipulating complex jsonable data, one will always prefer writing:
buildrequest.properties.myprop
rather than
brdict['properties']['myprop']

So what you want is to provide a dict-like class which has both a
__getitem__ and a __getattribute__, which produces mostly the same
results, if the parameters happen to be reasonable and not conflict with
other methods. (Similar for *set*, *del*, and __contains__ and maybe
others). This has been proposed and discussed and even implemented many
times on this list and others.
This ability in JS is well known for its flaws (e.g. http://drupal.org/node/172169#forin ), and I understand why this is not a feature that we want in python by default. I did work on class that adds this feature, and that I wish to use for manipulating my json data.

There are many more flaws than just the hiding of certain items because
of existing attributes. Additionally, this would only work for items
whose keys are strings, and are strings that happen to be legal symbol
names and not keywords. If you also support __setitem__ or __delitem__
you run the risk of arbitrary code trashing the code that makes the
class work.
The following github pull request to buildbot has tests that defines specification of such a class, and has several commits, which gives several implementation of the same thing.
https://github.com/buildbot/buildbot/pull/525

All implementation I tried are much slower than a pure native dict access.
Each implementation have bench results in commit comment. All of them are 20+x slower than plain dict!

Assuming you're talking about CPython benchmarks, the dict is highly
optimized, C code. And when you provide your own __getitem__
implementation in pure python, there are many attribute accesses, just
to make the code work.
I would like to have python guys advices on how one could optimize this.

Use C code and slots.
I'd like to eventually post this to python-dev, please tell if this is really not a good idea.

Regards,
Pierre

if you're proposing a new module for the stdlib, one of the (unstated?)
requirements is that it be in regular use by a fairly large audience for
a while.
 
T

Thomas Rachel

Am 19.09.2012 12:24 schrieb Pierre Tardy:
One thing that is cooler with java-script than in python is that dictionaries and objects are the same thing. It allows browsing of complex hierarchical data syntactically easy.

For manipulating complex jsonable data, one will always prefer writing:
buildrequest.properties.myprop
rather than
brdict['properties']['myprop']

This is quite easy to achieve (but not so easy to understand):

class JsObject(dict):
def __init__(self, *args, **kwargs):
super(JsObject, self).__init__(*args, **kwargs)
self.__dict__ = self

(Google for JSObject; this is not my courtesy).

What does it do? Well, an object's attributes are stored in a dict. If I
subclass dict, the resulting class can be used for this as well.

In this case, a subclass of a dict gets itself as its __dict__. What
happens now is

d = JsObject()

d.a = 1
print d['a']

# This results in d.__dict__['a'] = 1.
# As d.__dict__ is d, this is equivalent to d['a'] = 1.

# Now the other way:

d['b'] = 42
print d.b

# here as well: d.b reads d.__dict__['b'], which is essentially d['b'].


Thomas
 
O

Oscar Benjamin

Assuming you're talking about CPython benchmarks, the dict is highly
optimized, C code. And when you provide your own __getitem__
implementation in pure python, there are many attribute accesses, just to
make the code work.

I agree with all of Dave's objections to this idea. It is possible, however,
to make a more efficient implementation than the one that you have:

class Namespace(dict):
def __init__(self, *args, **kwargs):
dict.__init__(self, *args, **kwargs)
self.__dict__ = self

This implementation is not really sane, though, as it doesn't hide any of the
dict methods as attributes. It does, however, demonstrate something that be a
potentially simple way of making an alternate type object in C.

Oscar
 
O

Oscar Benjamin

--===============1362296571==
Content-Type: multipart/alternative; boundary=bcaec554d3229e814204ca105e50

--bcaec554d3229e814204ca105e50
Content-Type: text/plain; charset=ISO-8859-1

I can find this question on SO
http://stackoverflow.com/questions/4984647/accessing-dict-keys-like-an-attribute-in-python
which is basically answered with this solution

class AttributeDict(dict):
__getattr__ = dict.__getitem__
__setattr__ = dict.__setitem__


but this does not allow recursive access, you would need to first convert
all nested dictionaries to AttributeDict.
a.b.c.d = 2 # fail
a.b = dict(c=3)
a.b.c=4 # fail

There is no way to control "recursive access" in Python. The statement

a.b.c = 2

is equivalent to the statements

o = a.b # o = a.__getattr__('b')
o.c = 2 # o.__setattr__('c', 2)

The way that the o.c assignment is handled is determined by the type of o
regardless of the type of a. If you're looking for a way to change only the
type of a and make a custom __(set|get)attr__ work for all dicts that are
indirectly referred to then there is no solution to your problem.

Oscar
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top