Conceptual flaw in pxdom?

Emanuele D'Arrigo · May 17, 2009

Hi everybody,

I'm looking at pxdom and in particular at its foundation class
DOMObject (source code at the end of the message). In it, the author
attempts to allow the establishment of readonly and read&write
attributes through the special methods __getattr__ and __setattr__. In
so doing is possible to create subclasses such as:

class MyClass(DOMObject):

def __init__(self):
DOMObject.__init__(self)
self._anAttribute = "im_a_readonly_attribute"

## The presence of the following method allows
## read-only access to the attribute without the
## underscore, i.e.: aVar = myClassInstance.anAttribute
def _get_anAttribute(self): return self._anAttribute

## Uncommenting the following line allows the setting of
"anAttribute".
## Commented, the same action would raise an exception.
## def _set_anAttribute(self, value): self._anAttribute = value

This is all good and dandy and it works, mostly. However, if you look
at the code below for the method __getattr__, it appears to be
attempting to prevent direct access to -any- variable starting with an
underscore.

def __getattr__(self, key):
if key[:1]=='_':
raise AttributeError, key

But access isn't actually prevented because __getattr__ is invoked -
only- if an attribute is not found by normal means. So, is it just me
or that little snipped of code either has another purpose or simply
doesn't do the intended job?

Manu

-----
class DOMObject:
"""Base class for objects implementing DOM interfaces

Provide properties in a way compatible with old versions of
Python:
subclass should provide method _get_propertyName to make a read-
only
property, and also _set_propertyName for a writable. If the
readonly
property is set, all other properties become immutable.
"""
def __init__(self, readonly= False):
self._readonly= readonly

def _get_readonly(self):
return self._readonly

def _set_readonly(self, value):
self._readonly= value

def __getattr__(self, key):
if key[:1]=='_':
raise AttributeError, key
try:
getter= getattr(self, '_get_'+key)
except AttributeError:
raise AttributeError, key
return getter()

def __setattr__(self, key, value):
if key[:1]=='_':
self.__dict__[key]= value
return

# When an object is readonly, there are a few attributes that
can be set
# regardless. Readonly is one (obviously), but due to a wart
in the DOM
# spec it must also be possible to set nodeValue and
textContent to
# anything on nodes where these properties are defined to be
null (with no
# effect). Check specifically for these property names as a
nasty hack
# to conform exactly to the spec.
#
if self._readonly and key not in ('readonly', 'nodeValue',
'textContent'):
raise NoModificationAllowedErr(self, key)
try:
setter= getattr(self, '_set_'+key)
except AttributeError:
if hasattr(self, '_get_'+key):
raise NoModificationAllowedErr(self, key)
raise AttributeError, key
setter(value)

Stefan Behnel · May 18, 2009

Emanuele said:
I'm looking at pxdom and in particular at its foundation class
DOMObject

I didn't know pxdom, but looking at it now I can see that it hasn't been
updated since 2006. Not sure if that means that it is complete or that it
has been abandoned.

Anyway, seeing that it only provides DOM compliance, without anything
further like XPath or whatever, and that it doesn't focus on performance in
any way, you might still be better off with ElementTree, which is in the
stdlib since Python 2.5 (and available for Py2.2+).

Stefan

Paul Boddie · May 18, 2009

I didn't know pxdom, but looking at it now I can see that it hasn't been
updated since 2006. Not sure if that means that it is complete or that it
has been abandoned.

Maybe the developer is mostly satisfied with it.

Anyway, seeing that it only provides DOM compliance, without anything
further like XPath or whatever, and that it doesn't focus on performance in
any way, you might still be better off with ElementTree, which is in the
stdlib since Python 2.5 (and available for Py2.2+).

To put the inquirer's remarks in context, I suggested that he look at
pxdom specifically as a replacement for minidom and in response to the
following statement: "I've used etree and lxml successfully before but
I wanted to understand how close I can get to the W3C DOM standards."
Maybe you missed that thread, but here's a link to it:

http://groups.google.com/group/comp.lang.python/browse_frm/thread/d445363b99001ad6

Paul

Stefan Behnel · May 19, 2009

Paul said:
Maybe the developer is mostly satisfied with it.

Well, hard to tell without asking the author. I'm far from saying that a
"complete" piece of software is a bad thing, but bit-rot is still the death
of all now-working software.

To put the inquirer's remarks in context, I suggested that he look at
pxdom specifically as a replacement for minidom and in response to the
following statement: "I've used etree and lxml successfully before but
I wanted to understand how close I can get to the W3C DOM standards."
Maybe you missed that thread, but here's a link to it:

http://groups.google.com/group/comp.lang.python/browse_frm/thread/d445363b99001ad6

Ah, yes, I missed that thread. I always wonder why people want "DOM
compliance". I don't consider that a value by itself, and I don't see the
advantage that the DOM API has over other XML APIs (which, most of the
time, were designed to make life simpler for DOM suffering developers). I
find it a lot more important to get the stuff working quickly and in a
(somewhat) robust way, which is hard to achieve in DOM code. It's pretty
easy to write unmaintainable code that uses the DOM API, though.

Stefan

Emanuele D'Arrigo · May 19, 2009

It's pretty easy to write unmaintainable code that uses the DOM API, though.

I'm finding that at my own expenses...

Why would anybody want to use the DOM? I suppose the main reason is
that it is one of the most reliable standards around. It might be more
complicated, but that's probably because lots of very smart people
thought about it very carefully and it couldn't be made any simpler.
And hopefully will stand the test of time.

Anyway. Any taker on the original (very python-related rather than DOM
related) problem at hand?

Manu

Peter Otten · May 19, 2009

Emanuele said:
This is all good and dandy and it works, mostly. However, if you look
at the code below for the method __getattr__, it appears to be
attempting to prevent direct access to -any- variable starting with an
underscore.

def __getattr__(self, key):
if key[:1]=='_':
raise AttributeError, key

But access isn't actually prevented because __getattr__ is invoked -
only- if an attribute is not found by normal means. So, is it just me
or that little snipped of code either has another purpose or simply
doesn't do the intended job?

It doesn't do what you think it does; it is there to prevent "infinite"
recursion. Have a look at the complete method:

def __getattr__(self, key):
if key[:1]=='_':
raise AttributeError, key
try:
getter= getattr(self, '_get_'+key)
except AttributeError:
raise AttributeError, key
return getter()

If you instantiate the object and try to access the -- non-existent --
attribute yadda __getattr__() will be called with key="yadda" which doesn't
start with an underscore and with gettattr(self, "_get_yadda") triggers
another __getattr__() call as _get_yadda doesn't exist either. If the check
weren't there yet another getattr(self, "_get__get_yadda") call would
follow, and so on until the recursion limit is reached.

Peter

Diez B. Roggisch · May 19, 2009

Emanuele said:
I'm finding that at my own expenses...

Why would anybody want to use the DOM? I suppose the main reason is
that it is one of the most reliable standards around. It might be more
complicated, but that's probably because lots of very smart people
thought about it very carefully and it couldn't be made any simpler.
And hopefully will stand the test of time.

Sorry to say so, but that's nonsense. DOM is not complicated because it
contains anything superior - the reason (if any) is that it is formulated
as language-agnostic as possible, with the unfortunate result it is rather
clumsy to use in all languages.

Additionally, it *attempts* to unify APIs in browsers, unfortunately it is
only moderately successful.

APIs such as ElementTree don't try to burden themselves with the
language-agnosticism, and thus are much more powerful.

And no, I don't think being language-agnostic is a virtue in an API. Simply
because this can never really be reached, and then is questionable anyway.

Diez

Paul Boddie · May 19, 2009

Sorry to say so, but that's nonsense. DOM is not complicated because it
contains anything superior - the reason (if any) is that it is formulated
as language-agnostic as possible, with the unfortunate result it is rather
clumsy to use in all languages.

Although I presume that people really mean the core standards when
they talk about "the DOM", not all the other ones related to those
core standards, the API is not to everyone's taste because, amongst
other things, it uses functions and methods when some people would
rather use properties (which actually appear in various places in the
standards, so it isn't as if the W3C haven't heard of such things),
and for lots of other subjective reasons: some I can agree with, some
I put at the same level as a lot of the API-posturing in numerous
domains where Python code gets written, where such code jostles above
all other concerns for the coveted "Pythonic" label.

However, when people are actually choosing to use DOM-related
technologies, and when those technologies do not necessarily have
equivalents in whatever other technology stack that could be
suggested, can we not just take it as read that they actually know
that the DOM isn't very nice (or that other people don't think that
it's very nice) and that there are alternatives to the core stuff
(especially when the inquirer has actually indicated his familiarity
with those alternatives) and that reminding everyone for the nth time
about how bad the DOM is (for whatever tangential purpose only
partially related to the topic under discussion) adds very little if
anything in the way of advice? It's like someone saying that they're
going to fly the Atlantic in a 747 only to be told that they should
drive a Lexus because "Boeing make terrible cars".

Feel free to replace "DOM" in the above with whatever else fits,
because this kind of thing comes up all the time.

Paul

Damien Neil · May 20, 2009

Diez B. Roggisch said:
APIs such as ElementTree don't try to burden themselves with the
language-agnosticism, and thus are much more powerful.

Having used both ElementTree and xml.dom, I don't see that ET is any
more powerful. Both APIs let you manipulate an XML tree in pretty much
any way possible. (The ET bundled with Python is rather less powerful,
since it lacks support for processing instructions and other XML
features, but lxml corrects that.)

Personally, I find ElementTree's handling of text nodes to be very
clumsy compared to that in the DOM. For me, DOM with XPath is much
nicer than ET. Tastes differ, of course.

- Damien

Diez B. Roggisch · May 20, 2009

Paul said:
Although I presume that people really mean the core standards when
they talk about "the DOM", not all the other ones related to those
core standards, the API is not to everyone's taste because, amongst
other things, it uses functions and methods when some people would
rather use properties (which actually appear in various places in the
standards, so it isn't as if the W3C haven't heard of such things),
and for lots of other subjective reasons: some I can agree with, some
I put at the same level as a lot of the API-posturing in numerous
domains where Python code gets written, where such code jostles above
all other concerns for the coveted "Pythonic" label.

However, when people are actually choosing to use DOM-related
technologies, and when those technologies do not necessarily have
equivalents in whatever other technology stack that could be
suggested, can we not just take it as read that they actually know
that the DOM isn't very nice (or that other people don't think that
it's very nice) and that there are alternatives to the core stuff
(especially when the inquirer has actually indicated his familiarity
with those alternatives) and that reminding everyone for the nth time
about how bad the DOM is (for whatever tangential purpose only
partially related to the topic under discussion) adds very little if
anything in the way of advice? It's like someone saying that they're
going to fly the Atlantic in a 747 only to be told that they should
drive a Lexus because "Boeing make terrible cars".

Feel free to replace "DOM" in the above with whatever else fits,
because this kind of thing comes up all the time.

You could have wrote that same reply when the OP stated that

"""
It [DOM] might be more
complicated, but that's probably because lots of very smart people
thought about it very carefully and it couldn't be made any simpler.
"""

Which is another way of saying "I know X, but Y is better."

Also, not trying to convince people that there are better alternatives to
what and how they do something (admittedly, better is subjective, thus
ensues discussion), or gathering arguments on why they do believe their way
is preferable is the very essence of fora like this - if you'd really want
that to go away, we're down to answering the notorious
mutable-default-argument-question.

Diez

Paul Boddie · May 20, 2009

Also, not trying to convince people that there are better alternatives to
what and how they do something (admittedly, better is subjective, thus
ensues discussion), or gathering arguments on why they do believe their way
is preferable is the very essence of fora like this - if you'd really want
that to go away, we're down to answering the notorious
mutable-default-argument-question.

Yes, but the inquirer wasn't asking for general advice on what to use.
Again, if you're intending to use (or are constrained to using) one
thing, having people tell you to use something else has limited
benefits. I don't necessarily agree with the statements about the DOM
being particularly well thought out, but those were made to the
already-issued tangential advice which seems to be the norm on python-
list/comp.lang.python.

It's nice to see that one person responded to the inquirer's original
message, anyway, rather than attempt to "educate" him under the
assumption that he doesn't at all know what he's doing. In Google
Groups, you can see the "profile" of people and work out what they
seem to know fairly easily - a useful tool for anyone who doesn't
remember who asked what. Maybe some liberal usage of tools like that
would have saved us eight or so messages in this thread already.

Paul

Issues with writing pytest	0	Sep 9, 2022
Rewriting __getattr__	0	Jan 6, 2011
Debugging difficulty in python with __getattr__, decorated propertiesand AttributeError.	7	May 3, 2013
AttributeError in pygame code	4	Jan 3, 2024
which "dictionary with attribute-style access"?	8	Oct 12, 2009
[rfc] An object that creates (nested) attributes automatically onassignment	5	Apr 11, 2009
Universal Feed Browser problem in feedparser.py	0	Feb 7, 2011
python multiprocessing proxy	0	Oct 5, 2009

Conceptual flaw in pxdom?

Emanuele D'Arrigo

Stefan Behnel

Paul Boddie

Stefan Behnel

Emanuele D'Arrigo

Peter Otten

Diez B. Roggisch

Paul Boddie

Damien Neil

Diez B. Roggisch

Paul Boddie

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads