Issues with XMLTreeBuilder in cElementTree and ElementTree

M

Michael Becker

I had some xmls being output by an application whose formating did not
allow for easy editing by humans so I was trying to write a short
python app to pretty print xml files. Most of the data in these xml
files is in the attributes so I wanted each attribute on its own line.
I wrote a short app using xml.etree.ElementTree.XMLTreeBuilder(). To
my dismay the attributes were getting reordered. I found that the
implementation of XMLTreeBuilder did not make proper use of the
ordered_attributes attribute of the expat parser (which it defaults
to). The constructor sets ordered_attributes = 1 but then the
_start_list method iterates through the ordered list of attributes and
stores them in a dictionary! This is incredibly unintuitive and seems
to me to be a bug. I would recommend the following changes to
ElementTree.py:

class XMLTreeBuilder:
....
def _start_list(self, tag, attrib_in):
fixname = self._fixname
tag = fixname(tag)
attrib = []
if attrib_in:
for i in range(0, len(attrib_in), 2):
attrib.append((fixname(attrib_in),
self._fixtext(attrib_in[i+1])))
return self._target.start(tag, attrib)

class _ElementInterface:
....

def items(self):
try:
return self.attrib.items()
except AttributeError:
return self.attrib

These changes would allow the user to take advantage of the
ordered_attributes attribute in the expat parser to use either ordered
or unorder attributes as desired. For backwards compatibility it might
be desirable to change XMLTreeBuilder to default to ordered_attributes
= 0. I've never submitted a bug fix to a python library so if this
seems like a real bug please let me know how to proceed.

Secondly, I found a potential issue with the cElementTree module. My
understanding (which could be incorrect) of python C modules is that
they should work the same as the python versions but be more
efficient. The XMLTreeBuilder class in cElementTree doesn't seem to be
using the same parser as that in ElementTree. The following code
illustrates this issue:
Traceback (most recent call last):

In case it is relevant, here is the version and environment
information:
tpadmin@osswlg1{/tpdata/ossgw/config} $ python -V
Python 2.5.1
tpadmin@osswlg1{/tpdata/ossgw/config} $ uname -a
SunOS localhost 5.10 Generic_118833-33 sun4u sparc SUNW,Netra-240
 
S

Stefan Behnel

Michael said:
Secondly, I found a potential issue with the cElementTree module. My
understanding (which could be incorrect) of python C modules is that
they should work the same as the python versions but be more
efficient. The XMLTreeBuilder class in cElementTree doesn't seem to be
using the same parser as that in ElementTree. The following code
illustrates this issue:

Traceback (most recent call last):

Mind the underscore. You are using a non-public interface here. Don't expect
other implementations to support that.

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top