minidom questions

X

xtian

Hi -

I'm doing some data conversion with minidom (turning a csv file into a
specific xml format), and I've hit a couple of small problems.

1: The output format has a header with some xml that looks something
like this:
<item xmlns="" xmlns:thing="http://www.blah.com">
<thing:child name="smith"/>
</item>

As I understand it, this is a valid use of namespaces.
If I add this to the start of the document, when I do a .toxml(), I
get an exception. Here's a small example:
xmlns:thing="http://www.blah.com"><thing:child

Traceback (most recent call last):
File "<pyshell#26>", line 1, in -toplevel-
print doc.toxml()
File "C:\PYTHON23\lib\xml\dom\minidom.py", line 47, in toxml
return self.toprettyxml("", "", encoding)
File "C:\PYTHON23\lib\xml\dom\minidom.py", line 59, in toprettyxml
self.writexml(writer, "", indent, newl, encoding)
File "C:\PYTHON23\lib\xml\dom\minidom.py", line 1746, in writexml
node.writexml(writer, indent, addindent, newl)
File "C:\PYTHON23\lib\xml\dom\minidom.py", line 811, in writexml
_write_data(writer, attrs[a_name].value)
File "C:\PYTHON23\lib\xml\dom\minidom.py", line 301, in _write_data
data = data.replace("&", "&amp;").replace("<", "&lt;")
AttributeError: 'NoneType' object has no attribute 'replace'

Doing some debugging, the xmlns attribute (is it really an attribute?)
has a value of None, rather than "".
I can work around this by replacing the implementation of
Element.writexml with one including:

value = attrs[a_name].value
if value is None:
value = ""

Is this a bug? Am I doing something wrong?

2: Formatting - I'd like the output xml not to put extra line breaks
inside elements that contain only text nodes (which is what
..toprettyxml does by default) - the tool that uses the xml treats the
line breaks as significant. The .toxml method works, but I'd like to
have the output be prettier than this (while not being as pretty as
the output of .toprettyxml :). I can see how to get what I want by
replacing Element.writexml with one that checks to see whether all the
childNodes are text. Is there a better way to do this?

Thanks,
xtian
 
M

Martin v. =?iso-8859-15?q?L=F6wis?=

Is this a bug?

It's a bug. The attribute value should be an empty string.
I can see how to get what I want by
replacing Element.writexml with one that checks to see whether all the
childNodes are text. Is there a better way to do this?

Replacing methods in classes is certainly not a good thing.

What you *really* should do is to write your own traversal function
which does the saving of the document. For example, you could use the
PyXML xml.dom.ext.Printer module, and subclass the PrinterVisitor
(if you don't want to write the traversal from scratch).

If you really think you should only replace a single toxml
implementation on elements, you could use extended elements. To do so,
inherit from the Element class, redefining toxml, from the Document
class, redefining createElement, and from the DOMImplementation
class, redefining createDocument.

HTH,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top