Module for converting XML to Python object(s)?

R

Robert Oschler

Has anybody seen a Python module that will take an XML document (not a
colossal one), and convert it to a Python nested class object? I'm
basically looking for something that would allow me to parse an XML document
(not tokenize it like SAX or make it into an XPath accessible DOM object
like others), directly into a nested Python object so I could access
everything through Python class attribute references.

Thanks.
 
?

=?iso-8859-1?Q?Fran=E7ois?= Pinard

[Robert Oschler]
Has anybody seen a Python module that will take an XML document (not a
colossal one), and convert it to a Python nested class object?

You might want to check this announcement:

From: "Fredrik Lundh" <[email protected]>
Date: Fri, 18 Jun 2004 17:07:43 +0200
Subject: ANN: ElementTree 1.2 final (june 18, 2004)
To: (e-mail address removed)
Newsgroups: comp.lang.python.announce

The Element type is a simple but flexible container object,
designed to store hierarchical data structures, such as
simplified XML infosets, in memory. The ElementTree package
provides a Python implementation of this type, plus code to
serialize element trees to and from XML files.

The 1.2 release adds limited support for XPath and XInclude, and
also fixes a number of serialization bugs, mostly related to
extensive use of namespaces and unicode in tags and attribute
names. For a complete list of changes, see the CHANGES document
in the source kit.

You can get the ElementTree toolkit from:

http://effbot.org/downloads

Documentation, articles, and some code samples (including an
XML-RPC unmarshaller in 16 lines) are available from:

http://effbot.org/zone/element.htm

enjoy /F

--
http://mail.python.org/mailman/listinfo/python-announce-list

Support the Python Software Foundation:
http://www.python.org/psf/donations.html
 
U

Uche Ogbuji

Robert Oschler said:
Has anybody seen a Python module that will take an XML document (not a
colossal one), and convert it to a Python nested class object? I'm
basically looking for something that would allow me to parse an XML document
(not tokenize it like SAX or make it into an XPath accessible DOM object
like others), directly into a nested Python object so I could access
everything through Python class attribute references.

There is a metric ton of these "data bindings". See:

http://www.xml.com/pub/a/2003/06/11/py-xml.html
http://www.xml.com/pub/a/2003/07/02/py-xml.html
http://www.xml.com/pub/a/2003/08/13/py-xml.html
http://www.xml.com/pub/a/2003/12/17/py-xml.html

Based on your description, your best bets are generate_ds, gnosis
objectify, xmltramp and Anobind.

Good luck, and do report back on your experiences.


--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Perspective on XML: Steady steps spell success with Google -
http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care -
http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" -
http://www.adtmag.com/article.asp?id=9090
Harold's Effective XML -
http://www.ibm.com/developerworks/xml/library/x-think25.html
A survey of XML standards -
http://www-106.ibm.com/developerworks/xml/library/x-stand4/
 
U

Uche Ogbuji

François Pinard said:
[Robert Oschler]
Has anybody seen a Python module that will take an XML document (not a
colossal one), and convert it to a Python nested class object?

You might want to check this announcement:

From: "Fredrik Lundh" <[email protected]>
Date: Fri, 18 Jun 2004 17:07:43 +0200
Subject: ANN: ElementTree 1.2 final (june 18, 2004)
To: (e-mail address removed)
Newsgroups: comp.lang.python.announce

The Element type is a simple but flexible container object,
designed to store hierarchical data structures, such as
simplified XML infosets, in memory. The ElementTree package
provides a Python implementation of this type, plus code to
serialize element trees to and from XML files.

ElementTree rocks, but is not really what I understood the OP's
request to be. It seems he wants specialized classes matching the XML
vocabulary, e.g.

print html.head.title

You can't really do this with ElementTree. This is the preserve of a
different class of XML libraries, called data bindings. I discussed
data bindings I know of in my previous response to the OP. Oluyede
reminded me of XMLObject. Also, both Eric van der Vlist and Guido van
Rossum himself have done recent work in data bindings for XML/Python.
Neither's work is packaged for public release yet, though.


--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Perspective on XML: Steady steps spell success with Google -
http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care -
http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" -
http://www.adtmag.com/article.asp?id=9090
Harold's Effective XML -
http://www.ibm.com/developerworks/xml/library/x-think25.html
A survey of XML standards -
http://www-106.ibm.com/developerworks/xml/library/x-stand4/
 
U

Uche Ogbuji

Colin Brown said:

PyRXPU (the only useful part of PyRXP) is not really what the OP was
asking for. It creates lists and dicts rather than actual element
structures that can be accessed using the natural vocabulary from the
XML (e.g. "print html.head.title").


--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Perspective on XML: Steady steps spell success with Google -
http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care -
http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" -
http://www.adtmag.com/article.asp?id=9090
Harold's Effective XML -
http://www.ibm.com/developerworks/xml/library/x-think25.html
A survey of XML standards -
http://www-106.ibm.com/developerworks/xml/library/x-stand4/
 
P

Peter Dobcsanyi

Robert Oschler said:
Has anybody seen a Python module that will take an XML document (not a
colossal one), and convert it to a Python nested class object? I'm

I have something similar implemented in the "ext-rep" module of the
"pydesign" package. Here is an excerpt from the documentation:

XTree is a class to create and manipulate a labeled rooted tree data
structure whose underlying raw data structure is a tree whose nodes are
tuples of the structure:

(node_name, {dictionary of attributes}, [list of children])

The dictionary of attributes represents the label associated with the
node. This raw tree is not accessed directly, however, but through the
XTree object's interface. XTree, in fact, is a lazy recursive wrapper
around the raw structure and hides the implementation details. From the
user's point of view, the internal nodes of the tree presented this way
are of the XTree type. The leaves of the tree are either childless XTree
objects or some particular Python data types. Currently, the following
Python data types can be used for leaves: integer, floating point,
string, list of integers.

The module is specialized to a particular highly structured XML
document, but it should not be difficult to modify it for your needs.
The user can read the whole XML document in one step or alternatively as
a "stream" of sub XTree-s.

In fact, I have found this streaming Reader (parser + converter)
solution so useful that I am planning to make the module into a general
parameterizable module.

You can find documentation and the package at:

http://designtheory.org/software/pydesign/

-- ,
Peter Dobcsanyi
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top