jython lacks working xml processing modules?

J

Jane Austine

I'm trying to parse an xml file with jython (not through java parsers
like xerces).

I tried minidom in jython 2.1 and 2.2a but all failed.

What can I do? The last resort would be using java parsers. Then how
can I use them like python xml parsers? It seems like javadom and
javasax has something to do, but I don't know how.

Jane
 
A

Alan Kennedy

[Jane Austine]
I'm trying to parse an xml file with jython (not through java parsers
like xerces).

I tried minidom in jython 2.1 and 2.2a but all failed.

It's quite likely that your documents contained namespaces. The only
parser supported in jython is "xmlproc", because it is pure python.
However, "xmlproc" has some significant bugs in relation to namespace
processing, IIRC from the last time I looked at it.
What can I do?

1. Use a Java SAX2 parser, write a jython ContentHandler for it, build
a Minidom from the events.
2. Use a Java DOM processor (DOM4J, JDOM, etc), and let it build a DOM
for you.

It would probably be easier if you could give an outline of what you
are trying to achieve. For example, do you really need to build an
object model? Do you need to use xpath? Do you need to validate
structures? Etc, etc.
The last resort would be using java parsers. Then how
can I use them like python xml parsers? It seems like javadom and
javasax has something to do, but I don't know how.

If you want to know about using SAX events to build object models,
check this old thread on c.l.py.

http://groups.google.com/[email protected]

If you have any specific questions or face any specific problems, post
some details.

regards,
 
D

Diez B. Roggisch

Jane said:
I'm trying to parse an xml file with jython (not through java parsers
like xerces).

I tried minidom in jython 2.1 and 2.2a but all failed.

What can I do? The last resort would be using java parsers. Then how
can I use them like python xml parsers? It seems like javadom and
javasax has something to do, but I don't know how.

There is a really goog xml toolkit wich even covers xslt and some other
fancy stuff. Its called 4suite, and you can get it here:

http://www.4suite.org/

One of the authors, Uche Ogbuji, has some tutorials on working with it
on developerworks.

Diez
 
J

Jane Austine

Alan Kennedy said:
[Jane Austine]
I'm trying to parse an xml file with jython (not through java parsers
like xerces).

I tried minidom in jython 2.1 and 2.2a but all failed.

It's quite likely that your documents contained namespaces.

No.

Jython 2.1 on java1.4.2-beta (JIT: null)
Type "copyright", "credits" or "license" for more information.Traceback (innermost last):
File "<console>", line 1, in ?
File "C:\work\jython-2.1\Lib\xml\dom\minidom.py", line 913, in parseString
File "C:\work\jython-2.1\Lib\xml\dom\minidom.py", line 900, in _doparse
File "C:\work\jython-2.1\Lib\xml\dom\pulldom.py", line 251, in getEvent
AttributeError: feed
The only
parser supported in jython is "xmlproc", because it is pure python.

I can't make it work.
import xml.sax,xml.dom.minidom
p=xml.sax.make_parser(["xml.sax.drivers2.drv_xmlproc"])
xml.dom.minidom.parseString('<tag>foobar</tag>',parser=p)
Traceback (innermost last):
File "<console>", line 1, in ?
File "C:\work\jython-2.1\Lib\xml\dom\minidom.py", line 913, in parseString
File "C:\work\jython-2.1\Lib\xml\dom\minidom.py", line 900, in _doparse
File "C:\work\jython-2.1\Lib\xml\dom\pulldom.py", line 251, in getEvent
AttributeError: feed
However, "xmlproc" has some significant bugs in relation to namespace
processing, IIRC from the last time I looked at it.


1. Use a Java SAX2 parser, write a jython ContentHandler for it, build
a Minidom from the events.
2. Use a Java DOM processor (DOM4J, JDOM, etc), and let it build a DOM
for you.

It would probably be easier if you could give an outline of what you
are trying to achieve. For example, do you really need to build an
object model? Do you need to use xpath? Do you need to validate
structures? Etc, etc.

I need the tree model of the xml document. So I'm trying to use DOM parsers.
 
A

Alan Kennedy

[Alan Kennedy]
[Jane Austine]
I can't make it work.
import xml.sax,xml.dom.minidom
p=xml.sax.make_parser(["xml.sax.drivers2.drv_xmlproc"])
xml.dom.minidom.parseString('<tag>foobar</tag>',parser=p)
Traceback (innermost last):
File "<console>", line 1, in ?
File "C:\work\jython-2.1\Lib\xml\dom\minidom.py", line 913, in parseString
File "C:\work\jython-2.1\Lib\xml\dom\minidom.py", line 900, in _doparse
File "C:\work\jython-2.1\Lib\xml\dom\pulldom.py", line 251, in getEvent
AttributeError: feed

Hmm, pity about that. To be honest, I'm not going to try and make it
work. Any time I need XML processing in jython, I use native Java xml
parsers, such as Xerces. I think most jython users work the same way.

[Alan Kennedy]
[Jane Austine]
I need the tree model of the xml document. So I'm trying to use DOM parsers.

I kind of figured that. My focus with the question was "Why do you
need a tree model"? Do you need to data extraction, structure
validation, xpath/xslt, etc?

If this helps, here is a jython snippet that creates a DOM using
Apache Xerces.

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

from java.io import StringReader

import org.xml.sax as sax
import org.apache.xerces.parsers.DOMParser as domparser

if __name__ == "__main__":
parser = domparser()
document = """<doc><a href="http://www.python.org"/></doc>"""
parser.reset()
documentIS = sax.InputSource(StringReader(document))
parser.parse(documentIS)
domtree = parser.getDocument()
results = domtree.getElementsByTagName('a')
for ix in range(results.getLength()):
print "Link found: uri=%s" % results.item(ix).getAttribute('href')

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

HTH,
 
U

Uche Ogbuji

Diez B. Roggisch said:
There is a really goog xml toolkit wich even covers xslt and some other
fancy stuff. Its called 4suite, and you can get it here:

http://www.4suite.org/

One of the authors, Uche Ogbuji, has some tutorials on working with it
on developerworks.

Thanks for the plug, but 4Suite has a lot of C code in it and is
probably not an option for use in Java without some porting work.
 
P

Paul Boddie

I'm trying to parse an xml file with jython (not through java parsers
like xerces).

I tried minidom in jython 2.1 and 2.2a but all failed.

What can I do? The last resort would be using java parsers. Then how
can I use them like python xml parsers? It seems like javadom and
javasax has something to do, but I don't know how.

The Java parsers seem to work quite well with the xml.dom.javadom
package. First, update your CLASSPATH with references to the necessary
..jar files - this can be a frustrating process, but I found that
xercesImpl-2.5.0.jar and xml-apis.jar were a good combination:

export CLASSPATH=.../xercesImpl-2.5.0.jar:.../xml-apis.jar

Then, start jython and try the following:

import xml.dom.javadom
impl = xml.dom.javadom.XercesDomImplementation()
# Use your own filename below!
doc = impl.buildDocumentFile("example.xml")
# Now, try some PyXML-style DOM properties and methods.
doc.childNodes
doc.childNodes[0].getAttribute("some-attr")

I'd seen javadom lurking in PyXML before now, but it's a nice surprise
to see that it works rather well, especially since I've never seen any
evidence of anyone using it (as far as I remember).

Paul
 
G

gaodexiaozheng

在 2003å¹´11月24日星期一UTC+8下åˆ7æ—¶42分31秒,Paul Boddie写é“:
(e-mail address removed) (Jane Austine) wrote in message &gt; I'm trying to parse an xml file with jython (not through java parsers
&gt; like xerces).
&gt;
&gt; I tried minidom in jython 2.1 and 2.2a but all failed.
&gt;
&gt; What can I do? The last resort would be using java parsers. Then how
&gt; can I use them like python xml parsers? It seems like javadom and
&gt; javasax has something to do, but I don't know how.

The Java parsers seem to work quite well with the xml.dom.javadom
package. First, update your CLASSPATH with references to the necessary
.jar files - this can be a frustrating process, but I found that
xercesImpl-2.5.0.jar and xml-apis.jar were a good combination:

export CLASSPATH=.../xercesImpl-2.5.0.jar:.../xml-apis.jar

Then, start jython and try the following:

import xml.dom.javadom
impl = xml.dom.javadom.XercesDomImplementation()
# Use your own filename below!
doc = impl.buildDocumentFile(&quot;example.xml&quot;)
# Now, try some PyXML-style DOM properties and methods.
doc.childNodes
doc.childNodes[0].getAttribute(&quot;some-attr&quot;)

I'd seen javadom lurking in PyXML before now, but it's a nice surprise
to see that it works rather well, especially since I've never seen any
evidence of anyone using it (as far as I remember).

Paul

hi,do you know the PyXML whether can be supported by Jython ?
Looking forward to your reply!
Thanks
 
S

Stefan Behnel

(e-mail address removed), 17.07.2012 10:35:
hi,do you know the PyXML whether can be supported by Jython ?

PyXML is a dead project, don't use it.

You can use ElementTree in Jython, just as in Python.

Stefan
 
G

gaodexiaozheng

在 2012å¹´7月17日星期二UTC+8下åˆ6æ—¶02分31秒,Stefan Behnel写é“:
Matej Cepl, 17.07.2012 11:39:
&gt; On 17/07/12 10:35, (e-mail address removed) wrote:
&gt;&gt;&gt; &amp;gt; I&amp;#39;m trying to parse an xml file with jython(not through java
&gt;&gt;&gt; parsers
&gt;&gt;&gt; &amp;gt; like xerces).
&gt;
&gt; https://code.google.com/p/jython-elementtree/ ???

Note that this ships with Jython 2.5.

Stefan

Got that,thanks!
However,there is one project implemented by Python used PyXML and now my Jython project has to depend on it ,so I am afraid that if Jython doesn't support PyXML,then my jython project can not depend on the original Python project ,then my jython project maybe can not move on unless I find another project to take place of the original Python project.
 
G

gaodexiaozheng

在 2012å¹´7月17日星期二UTC+8下åˆ6æ—¶02分31秒,Stefan Behnel写é“:
Matej Cepl, 17.07.2012 11:39:
&gt; On 17/07/12 10:35, (e-mail address removed) wrote:
&gt;&gt;&gt; &amp;gt; I&amp;#39;m trying to parse an xml file with jython(not through java
&gt;&gt;&gt; parsers
&gt;&gt;&gt; &amp;gt; like xerces).
&gt;
&gt; https://code.google.com/p/jython-elementtree/ ???

Note that this ships with Jython 2.5.

Stefan

Got that,thanks!
However,there is one project implemented by Python used PyXML and now my Jython project has to depend on it ,so I am afraid that if Jython doesn't support PyXML,then my jython project can not depend on the original Python project ,then my jython project maybe can not move on unless I find another project to take place of the original Python project.
 
M

Matej Cepl

However,there is one project implemented by Python used PyXML and now
my Jython project has to depend on it ,so I am afraid that if Jython
doesn't support PyXML,then my jython project can not depend on the
original Python project ,then my jython project maybe can not move on
unless I find another project to take place of the original Python
project.

I think, if possible, such project should switch out of PyXML anyway. If
you make them nice patch to port them to standard ElementTree (and as a
side-effect make the project working with Jython), they will like you. I
guess.

Matěj
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,480
Members
44,900
Latest member
Nell636132

Latest Threads

Top