Ugly SAX

R

Roedy Green

I wrote a bit of code using SAX to extract data from an XML
configuration file whose structure I composed myself. I thought to
myself, "This can't be right". Nobody in their right mind would invent
something so clumsy to extract the data." I used as a model various
bits of code I found on the net. I am hoping this information was
obsolete.

Is there a more streamlined way to do this?

In this case the XML file is quite small, so perhaps a DOM approach
might be more appropriate.

Here is the XML file I want to extract data from:

https://wush.net/websvn/mindprod/fi...&path=/com/mindprod/htmlreflow/htmlreflow.xml

Here is the XSD schema for the file

https://wush.net/websvn/mindprod/fi...&path=/com/mindprod/htmlreflow/htmlreflow.xsd

Here is my parsing code
https://wush.net/websvn/mindprod/fi...&path=/com/mindprod/htmlreflow/Configure.java

What bothers me is I explained in the XSD considerable detail about
the structure of the document, but none of this knowledge is
automatically used in extracting data.
 
M

Mike Schilling

Roedy said:
I wrote a bit of code using SAX to extract data from an XML
configuration file whose structure I composed myself. I thought to
myself, "This can't be right". Nobody in their right mind would invent
something so clumsy to extract the data." I used as a model various
bits of code I found on the net. I am hoping this information was
obsolete.

Is there a more streamlined way to do this?

In this case the XML file is quite small, so perhaps a DOM approach
might be more appropriate.

Here is the XML file I want to extract data from:

https://wush.net/websvn/mindprod/fi...&path=/com/mindprod/htmlreflow/htmlreflow.xml

Here is the XSD schema for the file

https://wush.net/websvn/mindprod/fi...&path=/com/mindprod/htmlreflow/htmlreflow.xsd

Here is my parsing code
https://wush.net/websvn/mindprod/fi...&path=/com/mindprod/htmlreflow/Configure.java

Os this the code you meant to post? There's no SAX in it.
What bothers me is I explained in the XSD considerable detail about
the structure of the document, but none of this knowledge is
automatically used in extracting data.

To use sceham information to parse, use JAXB to generate Java classes that
correspond to your schema types, which will also have the logic to
deserialize themselves from XML.
 
R

Roedy Green

Os this the code you meant to post? There's no SAX in it.

right. There is
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

Which lead me to believe I was using SAX, but actually it is a DOM
parse.

I would rephrase my question. It the code obsolete? Is there a better
way to do this that takes advantage of field type information in the
XSD schema? Is there a terser way to extract data?
 
M

Mike Schilling

Roedy said:
right. There is
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

Which lead me to believe I was using SAX, but actually it is a DOM
parse.

I would rephrase my question. It the code obsolete? Is there a better
way to do this that takes advantage of field type information in the
XSD schema? Is there a terser way to extract data?

As I said, use JAXB to generate a class from the schema. Many web service
toolkits can do this too (e.g. Axis), but as far as I know, they need a WSDL
to start from.
 
D

Daniel Pitts

What bothers me is I explained in the XSD considerable detail about
the structure of the document, but none of this knowledge is
automatically used in extracting data.

It should bother you. Time to break out meta-programming skills. You
can generate either the XSD from code, code from the XSD, or code+XSD
from some other authoritative source!

There may already be tools that do one or the other of this for you. I
think I saw someone mention JAXB (never used it, so don't know about it).
 
R

Roedy Green

As I said, use JAXB to generate a class from the schema. Many web service
toolkits can do this too (e.g. Axis), but as far as I know, they need a WSDL
to start from.

Thanks for the tip. It turned out to be much easier than the tutorial
lead me to expect. If you ignore the generated code, the code is much
simpler than DOM. It gets the types right, including enums.

I am now using it to extract data about posters that I put on my home
page.

I wrote up an overview on JAXB at
http://mindprod.com/jgloss/jaxb.html

The tool that make this feasible was Stylus Studio which generated a
first cut at an XSD given a sample XML file. I then polished it a bit
and tweaked it to get JAXB to generate better Java.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,059
Latest member
cryptoseoagencies

Latest Threads

Top