XML parsing best practices

S

stas

I have a general XML parsing related question.

Suppose I have a big java project with some kind of information stored
in a java file. Then I decide to move this information to an XML file
for easier editing and to eliminate recompiling the project every time
I add a new piece of data. The size of this XML file is rather small,
the syntax is quite simple: a few possible tag names, some capable of
containing an attribute. The java class that formerly contained and
processed the data that is now represented in XML format should now
parse this XML file.

Here comes the question! Should I separate the code that actually
retrieves the data from the XML file from the code that processes this
data (that is located in this former java class)? And if I should make
such kind of separation, what will the framework that retireves the
data from XML look like? Should it be java class for every XML tag
with methods for every subelement and attribute (seems like an
overkill to me), or should it be just some kind of "improved" parser
that is simply familiar with the syntax of my XML file? Is it eligible
to write such a framework that will need refactoring each time the
syntax of the XML file slightly changes?

Thanks.
 
S

Stefan Poehn

stas said:
I have a general XML parsing related question.
[...]
Here comes the question! Should I separate the code that actually
retrieves the data from the XML file from the code that processes this
data (that is located in this former java class)? And if I should make
such kind of separation, what will the framework that retireves the
data from XML look like? Should it be java class for every XML tag
with methods for every subelement and attribute (seems like an
overkill to me), or should it be just some kind of "improved" parser
that is simply familiar with the syntax of my XML file? Is it eligible
to write such a framework that will need refactoring each time the
syntax of the XML file slightly changes?

JAXP
http://java.sun.com/xml/jaxp/faq.html
writes a class for every XML element and is very easy to use.
 
C

Chris Smith

stas said:
Here comes the question! Should I separate the code that actually
retrieves the data from the XML file from the code that processes this
data (that is located in this former java class)? And if I should make
such kind of separation, what will the framework that retireves the
data from XML look like?

This depends heavily on the project and the data, and what you mean by
retrieving and processing. Some questions to ask might include:

1. Is the processing obvious, or does it involve core logic of your
application?

For example, simple processing such as converting from "total meetings"
and "meetings attended" to "percent attendance" would probably fit fine
into your code to retrieve data from the XML document; you'd generally
just consider this just a translation from one data form to another, and
quite appropriate to a data retrieval module.

On the other hand, if by "processing" you mean something non-trivial
(such as finding an optimal solution to a system of linear constraints
using the simplex method) or something dependent on the logic of your
application (such as applying your local tax code to some accounting
data), then separating from the parsing step could be very desirable, in
order to ease maintenance of that step of your code and abstract it from
the data source.

2. How complex is the data? If the data is trivial, then it would
probably make very little sense to declare a class per element and store
data into that. If the data is non-trivial, then that *might* make more
sense. In general, my advice here is to find a representation of the
data that makes sense *without* thinking of it as XML. After all, the
XML piece is an implementation detail; a few years from now, you might
decide that a relational database is better, or some other system.

Once you've got your data representation (not XML) down, then you work
out how to write a SAX or DOM based parser to fill in that data
structure. There are "XML data binding" products that claim to do this
work for you; you may find mixed results on whether they are helpful at
all.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top