SAX callback method question

steve_marjoribanks · Feb 23, 2006

If I have an XML document with some elements like this:

<line>
<point>0 2</point>
<point>1 4</point>
<point>3 5</point>
etc.

</line>

ie. a collection of points which I want to extract the coordinates of
from the XML file and draw them using Java.
I was thinking I can obviously use the startElement method and use a
test to see if it's a <point> element and then use the characters
method to extract the coordinates as strings and cast them to intergers
and store in an array or similar. This might sound like a silly
question but will the parser always traverse through the XML document
in order parsing as it goes? ie, if using the method just described,
will the coordinates of the points be stored in the correct order in
the array?

Also, if my XML document was like:

<line>
<point>5 1</point>
<point>4 8</point>
etc
</line>
<line>
<point>3 4</point>
<point>4 1</point>
etc
</line>
etc

how would I go about making sure that the point coordinates for each
line remain separate from each other and do not get mixed up?
I'm starting to think that DOM might have been a better idea than SAX!!
:-(

Steve

Robert Klemme · Feb 23, 2006

If I have an XML document with some elements like this:

<line>
<point>0 2</point>
<point>1 4</point>
<point>3 5</point>
etc.

</line>

ie. a collection of points which I want to extract the coordinates of
from the XML file and draw them using Java.
I was thinking I can obviously use the startElement method and use a
test to see if it's a <point> element and then use the characters
method to extract the coordinates as strings and cast them to
intergers and store in an array or similar. This might sound like a
silly question but will the parser always traverse through the XML
document in order parsing as it goes? ie, if using the method just
described, will the coordinates of the points be stored in the
correct order in the array?

Yes. AFAIK order matters by the XML standard.

Also, if my XML document was like:

<line>
<point>5 1</point>
<point>4 8</point>
etc
</line>
<line>
<point>3 4</point>
<point>4 1</point>
etc
</line>
etc

how would I go about making sure that the point coordinates for each
line remain separate from each other and do not get mixed up?
I'm starting to think that DOM might have been a better idea than
SAX!! :-(

I prefer SAX as it's less resource intensive and you can easily skip
things you want to ignore without wasting mem or CPU cycles.

The way I usually do it is this: create a proxy that implements the
callback interface(s) I need. Internally when it sees an opening element
it will create a delegate instance based on the nane of the element and
puts it onto a stack by giving him the reference of the current elem.
Then the proxy delegates the method call to the topmost element on the
stack. Delegates store state as they see fit and model instances are
updated when the closing tag is detected.

Hope that was clear enough.

Btw, your points are really structured elements. I'd rather do something
like:

<line>
<point>
<x>0</x>
<y>2</y>
</point>
</line>

(With better names probably.)

Kind regards

robert

steve_marjoribanks · Feb 23, 2006

Thanks for the reply. With regards to the naming, I just made up an
example, a sample of the real XML I am using is shown below. The
problem is that my schema is an extension of other schemas and as such
contains elements and complex types whose naming is out of my control.

<geotechml:layers>
<geotechml:Layer materialID="1">
<geotechml:layerTop>
<geotechml:Curve>
<gml:LineString>
<gml

os>0 10</gml

os>
<gml

os>30 10</gml

os>
<gml

os>60 40</gml

os>
</gml:LineString>
</geotechml:Curve>
</geotechml:layerTop>
</geotechml:Layer>
<geotechml:Layer materialID="2">
<geotechml:layerTop>
<geotechml:Curve>
<gml:LineString>
<gml

os>0 30</gml

os>
<gml

os>20 40</gml

os>
<gml

os>60 40</gml

os>
</gml:LineString>
</geotechml:Curve>
</geotechml:layerTop>
</geotechml:Layer>
<geotechml:Layer materialID="3">
<geotechml:layerTop>
<geotechml:Curve>
<gml:LineString>
<gml

os>0 60</gml

os>
<gml

os>20 65</gml

os>
<gml

os>50 70</gml

os>
<gml

os>70 70</gml

os>
<gml

os>100 80</gml

os>
</gml:LineString>
</geotechml:Curve>
</geotechml:layerTop>
</geotechml:Layer>
</geotechml:layers>

In the example above I need to extract the values of the 3 coordinate
points given for each lineString and then draw them in my Java
application.
Sorry, but being a newbie I have no idea what you're talking about when
you gave your solution using a proxy? Any chance you could exlplain
further please? (sorry!).
Do you think in this instance it would be easier to use DOM? I say this
because although I don't need to extract data from every element (as
shown above) there are a number of elements which I need to get the
data from and they're not all named the same as in the example above
either.

Steve

Robert Klemme · Feb 23, 2006

Thanks for the reply. With regards to the naming, I just made up an
example, a sample of the real XML I am using is shown below. The
problem is that my schema is an extension of other schemas and as such
contains elements and complex types whose naming is out of my control.

Well, bad. :-}

In the example above I need to extract the values of the 3 coordinate
points given for each lineString and then draw them in my Java
application.
Sorry, but being a newbie I have no idea what you're talking about
when you gave your solution using a proxy? Any chance you could
exlplain further please? (sorry!).

You create an object that does just part of the job (finding the one that
should do the real work) and then delegates the method invocation to that
object.

Do you think in this instance it would be easier to use DOM? I say
this because although I don't need to extract data from every element
(as shown above) there are a number of elements which I need to get
the data from and they're not all named the same as in the example
above either.

Can't really tell as I don't see the whole picture. If you use DOM,
you'll have to do the traversal or work with an XSLT processor. If those
documents can be large I'd favour the other approach but YMMV (especially
if you need a lot of the data from the tree).

Kind regards

robert

steve_marjoribanks · Feb 24, 2006

You create an object that does just part of the job (finding the one that

should do the real work) and then delegates the method invocation to that
object.

Do you mean kind of 'nesting' callback methods? So would I have one
handler that find a certain node and then delegates the handling of the
children of that node to another node and so on until I get the data I
need? Sorry for all the questions!

Can't really tell as I don't see the whole picture. If you use DOM,
you'll have to do the traversal or work with an XSLT processor. If those
documents can be large I'd favour the other approach but YMMV (especially
if you need a lot of the data from the tree).

Hmm, its a tricky one. I originally chose SAX because the documents I'm
working with with have the potential to become fairly large, not
massive but not particularly small either. Also, I have no need to
write or change the XML so I thought I'd use SAX. Having thought about
it now though, I do need to extract a fair amount of data from the tree
but as shown above I'll need to traverse down though a fairly large
tree structure to get the information I need because there is a
reasonably 'deep' tree structure and the information needed is at the
bottom of the tree.

Robert Klemme · Feb 24, 2006

Do you mean kind of 'nesting' callback methods? So would I have one
handler that find a certain node and then delegates the handling of
the children of that node to another node and so on until I get the
data I need? Sorry for all the questions!

Yes. I think you get the hang of it.

Hmm, its a tricky one. I originally chose SAX because the documents
I'm working with with have the potential to become fairly large, not
massive but not particularly small either. Also, I have no need to
write or change the XML so I thought I'd use SAX. Having thought about
it now though, I do need to extract a fair amount of data from the
tree but as shown above I'll need to traverse down though a fairly
large tree structure to get the information I need because there is a
reasonably 'deep' tree structure and the information needed is at the
bottom of the tree.

But if you just need info from some top level nodes and leaf nodes and
there's a lot of stuff in between that you want to ignore, then that
sounds as if you rather only extract 20% of the data. In that case I'd go
for SAX.

Kind regards

robert

steve_marjoribanks · Feb 26, 2006

Having had a think about it, I'm struggling to get my head around how
this would actually be implemented. I've read up on the DefaultHandler
and as far as I can work out you can only assign one per reader. How
can I use multiple handlers on just one input?

steve_marjoribanks · Feb 26, 2006

Having had a think about it, I'm struggling to get my head around how
this would actually be implemented. I've read up on the DefaultHandler
and as far as I can work out you can only assign one per reader. How
can I use multiple handlers on just one input?

Thanks
Steve

Robert Klemme · Feb 27, 2006

Having had a think about it, I'm struggling to get my head around how
this would actually be implemented. I've read up on the DefaultHandler
and as far as I can work out you can only assign one per reader. How
can I use multiple handlers on just one input?

This is a basic pattern called "delegation" (also "strategy pattern" and
"state pattern"). Information about this abounds on the web but you might
be better off by first reading a book about OO design and / or software
design in general.

Kind regards

robert

Logic Problem with BigInteger Method	2	Aug 26, 2023
How to add four trackbars to plot?	2	Dec 14, 2022
SAX PARSING DESIGN PATTERN	1	Mar 28, 2007
XML / Unicode / SAX question	2	Jul 4, 2007
Minimum Total Difficulty	0	Nov 15, 2023
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
SAX XML Parse Python error message	5	Jul 13, 2008
Rearranging .ply file via C++ String Parsing	0	Dec 14, 2019

SAX callback method question

steve_marjoribanks

Robert Klemme

steve_marjoribanks

Robert Klemme

steve_marjoribanks

Robert Klemme

steve_marjoribanks

steve_marjoribanks

Robert Klemme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads