Xerces parsing losing data...

trent ohannessian · Sep 8, 2003

Hello -

I'm using the Xerces parser to parse a small XML file. I've noticed
that randomly the XML parsing process would chop off the first N
characters of a given element's value. It's always the same element,
but not every time, and it's always chopping off the same number of
characters.

For example, I have an <shipping_address2></shipping_address2>
element. Nine times out of ten it will parse correctly. But the
other time it will cut off the first half of the value. This
particular XML file has 132 transactions in it, so around 13 will have
this problem.

Has anyone seen this before? Any ideas?

Thanks,
Trent

Steve Jasper · Sep 8, 2003

This is a common gotcha....

I assume you're using the following method:

characters(char[] ch, int start,int length)

to get the characters of an element. Xerces will sometimes make
multiple calls to this method for a particular element, effectively
breaking the element data into chunks. What you are experiencing is
that sometimes the element is broken up into only one chunk (so it
looks like it works). In other cases, it's broken up into multiple
chunks, and I bet you're only getting the first chunk.

You need to keep a local variable that holds this data from each call
within an element to this method, making sure to keep appending to
this local variable until the endElement callback is called. Then
you'll know you have the entire element.

good luck.

trent ohannessian · Sep 9, 2003

That worked perfectly. Thank you very much!

Trent

This is a common gotcha....

I assume you're using the following method:

characters(char[] ch, int start,int length)

to get the characters of an element. Xerces will sometimes make
multiple calls to this method for a particular element, effectively
breaking the element data into chunks. What you are experiencing is
that sometimes the element is broken up into only one chunk (so it
looks like it works). In other cases, it's broken up into multiple
chunks, and I bet you're only getting the first chunk.

You need to keep a local variable that holds this data from each call
within an element to this method, making sure to keep appending to
this local variable until the endElement callback is called. Then
you'll know you have the entire element.

good luck.

Hello -

I'm using the Xerces parser to parse a small XML file. I've noticed
that randomly the XML parsing process would chop off the first N
characters of a given element's value. It's always the same element,
but not every time, and it's always chopping off the same number of
characters.

For example, I have an <shipping_address2></shipping_address2>
element. Nine times out of ten it will parse correctly. But the
other time it will cut off the first half of the value. This
particular XML file has 132 transactions in it, so around 13 will have
this problem.

Has anyone seen this before? Any ideas?

Thanks,
Trent

Click to expand...

Parsing Numeric Data	2	Nov 8, 2012
Parsing XSD Schema from namespace schemaLocation using Xerces CDT	1	Nov 21, 2006
XML parsing with Xerces	0	May 12, 2004
xerces advanced usage - progresss, random access etc	3	Sep 4, 2006
XML-Parsing with UTF-8 Byte-Order-Mark (BOM)	0	Jun 25, 2007
I'm tempted to quit out of frustration	1	Aug 13, 2023
A Unique XML Parsing Problem	5	Oct 24, 2010
XML-Parsing with UTF-8 Byte-Order-Mark (BOM)	3	Jun 25, 2007

Xerces parsing losing data...

trent ohannessian

Steve Jasper

trent ohannessian

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads