escapd XML writing/reading/parsing queries

P

Piper707

Hi,

My <operator> tag in XML can take any of the following values:

"<", ">", and "<>"

I enter the data this way in the xml:

<operator>&lt;</operator>
<operator>&gt;</operator>

This works fine for the first two.

1) How do I enter "<>" as data in the xml document? I tried
<operator>&gt;&lt;</operator>, but that only reads the first value,
i.e. "<".

2) After parsing the field, I need to match the data to the content of
an List, and return the matched index:

XML tag:
--------
<OPERATOR>&lt;</OPERATOR>


This is the method
------------------

List operatorList = Arrays.asList("<>", "&lt;&gt;", "&lt;" ,"<", "&gt;"
,">");

public void setOperator(String operator){
System.out.println("operator sent" + " ----> " + operator);
System.out.println("operator index" + " ----> " +
operatorList.indexOf(operator));
}


Method Output:
---------------

operator sent ----> <
operator index ----> -1


It is unable to find either "<" or ">" in my list, I can't understand
why?

3) However when I try this:

if(operator.equals("<"))
System.out.println("received <");

It prints out "received <" correctly.

Can someone please clarify?

Thank you,
Rohit.
 
O

Oliver Wong

Hi,

My <operator> tag in XML can take any of the following values:

"<", ">", and "<>"

I enter the data this way in the xml:

<operator>&lt;</operator>
<operator>&gt;</operator>

This works fine for the first two.

1) How do I enter "<>" as data in the xml document? I tried
<operator>&gt;&lt;</operator>, but that only reads the first value,
i.e. "<".

2) After parsing the field, I need to match the data to the content of
an List, and return the matched index:

XML tag:
--------
<OPERATOR>&lt;</OPERATOR>


This is the method
------------------

List operatorList = Arrays.asList("<>", "&lt;&gt;", "&lt;" ,"<", "&gt;"
,">");

public void setOperator(String operator){
System.out.println("operator sent" + " ----> " + operator);
System.out.println("operator index" + " ----> " +
operatorList.indexOf(operator));
}


Method Output:
---------------

operator sent ----> <
operator index ----> -1


It is unable to find either "<" or ">" in my list, I can't understand
why?

3) However when I try this:

if(operator.equals("<"))
System.out.println("received <");

It prints out "received <" correctly.

Can someone please clarify?

Can you post an SSCCE?

http://www.physci.org/codes/sscce.jsp

I'm not saying this just to be a pain, but there are so many places
where the error can be occuring, especially with respect to (1). Are you
escape too many times? Not enough times? Etc.

Try to post a small program and XML file such that the problem in (1) is
clearly demonstrated. E.g. the XML file should be something like
"<op>&lt;&gt;</op>" and the program shoud output to console "<". From there
we can see exactly what you're doing wrong.

- Oliver
 
P

Piper707

Hi Oliver,

Strangely, part of the problem seems to have fixed itself. It works
okay for "<" and ">", but I'm still having trouble with the "<>" bit.

Try the foll program with this XML:

<TEST>
<OPERATOR>
&lt;&gt;
</OPERATOR>
</TEST>

I'm expecting &gt;&lt; to match up with index 4, rather it matches up
with 2, corresponding to "<"

-----------------------------------------------------------------------
import javax.xml.stream.*;
import javax.xml.stream.events.*;
import java.io.*;
import java.util.Arrays;
import java.util.List;

public class MyExample{

public static void main(String[] args){

List operatorList = Arrays.asList(">", "&gt;", "<", "&lt;", "<>",
"&lt;&gt;");

try{
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLEventReader xmlEventReader =
xmlInputFactory.createXMLEventReader(new
FileInputStream("d:\\testing\\myexample.xml"));

while(xmlEventReader.hasNext())
{
XMLEvent xmlEvent = xmlEventReader.nextEvent();
if(xmlEvent.isStartElement())
{
StartElement startElement = (StartElement)xmlEvent;

if(startElement.getName().toString().equals("OPERATOR"))
{
Characters charactersEvent = (Characters)
xmlEventReader.nextEvent();
String operator = charactersEvent.getData().trim();
System.out.println("operator sent ----> " +
operator);
System.out.println("index found ----> " +
operatorList.indexOf(operator));
}
}//if
}//while
}//try
catch(Exception e)
{
e.printStackTrace();
}
}//main
}//class

-----------------------------------------------------------------------

1) The parsing mtd is StaX, i wonder if the parser implementation could
be a problem?

2) Probably a silly question, I run programs using Eclipse, is it
possible for the JVM to maintain some sort of a "cache", which could
fudge my results? I ran this thing as part of my program about 10 times
yesterday and it wouldn't recognize "<" or ">" either. But today, I'm
only having trouble with <>!

Thanks
Rohit.
 
J

Jaakko Kangasharju

I'm expecting &gt;&lt; to match up with index 4, rather it matches up
with 2, corresponding to "<"

I've seen this happen with a SAX parser, and I would expect a StAX
parser is allowed to do the same.
List operatorList = Arrays.asList(">", "&gt;", "<", "&lt;", "<>",
"&lt;&gt;");

By the way, I don't think you can match against the entities; first,
these entities should be handled by the parser, and second, there is
probably a separate event for entity references. So I'd say you
should just drop the entities from this list.
if(startElement.getName().toString().equals("OPERATOR"))
{
Characters charactersEvent = (Characters)
xmlEventReader.nextEvent();

I suspect here is your problem. At least with SAX it is not
guaranteed that you get the whole text with a single characters event.
Since StAX is also streaming, it makes sense that it doesn't provide
the whole text in a single event either. So you'll need to loop,
collecting the text, until you no longer get characters events.

If you check the next event after this, you should find that with <>
there is still a characters event left for the > character.
 
P

Piper707

By the way, I don't think you can match against the entities; first,
these entities should be handled by the parser, and second, there is
probably a separate event for entity references. So I'd say you
should just drop the entities from this list.

yes. that is correct. There is a property -
javax.xml.stream.isReplacingEntityReferences, which is "true" by
default. This makes the parser resolve all references and report them
as markup. If explicitly set to false, it would report them as seperate
entity reference events.
I suspect here is your problem. At least with SAX it is not
guaranteed that you get the whole text with a single characters event.
Since StAX is also streaming, it makes sense that it doesn't provide
the whole text in a single event either. So you'll need to loop,
collecting the text, until you no longer get characters events.

If you check the next event after this, you should find that with <>
there is still a characters event left for the > character.

Thats it. I just ran a check for all characters() callbacks. It gets
the ">" in the invocation after "<". Thats a pity though. I'll have to
factor that into my parsing logic.

I guess I've been lucky so far - even tag content as long and
ridiculous as: "mary had a little lamb and i'm writing a story between
the two tags and guess what i'm still typing...thats looks long enough"
gets invoked in just one event callback!

Thanks!
Rohit.
 
O

Oliver Wong

Thats it. I just ran a check for all characters() callbacks. It gets
the ">" in the invocation after "<". Thats a pity though. I'll have to
factor that into my parsing logic.

I guess I've been lucky so far - even tag content as long and
ridiculous as: "mary had a little lamb and i'm writing a story between
the two tags and guess what i'm still typing...thats looks long enough"
gets invoked in just one event callback!

Thanks!

This is obviously implementation dependent, but when I use SAX locally,
it always seems to wait at newlines, regardless of how long the content
between two tags are. Perhaps my underlying implementation is using
readLine() rather than, say, read256BytesAtATime()

- Oliver
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,533
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top