Can't Process InputSource More Than Once?

R

Rick Brandt

I am attempting to troubleshoot an XML parsing error that occurs when XML
is submitted to my servlet. The error does not occur when I run the
servlet locally so to debug the problem I was attempting to log the
contents of the InputSource to a file just before attempting the parse. To
that end I added the lines of code below between the asterisk rows.

It appears that the getByteStream() that I run against my InputSource
(xmlIn) renders it useless for passing to the parser. Can an InputSource
only be read through once? Is there no way to reset it before I pass it to
the SAX parser?


public String ProcessHTTPPost(InputSource xmlIn) {
try {
XMLReader parser =
XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
MBOSaxParser MBOSaxParserInstance = new MBOSaxParser();
parser.setContentHandler(MBOSaxParserInstance);

//*********************************************
BufferedReader br = new BufferedReader(new
InputStreamReader(xmlIn.getByteStream()));
String str = null;
while ((str = br.readLine()) != null)
{
System.out.println(str);
log = new RegMgrLogger(str, false);
}
//*********************************************

parser.parse(xmlIn);
response = MBOSaxParserInstance.mboInstance.returnVal;
}
catch(Exception ex) {
ex.printStackTrace();
response = "ERROR~" + ex.toString();
}
finally {
return response;
}
}
 
J

John C. Bollinger

Followups directed to comp.lang.java.programmer.

Rick said:
I am attempting to troubleshoot an XML parsing error that occurs when XML
is submitted to my servlet. The error does not occur when I run the
servlet locally so to debug the problem I was attempting to log the
contents of the InputSource to a file just before attempting the parse. To
that end I added the lines of code below between the asterisk rows.

Just curious, but why didn't you just get the source's character stream
instead of getting a byte stream and converting it to a character stream
(with use of the system's default character encoding)? The former is
more likely to get the encoding right, and it's easier too.
It appears that the getByteStream() that I run against my InputSource
(xmlIn) renders it useless for passing to the parser. Can an InputSource
only be read through once? Is there no way to reset it before I pass it to
the SAX parser?

You would probably need to write a custom InputSource to enable that.
It might be easier and more flexible to wrap the existing InputSource in
a custom InputSource implementation that taps into the stream and logs
it _while_ the parser is reading it. I don't have time to write an
example at the moment, I'm afraid.
XMLReader parser =
XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
MBOSaxParser MBOSaxParserInstance = new MBOSaxParser();
parser.setContentHandler(MBOSaxParserInstance);

//*********************************************
BufferedReader br = new BufferedReader(new
InputStreamReader(xmlIn.getByteStream()));
String str = null;
while ((str = br.readLine()) != null)
{
System.out.println(str);
log = new RegMgrLogger(str, false);
}
//*********************************************

parser.parse(xmlIn);
response = MBOSaxParserInstance.mboInstance.returnVal;


John Bollinger
(e-mail address removed)
 
A

Andy Fish

It appears that the getByteStream() that I run against my InputSource
(xmlIn) renders it useless for passing to the parser. Can an InputSource
only be read through once? Is there no way to reset it before I pass it to
the SAX parser?

That seems sensible to me. An input source might be a stream and a stream
might be coming in off the network so in general you wouldn't expect to be
able to rewind it.

I guess it's possible to provide extra methods on certain types of input
sources to allow more advanced source-specific features but I don't know of
any - you'd need to look in the javadoc.

Andy
 
R

Rick Brandt

John C. Bollinger said:
Followups directed to comp.lang.java.programmer.



Just curious, but why didn't you just get the source's character stream
instead of getting a byte stream and converting it to a character stream
(with use of the system's default character encoding)? The former is
more likely to get the encoding right, and it's easier too.

Because I don't know what I'm doing : )

I'm predominantly a database guy who develops in Access/VBA and learned
some Java because I needed to be able to send data to a database over the
internet from a client application. Servlets and XML seemed to best fill
the bill so I hack and probe and look for examples until I get something
that works. I am not very familiar with the API.

Based on your response I tried...

BufferedReader br = new BufferedReader(xmlIn.getCharacterStream());

....but it throws a Null Pointer Exception.
 
R

Rick Brandt

Andy Fish said:
it

That seems sensible to me. An input source might be a stream and a stream
might be coming in off the network so in general you wouldn't expect to be
able to rewind it.

I guess it's possible to provide extra methods on certain types of input
sources to allow more advanced source-specific features but I don't know of
any - you'd need to look in the javadoc.

Once I have the Character Stream in a Buffered Reader I can go through that
as many times as I want using mark() and reset(). Is there a way to
generate another InputSource instance from the Buffered Reader?
 
J

John C. Bollinger

Rick said:
as many times as I want using mark() and reset(). Is there a way to
generate another InputSource instance from the Buffered Reader?

You could try

InputSource input = new InputSource(myReader);

I.e. InputSource has a constructor that accepts a single Reader argument.

I submit, however, that if you can't get a valid Reader directly from
the InputSource (as you indicated elsewhere) then you are better off
doing the mark()ing, reset()ing and new InputSource creating on a
BufferedInputStream (byte stream) so as to not inject potentially false
information about the character encoding. You can put one between the
InputSource's InputStream and your InputStreamReader. Indeed, if it's
possible then I would recommend that you avoid character streams
altogether unless you can reliably determine the correct character
encoding to use. And if you can reliably determine the encoding then
you should specify it to your InputStreamReader's constructor.


John Bollinger
(e-mail address removed)
 
J

John C. Bollinger

Rick said:
Based on your response I tried...

BufferedReader br = new BufferedReader(xmlIn.getCharacterStream());

...but it throws a Null Pointer Exception.

Rats. On closer inspection I see that InputSource is not as smart as I
thought it was. However, I also see that it's even dumber than you
think it is. To wit: it is also possible for the InputSource's
getByteStream() method to return null. All depends on how the
InputSource was created and configured. If that's under your control
then you're fine, but if it isn't then you may have to be able to
support either a byte stream or a character stream depending on the
InputSource.


John Bollinger
(e-mail address removed)
 
R

Rick Brandt

John C. Bollinger said:
You could try

InputSource input = new InputSource(myReader);

I.e. InputSource has a constructor that accepts a single Reader argument.

Thanks; that seems to have resolved the issue. I am now creating a
BufferedReader from the InputSource, using that to log the source,
resetting it and then using it to create a new InputSource that I pass to
the parser.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top