Encoding problem with SAX parser

M

Martin Schlatter

I'm parsing an XML document with a SAX parser.
I initialise it in the following way:

javax.xml.parsers.DocumentBuilderFactory docBuilderFactory =
javax.xml.parsers.DocumentBuilderFactory.newInstance();
docBuilder = docBuilderFactory.newDocumentBuilder();
doc = docBuilder.parse(new File(fname));

But while parsing, I get an exception because their are characters
which are not valid utf-8 chars. I cannot change the input file. Is
there any way to skip over the invalid characters? Is there a chance
to use docBuilder.parse(InputStream) and then skip the invalid
characters?

Jens Martin Schlatter
 
M

Mike Schilling

Martin Schlatter said:
I'm parsing an XML document with a SAX parser.
I initialise it in the following way:

javax.xml.parsers.DocumentBuilderFactory docBuilderFactory =
javax.xml.parsers.DocumentBuilderFactory.newInstance();
docBuilder = docBuilderFactory.newDocumentBuilder();
doc = docBuilder.parse(new File(fname));

But while parsing, I get an exception because their are characters
which are not valid utf-8 chars. I cannot change the input file.

Is the file in UTF-8? If not, is it in any valid encoding? If so, try
replacing your last line with

org.xml.sax.InputSource src = new InputSource(new
FileInputStream(fname);
src.setEncoding(YourEncodingNameGoesHere);
doc = docBuilder.parse(src);

If not, you'll have to create a FilterInputStream that removes the bad
characters and replace your last line with:

doc = docBuilder.parse(new YourFilterStream(new FileInputStream(fname));
 
M

Martin Schlatter

Is the file in UTF-8?

Yes, its UTF-8, but some characters are invalid.
If not, you'll have to create a FilterInputStream that removes the bad
characters and replace your last line with:

doc = docBuilder.parse(new YourFilterStream(new FileInputStream(fname));

Ok, I see. Thanks! I'll try that!

Jens Martin Schlatter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,534
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top