Modifying encoding type of a Document

R

roy.sebastien

Hello everyone,

I have created a document in memory using the following code

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
Element root = document.getDocumentElement();
document.setXmlVersion("1.0");
Node rootNode =
document.appendChild(document.createElement("ExportRH"));
(I add all the fields here)

Then, once the document is created, I pass it to the following method
which transforms it using a stylesheet:

private void transform(InputStream xsltStream, Object xmlStream,
OutputStream outputFile) throws TransformerConfigurationException,
TransformerException, ParserConfigurationException, SAXException,
IOException {
DocumentBuilderFactory builderfactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderfactory.newDocumentBuilder();
Document document = null;

if (xmlStream instanceof String)
document = builder.parse((String)xmlStream);
else
document = (Document)xmlStream;
//builder.parse((InputStream)xmlStream);

Source xsltSource = new StreamSource(xsltStream);
Source source = new DOMSource(document);
Result result = new StreamResult(outputFile);
TransformerFactory factory = TransformerFactory.newInstance();
Templates transformation = factory.newTemplates(xsltSource);
Transformer transformer = transformation.newTransformer();

transformer.transform(source, result);

}

It works fine until I have non-us characters. Then I receive a
TransformerException. I could serialize it and transform it on disk
but I was looking for a solution that would do the whole thing in
memory.

Can someone help me? I know this is trivial, but I'm not familiar with
XML.

Thanks

Sebastien
 
S

Steve W. Jackson

Hello everyone,

I have created a document in memory using the following code

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
Element root = document.getDocumentElement();
document.setXmlVersion("1.0");
Node rootNode =
document.appendChild(document.createElement("ExportRH"));
(I add all the fields here)

Then, once the document is created, I pass it to the following method
which transforms it using a stylesheet:

private void transform(InputStream xsltStream, Object xmlStream,
OutputStream outputFile) throws TransformerConfigurationException,
TransformerException, ParserConfigurationException, SAXException,
IOException {
DocumentBuilderFactory builderfactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderfactory.newDocumentBuilder();
Document document = null;

if (xmlStream instanceof String)
document = builder.parse((String)xmlStream);
else
document = (Document)xmlStream;
//builder.parse((InputStream)xmlStream);

Source xsltSource = new StreamSource(xsltStream);
Source source = new DOMSource(document);
Result result = new StreamResult(outputFile);
TransformerFactory factory = TransformerFactory.newInstance();
Templates transformation = factory.newTemplates(xsltSource);
Transformer transformer = transformation.newTransformer();

transformer.transform(source, result);

}

It works fine until I have non-us characters. Then I receive a
TransformerException. I could serialize it and transform it on disk
but I was looking for a solution that would do the whole thing in
memory.

Can someone help me? I know this is trivial, but I'm not familiar with
XML.

Thanks

Sebastien

I've done relatively little with transformers, but I have one class that
saves preferences using a transformer. After obtaining a new
transformer, I call transformer.setOutputProperty("encoding", "UTF-8")
against it.

My Source is a StreamSource which I create by serializing a document
element to string, then calling getBytes("UTF-8") on that and passing
the result to the constructor of a ByteArrayInputStream, which then goes
to the constructor of the StreamSource. I'm not positive we needed
this, and you may not either.

My Result is a StreamResult. Its constructor gets an OutputStreamWriter
created around a FileOutputStream wrapping a File. I selected the
OutputStreamWriter because it offered a constructor that accepts an
OutputStream and a String naming the charsetName so that I could use
"UTF-8".

It's been my experience that reading XML back in from a file when
certain characters are present doesn't always work well unless both the
XML document and the stream to which it was written were encoded
correctly, which is what led to doing this.

HTH,
= Steve =
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top