ERROR when using xalan

Bekkali Hicham · Jun 24, 2003

hi,
i have already used xalan several times with success, but i have a error
message that i don't understand, thanks for your help

(Emplacement inconnu de l'erreur) Erreur XSLT
(javax.xml.transform.TransformerConfigurationException):
javax.xml.transform.TransformerException: java.io.UTFData
FormatException: Invalid byte 2 of 3-byte UTF-8 sequence.

thanks

Derek Harmon · Jun 25, 2003

Bekkali Hicham said:
i have already used xalan several times with success, but i have a error
message that i don't understand : :
javax.xml.transform.TransformerException: java.io.UTFData
FormatException: Invalid byte 2 of 3-byte UTF-8 sequence.

This looks like a Java I/O exception that Xalan is just passing along. UTF-8
is an encoding that sometimes refers to multi-byte character sequences (MBCS).
If I recall correctly when the first-byte is 0x80-0x9f (and there's another span of
values in addition to this span) then it's the lead byte of a multi-byte sequence
representing one Unicode character. This allows many commonly occuring
characters to be encoded with one byte while some less frequent chars are
encoded with multiple bytes.

The error message, "Invalid byte 2 of 3-byte UTF-8 sequence" means that
a Java I/O streaming object expected, from the first byte, that this was a 3
byte sequence and when it examined the second byte, it determined that the
second byte was an illegal value (for instance, a value contradicting the first
byte).

What does this mean for you, the programmer?

Two possibilities:

1. There is no encoding attribute in the document's XML declaration, and
Xalan is assuming it is UTF-8 when the document is not UTF-8.
2. The document may have been UTF-8 and was corrupted in transmission
(was it sent over the network?)

If there is no encoding attribute in the document's XML declaration, put one
there. For example, if there are Traditional Chinese (Taiwanese) characters
in the XML document, you might try:

<?xml version="1.0" encoding="Big5" ?>

if they are Simplified Chinese, then try GB2312, if it's Japanese, try JIS.
etc. When Xalan reads one of these encodings, I think Xerces will transcode
them to Unicode, or at least use a non-UTF8 streaming source.

If one or more bytes of the document were corrupted, you may be able to
simply edit the document and look for any glyphs that look out-of-place
at the point in the document where the error occured.

HTH,

Derek Harmon

XSL+XML=> HTML using xalan.	0	May 25, 2004
xalan versus ie xml rendering	3	Feb 25, 2011
Xalan using generic DOM structures..	7	Oct 25, 2004
Survey details won't go through using php, ajax, Mysql	0	Oct 26, 2023
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
No output from xalan when DOCTYPE declaration included	2	Feb 15, 2006
I need some help on a format issue that should be simple for someone here (but not me!)	0	Jul 6, 2023
How to pull the namespaces to the serailized XML Fragment using Xalan-C++ API	2	Oct 31, 2006

ERROR when using xalan

Bekkali Hicham

Derek Harmon

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads