ERROR when using xalan

Discussion in 'XML' started by Bekkali Hicham, Jun 24, 2003.

  1. hi,
    i have already used xalan several times with success, but i have a error
    message that i don't understand, thanks for your help

    (Emplacement inconnu de l'erreur) Erreur XSLT
    (javax.xml.transform.TransformerConfigurationException):
    javax.xml.transform.TransformerException: java.io.UTFData
    FormatException: Invalid byte 2 of 3-byte UTF-8 sequence.

    thanks
     
    Bekkali Hicham, Jun 24, 2003
    #1
    1. Advertising

  2. Bekkali Hicham

    Derek Harmon Guest

    "Bekkali Hicham" <> wrote in message news:bdafo9$or8$...
    > i have already used xalan several times with success, but i have a error
    > message that i don't understand

    : :
    > javax.xml.transform.TransformerException: java.io.UTFData
    > FormatException: Invalid byte 2 of 3-byte UTF-8 sequence.


    This looks like a Java I/O exception that Xalan is just passing along. UTF-8
    is an encoding that sometimes refers to multi-byte character sequences (MBCS).
    If I recall correctly when the first-byte is 0x80-0x9f (and there's another span of
    values in addition to this span) then it's the lead byte of a multi-byte sequence
    representing one Unicode character. This allows many commonly occuring
    characters to be encoded with one byte while some less frequent chars are
    encoded with multiple bytes.

    The error message, "Invalid byte 2 of 3-byte UTF-8 sequence" means that
    a Java I/O streaming object expected, from the first byte, that this was a 3
    byte sequence and when it examined the second byte, it determined that the
    second byte was an illegal value (for instance, a value contradicting the first
    byte).

    What does this mean for you, the programmer?

    Two possibilities:

    1. There is no encoding attribute in the document's XML declaration, and
    Xalan is assuming it is UTF-8 when the document is not UTF-8.
    2. The document may have been UTF-8 and was corrupted in transmission
    (was it sent over the network?)

    If there is no encoding attribute in the document's XML declaration, put one
    there. For example, if there are Traditional Chinese (Taiwanese) characters
    in the XML document, you might try:

    <?xml version="1.0" encoding="Big5" ?>

    if they are Simplified Chinese, then try GB2312, if it's Japanese, try JIS.
    etc. When Xalan reads one of these encodings, I think Xerces will transcode
    them to Unicode, or at least use a non-UTF8 streaming source.

    If one or more bytes of the document were corrupted, you may be able to
    simply edit the document and look for any glyphs that look out-of-place
    at the point in the document where the error occured.


    HTH,

    Derek Harmon
     
    Derek Harmon, Jun 25, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. satish mn via JavaKB.com
    Replies:
    0
    Views:
    435
    satish mn via JavaKB.com
    Nov 25, 2004
  2. z-man
    Replies:
    5
    Views:
    16,643
    z-man
    Apr 14, 2006
  3. cvissy
    Replies:
    0
    Views:
    638
    cvissy
    Nov 16, 2004
  4. Replies:
    1
    Views:
    356
  5. dodo_ind
    Replies:
    0
    Views:
    552
    dodo_ind
    Feb 10, 2009
Loading...

Share This Page