newbie question on UTFDataFormatException

Discussion in 'XML' started by Sriv Chakravarthy, Jun 24, 2003.

  1. I recieve a UTFDataFormatException while parsing a huge xml file. What
    is the meaning of this Exception and what are the possible causes ?
    Could this be a problem with the application or is it purely a data
    problem ?
    I am using the xerces-c SAX parser.
    Thanks for you help.
    Sriv Chakravarthy, Jun 24, 2003
    #1
    1. Advertising

  2. thanks for your response.
    What is the difference between UTF-8 and LATIN1 ? If my xml document
    contains all ascii chars then should the encoding be LATIN1 ?
    And how will I set the encoding - as a parameter to the handler object
    or in the first line of the xml file itself ( in <?xml ...> line )




    (Richard Tobin) wrote in message news:<bda09m$2gm5$>...
    > In article <>,
    > Sriv Chakravarthy <> wrote:
    >
    > >I recieve a UTFDataFormatException while parsing a huge xml file. What
    > >is the meaning of this Exception and what are the possible causes ?

    >
    > It means that it is reading the file as UTF-8 and there is a sequence
    > of bytes in the file that is not legal UTF-8. Possibly your file is
    > corrupted, but more likely it's actually in some other encoding such
    > as Latin-1, and just needs a declaration specifying this (UTF-8 is
    > the default).
    >
    > -- Richard
    Sriv Chakravarthy, Jun 30, 2003
    #2
    1. Advertising

  3. Sriv Chakravarthy

    Bob Foster Guest

    "Sriv Chakravarthy" <> wrote in message
    news:...
    > thanks for your response.
    > What is the difference between UTF-8 and LATIN1 ? If my xml document
    > contains all ascii chars then should the encoding be LATIN1 ?
    > And how will I set the encoding - as a parameter to the handler object
    > or in the first line of the xml file itself ( in <?xml ...> line )


    If it's all ascii, the encoding can be ASCII. UTF-8 or LATIN1 are ok, too.
    The best choice depends on what application is going to read the documents.
    Every parser is required to support UTF-8; most support the others, too.

    Bob Foster
    http://www.xmlbuddy.com/
    Bob Foster, Jul 2, 2003
    #3
  4. In xerces-c sax parser, how do you set the encoding ? is it set as the
    first line <?xml...> in the xml document or is it set via a member
    function ?


    "Bob Foster" <> wrote in message news:<FMyMa.17056$926.572@sccrnsc03>...
    > "Sriv Chakravarthy" <> wrote in message
    > news:...
    > > thanks for your response.
    > > What is the difference between UTF-8 and LATIN1 ? If my xml document
    > > contains all ascii chars then should the encoding be LATIN1 ?
    > > And how will I set the encoding - as a parameter to the handler object
    > > or in the first line of the xml file itself ( in <?xml ...> line )

    >
    > If it's all ascii, the encoding can be ASCII. UTF-8 or LATIN1 are ok, too.
    > The best choice depends on what application is going to read the documents.
    > Every parser is required to support UTF-8; most support the others, too.
    >
    > Bob Foster
    > http://www.xmlbuddy.com/
    Sriv Chakravarthy, Jul 3, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jerry C.
    Replies:
    8
    Views:
    207
    Uri Guttman
    Nov 23, 2003
  2. Kruno Saho
    Replies:
    0
    Views:
    125
    Kruno Saho
    Apr 7, 2013
  3. Dave Angel
    Replies:
    0
    Views:
    110
    Dave Angel
    Apr 7, 2013
  4. rusi
    Replies:
    0
    Views:
    101
  5. Miki Tebeka
    Replies:
    0
    Views:
    74
    Miki Tebeka
    Apr 7, 2013
Loading...

Share This Page