Error while parsing local languages using SAX/DOM parser.

Discussion in 'XML' started by Sidhartha, Sep 15, 2008.

  1. Sidhartha

    Sidhartha Guest

    Hi,
    I am facing a problem while parsing local language characters using
    sax parser. We use DOM to parse and SAX to read the source. But when
    our application parses strings with local language especially
    czech,polish,turkish in place of local language character some other
    word is comming.

    Eg:
    Input string :ahoj, jak se máš
    Output string :ahoj, jak se máš
    OS: Solaris.

    We persist this xml in the database. This issue was not comming when
    the parser was that of IBM and os NT.The local language character is
    getting replaced by "&aacute". This causing problem when we tranlsate
    it back.Can anyone please help me.

    Stack Trace

    class org.xml.sax.SAXException message = Parser reported fatal error
    while parsing : Input Source/DTD
    Stack Trace:
    org.xml.sax.SAXParseException: The entity "aacute" was referenced, but
    not declared.
    at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown
    Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown
    Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown
    Source)
    at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown
    Source)
    at org.apache.xerces.impl.XMLScanner.scanAttributeValue(Unknown
    Source)
    at
    org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanAttribute(Unknown
    Source)
    at
    org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown
    Source)
    at org.apache.xerces.impl.XMLDocumentScannerImpl
    $ContentDispatcher.scanRootElementHook(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl
    $FragmentContentDispatcher.dispatch(Unknown Source)
    at
    org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
    Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
    Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
    Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)

    Thanks,
    Sidhartha
     
    Sidhartha, Sep 15, 2008
    #1
    1. Advertising

  2. Sidhartha wrote:
    > Hi,
    > I am facing a problem while parsing local language characters using
    > sax parser. We use DOM to parse and SAX to read the source. But when
    > our application parses strings with local language especially
    > czech,polish,turkish in place of local language character some other
    > word is comming.
    >
    > Eg:
    > Input string :ahoj, jak se máš
    > Output string :ahoj, jak se máš
    > OS: Solaris.
    >
    > We persist this xml in the database. This issue was not comming when
    > the parser was that of IBM and os NT.The local language character is
    > getting replaced by "&aacute". This causing problem when we tranlsate
    > it back.Can anyone please help me.


    It is rather odd that you get an XHTML entity reference 'á' in
    your XML. I am not sure why that happens. Are you using XSLT for
    instance to serialize XML?

    --

    Martin Honnen
    http://JavaScript.FAQTs.com/
     
    Martin Honnen, Sep 15, 2008
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rogan Dawes

    HTML parser to DOM via SAX?

    Rogan Dawes, Mar 7, 2005, in forum: Java
    Replies:
    0
    Views:
    661
    Rogan Dawes
    Mar 7, 2005
  2. Igor Akkerman
    Replies:
    0
    Views:
    385
    Igor Akkerman
    Jul 30, 2003
  3. Jari Kujansuu
    Replies:
    2
    Views:
    1,036
    Jari Kujansuu
    Sep 30, 2003
  4. Replies:
    2
    Views:
    971
    Joseph Kesselman
    Nov 5, 2007
  5. sharan
    Replies:
    2
    Views:
    323
    Stefan Behnel
    Nov 16, 2007
Loading...

Share This Page