Handling (retain) special characters when parsing XML?

Discussion in 'Java' started by Piper707@hotmail.com, Apr 5, 2007.

  1. Guest

    Hi,

    We need help with processing special characters when processing XML
    consecutively first by SAX and then converting that output into DOM.

    This is what we do:

    The input XML has all special chacters like ampersand replaced with
    the correct strings: &

    SAXParserFactory factory = SAXParserFactory.newInstance();
    SAXParser parser = factory.newSAXParser();
    parser.parse( new File( FileWithXml ), handler );

    the handler saves all the parsed XML into a string in a particular
    format. - in the parsed XML, the & gets converted into &

    String parsedString = parsedXml.toString();

    parsedString needs to be converted into a document:

    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    document = factory.newDocumentBuilder().parse(new InputSource(new
    StringReader(parsedString)));

    But due to the presence of &, we cannot convert to a document unless &
    is again replaced with &

    Is there a way to retain special characters the first time around, so
    we dont have to replace all occurences again before converting to a
    document? Can a custom entity reference handler be used for anything
    like this?

    Thanks for any help
    Rohit
    , Apr 5, 2007
    #1
    1. Advertising

  2. wrote in news:1175728257.299882.68170
    @p77g2000hsh.googlegroups.com:

    > Is there a way to retain special characters the first time around, so
    > we dont have to replace all occurences again before converting to a
    > document? Can a custom entity reference handler be used for anything
    > like this?


    I'd recommend not attempting to re-use the parsed XML (parsed by the SAX
    parser) as input to the DOM parser. Instead, just create a new InputSource
    from the input file and use that to feed the DOM parser.

    Cheers
    GRB

    --
    ---------------------------------------------------------------------
    Greg R. Broderick

    A. Top posters.
    Q. What is the most annoying thing on Usenet?
    ---------------------------------------------------------------------
    Greg R. Broderick, Apr 5, 2007
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Stefan Mueller
    Replies:
    3
    Views:
    32,931
    Stefan Mueller
    Jul 23, 2006
  2. Replies:
    2
    Views:
    1,067
    Ingo Menger
    May 31, 2007
  3. rvino
    Replies:
    0
    Views:
    4,629
    rvino
    Aug 14, 2007
  4. hackingKK
    Replies:
    1
    Views:
    255
    Thomas 'PointedEars' Lahn
    Jul 28, 2011
  5. majna
    Replies:
    4
    Views:
    634
    Thomas 'PointedEars' Lahn
    Sep 19, 2007
Loading...

Share This Page