How to retrieve XML CDATA text contents by org.xml.sax.ext.DefaultHandler2?

Discussion in 'XML' started by RC, Apr 30, 2009.

  1. RC

    RC Guest

    For example I have a XML tag

    <script>
    <![CDATA[
    My script is here
    ]]>
    </script>

    I am using org.xml.sax.ext.DefaultHandler2 to parse my XML
    file. How do I retrieve my script contents?




    What shall I do in these two methods?
    @Override
    public void startElement(String uri, String localName, String qName,
    Attributes attributes)
    throws SAXException
    {
    if (qName.equals("script"))
    {
    // How to retrieve my script contents?
    }
    }
    @Override
    public void endElement(String uri, String localName, String qName)
    throws SAXException
    {
    if (qName.equals("script"))
    {
    // How to retrieve my script contents?
    }
    }



    Below two methods have no print out at all
    @Override
    public void endCDATA()
    {
    System.out.println("End of CDATA");
    }

    @Override
    public void startCDATA()
    {
    System.out.println("Start of CDATA");
    }

    Thank you very much in advance!
     
    RC, Apr 30, 2009
    #1
    1. Advertising

  2. RC

    Lew Guest

    RC wrote:
    > For example I have a XML tag
    >
    > <script>
    > <![CDATA[
    > My script is here
    > ]]>
    > </script>
    >
    > I am using org.xml.sax.ext.DefaultHandler2 to parse my XML
    > file. How do I retrieve my script contents?


    Via the 'characters()' method.

    > What shall I do in these two methods?


    Mark the beginning and end of each element so that your parser knows
    where it is in the parse process.

    > @Override
    > public void startElement(String uri, String localName, String qName,
    > Attributes attributes)
    > throws SAXException
    > {
    >         if (qName.equals("script"))
    >         {
    >                 // How to retrieve my script contents?


    Not here. What do the Javadocs tell you about the purpose of this
    method and the event it handles?

    >         }}
    >
    > @Override
    > public void endElement(String uri, String localName, String qName)
    > throws SAXException
    > {
    >         if (qName.equals("script"))
    >         {
    >                 // How to retrieve my script contents?


    Not here. What do the Javadocs tell you about the purpose of this
    method and the event it handles?

    >         }
    >
    > }
    >
    > Below two methods have no print out at all


    Did you read the Javadocs?

    > @Override
    > public void endCDATA()
    > {
    >         System.out.println("End of CDATA");
    > }
    >
    > @Override
    > public void startCDATA()
    > {
    >         System.out.println("Start of CDATA");
    > }


    The Javadocs will tell you:
    > The contents of the CDATA section will be reported through the regular
    > characters event; this event is intended only to report the boundary.


    While not always enough, the API Javadocs are always a good place to
    start, and often will completely answer your questions.

    --
    Lew
     
    Lew, Apr 30, 2009
    #2
    1. Advertising

  3. Thu, 30 Apr 2009 12:02:36 -0400, /RC/:

    > For example I have a XML tag
    >
    > <script>
    > <![CDATA[
    > My script is here
    > ]]>
    > </script>
    >
    > I am using org.xml.sax.ext.DefaultHandler2 to parse my XML
    > file. How do I retrieve my script contents?


    You retrieve it as ordinary text content delivered through
    'characters' events to your ContentHandler. Whether the text is
    written as CDATA section (or not) in the source is purely a
    syntactic detail which shouldn't bother you.

    > Below two methods have no print out at all
    > @Override
    > public void endCDATA()
    > {
    > System.out.println("End of CDATA");
    > }
    >
    > @Override
    > public void startCDATA()
    > {
    > System.out.println("Start of CDATA");
    > }
    >
    > Thank you very much in advance!


    You need to set the "lexical-handler" [1] property of the parser
    with the reference to your handler in addition to setting it as a
    'contentHandler':

    XMLReader parser;
    DefaultHandler2 myHandler;
    ...
    parser.setContentHandler(myHandler);
    parser.setProperty("http://xml.org/sax/properties/"
    + "lexical-handler", myHandler);

    [1] SAX2 Standard Handler and Property IDs
    <http://www.saxproject.org/apidoc/org/xml/sax/package-summary.html>

    --
    Stanimir
     
    Stanimir Stamenkov, May 4, 2009
    #3
  4. RC

    Lew Guest

    Stanimir Stamenkov wrote:
    > You need to set the "lexical-handler" [1] property of the parser with
    > the reference to your handler in addition to setting it as a
    > 'contentHandler':


    Are you sure about that?

    --
    Lew
     
    Lew, May 4, 2009
    #4
  5. In article <gtmr70$e6j$>, Lew <>
    wrote:

    > Stanimir Stamenkov wrote:
    > > You need to set the "lexical-handler" [1] property of the parser
    > > with the reference to your handler in addition to setting it as a
    > > 'contentHandler':

    >
    > Are you sure about that?


    I was surprised to see that the default value of lexical-handler is
    unspecified [1]. On closer reading, I see that the LexicalHandler
    interface is optional [2]. The API suggests setting the property and
    handling any SAXNotRecognizedException to determine if the feature is
    implemented.

    [1]<http://www.saxproject.org/apidoc/org/xml/sax/package-summary.html>
    [2]<http://www.saxproject.org/apidoc/org/xml/sax/ext/LexicalHandler.html>

    --
    John B. Matthews
    trashgod at gmail dot com
    <http://sites.google.com/site/drjohnbmatthews>
     
    John B. Matthews, May 4, 2009
    #5
  6. Mon, 04 May 2009 09:39:44 -0400, /Lew/:
    > Stanimir Stamenkov wrote:
    >
    >> You need to set the "lexical-handler" [1] property of the parser with
    >> the reference to your handler in addition to setting it as a
    >> 'contentHandler':

    >
    > Are you sure about that?


    Yes. As you've suggested you may consult with the API docs
    reference to which I've supplied. If you perform a simple test
    you'll see for yourself, too. Note I've meant one needs to set a
    "lexical-handler" only to detect CDATA section boundaries, i.e. to
    receive 'startCDATA' and 'endCDATA' events, not as requirement to
    read the content of CDATA sections (if that wasn't clear).

    --
    Stanimir
     
    Stanimir Stamenkov, May 4, 2009
    #6
  7. RC

    Lew Guest

    Stanimir Stamenkov wrote:
    > Note I've meant one needs to set a "lexical-handler"
    > only to detect CDATA section boundaries, i.e. to receive 'startCDATA'
    > and 'endCDATA' events, not as requirement to read the content of CDATA
    > sections (if that wasn't clear).


    Thanks, that wasn't.

    --
    Lew
     
    Lew, May 5, 2009
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Davison
    Replies:
    1
    Views:
    624
    Hal Rosser
    Jul 7, 2004
  2. Jindal, Pankaj

    ['ext.IsDOMString', 'ext.SplitQName']

    Jindal, Pankaj, Dec 22, 2004, in forum: Python
    Replies:
    0
    Views:
    333
    Jindal, Pankaj
    Dec 22, 2004
  3. RC
    Replies:
    6
    Views:
    622
  4. RC
    Replies:
    1
    Views:
    331
  5. RC
    Replies:
    1
    Views:
    1,352
Loading...

Share This Page