How do you get a class attribute from a span tag

Discussion in 'Java' started by Daryn, Feb 2, 2011.

  1. Daryn

    Daryn Guest

    Hi,

    I am using org.w3c.dom to extract values from some HTML.

    Some html with a span tag like this: <SPAN class='item c123'>-</SPAN>

    My code is like this:

    StringReader reader = new StringReader(html());
    InputSource inputSource = new InputSource(reader);
    SAX2DOM sax2dom = new SAX2DOM();

    Parser tagSoupParser = new Parser();
    tagSoupParser.setContentHandler(sax2dom);
    tagSoupParser.setFeature(Parser.namespacesFeature, false);
    tagSoupParser.parse(inputSource);

    Document document = (Document) sax2dom.getDOM();
    NodeList trElements = document.getElementsByTagName("span");

    Node node = trElements.item(0);


    I would like to do something like this:
    ((Element)node).getAttributes().getNamedItem("class")

    But that throws an "com.sun.org.apache.xerces.internal.dom.TextImpl
    cannot be cast to org.w3c.dom.Element" exception.

    How can I get the value of the class attribute in that span tag?

    Thanks in advance!
    Daryn, Feb 2, 2011
    #1
    1. Advertising

  2. On 02/02/2011 19:12, Daryn allegedly wrote:
    > Hi,
    >
    > I am using org.w3c.dom to extract values from some HTML.
    >
    > Some html with a span tag like this:<SPAN class='item c123'>-</SPAN>
    >
    > My code is like this:
    >
    > StringReader reader = new StringReader(html());
    > InputSource inputSource = new InputSource(reader);
    > SAX2DOM sax2dom = new SAX2DOM();
    >
    > Parser tagSoupParser = new Parser();
    > tagSoupParser.setContentHandler(sax2dom);
    > tagSoupParser.setFeature(Parser.namespacesFeature, false);
    > tagSoupParser.parse(inputSource);
    >
    > Document document = (Document) sax2dom.getDOM();
    > NodeList trElements = document.getElementsByTagName("span");
    >
    > Node node = trElements.item(0);
    >
    >
    > I would like to do something like this:
    > ((Element)node).getAttributes().getNamedItem("class")
    >
    > But that throws an "com.sun.org.apache.xerces.internal.dom.TextImpl
    > cannot be cast to org.w3c.dom.Element" exception.
    >
    > How can I get the value of the class attribute in that span tag?
    >
    > Thanks in advance!
    >


    Sounds fishy. Make sure that the code you're running as the same as what
    you've posted; make sure the exception occurs where you suggest it
    occurs. Furthermore I'd suggest printing out the elements of the
    returned list. You can also check the type of a node by comparing its
    nodeType (or somesuch) property against the constants defined in the
    org.w3c.dom.Node class.
    Also, why aren't you using a DocumentBuilder if what you need is a DOM?
    Daniele Futtorovic, Feb 2, 2011
    #2
    1. Advertising

  3. Daryn

    Arne Vajhøj Guest

    On 02-02-2011 13:12, Daryn wrote:
    > I am using org.w3c.dom to extract values from some HTML.
    >
    > Some html with a span tag like this:<SPAN class='item c123'>-</SPAN>
    >
    > My code is like this:
    >
    > StringReader reader = new StringReader(html());
    > InputSource inputSource = new InputSource(reader);
    > SAX2DOM sax2dom = new SAX2DOM();
    >
    > Parser tagSoupParser = new Parser();
    > tagSoupParser.setContentHandler(sax2dom);
    > tagSoupParser.setFeature(Parser.namespacesFeature, false);
    > tagSoupParser.parse(inputSource);
    >
    > Document document = (Document) sax2dom.getDOM();
    > NodeList trElements = document.getElementsByTagName("span");
    >
    > Node node = trElements.item(0);
    >
    >
    > I would like to do something like this:
    > ((Element)node).getAttributes().getNamedItem("class")
    >
    > But that throws an "com.sun.org.apache.xerces.internal.dom.TextImpl
    > cannot be cast to org.w3c.dom.Element" exception.
    >
    > How can I get the value of the class attribute in that span tag?


    This:

    String xml = "<SPAN class='item c123'>bla bla</SPAN>";
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder();
    Document doc = db.parse(new InputSource(new StringReader(xml )));
    Element elm = (Element)doc.getElementsByTagName("SPAN").item(0);
    System.out.println("content = " +
    elm.getFirstChild().getNodeValue());
    System.out.println("class = " + elm.getAttribute("class"));

    works here.

    Arne
    Arne Vajhøj, Feb 2, 2011
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. shruds
    Replies:
    1
    Views:
    709
    John C. Bollinger
    Jan 27, 2006
  2. Frank Rizzo
    Replies:
    3
    Views:
    28,437
    dojohansen
    Sep 9, 2008
  3. Fulio Open

    Can span include span?

    Fulio Open, Jun 26, 2009, in forum: HTML
    Replies:
    5
    Views:
    529
    dorayme
    Jun 26, 2009
  4. Stéphane Klein
    Replies:
    2
    Views:
    1,726
    John Nagle
    Mar 30, 2010
  5. Stefan Behnel
    Replies:
    0
    Views:
    470
    Stefan Behnel
    Mar 29, 2010
Loading...

Share This Page