D
Daryn
Hi,
I am using org.w3c.dom to extract values from some HTML.
Some html with a span tag like this: <SPAN class='item c123'>-</SPAN>
My code is like this:
StringReader reader = new StringReader(html());
InputSource inputSource = new InputSource(reader);
SAX2DOM sax2dom = new SAX2DOM();
Parser tagSoupParser = new Parser();
tagSoupParser.setContentHandler(sax2dom);
tagSoupParser.setFeature(Parser.namespacesFeature, false);
tagSoupParser.parse(inputSource);
Document document = (Document) sax2dom.getDOM();
NodeList trElements = document.getElementsByTagName("span");
Node node = trElements.item(0);
I would like to do something like this:
((Element)node).getAttributes().getNamedItem("class")
But that throws an "com.sun.org.apache.xerces.internal.dom.TextImpl
cannot be cast to org.w3c.dom.Element" exception.
How can I get the value of the class attribute in that span tag?
Thanks in advance!
I am using org.w3c.dom to extract values from some HTML.
Some html with a span tag like this: <SPAN class='item c123'>-</SPAN>
My code is like this:
StringReader reader = new StringReader(html());
InputSource inputSource = new InputSource(reader);
SAX2DOM sax2dom = new SAX2DOM();
Parser tagSoupParser = new Parser();
tagSoupParser.setContentHandler(sax2dom);
tagSoupParser.setFeature(Parser.namespacesFeature, false);
tagSoupParser.parse(inputSource);
Document document = (Document) sax2dom.getDOM();
NodeList trElements = document.getElementsByTagName("span");
Node node = trElements.item(0);
I would like to do something like this:
((Element)node).getAttributes().getNamedItem("class")
But that throws an "com.sun.org.apache.xerces.internal.dom.TextImpl
cannot be cast to org.w3c.dom.Element" exception.
How can I get the value of the class attribute in that span tag?
Thanks in advance!