Getting HTML title using HTMLEditorKit.ParserCallback

B

Bill Tschumy

I am parsing an HTML file using ParseDelegator and a ParserCallback. I am
trying to get the document title and the HREF links. The ParserCallback is
successfully getting the HREF, so I know it is basically working. However,
when I try to get the title, I always get back null. Here is the relevant
code of the ParserCallback subclass. Anyone have any clue as to what I'm
doing wrong?

public void handleStartTag(HTML.Tag tag,
MutableAttributeSet attrSet, int pos)
{
if (tag == HTML.Tag.TITLE)
{
urlTitle = (String)attrSet.getAttribute(HTML.Attribute.TITLE);
System.out.println("attrSet: " + attrSet); // prints ""
System.out.println("found title: " + urlTitle); // prints null
}
if (tag == HTML.Tag.A)
{
// This successfully gets the target URL
String targetURLStr =
(String)attrSet.getAttribute(HTML.Attribute.HREF);
}

}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top