Parsing HTML with HTML::Tree

N

Ninja Li

Hi,

I am trying to parsing the following HTML content:

-- first part
<td class="storyTitle"> @0.1.7.4.0.0.5.0.0.11.1
<a href="/GeneralContent/MySearch.aspx?PagePrefix=IN&amp;
target="_new"> @0.1.7.4.0.0.5.0.0.11.1.0
"Chicago"

-- second part
<td class="storyTitle"> @0.1.7.4.0.0.5.0.0.17.1
<b> @0.1.7.4.0.0.5.0.0.17.1.0
"Something here"

I am using HTML:Tree to parse the HTML and what I would like to do is
that whenever there isn't any <a href=.....> segment as in the second
part of the HTML, I will print something else, such as "Error
occurred". Notice that both first and second parts of the HTML have
common text of "<td class="storyTitle">", which I use for search
criteria.

My problem is that I don't know what the following code will return
whenever <a href=...> is not found. I tried to test against "" or
undef, but doesn't seem to work.

The following is some of my code and it doesn't work as I wish.

use strict;
use LWP::Simple;
use HTML::Tree;

if ($td->attr('class') eq 'storyTitle')
{
if (my $sym = $td->find('a'))
{
if ($sym->as_text() ne '')
{
print $sym->as_text() . "\n";
}
else
{
print "Error Occurred" . "\n";
}
}
}
 
N

Ninja Li

Tad,

Thanks for your advice. You hit the nail on the head and it works
well now.

Nick
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top