HTML::TokeParser; __DATA__ as a filehandle

J

Jonathan

This is an embarrassingly simple question, but I'm trying to get
HTML::TokeParser to execute an example even simpler than the one given as an
example in the docs.

I expect to get output of "3" from the program Instead, I get "0."
What is the trivial reason for this? Is there a problem with my use
of __DATA__, or my reference to main::DATA? Is the central loop not
working?

I hope that I have done enough of my homework on this to warrant a meaningful
response. If I have to go back to perlopen or perlreftut, please let me know.

Here is my sample program:

#!/usr/bin/perl -w

use warnings;
use strict;
use diagnostics;

use HTML::TokeParser;
my $fh = \<main::DATA>;
my $p = HTML::TokeParser->new($fh) || die "Bad open: $! \n";
my $heading3s = 0;

while (my $token=$p->get_tag("<h3>")){
$heading3s++;
}


print "Number of Level 3 Headings: $heading3s\.\n";

__DATA__
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<title>Test Page</title>
</head>
<body>
<h3>Alpha</h3>
<h3>Beta</h3>
<h3>Gamma</h3>
</body>
</html>
 
P

Paul Lalli

This is an embarrassingly simple question, but I'm trying to get
HTML::TokeParser to execute an example even simpler than the one given as an
example in the docs.

I expect to get output of "3" from the program Instead, I get "0."
What is the trivial reason for this? Is there a problem with my use
of __DATA__, or my reference to main::DATA? Is the central loop not
working?

no, yes, and yes
I hope that I have done enough of my homework on this to warrant a meaningful
response. If I have to go back to perlopen or perlreftut, please let me know.

Here is my sample program:

#!/usr/bin/perl -w

use warnings;
use strict;
use diagnostics;

use HTML::TokeParser;
my $fh = \<main::DATA>;

This is attempting to read a line from the DATA filehandle, and then
take a reference to it. Change to:
my $fh = \*main::DATA;
my $p = HTML::TokeParser->new($fh) || die "Bad open: $! \n";
my $heading3s = 0;

while (my $token=$p->get_tag("<h3>")){

Read the docs for HTML::TokeParser. get_tag() takes the name of the
$heading3s++;
}


print "Number of Level 3 Headings: $heading3s\.\n";

Paul Lalli
 
B

Brian Wilkins

Paul said:
no, yes, and yes


This is attempting to read a line from the DATA filehandle, and then
take a reference to it. Change to:
my $fh = \*main::DATA;


Read the docs for HTML::TokeParser. get_tag() takes the name of the


Paul Lalli

Follow what Paul said. I know this is slightly off what you originally
asked for, but if you want to get what's in between the tags, use this
code. Remember to define $result:

my $fh = \*main::DATA;

my $tp = HTML::TokeParser->new(\$fh) or die "Can't open $!";

while (my $tag = $tp->get_tag) {
if($tag->[0] eq 'h3') {
$result .= $tp->get_text("/h3")."\n";
}
}

This code will get what's between <h3> and </h3> (for each set) and
append a newline character to the data. This has worked for me in the
past if you want what's between the tags also.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,563
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top