How to know the input line number when using parse_file of HTML::Parser

H

Himanshu Garg

Hello,

I am using HTML::parser and want to know the line number where a
particular event occurs. Given below is an illustrative example :-
-----------------------------------------------------------
use strict;
use HTML::parser;

# Create parser object
my $p = HTML::parser->new( api_version => 3,
start_h => [\&start, "tagname"],
);
$p->report_tags("s"); # report only s event
$p->parse_file(*STDIN);



sub start
{
warn "<s> found in $."; # HOW TO PRINT THE LINE NUMBER
}

----------------------------------------------------------------

Although warn normally prints the input file line number on its own,
here it doesn't do that when I print it explicitly (using $.). I get
the following error message :-

Use of uninitialized value in concatenation (.) or string at
../parsehtml.pl line

Could you suggest the right ways of getting the line number, please.
The input I am parsing is erroneous and I want to know the location of
errors. Hence the above code.

Thank You
Himanshu.
 
K

ko

Himanshu said:
Hello,

I am using HTML::parser and want to know the line number where a
particular event occurs.

[snip code/warning message]
Could you suggest the right ways of getting the line number, please.
The input I am parsing is erroneous and I want to know the location of
errors. Hence the above code.

Thank You
Himanshu.

Start off by re-reading the documentation. The part you're looking for
is in the 'Argspec' section, specifically the 'line' argspec
identifier. Use something like this:

#!/usr/bin/perl -w
use strict;
use HTML::parser;

undef $/;
my $html = <DATA>;
my $p = HTML::parser->new( api_version => 3,
start_h => [sub { print shift, "\n"; }, 'line'],
);
$p->report_tags('s');
$p->parse($html);
$p->eof;

__DATA__
<html>
<body>
<s>paragraph</s>
<pre>

</pre>
<s>paragraph</s>
<body>
</html>

There are a lot of good examples here:

http://search.cpan.org/src/GAAS/HTML-Parser-3.34/eg/

HTH - keith
 
H

Himanshu Garg

ko said:
Himanshu said:
Hello,

I am using HTML::parser and want to know the line number where a
particular event occurs.

[snip code/warning message]
Could you suggest the right ways of getting the line number, please.
The input I am parsing is erroneous and I want to know the location of
errors. Hence the above code.

Thank You
Himanshu.

Start off by re-reading the documentation. The part you're looking for
is in the 'Argspec' section, specifically the 'line' argspec
identifier. Use something like this:

#!/usr/bin/perl -w
use strict;
use HTML::parser;

undef $/;
my $html = <DATA>;
my $p = HTML::parser->new( api_version => 3,
start_h => [sub { print shift, "\n"; }, 'line'],
);
$p->report_tags('s');
$p->parse($html);
$p->eof;

__DATA__
<html>
<body>
<s>paragraph</s>
<pre>

</pre>
<s>paragraph</s>
<body>
</html>

There are a lot of good examples here:

http://search.cpan.org/src/GAAS/HTML-Parser-3.34/eg/

HTH - keith

Thanks a lot.

Thank You
++imanshu.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top