M
Maqo
Is it possible to use HTML::TokeParser to return the raw HTML between
two <A> tags, as opposed to just the text? My source file contains
several blocks of code--containing anchor links for each--that I'm
trying to extract by section while maintaining formatting.
My code:
my $p = HTML::TokeParser->new("file.txt" || die "Can't open file.");
while (my $t = $p->get_tag("a")) {
my $name = $t->[1]{name};
next unless $name && ($name eq "anchor");
print "$name : " . $p->get_text("a");
Example HTML source:
<A NAME='anchor1'></a><p>Some text and HTML formatting</p><BR>
<A NAME='anchor2'></a><p>Some text and HTML formatting</p><BR>
....
<A NAME='anchor10'></a><p>Some text and HTML formatting</p><BR>
The above code returns the "text and formatting" portions nicely,
albeit only as text. Is there an easy way to do this using
HTML:arser to return the desired portion, with HTML markup included?
Many thanks.
two <A> tags, as opposed to just the text? My source file contains
several blocks of code--containing anchor links for each--that I'm
trying to extract by section while maintaining formatting.
My code:
my $p = HTML::TokeParser->new("file.txt" || die "Can't open file.");
while (my $t = $p->get_tag("a")) {
my $name = $t->[1]{name};
next unless $name && ($name eq "anchor");
print "$name : " . $p->get_text("a");
Example HTML source:
<A NAME='anchor1'></a><p>Some text and HTML formatting</p><BR>
<A NAME='anchor2'></a><p>Some text and HTML formatting</p><BR>
....
<A NAME='anchor10'></a><p>Some text and HTML formatting</p><BR>
The above code returns the "text and formatting" portions nicely,
albeit only as text. Is there an easy way to do this using
HTML:arser to return the desired portion, with HTML markup included?
Many thanks.