strip given HTML tags

D

Dan Jacobson

This solution posted two months ago was supposed to strip all the HTML
tags given on the command line, but it turns out it strips all the
tags whatsoever. Help.

#!/usr/bin/perl -w
# striptag -- strip given tags out of HTML
#Usage example: striptag font div < file.html
use strict;
use HTML::parser;
my $parser = HTML::parser->new( text_h => [ sub { print shift; },"dtext" ]);
$parser->parse_file(*STDIN);
 
K

ko

Dan Jacobson said:
This solution posted two months ago was supposed to strip all the HTML
tags given on the command line, but it turns out it strips all the
tags whatsoever. Help.

#!/usr/bin/perl -w
# striptag -- strip given tags out of HTML
#Usage example: striptag font div < file.html
use strict;
use HTML::parser;
my $parser = HTML::parser->new( text_h => [ sub { print shift; },"dtext" ]);
$parser->parse_file(*STDIN);

From the HTML::parser docs, right above the 'BUGS' section:

'More examples are found in the ``eg/'' directory of the HTML-Parser
distribution;' ... 'the program hstrip shows how you can strip out
certain tags/elements and/or attributes;'

Unfortunately, the files aren't available on your system unless you're
running an older version of Perl and had to install the module
manually (core module for newer versions of Perl). So you need to
download the distribution from CPAN. *All* of the example scripts are
good starting points on how to use the module.

Anyway, here's a quick fix:

#!/usr/bin/perl -w
use strict;
use HTML::parser;

# any method to assign to @ignore_tags and @files HERE.
my @ignore_tags = qw[font script];
my @files = qw[1.html 2.html];

strip_tags($_ ) foreach (@files);

sub strip_tags {
my $file = shift;
HTML::parser->new(
default_h => [ sub { print shift }, 'text'],
ignore_tags => \@ignore_tags,
)->parse_file($file);
}

HTH - keith
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,270
Latest member
TopCryptoTwitterChannels_

Latest Threads

Top