How to clear all html tag in document?

M

max

How to clear all html tag in line (or document) ?
All tags have "<" on start, and ">" at the end of tag. Eg <table>, </table>,
<div align="left">, <td align="left" bgcolor="#000099"> ....
I make program that work character by character, and control if is start is
"<", and end ">".
Please help me I now those Perl programmers do that on easier way! How?

Thanks
 
J

Jürgen Exner

max said:
How to clear all html tag in line (or document) ?

Is there anything wrong with the answer in the FAQ 'perldoc -q "remove
HTML"'
"How do I remove HTML from a string?"
Please help me I now those Perl programmers do that on easier way!
How?

Trivial. They just follow the suggestions in the FAQ.

jue
 
F

Fabian Pilkowski

* max said:
How to clear all html tag in line (or document) ?
All tags have "<" on start, and ">" at the end of tag. Eg <table>, </table>,
<div align="left">, <td align="left" bgcolor="#000099"> ....
I make program that work character by character, and control if is start is
"<", and end ">".
Please help me I now those Perl programmers do that on easier way! How?

I suggest to use the module HTML::Strip, it is doing exactly what you
want. Have a look at

http://search.cpan.org/~kilinrax/HTML-Strip-1.04/Strip.pm

With that all you have to do is something like

my $html = "<div><b>foo</b> bar</div> baz";
my $text = HTML::Strip->new->parse( $html );
print $text;
__END__
foo bar baz

regards,
fabian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top