How to use HTML::Parser to remove HTML tags and print result

Mitchua · Jul 14, 2003

I am trying to use HTML:

arser to parse an HTML file, remove all HTML tags
(including comments, etc.), replace all ENTITIES (e.g. &amp), and put the
result into a variable as a string. I figure HTML:

arser itself can
somehow preform the filtering, but how do I get it back as a string? I'd
appreciate some sample code if anyone has any. Sorry if this is a real n00b
question.

Thanks a lot,
Mitchua

Ice Demon · Jul 15, 2003

Mitchua said:
I am trying to use HTML:arser to parse an HTML file, remove all HTML tags
(including comments, etc.), replace all ENTITIES (e.g. &amp), and put the
result into a variable as a string. I figure HTML:arser itself can
somehow preform the filtering, but how do I get it back as a string? I'd
appreciate some sample code if anyone has any. Sorry if this is a real n00b
question.

Thanks a lot,
Mitchua

Try this for a sample of parsing a webpage
http://www.wdvl.com/Authoring/Languages/Perl/PerlfortheWeb/summarizer.html
If you are just trying to remove all the html tags, you could just do this
$webpage =~ s/<.*?>//g;

Ice Demon
http://adult-xxx-newsgroups.com
http://adult-cybergames.com
http://adult-spider.com

How to implement a html parser in java?	1	Dec 28, 2023
Remove all HTML but keep <p> tags	4	Feb 10, 2012
Need assistance finetuning HTML, CSS, Javascript - sticky header issue	3	Feb 25, 2022
FAQ 9.4 How do I remove HTML from a string?	0	Apr 10, 2011
HCaptcha - How to stop page from refreshing on submit if captcha is not checked/validated	1	Aug 29, 2023
How to use RSS2 parser to parse tags with colon?	0	Sep 30, 2007
HTML parser Hpricot? and how to get all text	10	Oct 29, 2007
how to make a tree with randomly selected html tags from an array in python?	0	Mar 10, 2013

How to use HTML::Parser to remove HTML tags and print result

Mitchua

Ice Demon

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads