Removing HTML from text

B

Bill H

I am looking for a perl routine that will strip HTML from a text file
and allow me to setup exceptions. For example, remove all HTML except
<B> <I> <U> <P> and their close tags, and optionally (preferably)
clean up any <P> tags so that contain just <P ALIGN="LEFT"> (or right
or center).

Is there such a beast out there before I write my own code?

Bill H
 
B

Ben Bullock

I am looking for a perl routine that will strip HTML from a text file
and allow me to setup exceptions. For example, remove all HTML except
<B> <I> <U> <P> and their close tags, and optionally (preferably) clean
up any <P> tags so that contain just <P ALIGN="LEFT"> (or right or
center).

Is there such a beast out there before I write my own code?

The place to look is http://search.cpan.org/.
 
J

Jürgen Exner

Bill H said:
I am looking for a perl routine that will strip HTML from a text file
and allow me to setup exceptions. For example, remove all HTML except

Your Question is Asked Frequently: perldoc -q "remove HTML"
<B> <I> <U> <P> and their close tags, and optionally (preferably)
clean up any <P> tags so that contain just <P ALIGN="LEFT"> (or right
or center).

You can define any custom action (remove or retain or whatever you like)
for any HTML element.

jue
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,141
Latest member
BlissKeto
Top