An example using htmllib?

D

Dfenestr8

Hi.

I want a routine that strips a line of html of all it's tags. e.g I want
it to turn ....

"<p><b>This is an <h1><blink>IRRITATING</blink></h1> line of </b>text</p>"

.... into ......

"This is an IRRITATING line of text"

I've been told I should use htmllib. I've tried reading the htmllib docs
in the Library Reference, but I have to say, it just confuses me.

Does anyone know of a page that shows some simple examples of the sort of
thing I want to do?

Or, is it possible to use the example provided in the docs to achieve
this? Here's the example below ...


from HTMLParser import HTMLParser

class MyHTMLParser(HTMLParser):

def handle_starttag(self, tag, attrs):
print "Encountered the beginning of a %s tag" % tag

def handle_endtag(self, tag):
print "Encountered the end of a %s tag" % tag
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top