Remove HTML tags (except anchor tag) from a string using regularexpressions

Nico Grubert · Feb 1, 2005

Hello,

I want to remove all html tags from a string "content" except <a
....>xxx</a>.

My script reads like this:

###
import re
content = re.sub('<([^!>]([^>]|\n)*)>', '', content)
###

It works fine. It removes all html tags from "content".
Unfortunately, this also removes <a ...>xxx</a> occurancies.
Any idea, how to modify this to remove all html tags except <a ...>xxx</a>?

Thanks in advance,
Nico

Anand · Feb 1, 2005

How about...

import re
content = re.sub('<([^!(a>)]([^(/a>)]|\n)*)>', '', content)
Seems to work for me.

HTH

-Anand

Anand · Feb 1, 2005

I meant
content = re.sub ('<[^!(a>)]([^>]|\n)*[^!(/a)]>', '', content)

Sorry for the mistake.
However this seems to also print tags like <b>, <p> etc
also.

-Anand

Max M · Feb 1, 2005

Nico Grubert wrote:

If it's not to learn, and you simply want it to work, try out this library:

http://zope.org/Members/chrisw/StripOGram/readme

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Gabriel Cooper · Feb 2, 2005

Max said:
If it's not to learn, and you simply want it to work, try out this
library:

http://zope.org/Members/chrisw/StripOGram/readme

'first first '

keeping in mind that bare ">" and "<" are invalid HTML (should be >
and &lt

, why'd it leave the greater than and why are there two "first"'s ?

HTML Anchor tag not working	2	Dec 15, 2020
Python client/server that reads HTML body from server	1	Apr 11, 2023
strip away html tags from extracted links	2	Nov 29, 2013
Remove all HTML but keep <p> tags	4	Feb 10, 2012
how to make a tree with randomly selected html tags from an array in python?	0	Mar 10, 2013
I need help with a Gemini prompt	1	May 14, 2025
Removing certain tags from html files	3	Jul 27, 2007
FAQ 9.4 How do I remove HTML from a string?	0	Apr 10, 2011

Remove HTML tags (except anchor tag) from a string using regularexpressions

Nico Grubert

Anand

Anand

Max M

Gabriel Cooper

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads