Regex expression to remove some html tags

S

Spondishy

Hi,

Does anyone have a good regex expression to remove some html tags that
would be efficient in .Net. Basically I want to keep anchors, bolds and
a few others, so an expression that says remove all tags except these
few would be best.

Thanks.
 
B

Ben Dewey

Just wiped this together, its may need some work.

string expres = @"<(?![!/]?[ABIU][>\s])[^>]*>";

string output = Regex.Replace(inputStr, expres, "", RegexOptions.IgnoreCase
| RegexOptions.Multiline);
 
B

Ben Dewey

Was tested using

<html>
<body>
<a name="top">
<b>My Website</b><br><br>
Here is the text for my website.
<table border="0" cellpadding="0">
<tr>
<td>Cell 1</td>
</tr>
<tr>
<td>Cell 2</td>
</tr>
</body>
</html>

you will still have to go through and replace /r/n s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top