Regexp: Matching unquoted attributes

D

DrewM

I'm attempting to clean up HTML in a database by quoting all unquoted
attributes.

So far, I have this:

oRegExp.Pattern = "<([^>]+)=([^>""]+)>"
sHtml = oRegExp.Replace(sHtml, "<$1=""$2"">")

which I can use to replace single attributes:
<p class=foo> becomes <p class="foo">

Now I'm trying to deal with multiple attributes and am getting myself
into a pickle converting:

<p class=foo name=bar> into <p class="foo" name="bar">

The best I've come up with so far is:

oRegExp.Pattern = "<(\w*\s)(([^=>]+=)([^>""\s]+))+>"
sHtml = oRegExp.Replace(sHtml, "<$1 $3""$4"">")

which obviously isn't going to work! :)

How can I match multiple unquoted attributes and replace them with quotes?

Thanks

Drew
 
C

Chris Hohmann

DrewM said:
I'm attempting to clean up HTML in a database by quoting all unquoted
attributes.

So far, I have this:

oRegExp.Pattern = "<([^>]+)=([^>""]+)>"
sHtml = oRegExp.Replace(sHtml, "<$1=""$2"">")

which I can use to replace single attributes:
<p class=foo> becomes <p class="foo">

Now I'm trying to deal with multiple attributes and am getting myself
into a pickle converting:

<p class=foo name=bar> into <p class="foo" name="bar">

The best I've come up with so far is:

oRegExp.Pattern = "<(\w*\s)(([^=>]+=)([^>""\s]+))+>"
sHtml = oRegExp.Replace(sHtml, "<$1 $3""$4"">")

which obviously isn't going to work! :)

How can I match multiple unquoted attributes and replace them with quotes?

Thanks

Drew

You are going to have to do a two pass capture. First capture the tag
(<something>), then capture the attributes/value pairs in each tag and
quote-delimit the unquoted values. When regular expression tasks reach
this level of complexity, I like to drop into JScript, as its native
support for RE's is more robust. Here's an example:

<script language="JavaScript" runat="SERVER">
var s = "<p BadAttribute=unquoted GoodAttribute='<hello>'>Here is some
text</p><p BadAttribute=NoQuotes>Here's another paragraph</p>";
Response.Write(s.replace(/<.*>?>/g,function(m,p,s){return
m.replace(/(\w+=)(\w+)/g,"$1\"$2\"");}));
</script>

HTH
-Chris Hohmann
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,571
Members
45,045
Latest member
DRCM

Latest Threads

Top