Regexp: Matching unquoted attributes

Discussion in 'ASP General' started by DrewM, Oct 13, 2003.

  1. DrewM

    DrewM Guest

    I'm attempting to clean up HTML in a database by quoting all unquoted
    attributes.

    So far, I have this:

    oRegExp.Pattern = "<([^>]+)=([^>""]+)>"
    sHtml = oRegExp.Replace(sHtml, "<$1=""$2"">")

    which I can use to replace single attributes:
    <p class=foo> becomes <p class="foo">

    Now I'm trying to deal with multiple attributes and am getting myself
    into a pickle converting:

    <p class=foo name=bar> into <p class="foo" name="bar">

    The best I've come up with so far is:

    oRegExp.Pattern = "<(\w*\s)(([^=>]+=)([^>""\s]+))+>"
    sHtml = oRegExp.Replace(sHtml, "<$1 $3""$4"">")

    which obviously isn't going to work! :)

    How can I match multiple unquoted attributes and replace them with quotes?

    Thanks

    Drew
     
    DrewM, Oct 13, 2003
    #1
    1. Advertising

  2. "DrewM" <> wrote in message
    news:...
    > I'm attempting to clean up HTML in a database by quoting all unquoted
    > attributes.
    >
    > So far, I have this:
    >
    > oRegExp.Pattern = "<([^>]+)=([^>""]+)>"
    > sHtml = oRegExp.Replace(sHtml, "<$1=""$2"">")
    >
    > which I can use to replace single attributes:
    > <p class=foo> becomes <p class="foo">
    >
    > Now I'm trying to deal with multiple attributes and am getting myself
    > into a pickle converting:
    >
    > <p class=foo name=bar> into <p class="foo" name="bar">
    >
    > The best I've come up with so far is:
    >
    > oRegExp.Pattern = "<(\w*\s)(([^=>]+=)([^>""\s]+))+>"
    > sHtml = oRegExp.Replace(sHtml, "<$1 $3""$4"">")
    >
    > which obviously isn't going to work! :)
    >
    > How can I match multiple unquoted attributes and replace them with

    quotes?
    >
    > Thanks
    >
    > Drew


    You are going to have to do a two pass capture. First capture the tag
    (<something>), then capture the attributes/value pairs in each tag and
    quote-delimit the unquoted values. When regular expression tasks reach
    this level of complexity, I like to drop into JScript, as its native
    support for RE's is more robust. Here's an example:

    <script language="JavaScript" runat="SERVER">
    var s = "<p BadAttribute=unquoted GoodAttribute='<hello>'>Here is some
    text</p><p BadAttribute=NoQuotes>Here's another paragraph</p>";
    Response.Write(s.replace(/<.*>?>/g,function(m,p,s){return
    m.replace(/(\w+=)(\w+)/g,"$1\"$2\"");}));
    </script>

    HTH
    -Chris Hohmann
     
    Chris Hohmann, Oct 13, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Prabh
    Replies:
    2
    Views:
    497
    Chris Smith
    May 13, 2004
  2. enrique
    Replies:
    3
    Views:
    12,770
    Alan Moore
    Feb 8, 2005
  3. Joao Silva
    Replies:
    16
    Views:
    364
    7stud --
    Aug 21, 2009
  4. Marc Bissonnette

    Pattern matching : not matching problem

    Marc Bissonnette, Jan 8, 2004, in forum: Perl Misc
    Replies:
    9
    Views:
    238
    Marc Bissonnette
    Jan 13, 2004
  5. Bobby Chamness
    Replies:
    2
    Views:
    233
    Xicheng Jia
    May 3, 2007
Loading...

Share This Page