regex: how to loop through individual matches

Discussion in 'ASP .Net' started by darrel, Dec 29, 2004.

  1. darrel

    darrel Guest

    I have some vb.net code that is running a regex, matching groups, and
    replacing them. I'm trying to come up with a simple script that will strip
    all attributes from all HTML tags.

    This is what I have:

    =============================================================

    function stripAllAttributes(ByVal textToParse as String, ByVal tagToFind as
    String) as String
    dim s as String
    dim r2 as new regex( _
    "(?<theTag>(<" & tagToFind & "))" & _
    "(?<everythingUpToEndTag>(([^/>].|\n)*))" _
    , RegexOptions.IgnoreCase)
    dim m2 as Match = r2.Match(textToParse)
    dim strTheTag as String = m2.Groups("theTag").Value.toString
    s = r2.Replace(textToParse, strTheTag)
    return s
    end function

    =============================================================

    This works, but, as you can see, I need to pass each tag I want to strip all
    attributes from separately. The reason is that if I just use a regex like
    this to grab the opening part of the tag:

    (<)([^/>\s\n])*

    it WILL grab the opening part of the first tag it sees, but it will then use
    the first matched text to replace ALL matches it finds in the rest of the
    text it is parsing. I imagine this is due more to my vb code than regex.

    For example, if my markup is this:

    <table width="100">
    <tr width="100">
    <td width="100">

    And if I run the function (using the generic 'find all tags' regex) against
    that, I get this returned:

    <table>
    <table>
    <table>

    When I want this:

    <table>
    <tr>
    <td>

    Off the top of my head, I can only think of doing it this way:

    Function find first HTML tag to strip (ie, find the first tag that has at
    least one attribute)
    if there's a match
    then pass that onto my current function (shown above) to replace all
    instances of that tag.
    then recursively call this same function so that it finds the next tag
    else
    assume it has stripped all attributes from all tags
    end if

    Or is there a way in my original script to do the same without the recursive
    part?

    -Darrel
     
    darrel, Dec 29, 2004
    #1
    1. Advertising

  2. I'd try something like the following:
    function stripAllAttributes(ByVal textToParse as String, ByVal tagToFind
    as String) as String
    dim s as String
    dim r2 as new regex( _
    "(?<theTag>(<" & tagToFind & "))" & _
    "(?<everythingUpToEndTag>(([^/>].|\n)*))" _
    , RegexOptions.IgnoreCase)
    s = r2.Replace(textToParse, "$1>")
    return s
    end function

    That uses a backreference to the first match ($1) in the replace
    command. For more info on the backreference, check out
    http://www.devarticles.com/c/a/VB.Net/Regular-Expressions-in-.NET/1/

    Blair
     
    Blair Bonnett, Jan 3, 2005
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Stephan Bour

    Extracting matches from Regex.Split

    Stephan Bour, Oct 29, 2003, in forum: ASP .Net
    Replies:
    3
    Views:
    2,544
    Stephan Bour
    Oct 30, 2003
  2. Replies:
    4
    Views:
    1,539
  3. Talin
    Replies:
    3
    Views:
    480
    Talin
    Nov 19, 2005
  4. Jonathan Lukens

    returning regex matches as lists

    Jonathan Lukens, Feb 15, 2008, in forum: Python
    Replies:
    7
    Views:
    308
    Jonathan Lukens
    Feb 16, 2008
  5. Isaac Won
    Replies:
    9
    Views:
    405
    Ulrich Eckhardt
    Mar 4, 2013
Loading...

Share This Page