Replacing Regex with part of itself

Discussion in 'Perl Misc' started by Hal Vaughan, Jul 14, 2005.

  1. Hal Vaughan

    Hal Vaughan Guest

    I know there's a way to do this, and I know it involves special uses of a
    regex, but I can't remember the terms that apply, so I'm having trouble
    searching for it. I want to take a line in a malformed HTML page like:

    <OPTION Value = '1' >Book_Title_1
    <OPTION Value = '2' >Book_Title_2

    so it'll look like:

    <OPTION Value = '1' >Book_Title_1</OPTION>
    <OPTION Value = '2' >Book_Title_2</OPTION>

    I know I can find the pattern by looking for something like:

    $htmlpage =~ /<OPTION.*?>.*?$/

    I THINK I remember that I can capture the wildcard part of the regex like:

    $htmlpage =~ /<OPTION(.*?)>(.*?)$/

    But when I try a substitution:

    $htmlpage =~ s/<OPTION(.*?)>(.*?)$/<OPTION.*?>.*?</OPTION>$/g;

    how do I get the selected sections from the search part to be included in
    the replace part?

    Thanks for any help on this. I'm not even sure what the name is for the
    type of search/replace I'm trying to do is!

    Hal
    Hal Vaughan, Jul 14, 2005
    #1
    1. Advertising

  2. Hal Vaughan

    Anno Siegel Guest

    Hal Vaughan <> wrote in comp.lang.perl.misc:
    > I know there's a way to do this, and I know it involves special uses of a
    > regex, but I can't remember the terms that apply, so I'm having trouble
    > searching for it. I want to take a line in a malformed HTML page like:


    "Capture" is the word you're looking for.

    >
    > <OPTION Value = '1' >Book_Title_1
    > <OPTION Value = '2' >Book_Title_2
    >
    > so it'll look like:
    >
    > <OPTION Value = '1' >Book_Title_1</OPTION>
    > <OPTION Value = '2' >Book_Title_2</OPTION>
    >
    > I know I can find the pattern by looking for something like:
    >
    > $htmlpage =~ /<OPTION.*?>.*?$/
    >
    > I THINK I remember that I can capture the wildcard part of the regex like:
    >
    > $htmlpage =~ /<OPTION(.*?)>(.*?)$/


    The () are capturing parentheses. The "?" after each ".*" make the
    match non-greedy. Why do you think you need that? The final "$" is
    also unnecessary.

    > But when I try a substitution:
    >
    > $htmlpage =~ s/<OPTION(.*?)>(.*?)$/<OPTION.*?>.*?</OPTION>$/g;

    ^
    "/" is your delimiter. You must quote it or use an alternative delimiter.

    > how do I get the selected sections from the search part to be included in
    > the replace part?


    Use $1, $2, etc. This is explained in "perldoc perlre".

    s{<OPTION(.*?)>(.*)} {<OPTION$1>$2</OPTION>};

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
    Anno Siegel, Jul 14, 2005
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    3
    Views:
    732
    Reedick, Andrew
    Jul 1, 2008
  2. Fulio Open
    Replies:
    5
    Views:
    408
    C A Upsdell
    Jun 16, 2009
  3. Rob Meade

    Replacing - and not Replacing...

    Rob Meade, Apr 5, 2005, in forum: ASP General
    Replies:
    5
    Views:
    261
    Chris Hohmann
    Apr 11, 2005
  4. Ben
    Replies:
    4
    Views:
    131
    Robert Klemme
    Mar 25, 2008
  5. kurt
    Replies:
    1
    Views:
    67
    Steve van Dongen
    Sep 1, 2004
Loading...

Share This Page