Resetting //g

Discussion in 'Perl Misc' started by Roy Johnson, Oct 29, 2003.

  1. Roy Johnson

    Roy Johnson Guest

    If you short-circuit out of a global pattern match like so:
    for (1..$n) {
    $str =~ /($pat)/g;
    $NthMatch = $1;
    }
    where there are more than $n matches, the next time you do
    $str =~ /($pat)/g;
    even if it's in a completely different block of code, the matching is
    going to pick up where it left off. Is there a way to reset it, short
    of whiling away the rest of the matches? (I tried several arguments
    for the reset function.)

    Incidentally, the best way to get the $nth match of $pat in $str is
    $str =~ /(?:.*?($pat)){$n}/;
    but I'm still curious about short-circuited global matches.
    Roy Johnson, Oct 29, 2003
    #1
    1. Advertising

  2. Roy Johnson

    Ben Morrow Guest

    (Roy Johnson) wrote:
    > If you short-circuit out of a global pattern match like so:
    > for (1..$n) {
    > $str =~ /($pat)/g;
    > $NthMatch = $1;
    > }
    > where there are more than $n matches, the next time you do
    > $str =~ /($pat)/g;
    > even if it's in a completely different block of code, the matching is
    > going to pick up where it left off. Is there a way to reset it, short
    > of whiling away the rest of the matches? (I tried several arguments
    > for the reset function.)


    From perldoc perlop:

    | The position after the last match can be read or set using the pos()
    | function; see "pos" in perlfunc.

    /Nota bene/ that you call pos on $str, not $pat.

    > Incidentally, the best way to get the $nth match of $pat in $str is
    > $str =~ /(?:.*?($pat)){$n}/;
    > but I'm still curious about short-circuited global matches.


    I would have said a better way would be
    $nthmatch = ($str =~ /($pat)/g)[$n];
    , not least because it actually works, but maybe that's just me... :)

    Ben

    --
    "The Earth is degenerating these days. Bribery and corruption abound.
    Children no longer mind their parents, every man wants to write a book,
    and it is evident that the end of the world is fast approaching."
    -Assyrian stone tablet, c.2800 BC
    Ben Morrow, Oct 29, 2003
    #2
    1. Advertising

  3. [posted & mailed]

    On 29 Oct 2003, Roy Johnson wrote:

    >If you short-circuit out of a global pattern match like so:
    > for (1..$n) {
    > $str =~ /($pat)/g;
    > $NthMatch = $1;
    > }
    >where there are more than $n matches, the next time you do
    > $str =~ /($pat)/g;
    >even if it's in a completely different block of code, the matching is
    >going to pick up where it left off. Is there a way to reset it, short
    >of whiling away the rest of the matches? (I tried several arguments
    >for the reset function.)


    You can set pos($str) to undef.

    for (1 .. $n) {
    $str =~ /($pat)/g;
    $last = $1;
    }
    undef pos($str);

    The /g flag makes the regex start looking at pos($str) next time; setting
    it to undef makes it start looking at the beginning of the string again.

    --
    Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
    Jeff 'japhy' Pinyan, Oct 29, 2003
    #3
  4. Roy Johnson

    Roy Johnson Guest

    Ben Morrow <> wrote in message news:<bnoo6h$hr4$>...
    > From perldoc perlop:
    >
    > | The position after the last match can be read or set using the pos()
    > | function; see "pos" in perlfunc.


    I shoulda thought of that. Thanks.

    > /Nota bene/ that you call pos on $str, not $pat.
    >
    > > Incidentally, the best way to get the $nth match of $pat in $str is
    > > $str =~ /(?:.*?($pat)){$n}/;
    > > but I'm still curious about short-circuited global matches.

    >
    > I would have said a better way would be
    > $nthmatch = ($str =~ /($pat)/g)[$n];
    > , not least because it actually works, but maybe that's just me... :)


    What makes you think that my pattern doesn't work? It does, while
    yours actually doesn't: you need to index $n-1 unless you've reset $[
    to 1. The difference is that yours does about twice the work, and so
    takes about twice as long. The aborting for loop is even slower.

    Some benchmark code for your amusement:

    #!perl

    use strict;
    use warnings;
    use Benchmark;

    my $str='abcabbcabbbbcabcabbcab';
    my $n = 3; ## Find the $nth occurrence
    my $pat = qr/ab+/; ## of this pattern

    sub pat_n;
    sub for_g;
    sub m_g;

    print "pat_n Match $n in $str is ", pat_n, "\n";
    print "for_g Match $n in $str is ", for_g, "\n";
    print "m_g Match $n in $str is ", m_g, "\n";

    timethese( 100_000, {
    '$pat{$n}' => \&pat_n,
    'for //g' => \&for_g,
    'm_g' => \&m_g,
    });

    sub pat_n {
    $str =~ /(?:.*?($pat)){$n}/;
    }

    sub for_g {
    my $NthMatch;
    for (1..$n) {
    $str =~ /($pat)/g;
    $NthMatch = $1;
    }
    pos($str) = 0;
    $NthMatch;
    }

    sub m_g {
    ($str =~ /($pat)/g)[$n-1];
    }
    Roy Johnson, Oct 29, 2003
    #4
  5. Roy Johnson

    Ben Morrow Guest

    (Roy Johnson) wrote:
    > Ben Morrow <> wrote in message
    > news:<bnoo6h$hr4$>...
    > > > Incidentally, the best way to get the $nth match of $pat in $str is
    > > > $str =~ /(?:.*?($pat)){$n}/;
    > > > but I'm still curious about short-circuited global matches.

    > >
    > > I would have said a better way would be
    > > $nthmatch = ($str =~ /($pat)/g)[$n];
    > > , not least because it actually works, but maybe that's just me... :)

    >
    > What makes you think that my pattern doesn't work?


    Sorry, I must have misread it... or something. I thought it would fail
    on inputs like
    ab ab ab abb
    and get the 'abb' instead of the third 'ab', but I was wrong.

    > It does, while yours actually doesn't: you need to index $n-1 unless
    > you've reset $[ to 1.


    Yes, of course... :(

    > The difference is that yours does about twice the work, and so
    > takes about twice as long. The aborting for loop is even slower.
    >
    > Some benchmark code for your amusement:


    Actually, on my machine my code runs slowest of the three for that
    input... :)

    Ben

    --
    If I were a butterfly I'd live for a day, / I would be free, just blowing away.
    This cruel country has driven me down / Teased me and lied, teased me and lied.
    I've only sad stories to tell to this town: / My dreams have withered and died.
    <=>=<=>=<=>=<=>=<=>=<=>=<=>=<=>=<=>=<=>=<=> (Kate Rusby)
    Ben Morrow, Oct 29, 2003
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. martin f. krafft

    confusion when resetting registers

    martin f. krafft, Aug 18, 2004, in forum: VHDL
    Replies:
    2
    Views:
    458
    Paul Sereno
    Aug 19, 2004
  2. ALuPin

    Resetting FIFO

    ALuPin, Feb 9, 2005, in forum: VHDL
    Replies:
    1
    Views:
    748
    Gabor
    Feb 9, 2005
  3. Matthew Wieder
    Replies:
    1
    Views:
    367
    Yan-Hong Huang[MSFT]
    Jul 22, 2003
  4. Israel Ordonez Jr

    Global.asax On_Start Variables resetting

    Israel Ordonez Jr, Nov 18, 2003, in forum: ASP .Net
    Replies:
    4
    Views:
    2,389
  5. Hrvoje Vrbanc

    Resetting SessionID

    Hrvoje Vrbanc, Feb 16, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    335
    Hrvoje Vrbanc
    Feb 16, 2004
Loading...

Share This Page