(?{ ... }) puzzlement

Discussion in 'Perl Misc' started by J Krugman, May 31, 2004.

  1. J Krugman

    J Krugman Guest

    In an attempt to find a single regexp that would succeed if three
    different sub-regexps matched in any order (see why in the thread
    called '"Commutative" regexps'), I started playing with (?{...})-type
    regexps. As warm-up, I tried this:

    1 use strict;
    2 use re 'eval';
    3
    4 my @re0 = qw(abc pqr xyz);
    5 my @seen = (undef) x @re0;
    6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',
    7 $re0[$_], $_),
    8 0..$#re0;
    9 my $re = eval "qr/@{[join('|', @re)]}/";
    10
    11 #0 1 2
    12 #01234567890123456789012345
    13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*/;
    14
    15 print "\$seen[$_] = $seen[$_]\n" for (0..$#seen);
    16
    17 __END__

    $seen[0] =
    $seen[1] =
    $seen[2] =

    If I change line 13 to

    13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*(?!)/;

    The output I get changes to

    $seen[0] = 2 14 14
    $seen[1] = 2
    $seen[2] = 2 2 2


    I find both results completely puzzling. I realize that ?{ ... }
    is a highly experimental feature, but if anyone can explain to me
    what's going on I'd very much appreciate it.

    TIA,

    jill

    P.S. Unrelated regexp question: if I have a string or regexp in
    a variable $x, and I want to use this variable to write a regexp
    corresponding to 5 repeats of the contents of $x, how do I write
    it? If I wrote /$x{5}/, it would be interpreted by perl as attempting
    to access the value corresponding to key '5' in the hash %x.

    --
    To s&e^n]d me m~a}i]l r%e*m?o\v[e bit from my a|d)d:r{e:s]s.
    J Krugman, May 31, 2004
    #1
    1. Advertising

  2. J Krugman

    Ben Morrow Guest

    Quoth J Krugman <>:
    >
    > In an attempt to find a single regexp that would succeed if three
    > different sub-regexps matched in any order (see why in the thread
    > called '"Commutative" regexps'), I started playing with (?{...})-type
    > regexps. As warm-up, I tried this:
    >
    > 1 use strict;
    > 2 use re 'eval';
    > 3
    > 4 my @re0 = qw(abc pqr xyz);
    > 5 my @seen = (undef) x @re0;
    > 6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',
    > 7 $re0[$_], $_),
    > 8 0..$#re0;


    @- only contains entries for () sub-expressions. You have none here
    (that have matched yet), so it won't work. (My guess is that the first
    entry isn't filled in until after the match has finished, but I don't
    rightly know...)

    > 9 my $re = eval "qr/@{[join('|', @re)]}/";
    > 10
    > 11 #0 1 2
    > 12 #01234567890123456789012345
    > 13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*/;
    > 14
    > 15 print "\$seen[$_] = $seen[$_]\n" for (0..$#seen);
    > 16
    > 17 __END__
    >
    > $seen[0] =
    > $seen[1] =
    > $seen[2] =
    >
    > If I change line 13 to
    >
    > 13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*(?!)/;
    >
    > The output I get changes to
    >
    > $seen[0] = 2 14 14
    > $seen[1] = 2
    > $seen[2] = 2 2 2


    Presumably the (?!) is causing it to carry on trying, so it re-runs
    those bits of code after @- has some entries.

    Try something more like (completely untested):

    my @pats = qw/abc pqr xyx/;
    my %seen;
    my $re = join '|', map {
    qr/(\Q$_\E) (?{ $seen{$_} = 1 }) (?!)/x
    } @pats;

    '__pqr__xyz__pqr__abc__' =~ /$re/;

    $, = $\ = "\n"
    print keys %seen;

    > P.S. Unrelated regexp question: if I have a string or regexp in
    > a variable $x, and I want to use this variable to write a regexp
    > corresponding to 5 repeats of the contents of $x, how do I write
    > it? If I wrote /$x{5}/, it would be interpreted by perl as attempting
    > to access the value corresponding to key '5' in the hash %x.


    Try /(?:$x){5}/.

    Ben

    --
    For the last month, a large number of PSNs in the Arpa[Inter-]net have been
    reporting symptoms of congestion ... These reports have been accompanied by an
    increasing number of user complaints ... As of June,... the Arpanet contained
    47 nodes and 63 links. [ftp://rtfm.mit.edu/pub/arpaprob.txt] *
    Ben Morrow, May 31, 2004
    #2
    1. Advertising

  3. J Krugman

    J Krugman Guest

    In <c9fabj$8bg$> Ben Morrow <> writes:


    >Quoth J Krugman <>:
    >>
    >> In an attempt to find a single regexp that would succeed if three
    >> different sub-regexps matched in any order (see why in the thread
    >> called '"Commutative" regexps'), I started playing with (?{...})-type
    >> regexps. As warm-up, I tried this:
    >>
    >> 1 use strict;
    >> 2 use re 'eval';
    >> 3
    >> 4 my @re0 = qw(abc pqr xyz);
    >> 5 my @seen = (undef) x @re0;
    >> 6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',
    >> 7 $re0[$_], $_),
    >> 8 0..$#re0;


    >@- only contains entries for () sub-expressions. You have none here
    >(that have matched yet), so it won't work. (My guess is that the first
    >entry isn't filled in until after the match has finished, but I don't
    >rightly know...)


    I get the same results (i.e. %seen never gets initialized) if I
    change line 6 to

    6 my @re = map sprintf('(%s)(?{ $seen[%d] ||= "@-" })',



    >Try something more like (completely untested):


    >my @pats = qw/abc pqr xyx/;
    >my %seen;
    >my $re = join '|', map {
    > qr/(\Q$_\E) (?{ $seen{$_} = 1 }) (?!)/x
    >} @pats;


    >'__pqr__xyz__pqr__abc__' =~ /$re/;


    >$, = $\ = "\n"
    >print keys %seen;


    OK, tried it (and many variants); no luck. I now think that ?{
    .... } is not the way to go; too complicated (and/or buggy) for me.



    >> P.S. Unrelated regexp question: if I have a string or regexp in
    >> a variable $x, and I want to use this variable to write a regexp
    >> corresponding to 5 repeats of the contents of $x, how do I write
    >> it? If I wrote /$x{5}/, it would be interpreted by perl as attempting
    >> to access the value corresponding to key '5' in the hash %x.


    >Try /(?:$x){5}/.


    Thanks!

    jill



    --
    To s&e^n]d me m~a}i]l r%e*m?o\v[e bit from my a|d)d:r{e:s]s.
    J Krugman, May 31, 2004
    #3
  4. J Krugman

    Matt Garrish Guest

    "J Krugman" <> wrote in message
    news:c9f8on$no6$...
    >
    >
    >
    > In an attempt to find a single regexp that would succeed if three
    > different sub-regexps matched in any order (see why in the thread
    > called '"Commutative" regexps'), I started playing with (?{...})-type
    > regexps. As warm-up, I tried this:
    >
    > 1 use strict;
    > 2 use re 'eval';
    > 3
    > 4 my @re0 = qw(abc pqr xyz);
    > 5 my @seen = (undef) x @re0;
    > 6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',


    I think you want ?? not ?

    my @re = map sprintf('%s(??{ $seen[%d] ||= "@-" })',


    > 7 $re0[$_], $_),
    > 8 0..$#re0;
    > 9 my $re = eval "qr/@{[join('|', @re)]}/";
    > 10
    > 11 #0 1 2
    > 12 #01234567890123456789012345
    > 13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*/;
    > 14
    > 15 print "\$seen[$_] = $seen[$_]\n" for (0..$#seen);
    > 16
    > 17 __END__
    >


    The output then becomes

    $seen[0] = 20
    $seen[1] = 2
    $seen[2] = 9

    Matt
    Matt Garrish, Jun 1, 2004
    #4
  5. J Krugman

    Matt Garrish Guest

    "Matt Garrish" <> wrote in message
    news:9OOuc.84348$...
    >
    > "J Krugman" <> wrote in message
    > news:c9f8on$no6$...
    > >
    > >
    > >
    > > In an attempt to find a single regexp that would succeed if three
    > > different sub-regexps matched in any order (see why in the thread
    > > called '"Commutative" regexps'), I started playing with (?{...})-type
    > > regexps. As warm-up, I tried this:
    > >
    > > 1 use strict;
    > > 2 use re 'eval';
    > > 3
    > > 4 my @re0 = qw(abc pqr xyz);
    > > 5 my @seen = (undef) x @re0;
    > > 6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',

    >
    > I think you want ?? not ?
    >
    > my @re = map sprintf('%s(??{ $seen[%d] ||= "@-" })',
    >
    >
    > > 7 $re0[$_], $_),
    > > 8 0..$#re0;
    > > 9 my $re = eval "qr/@{[join('|', @re)]}/";
    > > 10
    > > 11 #0 1 2
    > > 12 #01234567890123456789012345
    > > 13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*/;
    > > 14
    > > 15 print "\$seen[$_] = $seen[$_]\n" for (0..$#seen);
    > > 16
    > > 17 __END__
    > >

    >
    > The output then becomes
    >
    > $seen[0] = 20
    > $seen[1] = 2
    > $seen[2] = 9
    >


    Forgot to mention that I also changed line 13 to:

    '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*(?!)/;

    as per your original post.

    Matt
    Matt Garrish, Jun 1, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Roth

    Descriptor puzzlement

    John Roth, Jan 8, 2004, in forum: Python
    Replies:
    9
    Views:
    346
    John Roth
    Jan 8, 2004
  2. Faheem Mitha

    puzzlement about classmethod

    Faheem Mitha, Jun 24, 2006, in forum: Python
    Replies:
    2
    Views:
    258
    Dennis Lee Bieber
    Jun 24, 2006
  3. Derek Fountain

    super.clone() puzzlement

    Derek Fountain, Nov 20, 2008, in forum: Java
    Replies:
    42
    Views:
    1,326
    RedGrittyBrick
    Nov 27, 2008
  4. Martin DeMello

    String#scan puzzlement

    Martin DeMello, Jul 12, 2006, in forum: Ruby
    Replies:
    2
    Views:
    83
    Martin DeMello
    Jul 12, 2006
Loading...

Share This Page