more elegant way to say ($1, $2, $3, $4, ...)?

Discussion in 'Perl Misc' started by Larry, Aug 9, 2007.

  1. Larry

    Larry Guest

    I'm using a /g regex in a while loop to capture parenthesized matches
    to meaningful variable names like this:

    while (/ (...) ... (...) ... (...)/g) {
    my ($foo, $bar, $baz) = ($1, $2, $3);
    ...
    }

    The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

    BTW, don't suggest:

    while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
    ...
    }

    That will cause the regex to evaluate in a list context, which changes
    the behavior of /g to parse all of $_ at once, only returning the
    first match and throwing away the rest.
    Larry, Aug 9, 2007
    #1
    1. Advertising

  2. On Aug 9, 7:04 pm, Larry <> wrote:
    > I'm using a /g regex in a while loop to capture parenthesized matches
    > to meaningful variable names like this:
    >
    > while (/ (...) ... (...) ... (...)/g) {
    > my ($foo, $bar, $baz) = ($1, $2, $3);
    > ...
    >
    > }
    >
    > The ($1, $2, $3) part seems inelegant ... is there a more elegant way?


    Not that I know of in Perl5 - and believe me I've looked.

    You could write a function that returns them

    sub matches { no strict 'refs'; map $$_ , 1 .. $#- }

    while (/ (...) ... (...) ... (...)/g) {
    my ($foo, $bar, $baz) = matches;
    }

    But this is hardly more elegant.
    Brian McCauley, Aug 9, 2007
    #2
    1. Advertising

  3. Larry

    Larry Guest

    On Aug 9, 2:29 pm, Brian McCauley <> wrote:
    > On Aug 9, 7:04 pm, Larry <> wrote:
    >
    > > I'm using a /g regex in a while loop to capture parenthesized matches
    > > to meaningful variable names like this:

    >
    > > while (/ (...) ... (...) ... (...)/g) {
    > > my ($foo, $bar, $baz) = ($1, $2, $3);
    > > ...

    >
    > > }

    >
    > > The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

    >
    > Not that I know of in Perl5 - and believe me I've looked.
    >
    > You could write a function that returns them
    >
    > sub matches { no strict 'refs'; map $$_ , 1 .. $#- }
    >
    > while (/ (...) ... (...) ... (...)/g) {
    > my ($foo, $bar, $baz) = matches;
    >
    > }
    >
    > But this is hardly more elegant.


    Not elegant?! It's awesome! Thanks!

    BTW, I just learned 2 new things:

    -- map can take an expr as the first param, not just a block (had to
    look that up to see what was going on exactly!)

    -- that there is a variable called @- and what it does

    Thanks!
    Larry, Aug 9, 2007
    #3
  4. Larry

    Uri Guttman Guest

    >>>>> "L" == Larry <> writes:

    L> I'm using a /g regex in a while loop to capture parenthesized matches
    L> to meaningful variable names like this:

    L> while (/ (...) ... (...) ... (...)/g) {
    L> my ($foo, $bar, $baz) = ($1, $2, $3);
    L> ...
    L> }

    L> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

    you can look at using @+, @- to get the strings via substr. a map call
    on 0 .. $#+ will do it but it is ugly too

    perldoc perlvar says this:
    $1 is the same as "substr($var, $-[1], $+[1] - $-[1])"

    so this should work (untested):

    my ($foo, $bar, $baz) =
    map substr($var, $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;

    and that map stuff could be put into a sub to clean it up. just pass in
    $var and the @+ and @- globals should still be set. something like this:

    sub matches {
    map substr($_[0], $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
    }

    my ($foo, $bar, $baz) = matches( $var ) ;

    but i would just stick with the assignment of $1, $2 ... as it is the
    cleanest.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Aug 9, 2007
    #4
  5. Larry

    Uri Guttman Guest

    >>>>> "L" == Larry <> writes:

    L> On Aug 9, 2:29 pm, Brian McCauley <> wrote:
    >> On Aug 9, 7:04 pm, Larry <> wrote:


    >> You could write a function that returns them
    >>
    >> sub matches { no strict 'refs'; map $$_ , 1 .. $#- }
    >>
    >> while (/ (...) ... (...) ... (...)/g) {
    >> my ($foo, $bar, $baz) = matches;
    >>
    >> }
    >>
    >> But this is hardly more elegant.


    L> Not elegant?! It's awesome! Thanks!

    it is not elegant as it uses symrefs which is evil. see my other post
    for a solution without symrefs.

    L> -- that there is a variable called @- and what it does

    and see @+ and how perlvar says to use them. my other post shows a full
    example without symrefs.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Aug 9, 2007
    #5
  6. Larry

    Mirco Wahab Guest

    Larry wrote:
    > I'm using a /g regex in a while loop to capture parenthesized matches
    > to meaningful variable names like this:
    >
    > while (/ (...) ... (...) ... (...)/g) {
    > my ($foo, $bar, $baz) = ($1, $2, $3);
    > ...
    > }
    > The ($1, $2, $3) part seems inelegant ... is there a more elegant way?


    The $n is an idiomatic expression which is
    not that bad in my opinion.

    You could fake 'named captures' like this:

    ...
    my ($foo, $bar, $baz);
    $_ = ' abc def' x 60;

    while(/ (...)(?{$foo=$^N}) ... (...)(?{$bar=$^N}) ... (...)(?{$baz=$^N}) /g) {
    print "$foo, $bar, $baz\n"
    }

    ...

    or even (whatch your braces)

    ...
    while(/ (...) ... (...) ... (...)(?{($foo,$bar,$baz)=($1,$2,$3)})/g) {
    print "$foo, $bar, $baz\n";
    }
    ...


    Regards

    M.
    Mirco Wahab, Aug 9, 2007
    #6
  7. On Aug 9, 8:10 pm, Uri Guttman <> wrote:
    > >>>>> "L" == Larry <> writes:

    >
    > L> On Aug 9, 2:29 pm, Brian McCauley <> wrote:
    > >> On Aug 9, 7:04 pm, Larry <> wrote:

    >
    > >> You could write a function that returns them
    > >>
    > >> sub matches { no strict 'refs'; map $$_ , 1 .. $#- }
    > >>
    > >> while (/ (...) ... (...) ... (...)/g) {
    > >> my ($foo, $bar, $baz) = matches;
    > >>
    > >> }
    > >>
    > >> But this is hardly more elegant.

    >
    > L> Not elegant?! It's awesome! Thanks!
    >
    > it is not elegant as it uses symrefs which is evil.


    Using multiple named variables to implement what is logically a
    composite data structure (array or hash) is evil.

    The only way access such an evil structure is to use symref or eval().
    (Of the two symrefs are the lesser evil).

    In this case $1... are already, in effect, such a structure.

    The 'evil' here is not in my code but in the underlying design
    decision in early versions of Perl.

    An alternative approach using substr($_...) would avoid symrefs but
    the evil is still there. The fact that we choose to avert our eyes
    does not reduce the evil.

    See also:

    http://groups.google.co.uk/group/co...read/thread/1ebb17826a236940/1a323f2e1968a83f

    > see my other post
    > for a solution without symrefs.


    Your post does not appear to have propagated, could you re-post it
    please.
    Brian McCauley, Aug 10, 2007
    #7
  8. Larry

    Uri Guttman Guest

    >>>>> "BM" == Brian McCauley <> writes:

    BM> On Aug 9, 8:10 pm, Uri Guttman <> wrote:

    >>
    >> it is not elegant as it uses symrefs which is evil.


    BM> Using multiple named variables to implement what is logically a
    BM> composite data structure (array or hash) is evil.

    i would rather put the blame on the text being parsed! :)
    the OP never showed any real text to parse. i have done scalar m//g
    loops too but rarely with more than a few grabs so i don't mind the $1
    style. if there are too many i would break up the text first into
    sections and then parse out the grabs and assign them to a list of
    scalars or a hash slice (which is the best way).

    BM> The 'evil' here is not in my code but in the underlying design
    BM> decision in early versions of Perl.

    perl6 solves this problem as usual by allowing m//g loops but only
    grabbing what is in the regex and allowing assignment to hash elements
    among many other things.

    BM> An alternative approach using substr($_...) would avoid symrefs but
    BM> the evil is still there. The fact that we choose to avert our eyes
    BM> does not reduce the evil.

    but it looks so much neater with substr. :)

    >> see my other post
    >> for a solution without symrefs.


    BM> Your post does not appear to have propagated, could you re-post it
    BM> please.

    not sure why as i saw it. let it rest as it was just a slight mod of
    what is in perlvar about using substr and @- and @+.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Aug 10, 2007
    #8
  9. Larry

    Guest

    On Aug 9, 9:08 pm, Uri Guttman <> wrote:
    > >>>>> "L" == Larry <> writes:

    >
    > L> I'm using a /g regex in a while loop to capture parenthesized matches
    > L> to meaningful variable names like this:
    >
    > L> while (/ (...) ... (...) ... (...)/g) {
    > L> my ($foo, $bar, $baz) = ($1, $2, $3);
    > L> ...
    > L> }
    >
    > L> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?
    >
    > you can look at using @+, @- to get the strings via substr. a map call
    > on 0 .. $#+ will do it but it is ugly too
    >
    > perldoc perlvar says this:
    > $1 is the same as "substr($var, $-[1], $+[1] - $-[1])"
    >
    > so this should work (untested):
    >
    > my ($foo, $bar, $baz) =
    > map substr($var, $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
    >
    > and that map stuff could be put into a sub to clean it up. just pass in
    > $var and the @+ and @- globals should still be set. something like this:
    >
    > sub matches {
    > map substr($_[0], $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
    > }
    >
    > my ($foo, $bar, $baz) = matches( $var ) ;
    >
    > but i would just stick with the assignment of $1, $2 ... as it is the
    > cleanest.


    The problem with this approach is that it requires you to know what
    string @- and @+ are operating on, which is actually somewhere between
    difficult and impossible in the case of s///.

    One solution that avoids this problem is the following, somewhat
    crufty code:

    sub matches { eval 'sub { \@_ }->(' . join(", ", map "\$$_", 1 .. $#
    + ) . ')' }

    Now you can say

    my $array=matches();

    and have it do the right thing always*, even if we didn't make a copy
    of the original string before we used s///.

    *Of course the array returned for matches() is only "good" for the
    results of a given match.

    What we (perl5porters) really should do is provide a special magic
    variable that returns the entire string that $1 and friends reference,
    so then using @- and @+ would be safe. Unfortunately its too late for
    that to make it into 5.10, although its possible for 5.10.1 i guess.

    Yves
    , Oct 2, 2007
    #9
  10. Larry

    Guest

    On Aug 9, 10:11 pm, Mirco Wahab <> wrote:
    > Larry wrote:
    > > I'm using a /g regex in a while loop to capture parenthesized matches
    > > to meaningful variable names like this:

    >
    > > while (/ (...) ... (...) ... (...)/g) {
    > > my ($foo, $bar, $baz) = ($1, $2, $3);
    > > ...
    > > }
    > > The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

    >
    > The $n is an idiomatic expression which is
    > not that bad in my opinion.
    >
    > You could fake 'named captures' like this:


    Of use 5.10 when it comes out and make use its real named
    captures. :)

    Yves
    , Oct 2, 2007
    #10
  11. On Aug 9, 1:04 pm, Larry <> wrote:
    > I'm using a /g regex in a while loop to capture parenthesized matches
    > to meaningful variable names like this:
    >
    > while (/ (...) ... (...) ... (...)/g) {
    > my ($foo, $bar, $baz) = ($1, $2, $3);
    > ...
    >
    > }
    >
    > The ($1, $2, $3) part seems inelegant ... is there a more elegant way?
    >
    > BTW, don't suggest:
    >
    > while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
    > ...
    >
    > }
    >
    > That will cause the regex to evaluate in a list context, which changes
    > the behavior of /g to parse all of $_ at once, only returning the
    > first match and throwing away the rest.


    Ruby

    scan( / (...) ... (...) ... (...)/ ){ |foo, bar, baz|
    ...
    }
    William James, Oct 2, 2007
    #11
  12. Larry

    szr Guest

    Larry wrote:
    > I'm using a /g regex in a while loop to capture parenthesized matches
    > to meaningful variable names like this:
    >
    > while (/ (...) ... (...) ... (...)/g) {
    > my ($foo, $bar, $baz) = ($1, $2, $3);
    > ...
    > }
    >
    > The ($1, $2, $3) part seems inelegant ... is there a more elegant way?
    >
    > BTW, don't suggest:
    >
    > while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
    > ...
    > }
    >
    > That will cause the regex to evaluate in a list context, which changes
    > the behavior of /g to parse all of $_ at once, only returning the
    > first match and throwing away the rest.


    Why not just do something like the following?

    my $s = 'A1Z B2Y C3X D4W E5V';

    ### Inelegant - have to know amount of captures/loop-iteration
    while ($s =~ /(\w)(\d)(\w)/g) {
    my ($foo, $bar, $baz) = ($1, $2, $3);
    print "'$foo' '$bar' '$baz'\n";
    }

    print "\n";

    ### More elegant - all matches for each iteration goes into an array
    while (my @matches = $s =~ /\G.*?(\w)(\d)(\w)/) {
    pos($s) = $+[0];
    print "'", join("' '", @matches), "'\n";
    }

    ___OUTPUT___
    'A' '1' 'Z'
    'B' '2' 'Y'
    'C' '3' 'X'
    'D' '4' 'W'
    'E' '5' 'V'

    'A' '1' 'Z'
    'B' '2' 'Y'
    'C' '3' 'X'
    'D' '4' 'W'
    'E' '5' 'V'


    All you have to do is add \G.*? to the beginning of the regex, and
    remove g from the end of the regex (modifier list.) Other than that,
    you just need to have pos($s) = $+[0]; at the beginning of your loop
    (or at least before the end of the loop, thouhg it seems safest to keep
    it at the beginning, especially if you do any tests on pos($s)

    :)


    --
    szr
    szr, Oct 10, 2007
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. mikegw
    Replies:
    5
    Views:
    370
    Nick Keighley
    May 24, 2004
  2. Kamilche
    Replies:
    7
    Views:
    282
    Peter Hansen
    Jun 29, 2004
  3. Kamilche

    More elegant way to cwd?

    Kamilche, Dec 24, 2004, in forum: Python
    Replies:
    11
    Views:
    495
    Peter Hansen
    Dec 29, 2004
  4. Replies:
    5
    Views:
    376
  5. David Garamond
    Replies:
    21
    Views:
    254
    Gergely Kontra
    Jun 1, 2004
Loading...

Share This Page