more elegant way to say ($1, $2, $3, $4, ...)?

Larry · Aug 9, 2007

I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...
}

The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

BTW, don't suggest:

while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
...
}

That will cause the regex to evaluate in a list context, which changes
the behavior of /g to parse all of $_ at once, only returning the
first match and throwing away the rest.

Brian McCauley · Aug 9, 2007

I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...

}

The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

Not that I know of in Perl5 - and believe me I've looked.

You could write a function that returns them

sub matches { no strict 'refs'; map $$_ , 1 .. $#- }

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = matches;
}

But this is hardly more elegant.

Larry · Aug 9, 2007

Not that I know of in Perl5 - and believe me I've looked.

You could write a function that returns them

sub matches { no strict 'refs'; map $$_ , 1 .. $#- }

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = matches;

}

But this is hardly more elegant.

Not elegant?! It's awesome! Thanks!

BTW, I just learned 2 new things:

-- map can take an expr as the first param, not just a block (had to
look that up to see what was going on exactly!)

-- that there is a variable called @- and what it does

Thanks!

Uri Guttman · Aug 9, 2007

L> I'm using a /g regex in a while loop to capture parenthesized matches
L> to meaningful variable names like this:

L> while (/ (...) ... (...) ... (...)/g) {
L> my ($foo, $bar, $baz) = ($1, $2, $3);
L> ...
L> }

L> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

you can look at using @+, @- to get the strings via substr. a map call
on 0 .. $#+ will do it but it is ugly too

perldoc perlvar says this:
$1 is the same as "substr($var, $-[1], $+[1] - $-[1])"

so this should work (untested):

my ($foo, $bar, $baz) =
map substr($var, $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;

and that map stuff could be put into a sub to clean it up. just pass in
$var and the @+ and @- globals should still be set. something like this:

sub matches {
map substr($_[0], $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
}

my ($foo, $bar, $baz) = matches( $var ) ;

but i would just stick with the assignment of $1, $2 ... as it is the
cleanest.

uri

Uri Guttman · Aug 9, 2007

L> Not elegant?! It's awesome! Thanks!

it is not elegant as it uses symrefs which is evil. see my other post
for a solution without symrefs.

L> -- that there is a variable called @- and what it does

and see @+ and how perlvar says to use them. my other post shows a full
example without symrefs.

uri

Mirco Wahab · Aug 9, 2007

Larry said:
I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...
}
The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

The $n is an idiomatic expression which is
not that bad in my opinion.

You could fake 'named captures' like this:

...
my ($foo, $bar, $baz);
$_ = ' abc def' x 60;

while(/ (...)(?{$foo=$^N}) ... (...)(?{$bar=$^N}) ... (...)(?{$baz=$^N}) /g) {
print "$foo, $bar, $baz\n"
}

...

or even (whatch your braces)

...
while(/ (...) ... (...) ... (...)(?{($foo,$bar,$baz)=($1,$2,$3)})/g) {
print "$foo, $bar, $baz\n";
}
...

Regards

M.

Brian McCauley · Aug 10, 2007

L> Not elegant?! It's awesome! Thanks!

it is not elegant as it uses symrefs which is evil.

Using multiple named variables to implement what is logically a
composite data structure (array or hash) is evil.

The only way access such an evil structure is to use symref or eval().
(Of the two symrefs are the lesser evil).

In this case $1... are already, in effect, such a structure.

The 'evil' here is not in my code but in the underlying design
decision in early versions of Perl.

An alternative approach using substr($_...) would avoid symrefs but
the evil is still there. The fact that we choose to avert our eyes
does not reduce the evil.

See also:

http://groups.google.co.uk/group/co...read/thread/1ebb17826a236940/1a323f2e1968a83f

see my other post
for a solution without symrefs.

Your post does not appear to have propagated, could you re-post it
please.

Uri Guttman · Aug 10, 2007

BM> Using multiple named variables to implement what is logically a
BM> composite data structure (array or hash) is evil.

i would rather put the blame on the text being parsed!

the OP never showed any real text to parse. i have done scalar m//g
loops too but rarely with more than a few grabs so i don't mind the $1
style. if there are too many i would break up the text first into
sections and then parse out the grabs and assign them to a list of
scalars or a hash slice (which is the best way).

BM> The 'evil' here is not in my code but in the underlying design
BM> decision in early versions of Perl.

perl6 solves this problem as usual by allowing m//g loops but only
grabbing what is in the regex and allowing assignment to hash elements
among many other things.

BM> An alternative approach using substr($_...) would avoid symrefs but
BM> the evil is still there. The fact that we choose to avert our eyes
BM> does not reduce the evil.

but it looks so much neater with substr.

BM> Your post does not appear to have propagated, could you re-post it
BM> please.

not sure why as i saw it. let it rest as it was just a slight mod of
what is in perlvar about using substr and @- and @+.

uri

demerphq · Oct 2, 2007

L> I'm using a /g regex in a while loop to capture parenthesized matches
L> to meaningful variable names like this:

L> while (/ (...) ... (...) ... (...)/g) {
L> my ($foo, $bar, $baz) = ($1, $2, $3);
L> ...
L> }

L> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

you can look at using @+, @- to get the strings via substr. a map call
on 0 .. $#+ will do it but it is ugly too

perldoc perlvar says this:
$1 is the same as "substr($var, $-[1], $+[1] - $-[1])"

so this should work (untested):

my ($foo, $bar, $baz) =
map substr($var, $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;

and that map stuff could be put into a sub to clean it up. just pass in
$var and the @+ and @- globals should still be set. something like this:

sub matches {
map substr($_[0], $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
}

my ($foo, $bar, $baz) = matches( $var ) ;

but i would just stick with the assignment of $1, $2 ... as it is the
cleanest.

The problem with this approach is that it requires you to know what
string @- and @+ are operating on, which is actually somewhere between
difficult and impossible in the case of s///.

One solution that avoids this problem is the following, somewhat
crufty code:

sub matches { eval 'sub { \@_ }->(' . join(", ", map "\$$_", 1 .. $#
+ ) . ')' }

Now you can say

my $array=matches();

and have it do the right thing always*, even if we didn't make a copy
of the original string before we used s///.

*Of course the array returned for matches() is only "good" for the
results of a given match.

What we (perl5porters) really should do is provide a special magic
variable that returns the entire string that $1 and friends reference,
so then using @- and @+ would be safe. Unfortunately its too late for
that to make it into 5.10, although its possible for 5.10.1 i guess.

Yves

demerphq · Oct 2, 2007

The $n is an idiomatic expression which is
not that bad in my opinion.

You could fake 'named captures' like this:

Of use 5.10 when it comes out and make use its real named
captures.

Yves

William James · Oct 2, 2007

I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...

}

The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

BTW, don't suggest:

while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
...

}

That will cause the regex to evaluate in a list context, which changes
the behavior of /g to parse all of $_ at once, only returning the
first match and throwing away the rest.

Ruby

scan( / (...) ... (...) ... (...)/ ){ |foo, bar, baz|
...
}

szr · Oct 10, 2007

Larry said:
I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...
}

The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

BTW, don't suggest:

while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
...
}

That will cause the regex to evaluate in a list context, which changes
the behavior of /g to parse all of $_ at once, only returning the
first match and throwing away the rest.

Why not just do something like the following?

my $s = 'A1Z B2Y C3X D4W E5V';

### Inelegant - have to know amount of captures/loop-iteration
while ($s =~ /(\w)(\d)(\w)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
print "'$foo' '$bar' '$baz'\n";
}

print "\n";

### More elegant - all matches for each iteration goes into an array
while (my @matches = $s =~ /\G.*?(\w)(\d)(\w)/) {
pos($s) = $+[0];
print "'", join("' '", @matches), "'\n";
}

___OUTPUT___
'A' '1' 'Z'
'B' '2' 'Y'
'C' '3' 'X'
'D' '4' 'W'
'E' '5' 'V'

'A' '1' 'Z'
'B' '2' 'Y'
'C' '3' 'X'
'D' '4' 'W'
'E' '5' 'V'

All you have to do is add \G.*? to the beginning of the regex, and
remove g from the end of the regex (modifier list.) Other than that,
you just need to have pos($s) = $+[0]; at the beginning of your loop
(or at least before the end of the loop, thouhg it seems safest to keep
it at the beginning, especially if you do any tests on pos($s)

Is there a more elegant way to handle determing fail status?	3	Jan 15, 2013
convert String "1;2;3;4;5;" to Array [1, 2, 3, 4, 5]	9	Dec 28, 2010
Elegant equivalent to this regex?	19	Jan 4, 2007
API design for Python 2 / 3 compatibility	3	Apr 13, 2013
more inelegancies	4	Aug 9, 2007
2 threads; 1 more Tkinter and 1 more terminal. problem	3	Aug 15, 2010
How to make iterating through list of variables more elegant	4	Jul 12, 2010
Elegant Solution to a Seemingly Simple Problem?	11	Apr 18, 2010

more elegant way to say ($1, $2, $3, $4, ...)?

Larry

Brian McCauley

Larry

Uri Guttman

Uri Guttman

Mirco Wahab

Brian McCauley

Uri Guttman

demerphq

demerphq

William James

szr

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads