more elegant way to say ($1, $2, $3, $4, ...)?

L

Larry

I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...
}

The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

BTW, don't suggest:

while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
...
}

That will cause the regex to evaluate in a list context, which changes
the behavior of /g to parse all of $_ at once, only returning the
first match and throwing away the rest.
 
B

Brian McCauley

I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...

}

The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

Not that I know of in Perl5 - and believe me I've looked.

You could write a function that returns them

sub matches { no strict 'refs'; map $$_ , 1 .. $#- }

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = matches;
}

But this is hardly more elegant.
 
L

Larry

Not that I know of in Perl5 - and believe me I've looked.

You could write a function that returns them

sub matches { no strict 'refs'; map $$_ , 1 .. $#- }

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = matches;

}

But this is hardly more elegant.

Not elegant?! It's awesome! Thanks!

BTW, I just learned 2 new things:

-- map can take an expr as the first param, not just a block (had to
look that up to see what was going on exactly!)

-- that there is a variable called @- and what it does

Thanks!
 
U

Uri Guttman

L> I'm using a /g regex in a while loop to capture parenthesized matches
L> to meaningful variable names like this:

L> while (/ (...) ... (...) ... (...)/g) {
L> my ($foo, $bar, $baz) = ($1, $2, $3);
L> ...
L> }

L> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

you can look at using @+, @- to get the strings via substr. a map call
on 0 .. $#+ will do it but it is ugly too

perldoc perlvar says this:
$1 is the same as "substr($var, $-[1], $+[1] - $-[1])"

so this should work (untested):

my ($foo, $bar, $baz) =
map substr($var, $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;

and that map stuff could be put into a sub to clean it up. just pass in
$var and the @+ and @- globals should still be set. something like this:

sub matches {
map substr($_[0], $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
}

my ($foo, $bar, $baz) = matches( $var ) ;

but i would just stick with the assignment of $1, $2 ... as it is the
cleanest.

uri
 
U

Uri Guttman

L> Not elegant?! It's awesome! Thanks!

it is not elegant as it uses symrefs which is evil. see my other post
for a solution without symrefs.

L> -- that there is a variable called @- and what it does

and see @+ and how perlvar says to use them. my other post shows a full
example without symrefs.

uri
 
M

Mirco Wahab

Larry said:
I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...
}
The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

The $n is an idiomatic expression which is
not that bad in my opinion.

You could fake 'named captures' like this:

...
my ($foo, $bar, $baz);
$_ = ' abc def' x 60;

while(/ (...)(?{$foo=$^N}) ... (...)(?{$bar=$^N}) ... (...)(?{$baz=$^N}) /g) {
print "$foo, $bar, $baz\n"
}

...

or even (whatch your braces)

...
while(/ (...) ... (...) ... (...)(?{($foo,$bar,$baz)=($1,$2,$3)})/g) {
print "$foo, $bar, $baz\n";
}
...


Regards

M.
 
B

Brian McCauley

L> Not elegant?! It's awesome! Thanks!

it is not elegant as it uses symrefs which is evil.

Using multiple named variables to implement what is logically a
composite data structure (array or hash) is evil.

The only way access such an evil structure is to use symref or eval().
(Of the two symrefs are the lesser evil).

In this case $1... are already, in effect, such a structure.

The 'evil' here is not in my code but in the underlying design
decision in early versions of Perl.

An alternative approach using substr($_...) would avoid symrefs but
the evil is still there. The fact that we choose to avert our eyes
does not reduce the evil.

See also:

http://groups.google.co.uk/group/co...read/thread/1ebb17826a236940/1a323f2e1968a83f
see my other post
for a solution without symrefs.

Your post does not appear to have propagated, could you re-post it
please.
 
U

Uri Guttman

BM> Using multiple named variables to implement what is logically a
BM> composite data structure (array or hash) is evil.

i would rather put the blame on the text being parsed! :)
the OP never showed any real text to parse. i have done scalar m//g
loops too but rarely with more than a few grabs so i don't mind the $1
style. if there are too many i would break up the text first into
sections and then parse out the grabs and assign them to a list of
scalars or a hash slice (which is the best way).

BM> The 'evil' here is not in my code but in the underlying design
BM> decision in early versions of Perl.

perl6 solves this problem as usual by allowing m//g loops but only
grabbing what is in the regex and allowing assignment to hash elements
among many other things.

BM> An alternative approach using substr($_...) would avoid symrefs but
BM> the evil is still there. The fact that we choose to avert our eyes
BM> does not reduce the evil.

but it looks so much neater with substr. :)

BM> Your post does not appear to have propagated, could you re-post it
BM> please.

not sure why as i saw it. let it rest as it was just a slight mod of
what is in perlvar about using substr and @- and @+.

uri
 
D

demerphq

L> I'm using a /g regex in a while loop to capture parenthesized matches
L> to meaningful variable names like this:

L> while (/ (...) ... (...) ... (...)/g) {
L> my ($foo, $bar, $baz) = ($1, $2, $3);
L> ...
L> }

L> The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

you can look at using @+, @- to get the strings via substr. a map call
on 0 .. $#+ will do it but it is ugly too

perldoc perlvar says this:
$1 is the same as "substr($var, $-[1], $+[1] - $-[1])"

so this should work (untested):

my ($foo, $bar, $baz) =
map substr($var, $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;

and that map stuff could be put into a sub to clean it up. just pass in
$var and the @+ and @- globals should still be set. something like this:

sub matches {
map substr($_[0], $-[$_], $+[$_] - $-[$_]), 0 .. $#+ ;
}

my ($foo, $bar, $baz) = matches( $var ) ;

but i would just stick with the assignment of $1, $2 ... as it is the
cleanest.

The problem with this approach is that it requires you to know what
string @- and @+ are operating on, which is actually somewhere between
difficult and impossible in the case of s///.

One solution that avoids this problem is the following, somewhat
crufty code:

sub matches { eval 'sub { \@_ }->(' . join(", ", map "\$$_", 1 .. $#
+ ) . ')' }

Now you can say

my $array=matches();

and have it do the right thing always*, even if we didn't make a copy
of the original string before we used s///.

*Of course the array returned for matches() is only "good" for the
results of a given match.

What we (perl5porters) really should do is provide a special magic
variable that returns the entire string that $1 and friends reference,
so then using @- and @+ would be safe. Unfortunately its too late for
that to make it into 5.10, although its possible for 5.10.1 i guess.

Yves
 
D

demerphq

The $n is an idiomatic expression which is
not that bad in my opinion.

You could fake 'named captures' like this:

Of use 5.10 when it comes out and make use its real named
captures. :)

Yves
 
W

William James

I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...

}

The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

BTW, don't suggest:

while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
...

}

That will cause the regex to evaluate in a list context, which changes
the behavior of /g to parse all of $_ at once, only returning the
first match and throwing away the rest.

Ruby

scan( / (...) ... (...) ... (...)/ ){ |foo, bar, baz|
...
}
 
S

szr

Larry said:
I'm using a /g regex in a while loop to capture parenthesized matches
to meaningful variable names like this:

while (/ (...) ... (...) ... (...)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
...
}

The ($1, $2, $3) part seems inelegant ... is there a more elegant way?

BTW, don't suggest:

while (my ($foo, $bar, $baz) = / (...) ... (...) ... (...)/g) {
...
}

That will cause the regex to evaluate in a list context, which changes
the behavior of /g to parse all of $_ at once, only returning the
first match and throwing away the rest.

Why not just do something like the following?

my $s = 'A1Z B2Y C3X D4W E5V';

### Inelegant - have to know amount of captures/loop-iteration
while ($s =~ /(\w)(\d)(\w)/g) {
my ($foo, $bar, $baz) = ($1, $2, $3);
print "'$foo' '$bar' '$baz'\n";
}

print "\n";

### More elegant - all matches for each iteration goes into an array
while (my @matches = $s =~ /\G.*?(\w)(\d)(\w)/) {
pos($s) = $+[0];
print "'", join("' '", @matches), "'\n";
}

___OUTPUT___
'A' '1' 'Z'
'B' '2' 'Y'
'C' '3' 'X'
'D' '4' 'W'
'E' '5' 'V'

'A' '1' 'Z'
'B' '2' 'Y'
'C' '3' 'X'
'D' '4' 'W'
'E' '5' 'V'


All you have to do is add \G.*? to the beginning of the regex, and
remove g from the end of the regex (modifier list.) Other than that,
you just need to have pos($s) = $+[0]; at the beginning of your loop
(or at least before the end of the loop, thouhg it seems safest to keep
it at the beginning, especially if you do any tests on pos($s)

:)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top