Hi all,
In perlre, the documentation for (?{...}) says:
"Due to an unfortunate implementation issue, the Perl code contained
in these blocks is treated as a compile time closure that can have
seemingly bizarre consequences when used with lexically scoped
variables inside of subroutines or loops. There are various
workarounds for this, including simply using global variables instead.
If you are using this construct and strange results occur then check
for the use of lexically scoped variables."
I'm indeed seeing weird things, mainly variables getting undefined for
no reason.
Any ideas on what the "various workarounds" that TFM is speaking of
are? I'd rather stay as far away from global vars as I can.
Thanks,
--Ala
Probably the key phrase is "the Perl code contained in these blocks";
Imbedding code inside of a regular expression really has limited use
unless used in conjunction with a conditional or to immediatly store
the value of the last capture group.
It gets better, you can't nest another regular expression in the block
(since the engine is not reentrant).
Seemingly better results happen when you call a named subroutine from the
code block. Here, lexicals seem to work and m// seems to work, but not s///
(the latter causes a crash on my machine, so be carefull not to call a
Perl function that uses the regex engine).
It seems the lack of explanation and numerous caveats are meant as a warning
to stay clear.
Below are a few examples.
The first one trys a lexical within the code block (not too good).
The second calls a subroutine (that does lexicals) from withing the code block.
The third is an example of somebody's IP parser I cleaned up that
shows extended conditional and code embedding (using 5.10).
Anyway, its hit or miss with extended/experimental stuff.
-sln
## ex. 1
================
use strict;
use warnings;
my $string = "yes no yes no";
while ( $string =~ /yes(?{my $test = printmsg(); print "test = '$test'\n";})/g) {}
sub printmsg
{
print "found yes\n";
'';
}
__END__
Output:
found yes
test = ''
Use of uninitialized value $string in pattern match (m//) at dd.pl line 6.
## ex. 2
================
use strict;
use warnings;
my $string = "yes no yes no";
my $test = "this is test";
while ( $string =~ /yes(?{$test = printmsg($test);})/g) {}
print "test = '$test'\n";
sub printmsg
{
my $param = shift;
my $count = 2;
while ($count--) {
print "($count)found yes, was passed '$param'\n";
}
return '';
## Cannot do regex if being called from embedded code
}
__END__
Output:
(1)found yes, was passed 'this is test'
(0)found yes, was passed 'this is test'
(1)found yes, was passed ''
(0)found yes, was passed ''
test = ''
## ex. 3
================
## IpMatch_5_10.pl
## (To test new Perl 5.10 conditionals)
##
require 5.10.0; # 5.10 only, new extended regex
use strict;
use warnings;
my $Octlimit = 255;
my $OctetPat = qr/
\b (\d{1,3}) \b # capture a 3 digit number on boundries
(?(?{ # start conditional code block
# print "$^N\n"; # uncomment to print what matched last
$^N > $Octlimit # condition: is number > octet limit ?
}) # end code block
(*FAIL) # yes, condition is true, force pattern to fail for this number
)
/x;
my $dottedQuadPat = qr/ # Capture quad parts to named variables in the %+ hash
\s*
(?<O1>$OctetPat)
\.
(?<O2>$OctetPat)
\.
(?<O3>$OctetPat)
\.
(?<O4>$OctetPat)
\s*
/x;
my $DressedIPv4Pat = qr/ # Capture dressed quad parts to named variables in the %+ hash
\s* \[
$dottedQuadPat
\] \s*
/x;
while (my $ip = <DATA>)
{
chomp $ip;
next if !length($ip);
print "IP:\n'$ip'\n";
## Match all valid ip octets
my @match = $ip =~/$OctetPat/g;
if (@match)
{
print " ++ matched single octets\n";
for my $val (@match) {
print " $val\n";
}
} else {
print " -- no single octet match\n";
}
## Match dotted quad ip
if ($ip =~ /^$dottedQuadPat$/)
{
print " ++ matched quad #.#.#.#\n";
foreach my $key (sort keys %+) {
print " $key = $+{$key}\n";
}
} else {
print " -- no strict quad match\n";
}
## Match dressed dotted quad ip
if ($ip =~ /^$DressedIPv4Pat$/)
{
print " ++ matched dressed quad [#.#.#.#]\n";
foreach my $key (sort keys %+) {
print " $key = $+{$key}\n";
}
} else {
print " -- no strict dressed quad match\n";
}
}
__DATA__
1.12.123.254.255.256.4872
1.12.123.254
[123.254.255.255]