I
Ian
I have something strange happening with a recursive regexp compiled
with qr//x; It is a regular expression to match individual single
double and un-quoted strings, i.e. "string", 'string' and string.
It works fine when the sub parts of it are global variables, or
"local" variables, but if I change them to "my" variables, suddenly
they stop matching correctly (or at least start matching differently).
Anybody have any ideas why changing to "my" variables would affect it
this way?
I get the same behaviour using active perl 5.6.1, and perl 5.81 on
knoppix.
Other things I'd like to know if anybody has any idea are:
Is there a simpler way to regexp this kind of thing?
Why does perl crash with some recursive regexps?
Is there any particular reason for the warning generated when this
script is run using perl -W
I use test input something like:
aaa bbb "ccc"'ddd"ddd'"eee'eee" f\ \ \ ff
Here's the program, if you change the first two vars to "my" variables
it stops working. changing others don't seem to affect it.
#!perl
# Double-quoted-string data regexp
$dStringData = qr/
([^"\\]|\\.)+ (??{$dStringData})
|
"
/x;
# Single-quoted-string data regexp
$sStringData = qr/
([^'\\]|\\.)+ (??{$sStringData})
|
'
/x;
# Characters that are allowed in unquoted strings
$token = qr/([^\s\\'"]|\\.)/x;
# Unquoted-strings broken up by spaces regexp
$uStringData = qr/
(??{$token})+ (??{$uStringData})
|
\B|\b
/x;
# Matches single or double, single or unquoted strings
$string = qr/
(
(??{$token}) (??{$uStringData})
|
" (??{$dStringData})
|
' (??{$sStringData})
)
/x;
# Test program to identify "STRING"s or 'STRING's or STRINGs in the
input
while (<>) {
my @strings;
# remove them all one by one
while (/$string/) {
push @strings, $1;
s/$string//;
}
# print out of all them one by one
my $counter = 0;
foreach (@strings) {
print "$counter = [$_]\n";
$counter ++;
}
}
with qr//x; It is a regular expression to match individual single
double and un-quoted strings, i.e. "string", 'string' and string.
It works fine when the sub parts of it are global variables, or
"local" variables, but if I change them to "my" variables, suddenly
they stop matching correctly (or at least start matching differently).
Anybody have any ideas why changing to "my" variables would affect it
this way?
I get the same behaviour using active perl 5.6.1, and perl 5.81 on
knoppix.
Other things I'd like to know if anybody has any idea are:
Is there a simpler way to regexp this kind of thing?
Why does perl crash with some recursive regexps?
Is there any particular reason for the warning generated when this
script is run using perl -W
I use test input something like:
aaa bbb "ccc"'ddd"ddd'"eee'eee" f\ \ \ ff
Here's the program, if you change the first two vars to "my" variables
it stops working. changing others don't seem to affect it.
#!perl
# Double-quoted-string data regexp
$dStringData = qr/
([^"\\]|\\.)+ (??{$dStringData})
|
"
/x;
# Single-quoted-string data regexp
$sStringData = qr/
([^'\\]|\\.)+ (??{$sStringData})
|
'
/x;
# Characters that are allowed in unquoted strings
$token = qr/([^\s\\'"]|\\.)/x;
# Unquoted-strings broken up by spaces regexp
$uStringData = qr/
(??{$token})+ (??{$uStringData})
|
\B|\b
/x;
# Matches single or double, single or unquoted strings
$string = qr/
(
(??{$token}) (??{$uStringData})
|
" (??{$dStringData})
|
' (??{$sStringData})
)
/x;
# Test program to identify "STRING"s or 'STRING's or STRINGs in the
input
while (<>) {
my @strings;
# remove them all one by one
while (/$string/) {
push @strings, $1;
s/$string//;
}
# print out of all them one by one
my $counter = 0;
foreach (@strings) {
print "$counter = [$_]\n";
$counter ++;
}
}