Dr.Ruud said:
(e-mail address removed) schreef:
The 'Must occur ... no more than m times' is not accurate.
It is accurate, when you realize that it is talking about the
characters that were actually part of the matched string.
Characters outside the matched string are irrelevant when
the match succeeds.
#!/usr/bin/perl -w
use strict;
my $s = 'a'x100; # is more than 50 times
sub run {
local ($,, $\) = (' ', "\n");
my $re; ($re, $_) = @_;
s/$re/$1/;
print length, length($1);
}
run 'a{10,50}?(.*)' , $s;
run 'a{10,50}?(.*?)a', $s;
run 'a{10,50}?(.*?)' , $s;
run 'a{10,50}(.*?)' , $s;
run 'a{10,50}(.*)' , $s;
output:
90 90
89 0
90 0
50 0
50 50
run 'a{10,50}?(.*)' , $s;
First part matches minimum; 'a'x10.
Second part matches rest of string; 'a'x90.
Replacing first+second with just second = 'a'x90.
Expected result of "90 90" = yes.
run 'a{10,50}?(.*?)a', $s;
First part matches minimum; 'a'x10.
Second part matches the null string.
Third part matches 11th a.
Replacing first+second+third with just second leaves the
89 characters that were not part of the overall match = 'a'x89.
Expected result of "89 0" = yes.
run 'a{10,50}?(.*?)' , $s;
First part matches minimum; 'a'x10.
Second part matches the null string.
Replacing first+second with just second leaves the
90 characters that were not part of the overall match = 'a'x90.
Expected result of "90 0" = yes.
run 'a{10,50}(.*?)' , $s;
First part matches maximum; 'a'x50.
Second part matches the null string.
Replacing first+second with just second leaves the
50 characters that were not part of the overall match = 'a'x50.
Expected result of "50 0" = yes.
run 'a{10,50}(.*)' , $s;
First part matches maximum; 'a'x50.
Second part matches rest of string; 'a'x50.
Replacing first+second with just second = 'a'x50.
Expected result of "50 50" = yes.
The s/$re/$1/ just confuses things. This is better:
#!/usr/bin/perl -w
use strict;
my $s = 'a'x100; # is more than 50 times
sub run {
my $re; ($re, $_) = @_;
/$re/;
print "\$1='$1' \$2='$2' rest=|$'|\n";
}
run '(a{10,50}?)(.*)' , $s;
run '(a{10,50}?)(.*?)a', $s;
run '(a{10,50}?)(.*?)' , $s;
run '(a{10,50})(.*?)' , $s;
run '(a{10,50})(.*)' , $s;
$1='aaaaaaaaaa'
$2='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
rest=||
$1='aaaaaaaaaa' $2=''
rest=|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa|
$1='aaaaaaaaaa' $2=''
rest=|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa|
$1='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' $2=''
rest=|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa|
$1='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
$2='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' rest=||
This shows that /a{10,50}?/ matches the first 10 characters of the
string and /a{10,50}/ matches the first 50 characters of the string.
-Joe