I read up on this on the www and I found ideas like
if ( /\b([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b/ ) ...
which is pretty uncipherable at a glance and just in general not
elegant in any sense.
I generally do something like
if ( /(\d+)/ && $1 > 256 && $1 < 1024 )
Which to me is a lot more readable at a glance, but like the example
above not overly elegant..
But what I'd REALLY like to do is, similar to the trick for numeric
sort, a way to do it in the regex like
/[256-1024]/ # but force it to be numeric, not literal perhaps with a
switch
Thoughts, Masters?
/[256-1024]/ is generally possible.
It has limitations that affect the surrounding expressions, but it
could be worked around and functionally generalized (again within
specific limitations).
-sln
-----------------------
use strict;
use warnings;
my $str = '0001023 widgets';
# Inline code is going to be a thing of the future and definitely
# going to happen (see perl 6 regex).
# This allows parameter checking and is usefull when the source
# has extended data to be regex analyzed in one expression.
if ($str =~ / \b (\d+) \b
(?(?{$^N > 256 && $^N < 1024}) # is this number between 256-1024?
# yes, continue processing
|
(*FAIL) # no, fail outright
)
# more expressions here ..
\s*
(.+)
/x )
{
print "Number: '$1', Type: '$2'\n";
}
else {
print "failed\n";
}
print "\n";
# This does a source conversion of \d+ to a single utf8 character.
# It then allows checking it in a HEX numeric range character class.
# Even though the source is decimal, '1023', when magically assumed to
# be hex and converted to a utf8 char like "\x{1023}", its code point
# will be corectly matched within a regex character class range.
# Example: "\x{1023}" =~ /[\x{257}-\x{1023}]/ will match.
# And, only "\x{N}" where N is between 257-1023 will match.
for (0 .. 4096)
{
# Construct a fake string using the current counter.
# In reality, you have to parse the source string and do the conversion
# so that you end up doing something like this:
# $src =~ /^(.*?)\b(\d+)\b(.*?)$/
# eval "\$temp_src = \"$1\\x{$2}$3\" ";
# Then use the $temp_src in place of the $str below.
my $padded_string = "000$_"; # the extra '000' padding is just a test
eval "\$str = \"\\x{$padded_string} widgets\" ";
if ( $str =~ /^ ([\x{257}-\x{1023}])
\s*
(.+)
/x )
{
print "Number: '$padded_string', Type: '$2'\n";
}
}
__END__
Output
------------
Number: '0001023', Type: 'widgets'
Number: '000257', Type: 'widgets'
Number: '000258', Type: 'widgets'
Number: '000259', Type: 'widgets'
Number: '000260', Type: 'widgets'
Number: '000261', Type: 'widgets'
Number: '000262', Type: 'widgets'
Number: '000263', Type: 'widgets'
Number: '000264', Type: 'widgets'
Number: '000265', Type: 'widgets'
Number: '000266', Type: 'widgets'
Number: '000267', Type: 'widgets'
...
...
Number: '0001012', Type: 'widgets'
Number: '0001013', Type: 'widgets'
Number: '0001014', Type: 'widgets'
Number: '0001015', Type: 'widgets'
Number: '0001016', Type: 'widgets'
Number: '0001017', Type: 'widgets'
Number: '0001018', Type: 'widgets'
Number: '0001019', Type: 'widgets'
Number: '0001020', Type: 'widgets'
Number: '0001021', Type: 'widgets'
Number: '0001022', Type: 'widgets'
Number: '0001023', Type: 'widgets'