Case-insensitive matching on compiled regex?

U

usenet

Kindly consider this sample code which illustrates my question:

#!/usr/bin/perl
use strict; use warnings;

my $rx = qr{[A-C]};

while (<DATA>) {
# print if m{[A-C]}i;
print if m{$rx}i;
}

__DATA__
ABC
abc
def

I expect both print statements to have the same functionality, the only
difference is that the second statement uses a compiled regex.
However, the second print does not match data 'abc', even though I am
using the 'i' regex modifier (whereas the first statement matches both
'ABC' and 'abc', as I expected).

How can I apply case-insensitive matching to a compiled regular
expression, without needing to do something awful like my $rx =
qr{[A-Ca-c]} ?

Thanks!
 
D

Dr.Ruud

(e-mail address removed) schreef:
#!/usr/bin/perl
use strict; use warnings;

my $rx = qr{[A-C]};

while (<DATA>) {
# print if m{[A-C]}i;
print if m{$rx}i;
}

__DATA__
ABC
abc
def

I expect both print statements to have the same functionality, the
only difference is that the second statement uses a compiled regex.
However, the second print does not match data 'abc', even though I am
using the 'i' regex modifier (whereas the first statement matches both
'ABC' and 'abc', as I expected).

How can I apply case-insensitive matching to a compiled regular
expression, without needing to do something awful like my $rx =
qr{[A-Ca-c]} ?

Can't you do

my $rx = qr/[A-C]/i ;

in the first place?

The modifiers are compiled-in, so you can't change them afterwards; you
need to compile a new expression:

#!/usr/bin/perl
use strict;
use warnings;

my $r_ = '[A-C]';
my $rx;

$rx = qr{$r_} ;
print "$rx\n" ;

$rx = qr{$r_}i ;
print "$rx\n" ;

$rx = qr{$rx} ;
print "$rx\n" ;

$rx = qr{$rx}s ;
print "$rx\n" ;

$rx = qr{$r_}i ;
print "$rx\n" ;
 
I

it_says_BALLS_on_your forehead

Mirco said:
Hi,

[snip]
my $rx = qr{[A-C]};

while (<DATA>) {
# print if m{[A-C]}i;
print if m{$rx}i;
I expect both print statements to have the same functionality, the only
difference is that the second statement uses a compiled regex.
However, the second print does not match data 'abc', even though I am
using the 'i' regex modifier (whereas the first statement matches both
'ABC' and 'abc', as I expected).

The i-Modifier is already set (to _none_) in the compiled
regex, you cant modify this later. You only my say how to
_use_ the regex (like /g etc.)

Consider:
my $r1 = qr/[A-C]/; print $r1, "\n";

my $r2 = qr/[A-C]/i; print $r2, "\n";

Prints:
(?-xism:[A-C])
(?i-xsm:[A-C])

The first has all options (/ixsm) disabled,
the second one has the /i set.

Surprisingly (to me anyway), a lexical modifier doesn't work either:
=====================

use strict; use warnings;

my $rx = qr{[A-C]};


while (<DATA>) {
# print if m{[A-C]}i;
print if m{(?i:$rx)};
}


__DATA__
ABC
abc
def

===================

__OUTPUT__
ABC
 
D

Dr.Ruud

it_says_BALLS_on_your forehead schreef:
Surprisingly (to me anyway), a lexical modifier doesn't work either:

Print the resulting regex, and you'll see why:

$ perl -e 'print qr/(?-xism:[A-C])/i'
(?i-xsm:(?-xism:[A-C]))

A compiled regex is enclosed in (). Surrounding that with an extra level
doesn't change the "local" behaviour. The profit is that you can have a
regex that only partially matches case-insensitive.

#!/usr/bin/perl
use strict;
use warnings;

while ( <DATA> )
{
print /\A(?:[A-Z][.]?)+ [A-Z](?i:[a-z]+)$/m ? '+ ' : '- ' ;
print
}

__DATA__
ABC McClean
A.B.C McClean
A.B.C. McClean
J Jones
J. Jones
uri guttman
A. Sinan Unur
Dr.Ruud
Ruud H.G. van Tol
DJ Stunks
brian d foy
 
I

it_says_BALLS_on_your forehead

Dr.Ruud said:
it_says_BALLS_on_your forehead schreef:
Surprisingly (to me anyway), a lexical modifier doesn't work either:

Print the resulting regex, and you'll see why:

$ perl -e 'print qr/(?-xism:[A-C])/i'
(?i-xsm:(?-xism:[A-C]))

Understood, qr binds most tightly by virtue of the fact that the
'variable' is essentially atomic (sorta). It would be neat if qr could
be supplied with a string, and an optional set of flag parameters via a
hash.
 
I

it_says_BALLS_on_your forehead

it_says_BALLS_on_your forehead said:
Dr.Ruud said:
it_says_BALLS_on_your forehead schreef:
Surprisingly (to me anyway), a lexical modifier doesn't work either:

Print the resulting regex, and you'll see why:

$ perl -e 'print qr/(?-xism:[A-C])/i'
(?i-xsm:(?-xism:[A-C]))

Understood, qr binds most tightly by virtue of the fact that the
'variable' is essentially atomic (sorta). It would be neat if qr could
be supplied with a string, and an optional set of flag parameters via a
hash.

nvm...i'm on crack. just set the flags after the last string delimiter,
as was pointed out by Mirco. duh!
 
D

Dr.Ruud

it_says_BALLS_on_your forehead schreef:
Dr.Ruud:
it_says_BALLS_on_your forehead:
Surprisingly (to me anyway), a lexical modifier doesn't work either:

Print the resulting regex, and you'll see why:

$ perl -e 'print qr/(?-xism:[A-C])/i'
(?i-xsm:(?-xism:[A-C]))

Understood, qr binds most tightly by virtue of the fact that the
'variable' is essentially atomic (sorta). It would be neat if qr could
be supplied with a string, and an optional set of flag parameters via
a hash.

I just installed and tested Regexp::Optimizer, but it did quite the
opposite of what I hoped for:

#!/usr/bin/perl
use strict ;
use warnings ;
# use lib "$ENV{HOME}/lib" ;

use Regexp::Optimizer ;

my $RE_cell = qr/ [A-Z]+ [0-9]+ /x ;
my $RE_range = qr/ $RE_cell (?: : $RE_cell )? /x ;
print "$RE_range\n" ;

my $o = Regexp::Optimizer->new ;
my $re = $o->optimize( $RE_range ) ;

print "$re\n" ;
__END__

(?x-ism: (?x-ism: [A-Z]+ [0-9]+ ) (?: : (?x-ism: [A-Z]+
[0-9]+ ) )? )

(?-xism:(?x-ism: (?x-ism: [A-Z]+ [0-9]+ ) (?: : (?x-ism: [A-Z]+
[0-9]+ ) )? ))
 
D

Dr.Ruud

it_says_BALLS_on_your forehead schreef:
nvm...i'm on crack. just set the flags after the last string
delimiter, as was pointed out by Mirco. duh!

Sorry, what do you mean?
 
I

it_says_BALLS_on_your forehead

Dr.Ruud said:
it_says_BALLS_on_your forehead schreef:


Sorry, what do you mean?

so qr works as follows:
qr/STRING/

....I think I was confusing in referring to the slashes as string
delimiters. Perhaps I should have called them STRING boundaries?

I meant that it is easy to disable the default lexical flags set by qr
by doing what Mirco Wahab did earlier in the thread. I copy/pasted it
below.

from Mirco Wahab:

Consider:
my $r1 = qr/[A-C]/; print $r1, "\n";


my $r2 = qr/[A-C]/i; print $r2, "\n";


Prints:
(?-xism:[A-C])
(?i-xsm:[A-C])


The first has all options (/ixsm) disabled,
the second one has the /i set.
 
D

Dr.Ruud

it_says_BALLS_on_your forehead schreef:
Consider:
my $r1 = qr/[A-C]/; print $r1, "\n";
my $r2 = qr/[A-C]/i; print $r2, "\n";

Prints:
(?-xism:[A-C])
(?i-xsm:[A-C])

The first has all options (/ixsm) disabled,
the second one has the /i set.

You should have read that in perlop, a while ago.
:)

qr/STRING/imosx
This operator quotes (and possibly compiles) its STRING as a
regular expression. STRING is interpolated the same way as
PATTERN in "m/PATTERN/". If "'" is used as the delimiter, no
interpolation is done. Returns a Perl value which may be used
instead of the corresponding "/STRING/imosx" expression.

etc.
 
I

it_says_BALLS_on_your forehead

Dr.Ruud said:
it_says_BALLS_on_your forehead schreef:
Consider:
my $r1 = qr/[A-C]/; print $r1, "\n";
my $r2 = qr/[A-C]/i; print $r2, "\n";

Prints:
(?-xism:[A-C])
(?i-xsm:[A-C])

The first has all options (/ixsm) disabled,
the second one has the /i set.

You should have read that in perlop, a while ago.
:)

qr/STRING/imosx
This operator quotes (and possibly compiles) its STRING as a
regular expression. STRING is interpolated the same way as
PATTERN in "m/PATTERN/". If "'" is used as the delimiter, no
interpolation is done. Returns a Perl value which may be used
instead of the corresponding "/STRING/imosx" expression.

yes. i'm a dope.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top