Basic pattern matching - baffled

X

Xainin

Help! I don't understand why this script:

#!perl -w

$a = 'C:\WINDOWS';
$b = 'C:\WINDOWS';

if ( $a =~ /^$b$/i ) {
print "matched '$a' to '$b'\n";
}
else {
print "UNMATCHED '$a' vs. '$b'\n";
}

$ta = quotemeta "$a";
$tb = quotemeta "$b";
if ( $ta =~ /^$tb$/i ) {
print "(quoted) matched '$ta' to '$tb'\n";
}
else {
print "(quoted) UNMATCHED '$ta' vs. '$tb'\n";
}

__END__

Reports this:

UNMATCHED 'C:\WINDOWS' vs. 'C:\WINDOWS'
(quoted) UNMATCHED 'C\:\\WINDOWS' vs. 'C\:\\WINDOWS'
 
T

Tim Greer

Xainin said:
Help! I don't understand why this script:

#!perl -w

$a = 'C:\WINDOWS';
$b = 'C:\WINDOWS';

if ( $a =~ /^$b$/i ) {
print "matched '$a' to '$b'\n";
}
else {
print "UNMATCHED '$a' vs. '$b'\n";
}

$ta = quotemeta "$a";
$tb = quotemeta "$b";
if ( $ta =~ /^$tb$/i ) {
print "(quoted) matched '$ta' to '$tb'\n";
}
else {
print "(quoted) UNMATCHED '$ta' vs. '$tb'\n";
}

__END__

Reports this:

UNMATCHED 'C:\WINDOWS' vs. 'C:\WINDOWS'
(quoted) UNMATCHED 'C\:\\WINDOWS' vs. 'C\:\\WINDOWS'

The \W is activated in the regular expression as a "non word" character.
The quotemeta will automatically disable (backwack \) characters that
would otherwise be seen as a meta character or such things as ;, \,
etc. are translated as \;, \\, etc.
 
J

Jürgen Exner

Xainin said:
Help! I don't understand why this script:

#!perl -w

Most people prefer
use warnings;
and
use strict;
$a = 'C:\WINDOWS';
$b = 'C:\WINDOWS';

if ( $a =~ /^$b$/i ) {

You got a variation of 'perldoc -q "dos paths".

You are trying to match 'C:' followed by a non-word character, followed
by 'INDOWS' in the text 'C:\WINDOWS'.

See 'perldoc perlre'

jue
 
X

xhoster

Xainin said:
Help! I don't understand why this script:

#!perl -w

$a = 'C:\WINDOWS';
$b = 'C:\WINDOWS';

if ( $a =~ /^$b$/i ) {
print "matched '$a' to '$b'\n";
}
else {
print "UNMATCHED '$a' vs. '$b'\n";
}

\W is special in a regex.
$ta = quotemeta "$a";

$a is not used as a regex, it is treated as a literal string. Protecting
characters special to regexes in something not used that way is
counterproductive.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
X

Xainin

\W is special in a regex.


$a is not used as a regex, it is treated as a literal string. Protecting
characters special to regexes in something not used that way is
counterproductive.

Xho

Thanks to all - I added strict/warnings and declared with "my", but the
key per your last comment was to change "$ta" to "$a" in my last test and
it works.
 
S

sln

Thanks to all - I added strict/warnings and declared with "my", but the
key per your last comment was to change "$ta" to "$a" in my last test and
it works.


I'm not sure if you are getting the point.

The regular expression is on the right, what your testing is on the left:

$a = 'C:\WINDOWS';

if ($a =~ /do/i) { # i modifier means case insensitive matching
# regexp ^^
print "matched $a to 'do'\n";
}

if ($a =~ /wi/i) {
print "matched $a to 'wi'\n";
}

if ($a !~ /c:\windows/i) {
# escape seq ^^
print "did NOT match $a to 'c:\windows'\n";

# in this case the regular expression had a \w in it
# which is shorthand for all the letters, and all the
# numbers that can be matched in that single character position.
# the regexp now looks for:
# c, :, then
# any char or number (because of \w), then
# i, n, d, o, w, s
# however, in that character position in $a, the literal object
# of comparison, is '\' and it fails
}

# to fix that, the regular expression needs to escape '\' the escape character.
# this results in '\\'w instead of '\w'. there is no '\\' substitution (shorthand)
# in the regex parser, so '\\' is treated as a single '\' when the regular expression
# is parsed. thus '\\w' becomes the literal search pattern "\w" within $a.

if ($a =~ /c:\\windows/i) {
print "did match $a to 'c:\windows'\n";
}

Be sure not to confuse yourself with the constructs you have listed.
It does not seam like you are distinguishing the string you want to test with
the regular expression you use to test it with.

sln
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,023
Latest member
websitedesig25

Latest Threads

Top