Basic pattern matching - baffled

Discussion in 'Perl Misc' started by Xainin, Oct 2, 2008.

  1. Xainin

    Xainin Guest

    Help! I don't understand why this script:

    #!perl -w

    $a = 'C:\WINDOWS';
    $b = 'C:\WINDOWS';

    if ( $a =~ /^$b$/i ) {
    print "matched '$a' to '$b'\n";
    }
    else {
    print "UNMATCHED '$a' vs. '$b'\n";
    }

    $ta = quotemeta "$a";
    $tb = quotemeta "$b";
    if ( $ta =~ /^$tb$/i ) {
    print "(quoted) matched '$ta' to '$tb'\n";
    }
    else {
    print "(quoted) UNMATCHED '$ta' vs. '$tb'\n";
    }

    __END__

    Reports this:

    UNMATCHED 'C:\WINDOWS' vs. 'C:\WINDOWS'
    (quoted) UNMATCHED 'C\:\\WINDOWS' vs. 'C\:\\WINDOWS'

    --
    Hot water heaters? Hot water needs heating?
     
    Xainin, Oct 2, 2008
    #1
    1. Advertising

  2. Xainin

    Tim Greer Guest

    Xainin wrote:

    > Help! I don't understand why this script:
    >
    > #!perl -w
    >
    > $a = 'C:\WINDOWS';
    > $b = 'C:\WINDOWS';
    >
    > if ( $a =~ /^$b$/i ) {
    > print "matched '$a' to '$b'\n";
    > }
    > else {
    > print "UNMATCHED '$a' vs. '$b'\n";
    > }
    >
    > $ta = quotemeta "$a";
    > $tb = quotemeta "$b";
    > if ( $ta =~ /^$tb$/i ) {
    > print "(quoted) matched '$ta' to '$tb'\n";
    > }
    > else {
    > print "(quoted) UNMATCHED '$ta' vs. '$tb'\n";
    > }
    >
    > __END__
    >
    > Reports this:
    >
    > UNMATCHED 'C:\WINDOWS' vs. 'C:\WINDOWS'
    > (quoted) UNMATCHED 'C\:\\WINDOWS' vs. 'C\:\\WINDOWS'
    >


    The \W is activated in the regular expression as a "non word" character.
    The quotemeta will automatically disable (backwack \) characters that
    would otherwise be seen as a meta character or such things as ;, \,
    etc. are translated as \;, \\, etc.
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
     
    Tim Greer, Oct 2, 2008
    #2
    1. Advertising

  3. Xainin <> wrote:
    >Help! I don't understand why this script:
    >
    >#!perl -w


    Most people prefer
    use warnings;
    and
    use strict;

    >$a = 'C:\WINDOWS';
    >$b = 'C:\WINDOWS';
    >
    >if ( $a =~ /^$b$/i ) {


    You got a variation of 'perldoc -q "dos paths".

    You are trying to match 'C:' followed by a non-word character, followed
    by 'INDOWS' in the text 'C:\WINDOWS'.

    See 'perldoc perlre'

    jue
     
    Jürgen Exner, Oct 2, 2008
    #3
  4. Xainin

    Guest

    Xainin <> wrote:
    > Help! I don't understand why this script:
    >
    > #!perl -w
    >
    > $a = 'C:\WINDOWS';
    > $b = 'C:\WINDOWS';
    >
    > if ( $a =~ /^$b$/i ) {
    > print "matched '$a' to '$b'\n";
    > }
    > else {
    > print "UNMATCHED '$a' vs. '$b'\n";
    > }


    \W is special in a regex.

    >
    > $ta = quotemeta "$a";


    $a is not used as a regex, it is treated as a literal string. Protecting
    characters special to regexes in something not used that way is
    counterproductive.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    The costs of publication of this article were defrayed in part by the
    payment of page charges. This article must therefore be hereby marked
    advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
    this fact.
     
    , Oct 2, 2008
    #4
  5. Xainin

    Xainin Guest

    wrote:

    >Xainin <> wrote:
    >> Help! I don't understand why this script:
    >>
    >> #!perl -w
    >>
    >> $a = 'C:\WINDOWS';
    >> $b = 'C:\WINDOWS';
    >>
    >> if ( $a =~ /^$b$/i ) {
    >> print "matched '$a' to '$b'\n";
    >> }
    >> else {
    >> print "UNMATCHED '$a' vs. '$b'\n";
    >> }

    >
    >\W is special in a regex.
    >
    >>
    >> $ta = quotemeta "$a";

    >
    >$a is not used as a regex, it is treated as a literal string. Protecting
    >characters special to regexes in something not used that way is
    >counterproductive.
    >
    >Xho


    Thanks to all - I added strict/warnings and declared with "my", but the
    key per your last comment was to change "$ta" to "$a" in my last test and
    it works.

    --
    A waist is a terrible thing to mind.
     
    Xainin, Oct 3, 2008
    #5
  6. Xainin

    Guest

    On Fri, 03 Oct 2008 01:54:11 -0700, Xainin <> wrote:

    > wrote:
    >
    >>Xainin <> wrote:
    >>> Help! I don't understand why this script:
    >>>
    >>> #!perl -w
    >>>
    >>> $a = 'C:\WINDOWS';
    >>> $b = 'C:\WINDOWS';
    >>>
    >>> if ( $a =~ /^$b$/i ) {
    >>> print "matched '$a' to '$b'\n";
    >>> }
    >>> else {
    >>> print "UNMATCHED '$a' vs. '$b'\n";
    >>> }

    >>
    >>\W is special in a regex.
    >>
    >>>
    >>> $ta = quotemeta "$a";

    >>
    >>$a is not used as a regex, it is treated as a literal string. Protecting
    >>characters special to regexes in something not used that way is
    >>counterproductive.
    >>
    >>Xho

    >
    >Thanks to all - I added strict/warnings and declared with "my", but the
    >key per your last comment was to change "$ta" to "$a" in my last test and
    >it works.



    I'm not sure if you are getting the point.

    The regular expression is on the right, what your testing is on the left:

    $a = 'C:\WINDOWS';

    if ($a =~ /do/i) { # i modifier means case insensitive matching
    # regexp ^^
    print "matched $a to 'do'\n";
    }

    if ($a =~ /wi/i) {
    print "matched $a to 'wi'\n";
    }

    if ($a !~ /c:\windows/i) {
    # escape seq ^^
    print "did NOT match $a to 'c:\windows'\n";

    # in this case the regular expression had a \w in it
    # which is shorthand for all the letters, and all the
    # numbers that can be matched in that single character position.
    # the regexp now looks for:
    # c, :, then
    # any char or number (because of \w), then
    # i, n, d, o, w, s
    # however, in that character position in $a, the literal object
    # of comparison, is '\' and it fails
    }

    # to fix that, the regular expression needs to escape '\' the escape character.
    # this results in '\\'w instead of '\w'. there is no '\\' substitution (shorthand)
    # in the regex parser, so '\\' is treated as a single '\' when the regular expression
    # is parsed. thus '\\w' becomes the literal search pattern "\w" within $a.

    if ($a =~ /c:\\windows/i) {
    print "did match $a to 'c:\windows'\n";
    }

    Be sure not to confuse yourself with the constructs you have listed.
    It does not seam like you are distinguishing the string you want to test with
    the regular expression you use to test it with.

    sln
     
    , Oct 4, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lloyd Dupont

    baffled...

    Lloyd Dupont, Sep 1, 2004, in forum: ASP .Net
    Replies:
    2
    Views:
    1,231
    Lloyd Dupont
    Sep 2, 2004
  2. John English

    JNDI (LDAP): totally baffled!

    John English, Jun 9, 2005, in forum: Java
    Replies:
    4
    Views:
    770
    John English
    Jun 11, 2005
  3. Mark Thomas

    Baffled!

    Mark Thomas, Oct 24, 2003, in forum: C++
    Replies:
    4
    Views:
    476
    Jonathan Mcdougall
    Oct 25, 2003
  4. Marc Bissonnette

    Pattern matching : not matching problem

    Marc Bissonnette, Jan 8, 2004, in forum: Perl Misc
    Replies:
    9
    Views:
    237
    Marc Bissonnette
    Jan 13, 2004
  5. Bobby Chamness
    Replies:
    2
    Views:
    231
    Xicheng Jia
    May 3, 2007
Loading...

Share This Page