Regex confusion...

Discussion in 'Perl Misc' started by guthrie, Sep 27, 2007.

  1. guthrie

    guthrie Guest

    sorry for the beginner question, but...

    With this code
    my $img = "0-12345-abc";
    print " Match.1 ", (defined $img);
    print " Match.2 ", ($img =~ /\S/);
    print " Matched::", $1;
    print " Match.3 ", ($img =~ /^(\d)-(\d+)-(\w)$/);
    print " Matched::", $1, ", ", $2, ", ", $3;
    I expected:
    true, true, "012345-abc", true,
    and then: 0, 12345, abc

    Instead I get:
    true true "" false
    "" "" ""

    Actually: img= (0-52557-wind)
    Match.1 1 Match.2 1 Matched::
    Match.3 Matched::, ,

    Seems simple enough, what am I missing!
    - why doesn't the full string first match against /\S/ return the
    string
    - why doesn't the second (extracting) match work.

    The actual code I'm trying for is:
    if(defined $img and $img =~ /\S/) {
    if ($img =~ /^(\d)-(\d+)-(\w)$/)
    { my ($t, $zip, $type) = ($1, $2, $3); }
    else { die "ERROR: invalid URL arguments :: ${img}\n";}
    print "Debug: (", $t, ", ", $zip, ", ", $type, ")\n"; ##
    Debug...

    Thanks for any hints. Sorry for the confusion!
    Greg
     
    guthrie, Sep 27, 2007
    #1
    1. Advertising

  2. guthrie

    guthrie Guest

    Oops, small correction;

    > With this code
    > my $img = "0-12345-abc";
    > print " Match.1 ", (defined $img);
    > print " Match.2 ", ($img =~ /\S/);
    > print " Matched::", $&;

    print " pre-Matched::", $`, "\n";
    print " post-Matched::", $', "\n";
    I get::
    Match.1 1 Match.2 1 Matched::0
    pre-Matched::
    post-Matched::-52557-wind

    I expected:
    pre="", match="0-12345-abc", post=""
     
    guthrie, Sep 27, 2007
    #2
    1. Advertising

  3. guthrie

    Narthring Guest

    On Sep 27, 3:42 pm, guthrie <> wrote:
    > sorry for the beginner question, but...
    >
    > With this code
    > my $img = "0-12345-abc";
    > print " Match.1 ", (defined $img);
    > print " Match.2 ", ($img =~ /\S/);
    > print " Matched::", $1;
    > print " Match.3 ", ($img =~ /^(\d)-(\d+)-(\w)$/);
    > print " Matched::", $1, ", ", $2, ", ", $3;
    > I expected:
    > true, true, "012345-abc", true,
    > and then: 0, 12345, abc
    >
    > Instead I get:
    > true true "" false
    > "" "" ""
    >
    > Actually: img= (0-52557-wind)
    > Match.1 1 Match.2 1 Matched::
    > Match.3 Matched::, ,
    >
    > Seems simple enough, what am I missing!
    > - why doesn't the full string first match against /\S/ return the
    > string
    > - why doesn't the second (extracting) match work.
    >
    > The actual code I'm trying for is:
    > if(defined $img and $img =~ /\S/) {
    > if ($img =~ /^(\d)-(\d+)-(\w)$/)
    > { my ($t, $zip, $type) = ($1, $2, $3); }
    > else { die "ERROR: invalid URL arguments :: ${img}\n";}
    > print "Debug: (", $t, ", ", $zip, ", ", $type, ")\n"; ##
    > Debug...
    >
    > Thanks for any hints. Sorry for the confusion!
    > Greg


    $img =~ /^(\d)-(\d+)-(\w+)$/

    \w matches a single 'word' character, not an entire word.
     
    Narthring, Sep 27, 2007
    #3
  4. guthrie wrote:
    > sorry for the beginner question, but...
    >
    > With this code
    > my $img = "0-12345-abc";
    > print " Match.1 ", (defined $img);
    > print " Match.2 ", ($img =~ /\S/);
    > print " Matched::", $1;
    > print " Match.3 ", ($img =~ /^(\d)-(\d+)-(\w)$/);
    > print " Matched::", $1, ", ", $2, ", ", $3;
    > I expected:
    > true, true, "012345-abc", true,
    > and then: 0, 12345, abc
    >
    > Instead I get:
    > true true "" false
    > "" "" ""
    >
    > Actually: img= (0-52557-wind)
    > Match.1 1 Match.2 1 Matched::
    > Match.3 Matched::, ,
    >
    > Seems simple enough, what am I missing!
    > - why doesn't the full string first match against /\S/ return the
    > string


    The character class \S matches a single character so it can't match the full
    string. The expression ($img =~ /\S/) will only return "true" or "false"
    because you don't use the /g global option and/or you don't have any capturing
    parentheses in the pattern.


    > - why doesn't the second (extracting) match work.


    Because the pattern /^(\d)-(\d+)-(\w)$/ doesn't match the string
    '0-52557-wind'. -(\w)$ will only match one character between a hyphen and
    the end of the line but your string has four characters (wind) between the
    hyphen and the end of the line.



    John
    --
    Perl isn't a toolbox, but a small machine shop where you
    can special-order certain sorts of tools at low cost and
    in short order. -- Larry Wall
     
    John W. Krahn, Sep 27, 2007
    #4
  5. guthrie

    guthrie Guest

    -- Many thanks, very silly of me.

    I thought these were word & space matches, not just a single
    character.
    I did (mis-) read the documentation several times! :)

    Thanks again.
    Greg
     
    guthrie, Sep 27, 2007
    #5
  6. guthrie

    Ben Morrow Guest

    Quoth guthrie <>:
    > sorry for the beginner question, but...
    >
    > With this code
    > my $img = "0-12345-abc";
    > print " Match.1 ", (defined $img);
    > print " Match.2 ", ($img =~ /\S/);
    > print " Matched::", $1;


    You should never use the $N variables without checking the match
    succeeded. In any case, your pattern has no capturing parens, so $1 will
    be empty.

    Others have already noted that \S and \w only match single characters.

    > The actual code I'm trying for is:
    > if(defined $img and $img =~ /\S/) {
    > if ($img =~ /^(\d)-(\d+)-(\w)$/)
    > { my ($t, $zip, $type) = ($1, $2, $3); }


    This can be simplified to

    if (
    my ($t, $zip, $type) =
    $img =~ /^(\d)-(\d+)-(\w+)$/
    ) {

    which avoids the need to use the $N variables altogether.

    Ben
     
    Ben Morrow, Sep 28, 2007
    #6
  7. On Sep 27, 6:43 pm, wrote:
    > On Fri, 28 Sep 2007 00:10:42 +0100, Ben Morrow <> wrote:
    >
    > ...
    >
    > >This can be simplified to

    >
    > > if (
    > > my ($t, $zip, $type) =
    > > $img =~ /^(\d)-(\d+)-(\w+)$/
    > > ) {

    >
    > >which avoids the need to use the $N variables altogether.

    >
    > >Ben

    >
    > It might be quicker to check for sucess first then do the asignment
    >
    > $_ = "......";
    > if ( /^(\d)-(\d+)-(\w+)$/ )
    > {
    > #use $1,2,3 or asign
    > ($t, $zip, $type) = ($1, $2,

    ^^^^^^^^^^^^^^^^^^^^^^^^^^

    Are you sure... wouldn't your solution require an extra copy from
    $N.

    --
    Charles DeRykus
     
    comp.llang.perl.moderated, Sep 28, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    745
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,693
    Ant...
    Nov 6, 2003
  3. John Hunter

    regex confusion

    John Hunter, Dec 9, 2003, in forum: Python
    Replies:
    8
    Views:
    369
    John Hunter
    Dec 9, 2003
  4. Replies:
    3
    Views:
    833
    Reedick, Andrew
    Jul 1, 2008
  5. Regex confusion

    , Feb 19, 2007, in forum: Perl Misc
    Replies:
    6
    Views:
    103
    Tad McClellan
    Feb 20, 2007
Loading...

Share This Page