pattern matching

Discussion in 'Perl Misc' started by LiHui, Apr 20, 2004.

  1. LiHui

    LiHui Guest

    Can someone tell me what does this line do ?

    $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o

    I know that it check to see if the line begin with "|" follow by
    whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?)

    Any help will be greatly appreciate. Thanks LH
     
    LiHui, Apr 20, 2004
    #1
    1. Advertising

  2. LiHui

    Brad Baxter Guest

    On Mon, 19 Apr 2004, LiHui wrote:

    > Can someone tell me what does this line do ?
    >
    > $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o
    >
    > I know that it check to see if the line begin with "|" follow by
    > whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?)


    Make that "|" followed by optional whitespace, a word character, optional
    nonwhitespace, optional whitespace, ten or more of "these": |x..., and
    ending in "|".

    Also, it's not s*(?:\|.+?), it's \s*(?:\|..+?). (?:...) are grouping (not
    capturing) parentheses. They're followed by {10,} so you want 10 or more
    of those groups. Each group is "|" followed by at least one, possibly
    more, character(s) that match(es) /./ but not "|" (because +? is
    non-greedy).

    Below is an expanded (using /x) version:

    # example line that will match
    my $line = '| abc |0|1|2|3|4|5|6|7|8|9|x|y|z|';

    $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o and print "yes\n";

    $line =~ m/
    ^ # begins with
    \| # 'or' bar
    \s* # optional whitespace
    \w # ONE word character
    \S* # optional nonwhitespace
    \s* # optional whitespace
    (?: # begin the group
    \| # 'or' bar
    .+ # at least one character
    ? # make the '+' non-greedy
    ) # end the group
    {10,} # give me 10 or more GROUPS
    \| # 'or' bar
    $ # at the end
    /ox and print "yes\n";


    I assume you've looked at perldoc perlre.

    Regards,

    Brad
     
    Brad Baxter, Apr 20, 2004
    #2
    1. Advertising

  3. Brad Baxter wrote:
    >
    > On Mon, 19 Apr 2004, LiHui wrote:
    >
    > > Can someone tell me what does this line do ?
    > >
    > > $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o
    > >
    > > I know that it check to see if the line begin with "|" follow by
    > > whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?)

    >
    > Make that "|" followed by optional whitespace, a word character, optional
    > nonwhitespace, optional whitespace, ten or more of "these": |x..., and
    > ending in "|".
    >
    > Also, it's not s*(?:\|.+?), it's \s*(?:\|..+?). (?:...) are grouping (not
    > capturing) parentheses. They're followed by {10,} so you want 10 or more
    > of those groups. Each group is "|" followed by at least one, possibly
    > more, character(s) that match(es) /./ but not "|" (because +? is
    > non-greedy).
    >
    > Below is an expanded (using /x) version:
    >
    > # example line that will match
    > my $line = '| abc |0|1|2|3|4|5|6|7|8|9|x|y|z|';
    >
    > $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o and print "yes\n";
    >
    > $line =~ m/
    > ^ # begins with
    > \| # 'or' bar
    > \s* # optional whitespace
    > \w # ONE word character
    > \S* # optional nonwhitespace
    > \s* # optional whitespace
    > (?: # begin the group
    > \| # 'or' bar
    > .+ # at least one character
    > ? # make the '+' non-greedy
    > ) # end the group
    > {10,} # give me 10 or more GROUPS
    > \| # 'or' bar
    > $ # at the end
    > /ox and print "yes\n";
    >
    > I assume you've looked at perldoc perlre.


    Also, the /o option is not required as there are no variables in the
    regular expression.

    perldoc perlop


    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, Apr 20, 2004
    #3
  4. LiHui

    LiHui Guest

    Thanks Brad & John. Got it now.

    LiHui
     
    LiHui, Apr 21, 2004
    #4
  5. LiHui

    Scott J Guest

    On 19 Apr 2004 18:58:30 -0700, (LiHui)
    wrote:

    >Can someone tell me what does this line do ?
    >
    >$line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o
    >
    >I know that it check to see if the line begin with "|" follow by
    >whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?)
    >
    >Any help will be greatly appreciate. Thanks LH


    I'll give it a go :) My regex abilities are a bit rusty.

    ^\| line starts with |
    \s* followed by 0 or more whitespaces
    \w followed by an alphanumeric character
    \S* followed by 0 or more non whitespaces
    \s* followed by 0 or more whitespaces
    (?:\|.+?){10,} is a quantified extended regex sequence (see below)
    \|$ line ends with |

    o switch tells the pattern to compile only once.

    Quantified regex sequence :
    (?:...) is a cluster only parenthesis, no capturing (thanks Camel
    book) which I think means that the pattern matches, but does not store
    the matched string in a variable. The remainder of this sequence is a
    regular regex :
    \| matches |
    ..+? matches one character, 1 or more times (minimally)

    {10,} tells the pattern inside the () to match at least 10 times

    Does that help or hinder?

    Scott
     
    Scott J, Jul 7, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. DelphiDude
    Replies:
    3
    Views:
    1,176
  2. danpres2k
    Replies:
    3
    Views:
    7,496
    danpres2k
    Aug 25, 2003
  3. CV
    Replies:
    2
    Views:
    596
    Charles DeRykus
    Aug 31, 2004
  4. Marc Bissonnette

    Pattern matching : not matching problem

    Marc Bissonnette, Jan 8, 2004, in forum: Perl Misc
    Replies:
    9
    Views:
    244
    Marc Bissonnette
    Jan 13, 2004
  5. Bobby Chamness
    Replies:
    2
    Views:
    240
    Xicheng Jia
    May 3, 2007
Loading...

Share This Page