order of evaluation

Discussion in 'Perl Misc' started by Flo, Mar 26, 2007.

  1. Flo

    Flo Guest

    Hello

    Consider these two regular expressions
    1) .*.*
    2) .*?.*?

    which are equivalent to
    1) (.*)(.*)
    2) (.*?)(.*?)

    Where the * is the greedy 0-or-more quantifier and *? the lazy 0-or-
    more quantifier. In the following \1 is a backreference to the match
    inside the first parenthesis, \2 likewise for the 2nd parenthesis.

    As I understand it, most flavours of regular expressions have the
    following behaviour for the two regexes above
    1) \1 returns the whole target string, \2 returns the empty string
    2) \1 returns the empty string, \1 returns the whole target string

    Is that statement correct at all, i.e. do indeed most flavours have
    that behaviour?

    Which rules dictates that behaviour? I would say
    "for regexes, order of evaluation is left to right"
    But I am not sure since I never saw such a statement.

    Remember that "order of evaluation" and "precedence" are *not* the
    same thing, at least not in general. See also
    http://groups.google.com/group/comp.lang.c/browse_thread/thread/5bc23...

    Flo
     
    Flo, Mar 26, 2007
    #1
    1. Advertising

  2. Flo

    Flo Guest

    1. Advertising

  3. Flo

    Paul Lalli Guest

    On Mar 26, 6:41 am, "Flo" <> wrote:
    > Consider these two regular expressions
    > 1) .*.*
    > 2) .*?.*?
    >
    > which are equivalent to
    > 1) (.*)(.*)
    > 2) (.*?)(.*?)
    >
    > Where the * is the greedy 0-or-more quantifier and *? the lazy 0-or-
    > more quantifier. In the following \1 is a backreference to the match
    > inside the first parenthesis, \2 likewise for the 2nd parenthesis.
    >
    > As I understand it, most flavours of regular expressions have the
    > following behaviour for the two regexes above
    > 1) \1 returns the whole target string, \2 returns the empty string


    Yes.

    > 2) \1 returns the empty string, \2 returns the whole target string


    No. They both return the empty string. They're both non-greedy.
    There is nothing forcing the second one to return anything more than 0
    characters.

    You can check this for yourself:
    $ perl -le'"Foo Bar" =~ /(.*)(.*)/; print qq{"$1"-"$2"}'
    "Foo Bar"-""
    $ perl -le'"Foo Bar" =~ /(.*?)(.*?)/; print qq{"$1"-"$2"}'
    ""-""

    Now, if you had anchored the patterns to the start/end of the string,
    then you would be correct:
    $ perl -le'"Foo Bar" =~ /^(.*)(.*)$/; print qq{"$1"-"$2"}'
    "Foo Bar"-""
    $ perl -le'"Foo Bar" =~ /^(.*?)(.*?)$/; print qq{"$1"-"$2"}'
    ""-"Foo Bar"

    > Is that statement correct at all, i.e. do indeed most flavours have
    > that behaviour?


    I have no idea what you mean by "flavors". I'm talking about Perl
    regular expressions. If you're asking about regular expressions in
    some other language, you'd have to ask a group devoted to that
    language.

    > Which rules dictates that behaviour? I would say
    > "for regexes, order of evaluation is left to right"
    > But I am not sure since I never saw such a statement.


    perldoc perlre
    Alternatives are tried from left to right, so the first
    alternative found for which the entire expression matches,
    is the one that is chosen.

    Paul Lalli
     
    Paul Lalli, Mar 26, 2007
    #3
  4. Flo

    Flo Guest

    Can you also refer me to a rule within the documentation of Perl why
    in the first example, where you answered with yes, why it is like
    that. I'd like to know the name of the rule which states why the first
    and not the second star * greedely matches all. As I said, the fact is
    not determined by precedence. Precedence defines other things.
     
    Flo, Mar 26, 2007
    #4
  5. Flo

    Guest

    "Flo" <> wrote:
    > Can you also refer me to a rule within the documentation of Perl why
    > in the first example, where you answered with yes, why it is like
    > that.


    "Combining pieces together" in perldoc perlre.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Mar 26, 2007
    #5
  6. On Mar 26, 3:33 pm, "Flo" <> wrote without context:
    > Can you also refer me to a rule within the documentation of Perl why
    > in the first example, where you answered with yes, why it is like
    > that. I'd like to know the name of the rule which states why the first
    > and not the second star * greedely matches all.


    I'm guessing that this is a reply to Paul's post.

    You are not quoting enough context for this to make sense. You stated
    _yourself_ that the behaviour would be explained by a left-to-right
    rule but said you couldn't find such a statement in perlre.

    Paul found the phrase "left to right" in perlre.

    What are you still having trouble with?
     
    Brian McCauley, Mar 26, 2007
    #6
  7. Flo

    Flo Guest

    > Paul found the phrase "left to right" in perlre.
    >
    > What are you still having trouble with?


    Because that phrase is about alternation, i.e. the alternation
    operator |. It is thus not applicable to my problem.

    Flo
     
    Flo, Mar 26, 2007
    #7
  8. Flo

    Paul Lalli Guest

    On Mar 26, 1:18 pm, "Flo" <> wrote:
    > > Paul found the phrase "left to right" in perlre.

    >
    > > What are you still having trouble with?

    >
    > Because that phrase is about alternation, i.e. the alternation
    > operator |. It is thus not applicable to my problem.


    You're right. My mistake. I quoted the wrong passage. Instead, how
    about this one, from `perldoc perlretut`?
    o Principle 0: Taken as a whole, any regexp will be
    matched at the earliest possible position in the string.

    o Principle 1: In an alternation "a|b|c...", the leftmost
    alternative that allows a match for the whole regexp
    will be the one used.

    o Principle 2: The maximal matching quantifiers "?", "*",
    "+" and "{n,m}" will in general match as much of the
    string as possible while still allowing the whole regexp
    to match.

    o Principle 3: If there are two or more elements in a
    regexp, the leftmost greedy quantifier, if any, will
    match as much of the string as possible while still
    allowing the whole regexp to match. The next leftmost
    greedy quantifier, if any, will try to match as much of
    the string remaining available to it as possible, while
    still allowing the whole regexp to match. And so on,
    until all the regexp elements are satisfied.


    Pay close attentions to Principles 0 and 3.

    Paul Lalli
     
    Paul Lalli, Mar 26, 2007
    #8
  9. Flo wrote:
    > Here the correct link
    > http://groups.google.com/group/comp... order of evaluation &rnum=5#78e3e006c5b99c42


    While that is a fascinating article on "precedence" and "order of evaluation"
    it is talking about the C programming language, it has nothing to do with
    regular expressions. For almost everything you need to know about regular
    expressions get Jeffrey Friedl's book "Mastering Regular Expressions".

    http://www.oreilly.com/catalog/regex3/index.html


    John
    --
    Perl isn't a toolbox, but a small machine shop where you can special-order
    certain sorts of tools at low cost and in short order. -- Larry Wall
     
    John W. Krahn, Mar 26, 2007
    #9
  10. On Mar 26, 6:18 pm, "Flo" <> wrote:
    > > Paul found the phrase "left to right" in perlre.

    >
    > > What are you still having trouble with?

    >
    > Because that phrase is about alternation, i.e. the alternation
    > operator |. It is thus not applicable to my problem.


    Opps, my bad.
     
    Brian McCauley, Mar 27, 2007
    #10
  11. In article <>,
    "Flo" <> wrote:

    > I'd like to know the name of the rule which states why the first
    > and not the second star * greedely matches all.


    Um, I'm not sure it's directly stated. Others have posted sources that
    the regexp is evaluated left-to-right. By the time the second * comes
    up, there isn't anything "remaining" of the target.
    --
    Xiong Changnian
    xiong102ATxuefangDOTcom

    --
    Posted via a free Usenet account from http://www.teranews.com
     
    Xiong Changnian, Apr 2, 2007
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Nan Li

    evaluation order

    Nan Li, Nov 14, 2005, in forum: Java
    Replies:
    11
    Views:
    653
    Oliver Wong
    Nov 15, 2005
  2. Xavier Decoret

    Evaluation order for a=b

    Xavier Decoret, Jul 3, 2003, in forum: C++
    Replies:
    1
    Views:
    347
    Ron Natalie
    Jul 3, 2003
  3. Ilias Lazaridis
    Replies:
    2
    Views:
    392
    Ilias Lazaridis
    Apr 24, 2005
  4. Ilias Lazaridis
    Replies:
    74
    Views:
    763
    Ilias Lazaridis
    Apr 4, 2005
  5. Ilias Lazaridis
    Replies:
    18
    Views:
    334
    Bill Guindon
    Apr 9, 2005
Loading...

Share This Page