Why qr// needs /o modifier, or bug in a documentation.

Discussion in 'Perl Misc' started by ddtl, Sep 2, 2003.

  1. ddtl

    ddtl Guest

    Hello everybody,

    I have some difficulty to understand why does qr// operator needs
    'o' modifier. There seems to be a disagreement between perlop manpage
    and "Programming Perl" (3rd edition - i will use PP for short from now
    on).

    From PP (chapter 5.9.2.2), it is clear that qr// needed for the cases
    when it is impossible to use usual /o modifier, and a programmer wants
    to spare recompilation every time RE is evaluated. Here is a quote:

    -----------------------------------------------------------------------
    Variables that interpolate into patterns necessarily do so at run time,
    not compile time. This slows down execution because Perl has to check
    whether you've changed the contents of the variable; if so, it would
    have to recompile the regular expression. As mentioned in
    "Pattern-Matching Operators", if you promise never to change the pattern,
    you can use the /o option to interpolate and compile only once:

    print if /$pattern/o;

    Although that works fine in our pgrep program, in the general case,
    it doesn't. Imagine you have a slew of patterns, and you want to match
    each of them in a loop, perhaps like this:

    foreach $item (@data) {
    foreach $pat (@patterns) {
    if ($item =~ /$pat/) { ... }
    }
    }

    You couldn't write /$pat/o because the meaning of $pat varies each time
    through the inner loop.

    The solution to this is the qr/PATTERN/imosx operator. This operator
    quotes--and compiles--its PATTERN as a regular expression. PATTERN is
    interpolated the same way as in m/PATTERN/. If ' is used as the delimiter,
    no interpolation of variables (or the six translation escapes) is done.
    The operator returns a Perl value that may be used instead of the equivalent
    literal in a corresponding pattern match or substitute.
    -----------------------------------------------------------------------

    But perlop manpage says something different:


    -----------------------------------------------------------------------
    qr/STRING/imosx

    This operator quotes (and possibly compiles) its STRING as a regular
    expression. STRING is interpolated the same way as PATTERN in m/PATTERN/.
    If "'" is used as the delimiter, no interpolation is done. Returns a
    Perl value which may be used instead of the corresponding /STRING/imosx
    expression.
    -----------------------------------------------------------------------

    According to the manpage, it is quite common that qr// does not compile
    RE, which is, except being different from the said in the book, doesn't
    really make sense - why would otherwise anybody will need qr// for?

    Also, the book doesn't even mention an existence of /o modifier for
    qr// when he talks about modifiers (in the same section. though it
    does mention it in the previous quote):

    -----------------------------------------------------------------------
    ....
    ....
    The reason this works is because the qr// operator returns a special kind
    of object that has a stringification overload as described in Chapter 13,
    "Overloading". If you print out the return value, you'll see the equivalent
    string:

    $re = qr/my.STRING/is;
    print $re; # prints (?si-xm:my.STRING)

    The /s and /i modifiers were enabled in the pattern because they were
    supplied to qr//. The /x and /m, however, are disabled because they were not.
    -----------------------------------------------------------------------



    Additionally, using "use re "debug";" option, i checked what is the
    difference between when you add /o modifier to qr// and when you don't -
    and as i found out - there is no difference, the expression was compiled
    only once when compiler reached qr// operator (while it was compiled every
    time RE was evaluated when a usual double-quoted string was used.
    Here is my test case:

    ---------------------------------------
    #!/usr/bin/perl
    use strict;
    use re "debug";

    my $re = qr/world/;
    "hello world" =~ /$re/;
    "hello world" =~ /$re/;
    ---------------------------------------


    The output i get when running the program (which is the same regardless of
    /o modifier)



    ---------------------------------------
    Compiling REx `world'
    size 4 Got 36 bytes for offset annotations.
    first at 1
    1: EXACT <world>(4)
    4: END(0)
    anchored `world' at 0 (checking anchored isall) minlen 5
    Offsets: [4]
    1[5] 0[0] 0[0] 6[0]
    Guessing start of match, REx `world' against `hello world'...
    Found anchored substr `world' at offset 6...
    Starting position does not contradict /^/m...
    Guessed: match at offset 6
    Guessing start of match, REx `world' against `hello world'...
    Found anchored substr `world' at offset 6...
    Starting position does not contradict /^/m...
    Guessed: match at offset 6
    Freeing REx: `"world"'
    ---------------------------------------




    Is there is an error in the manpage? If it is not, how it is possible to
    explain the difference between the book and the manpage, and especially -
    why there is a need for qr// according to the manpage's version?


    ddtl.
    ddtl, Sep 2, 2003
    #1
    1. Advertising

  2. ddtl

    Anno Siegel Guest

    ddtl <> wrote in comp.lang.perl.misc:
    >
    > Hello everybody,
    >
    > I have some difficulty to understand why does qr// operator needs
    > 'o' modifier. There seems to be a disagreement between perlop manpage
    > and "Programming Perl" (3rd edition - i will use PP for short from now
    > on).


    The qr// operator doesn't need the /o modifier, you must have misunder-
    stood what the documentation is saying. The relation is that qr// can
    be used to achieve what /o would be needed for without it.

    [snip]

    Anno
    Anno Siegel, Sep 3, 2003
    #2
    1. Advertising

  3. ddtl

    ddtl Guest


    >The qr// operator doesn't need the /o modifier, you must have misunder-
    >stood what the documentation is saying. The relation is that qr// can
    >be used to achieve what /o would be needed for without it.


    Maybe i have chosen rather wrong wording. qr// certainly does not
    *need* the /o operator in a sense that it is syntactically correct to
    write a statement containing qr// operator without writing an /o
    modifier, but if you don't add /o modifier, RE will be compiled every
    time it is evaluated, that is what manpage says.

    First of all, operator's signature is (everything between double
    quotes is quoted from perlop manpage):

    "qr/STRING/imosx"

    that is, obviously there *is* /o modifier for qr//.

    "Options are:

    i Do case-insensitive pattern matching.
    m Treat string as multiple lines.
    o Compile pattern only once.
    s Treat string as single line.
    x Use extended regular expressions.
    "

    As the quote above says, if you use /o, pattern is compiled only once,
    which obviously means that if you don't use /o - pattern would be
    compiled more then once, which is exactly what happens when instead of
    using 'qr//'ed expression you use a plain variable in m// or s//.


    So, obviously qr// operator *does* need the /o modifier in order to
    be equivalent to the usual RE with /o modifier, which means that the
    question is still valid.

    Maybe you have another explanation to the above quotes from the manpage?
    For the time being i don't see any other way to understand it...


    ddtl.
    ddtl, Sep 3, 2003
    #3
  4. On 3 Sep 2003, Anno Siegel wrote:

    >ddtl <> wrote in comp.lang.perl.misc:
    >>
    >> Hello everybody,
    >>
    >> I have some difficulty to understand why does qr// operator needs
    >> 'o' modifier. There seems to be a disagreement between perlop manpage
    >> and "Programming Perl" (3rd edition - i will use PP for short from now
    >> on).

    >
    >The qr// operator doesn't need the /o modifier, you must have misunder-
    >stood what the documentation is saying. The relation is that qr// can
    >be used to achieve what /o would be needed for without it.


    But the qr// operator can take the /o modifier. Perhaps it's a super rare
    condition, but you can use it.

    --
    Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
    Jeff 'japhy' Pinyan, Sep 3, 2003
    #4
  5. ddtl

    Sam Holden Guest

    On Wed, 03 Sep 2003 18:52:18 +0400, ddtl <> wrote:
    >
    >>The qr// operator doesn't need the /o modifier, you must have misunder-
    >>stood what the documentation is saying. The relation is that qr// can
    >>be used to achieve what /o would be needed for without it.

    >
    > Maybe i have chosen rather wrong wording. qr// certainly does not
    > *need* the /o operator in a sense that it is syntactically correct to
    > write a statement containing qr// operator without writing an /o
    > modifier, but if you don't add /o modifier, RE will be compiled every
    > time it is evaluated, that is what manpage says.


    The manpage does not say that. In fact the manpage says almost the opposite:

    Since Perl may compile the pattern at the moment of execution of qr()
    operator, using qr() may have speed advantages in some situations,
    notably if the result of qr() is used standalone:

    [snip example that doesn't use /o]

    Precompilation of the pattern into an internal representation at the
    moment of qr() avoids a need to recompile the pattern every time a
    match "/$pat/" is attempted.

    - perldoc perlop



    > First of all, operator's signature is (everything between double
    > quotes is quoted from perlop manpage):
    >
    > "qr/STRING/imosx"
    >
    > that is, obviously there *is* /o modifier for qr//.
    >
    > "Options are:
    >
    > i Do case-insensitive pattern matching.
    > m Treat string as multiple lines.
    > o Compile pattern only once.
    > s Treat string as single line.
    > x Use extended regular expressions.
    > "
    >
    > As the quote above says, if you use /o, pattern is compiled only once,
    > which obviously means that if you don't use /o - pattern would be
    > compiled more then once, which is exactly what happens when instead of
    > using 'qr//'ed expression you use a plain variable in m// or s//.


    Just because A -> B, does not mean that !A -> !B.

    Just because the pattern is compiled once with /o, does not mean that
    the pattern is not compiled once without /o.

    >
    > So, obviously qr// operator *does* need the /o modifier in order to
    > be equivalent to the usual RE with /o modifier, which means that the
    > question is still valid.
    >
    > Maybe you have another explanation to the above quotes from the manpage?
    > For the time being i don't see any other way to understand it...


    The o on qr//o doesn't do anything, since qr// precompiles already.

    For example:

    $needle = 'foo';
    $re = qr/$needle/;
    $reo = qr/$needle/o;

    sub check {
    print "\$needle = $needle\n";
    for (@_) {
    print ' /$needle/ matches ',"$_\n" if /$needle/;
    print ' /$needle/o matches ',"$_\n" if /$needle/o;
    print ' /$re/ matches ', "$_\n" if /$re/;
    print ' /$reo/ matches ', "$_\n" if /$reo/;
    }
    }


    check('barbaz');
    $needle = 'bar';
    check('barbaz');

    Obviously qr// *does not* need the /o modifier in order to be
    equivalent to the usual RE with the /o modifier, as evidenced
    by the fact that /$needle/o, /$re/, and /$reo/ all fail to match
    'barbaz' even though $needle is set to 'bar' in the above code.

    --
    Sam Holden
    Sam Holden, Sep 3, 2003
    #5
  6. ddtl

    Anno Siegel Guest

    ddtl <> wrote in comp.lang.perl.misc:
    >
    > >The qr// operator doesn't need the /o modifier, you must have misunder-
    > >stood what the documentation is saying. The relation is that qr// can
    > >be used to achieve what /o would be needed for without it.

    >
    > Maybe i have chosen rather wrong wording. qr// certainly does not
    > *need* the /o operator in a sense that it is syntactically correct to
    > write a statement containing qr// operator without writing an /o
    > modifier, but if you don't add /o modifier, RE will be compiled every
    > time it is evaluated, that is what manpage says.
    >
    > First of all, operator's signature is (everything between double
    > quotes is quoted from perlop manpage):
    >
    > "qr/STRING/imosx"
    >
    > that is, obviously there *is* /o modifier for qr//.
    >
    > "Options are:
    >
    > i Do case-insensitive pattern matching.
    > m Treat string as multiple lines.
    > o Compile pattern only once.
    > s Treat string as single line.
    > x Use extended regular expressions.
    > "
    >
    > As the quote above says, if you use /o, pattern is compiled only once,
    > which obviously means that if you don't use /o - pattern would be
    > compiled more then once, which is exactly what happens when instead of
    > using 'qr//'ed expression you use a plain variable in m// or s//.
    >
    >
    > So, obviously qr// operator *does* need the /o modifier in order to
    > be equivalent to the usual RE with /o modifier, which means that the
    > question is still valid.


    Your question seems to be: If qr// is supposed to be used when /o can't
    be (because the pattern changes occasionally), why is it allowed to
    recompile each time (without /o).

    The answer is, you are not supposed to just replace /.../o by qr/.../
    literally. qr// allows you to compile a regex in one place, and apply
    it in another. So you recompile (using qr//) when needed, and replace
    the /.../o with a variable that holds the value where you want to apply
    the regex.

    Look again at the examples, which you quoted in your first post. They
    make pretty clear how qr// is supposed to solve the //o problem.

    > Maybe you have another explanation to the above quotes from the manpage?
    > For the time being i don't see any other way to understand it...


    You have made a wrong assumption: That qr// goes where the regex used to be.

    Anno
    Anno Siegel, Sep 3, 2003
    #6
  7. [posted & mailed]

    On 3 Sep 2003, Sam Holden wrote:

    > $needle = 'foo';
    > $re = qr/$needle/;
    > $reo = qr/$needle/o;


    This is a bad example. These lines are only RUN once.

    Compare:

    sub make_qr {
    my $pat = shift;
    return qr/$pat/o;
    }

    print make_qr('foo'), "\n";
    print make_qr('bar'), "\n";

    It prints the foo regex both times.

    --
    Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
    Jeff 'japhy' Pinyan, Sep 4, 2003
    #7
  8. ddtl

    Sam Holden Guest

    On Wed, 3 Sep 2003 22:22:47 -0400, Jeff 'japhy' Pinyan <> wrote:
    > [posted & mailed]
    >
    > On 3 Sep 2003, Sam Holden wrote:
    >
    >> $needle = 'foo';
    >> $re = qr/$needle/;
    >> $reo = qr/$needle/o;

    >
    > This is a bad example. These lines are only RUN once.


    I thought the post I was replying to was refering to that case...

    Of course, I'm often wrong.

    --
    Sam Holden
    Sam Holden, Sep 4, 2003
    #8
  9. ddtl

    ddtl Guest


    >Your question seems to be: If qr// is supposed to be used when /o can't
    >be (because the pattern changes occasionally), why is it allowed to
    >recompile each time (without /o).
    >
    >The answer is, you are not supposed to just replace /.../o by qr/.../
    >literally. qr// allows you to compile a regex in one place, and apply
    >it in another. So you recompile (using qr//) when needed, and replace
    >the /.../o with a variable that holds the value where you want to apply
    >the regex.


    But that does not explain why /o is needed! Yes, qr// allows you to
    compile a regex in one place, and apply it in another, because if
    it wasn't possible your RE would be recompiled every time it is evaluated,
    and we want to be able to compile RE only once, so we use qr//. But
    if we use qr// with /o, RE would be compiled every time RE is evaluated,
    which defeats the whole reason for usage of qr//o.

    Maybe you could give an example when it makes difference between
    using qr//o and plain quoted string, that is, between:

    -----------------
    my $re = qr/hello/o;
    ....
    ....
    /$re/;
    ....
    ....
    /$re/;
    -----------------

    and:

    -----------------
    my $re = /hello/;
    ....
    ....
    /$re/;
    ....
    ....
    /$re/;
    -----------------

    According to the manpage, whenever "/$re/;" is evaluated, RE would be
    recompiled in both examples, so why would i use qr//o at all - exactly
    the same thing happens when qr//o is not used!

    That is without the fact that in the first example RE actually compiled
    only once (when "my $re = qr/hello/o;" is being evaluated - and there is
    no difference whether you use /o or not - which means that /o does not
    has *any* effect on compilation of RE, which is not what documentation says),
    while in the second example RE compiled every time it is evaluated (that is,
    every time "/$re/;" executed), though according to the manpage qr//o
    is supposed to be recompiled every time RE is evaluated.


    >Look again at the examples, which you quoted in your first post. They
    >make pretty clear how qr// is supposed to solve the //o problem.


    It is clear to me what problem qr// is supposed to solve, it is not
    clear why

    1) would somebody use /o modifier,

    and

    2) what is the difference between using qr// and qr//o - according
    to the messages from debugger there is none at all and /o modifier does
    not has any effect (try running an examples with "use re "debug";").


    ddtl.
    ddtl, Sep 4, 2003
    #9
  10. ddtl

    ddtl Guest

    >Just because A -> B, does not mean that !A -> !B.
    >
    >Just because the pattern is compiled once with /o, does not mean that
    >the pattern is not compiled once without /o.


    So what that means? Do you mean that when it is said:

    "o Compile pattern only once."

    means that when you *do not* use 'o', pattern is also compiled only
    once?? If A -> B does not mean that !A -> !B (and what you want to say
    is that when !A there is still B), means that A is not the only reason
    for B. If so, why do you need A at all - it is surely not because
    you want B, because B exists even without A. And that is just rephrasing
    of my question *why* do you need /o???

    ddtl.
    ddtl, Sep 4, 2003
    #10
  11. ddtl

    Anno Siegel Guest

    ddtl <> wrote in comp.lang.perl.misc:
    >
    > >Your question seems to be: If qr// is supposed to be used when /o can't
    > >be (because the pattern changes occasionally), why is it allowed to
    > >recompile each time (without /o).
    > >
    > >The answer is, you are not supposed to just replace /.../o by qr/.../
    > >literally. qr// allows you to compile a regex in one place, and apply
    > >it in another. So you recompile (using qr//) when needed, and replace
    > >the /.../o with a variable that holds the value where you want to apply
    > >the regex.

    >
    > But that does not explain why /o is needed! Yes, qr// allows you to
    > compile a regex in one place, and apply it in another, because if
    > it wasn't possible your RE would be recompiled every time it is evaluated,
    > and we want to be able to compile RE only once, so we use qr//. But
    > if we use qr// with /o, RE would be compiled every time RE is evaluated,
    > which defeats the whole reason for usage of qr//o.
    >
    > Maybe you could give an example when it makes difference between
    > using qr//o and plain quoted string, that is, between:
    >
    > -----------------
    > my $re = qr/hello/o;
    > ...
    > ...
    > /$re/;
    > ...
    > ...
    > /$re/;
    > -----------------


    Well, as the documentation assures us, no re-compilation happens in this
    case. Why are you assuming the opposite? The /o is irrelevant here,
    because the statement is executed only once anyway. To be sure that
    no re-compilation *can* happen, rewrite it as

    my $re = qr/hello/;
    # ...
    # ...
    $_ =~ $re;
    # ...
    # ...
    $_ =~ $re;

    Now we don't have a regex literal in the match, so no compilation happens.

    > and:
    >
    > -----------------
    > my $re = /hello/;


    You mean 'hello', not /hello/.

    > ...
    > ...
    > /$re/;
    > ...
    > ...
    > /$re/;
    > -----------------


    In fact, even this may not re-compile the regex, if $re hasn't been
    changed. The regex compiler has become rather clever about these things.
    But that's beside the point. Originally, the regex would be recompiled,
    and that's one of the reasons why qr// has been invented.

    It is perhaps unfortunate that the Camel doesn't show an example of a
    non-literal (bare-variable) pattern match, it could make things clearer.

    > According to the manpage, whenever "/$re/;" is evaluated, RE would be
    > recompiled in both examples, so why would i use qr//o at all - exactly
    > the same thing happens when qr//o is not used!


    According to what manpage? Quoting "perldoc perlop":

    Since Perl may compile the pattern at the moment
    of execution of qr() operator, using qr() may have
    speed advantages in some situations, notably if
    the result of qr() is used standalone:

    sub match {
    my $patterns = shift;
    my @compiled = map qr/$_/i, @$patterns;
    grep {
    my $success = 0;
    foreach my $pat (@compiled) {
    $success = 1, last if /$pat/;
    }
    $success;
    } @_;
    }

    Precompilation of the pattern into an internal
    representation at the moment of qr() avoids a need
    to recompile the pattern every time a match ...

    This clearly states that a qr//-pattern, used stand-alone in m// does
    *not* cause re-compilation.

    The use of /o with qr// is a red herring. It rarely makes sense.
    Either the qr// is run only once, then it doesn't matter. Or you
    run over it again, but then you usually do so because you want another
    regex compiled, and /o would defeat the purpose.

    [snip argument about /o]

    Anno
    Anno Siegel, Sep 4, 2003
    #11
  12. ddtl

    Sam Holden Guest

    On Thu, 04 Sep 2003 19:51:02 +0400, ddtl <> wrote:
    >>Just because A -> B, does not mean that !A -> !B.
    >>
    >>Just because the pattern is compiled once with /o, does not mean that
    >>the pattern is not compiled once without /o.

    >
    > So what that means? Do you mean that when it is said:
    >
    > "o Compile pattern only once."
    >
    > means that when you *do not* use 'o', pattern is also compiled only
    > once?? If A -> B does not mean that !A -> !B (and what you want to say
    > is that when !A there is still B), means that A is not the only reason
    > for B. If so, why do you need A at all - it is surely not because
    > you want B, because B exists even without A. And that is just rephrasing
    > of my question *why* do you need /o???


    In this example:

    $foo = "foo";
    $re = qr/$foo/;
    $reo = qr/$foo/o;
    for ("foo", "bar") {
    $foo = $_;
    print "$_ matches re\n" if $_=~/$re/;
    print "$_ matches reo\n" if $_=~/$reo/;
    }

    The regex is compiled once, as evidenced by the non-matching of "bar" when
    $foo is changed. This occurs both with and without /o.

    qr//o means reevaluation of the qr//o expression won't recompile the
    regex (there was previous post pointing this out), for example:

    for ("foo", "bar") {
    $foo = $_;
    $re = qr/$foo/;
    $reo = qr/$foo/o;
    print "$_ matches re\n" if $_=~/$re/;
    print "$_ matches reo\n" if $_=~/$reo/;
    }

    If you need to do something like that, then you need qr//o.

    However, qr// only compiled when qr// is executed, not when the resulting
    regex is used. I was interpreting recompiled to mean at the point of usage.
    I can't see why if qr//o was wanted, the qr// wouldn't just be moved out of
    the loop or function where is was being evaluated and the resulting value
    used instead. It seems like more of a compatibility with normal regexes
    type feature to me. Of course I am often wrong.

    --
    Sam Holden
    Sam Holden, Sep 4, 2003
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Cameron Laird
    Replies:
    1
    Views:
    649
    Josiah Carlson
    Apr 3, 2004
  2. Mr. SweatyFinger

    why why why why why

    Mr. SweatyFinger, Nov 28, 2006, in forum: ASP .Net
    Replies:
    4
    Views:
    878
    Mark Rae
    Dec 21, 2006
  3. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,816
    Smokey Grindel
    Dec 2, 2006
  4. Clarence

    -W: Python bug? Documentation bug?

    Clarence, Dec 13, 2006, in forum: Python
    Replies:
    1
    Views:
    236
    Clarence
    Dec 13, 2006
  5. Ahmad
    Replies:
    2
    Views:
    77
    Paul Lalli
    Jan 2, 2008
Loading...

Share This Page