unexplained warning message in m{...} regexp

Discussion in 'Perl Misc' started by Klaus, Apr 24, 2009.

  1. Klaus

    Klaus Guest

    I am trying to match a literal string '{0,0}' using the syntax m{...}.
    I know that I have to escape both the '{' and '}' characters.

    Here is my program
    ========================
    use strict;
    use warnings;

    $_ = '{0,0}';
    if (m{\A\{0,0\}\z}) {
    print "yes\n";
    }
    else {
    print "no\n";
    }
    ========================

    The regexp works as intended and prints "yes", but there is an
    unexplained warning message:

    ========================
    Quantifier unexpected on zero-length expression in regex; marked by
    <-- HERE in m/\A{0,0}\z <-- HERE / at Testregexp.pl line 5.
    yes
    ========================

    The message does not appear if I use /\A\{0,0\}\z/.

    It seems to me that Perl is confused about using '{' and '}' inside a
    match of the form m{...}

    I am using Activestate Perl 5.10 on Windows XP.

    C:\>perl -v

    This is perl, v5.10.0 built for MSWin32-x86-multi-thread
    (with 5 registered patches, see perl -V for more detail)

    Copyright 1987-2007, Larry Wall

    Binary build 1004 [287188] provided by ActiveState http://www.ActiveState.com
    Built Sep 3 2008 13:16:37

    --
    Klaus
    Klaus, Apr 24, 2009
    #1
    1. Advertising

  2. Klaus

    Frank Seitz Guest

    Klaus wrote:
    >
    > It seems to me that Perl is confused about using '{' and '}' inside a
    > match of the form m{...}


    Perl is not confused. It's a syntax error, because { and } have a special
    meaning in regexes. See perldoc perlre (Section "Quantifiers").

    Frank
    --
    Dipl.-Inform. Frank Seitz; http://www.fseitz.de/
    Anwendungen für Ihr Internet und Intranet
    Tel: 04103/180301; Fax: -02; Industriestr. 31, 22880 Wedel
    Frank Seitz, Apr 24, 2009
    #2
    1. Advertising

  3. Klaus

    Klaus Guest

    On Apr 24, 10:32 am, Frank Seitz <> wrote:
    > Klaus wrote:
    >
    > > It seems to me that Perl is confused about using '{' and '}' inside a
    > > match of the form m{...}

    >
    > Perl is not confused. It's a syntax error, because { and } have a special
    > meaning in regexes. See perldoc perlre (Section "Quantifiers").


    Please note that I have escaped '\{' and '\}' inside m{\A\{0,0\}\z}...

    ....and why does the message disappear if I use /\A\{0,0\}\z/. ?

    --
    Klaus
    Klaus, Apr 24, 2009
    #3
  4. Klaus

    Frank Seitz Guest

    Klaus wrote:
    > On Apr 24, 10:32 am, Frank Seitz <> wrote:
    >> Klaus wrote:
    >>>
    >>> It seems to me that Perl is confused about using '{' and '}' inside a
    >>> match of the form m{...}

    >> Perl is not confused. It's a syntax error, because { and } have a special
    >> meaning in regexes. See perldoc perlre (Section "Quantifiers").

    >
    > Please note that I have escaped '\{' and '\}' inside m{\A\{0,0\}\z}...


    Here, the \-escape eliminates the meaning as delimiter.
    The { } become metacharacters.

    > ...and why does the message disappear if I use /\A\{0,0\}\z/. ?


    Here, the \-escape eliminates the meaning as metacharacter.
    The { } become normal characters.

    Frank
    --
    Dipl.-Inform. Frank Seitz; http://www.fseitz.de/
    Anwendungen für Ihr Internet und Intranet
    Tel: 04103/180301; Fax: -02; Industriestr. 31, 22880 Wedel
    Frank Seitz, Apr 24, 2009
    #4
  5. Klaus

    Teo Guest

    Dear Franz,

    On Apr 24, 10:32 am, Frank Seitz <> wrote:
    > Klaus wrote:
    >
    > > It seems to me that Perl is confused about using '{' and '}' inside a
    > > match of the form m{...}

    >
    > Perl is not confused. It's a syntax error, because { and } have a special
    > meaning in regexes. See perldoc perlre (Section "Quantifiers").


    No is not: the curly brackets are correctly escaped. The problem only
    occurs if {} are used: other bracketing delimiters (e.g., m(\A\{0,0\}
    \z) ) do not provoke the warning.

    I can reproduce the problem with both 5.8.9 and 5.10.0

    Matteo
    Teo, Apr 24, 2009
    #5
  6. Klaus

    Frank Seitz Guest

    Teo wrote:
    > On Apr 24, 10:32 am, Frank Seitz <> wrote:
    >> Klaus wrote:
    >>
    >>> It seems to me that Perl is confused about using '{' and '}' inside a
    >>> match of the form m{...}

    >> Perl is not confused. It's a syntax error, because { and } have a special
    >> meaning in regexes. See perldoc perlre (Section "Quantifiers").

    >
    > No is not: the curly brackets are correctly escaped. The problem only
    > occurs if {} are used: other bracketing delimiters (e.g., m(\A\{0,0\}
    > \z) ) do not provoke the warning.
    >
    > I can reproduce the problem with both 5.8.9 and 5.10.0


    See <>

    Frank
    --
    Dipl.-Inform. Frank Seitz; http://www.fseitz.de/
    Anwendungen für Ihr Internet und Intranet
    Tel: 04103/180301; Fax: -02; Industriestr. 31, 22880 Wedel
    Frank Seitz, Apr 24, 2009
    #6
  7. Klaus wrote:
    > I am trying to match a literal string '{0,0}' using the syntax m{...}.
    > I know that I have to escape both the '{' and '}' characters.
    >
    > Here is my program
    > ========================
    > use strict;
    > use warnings;
    >
    > $_ = '{0,0}';
    > if (m{\A\{0,0\}\z}) {
    > print "yes\n";
    > }

    [...]
    > ========================
    > Quantifier unexpected on zero-length expression in regex; marked by
    > <-- HERE in m/\A{0,0}\z <-- HERE / at Testregexp.pl line 5.
    > yes
    > ========================


    Same here - Perl 5.19 on Debian/Linux.

    Seems to be bug.

    Workarounds:

    m{\A[{]0,0[}]\z}
    m{\A\{0\,0\}\z}

    Helmut Wollmersdorfer
    Helmut Wollmersdorfer, Apr 24, 2009
    #7
  8. Klaus

    Teo Guest

    On Apr 24, 10:58 am, Frank Seitz <> wrote:
    > Klaus wrote:
    > > On Apr 24, 10:32 am, Frank Seitz <> wrote:
    > >> Klaus wrote:

    >
    > >>> It seems to me that Perl is confused about using '{' and '}' inside a
    > >>> match of the form m{...}
    > >> Perl is not confused. It's a syntax error, because { and } have a special
    > >> meaning in regexes. See perldoc perlre (Section "Quantifiers").

    >
    > > Please note that I have escaped '\{' and '\}' inside m{\A\{0,0\}\z}...

    >
    > Here, the \-escape eliminates the meaning as delimiter.
    > The { } become metacharacters.
    >
    > > ...and why does the message disappear if I use /\A\{0,0\}\z/. ?

    >
    > Here, the \-escape eliminates the meaning as metacharacter.
    > The { } become normal characters.


    Ok I see but this is rather confusing:

    * in a // delimited regex the literal '/' has to be escaped
    * in a {} delimited regex the literal '{' has *not* to be escaped

    am I getting it right?

    But then when I look at perlop

    When searching for single-character delimiters, escaped delimiters
    and "\\" are skipped. For example, while
    searching for terminating "/", combinations of "\\" and "\/" are
    skipped. If the delimiters are bracketing, nested
    pairs are also skipped. For example, while searching for closing
    "]" paired with the opening "[", combinations of
    "\\", "\]", and "\[" are all skipped, and nested "[" and "]" are
    skipped as well. However, when backslashes are
    used as the delimiters (like "qq\\" and "tr\\\"), nothing is
    skipped. During the search for the end, backslashes
    that escape delimiters are removed (exactly speaking, they are not
    copied to the safe location).

    it gets more confusing. If I understand correctly the difference is
    only there if the { } are paired.
    In m{ aaa{bbb } the { is not escaped and it is understood as the
    beginning of a quantifier.

    In fact I get:

    Search pattern not terminated at ./test.pl line 6.

    So to have a literal '{' I should escape it if not paired and not
    escape it if not paired.

    Did I get it wrong? (I sincerly hope so :)

    Matteo
    Teo, Apr 24, 2009
    #8
  9. Klaus

    Klaus Guest

    On Apr 24, 10:58 am, Frank Seitz <> wrote:
    > Klaus wrote:
    > > On Apr 24, 10:32 am, Frank Seitz <> wrote:
    > >> Klaus wrote:

    >
    > >>> It seems to me that Perl is confused about using '{' and '}' inside a
    > >>> match of the form m{...}
    > >> Perl is not confused. It's a syntax error, because { and } have a special
    > >> meaning in regexes. See perldoc perlre (Section "Quantifiers").

    >
    > > Please note that I have escaped '\{' and '\}' inside m{\A\{0,0\}\z}...

    >
    > Here, the \-escape eliminates the meaning as delimiter.


    I see, thanks for the explanation.

    > The { } become metacharacters.


    I find it unfortunate that they become metacharacters, particularly so
    because there is no reason to quote metacharacters { } in the first
    place, as they always come in pairs and are handled natuarally by m{...
    { }...} as a nested pair of curlies, for example m{a{1,2}}

    --
    Klaus
    Klaus, Apr 24, 2009
    #9
  10. Klaus

    Klaus Guest

    On Apr 24, 11:12 am, Helmut Wollmersdorfer <>
    wrote:
    > Klaus wrote:
    > > I am trying to match a literal string '{0,0}' using the syntax m{...}.
    > > I know that I have to escape both the '{' and '}' characters.

    >
    > > Here is my program
    > > ========================
    > > use strict;
    > > use warnings;

    >
    > > $_ = '{0,0}';
    > > if (m{\A\{0,0\}\z}) {
    > >     print "yes\n";
    > > }

    > [...]
    > > ========================
    > > Quantifier unexpected on zero-length expression in regex; marked by
    > > <-- HERE in m/\A{0,0}\z <-- HERE / at Testregexp.pl line 5.
    > > yes
    > > ========================

    >
    > Same here - Perl 5.19 on Debian/Linux.
    >
    > Seems to be bug.
    >
    > Workarounds:
    >
    > m{\A[{]0,0[}]\z}
    > m{\A\{0\,0\}\z}


    Thanks for the workarounds, they work ok. But I agree that the
    original problem is a bug. Where can I post a bug report ?

    --
    Klaus
    Klaus, Apr 24, 2009
    #10
  11. Klaus <> wrote:

    > Where can I post a bug report ?



    perldoc -q bug

    Where do I send bug reports?

    perldoc perlbug


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
    Tad J McClellan, Apr 24, 2009
    #11
  12. Klaus <> wrote:
    >I am trying to match a literal string '{0,0}' using the syntax m{...}.


    If you don't mind me asking: why? I mean why do you use REs if you don't
    want their functionality ("match a literal string")?
    A simple index() is so much easier to use.

    jue
    Jürgen Exner, Apr 24, 2009
    #12
  13. Klaus

    Klaus Guest

    On Apr 24, 1:54 pm, Jürgen Exner <> wrote:
    > Klaus <> wrote:
    > >I am trying to match a literal string '{0,0}' using the syntax m{...}.

    >
    > If you don't mind me asking: why? I mean why do you use REs if you don't
    > want their functionality ("match a literal string")?
    > A simple index() is so much easier to use.


    You are right, in fact my regexp is anchored with \A and \z, so a
    simple $_ eq '{0,0}' should suffice, that's easier to read and
    probably much faster.

    However, I maintain that a regexp m{\A\{0,0\}\z} should not emit a
    warning message and I have filed a bugreport.

    --
    Klaus
    Klaus, Apr 24, 2009
    #13
  14. Klaus

    Willem Guest

    Klaus wrote:
    ) You are right, in fact my regexp is anchored with \A and \z, so a
    ) simple $_ eq '{0,0}' should suffice, that's easier to read and
    ) probably much faster.
    )
    ) However, I maintain that a regexp m{\A\{0,0\}\z} should not emit a
    ) warning message and I have filed a bugreport.

    Then how are you supposed to put a quantifier in an m{...} expression ?


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
    Willem, Apr 24, 2009
    #14
  15. Willem <> writes:

    > Then how are you supposed to put a quantifier in an m{...} expression ?


    By using another delimiter.

    //Makholm
    Peter Makholm, Apr 24, 2009
    #15
  16. Klaus

    Klaus Guest

    On Apr 24, 2:59 pm, Willem <> wrote:
    > Klaus wrote:
    >
    > ) You are right, in fact my regexp is anchored with \A and \z, so a
    > ) simple $_ eq '{0,0}' should suffice, that's easier to read and
    > ) probably much faster.
    > )
    > ) However, I maintain that a regexp m{\A\{0,0\}\z} should not emit a
    > ) warning message and I have filed a bugreport.
    >
    > Then how are you supposed to put a quantifier in an m{...} expression ?


    By not escaping the curlies { }, for example

    m{a{1,2}} matches 'a' or 'aa'
    m{a\{1,2\}} matches 'a{1,2}'

    --
    Klaus
    Klaus, Apr 24, 2009
    #16
  17. On 2009-04-24, Teo <> wrote:
    >> > Please note that I have escaped '\{' and '\}' inside m{\A\{0,0\}\z}...

    >>
    >> Here, the \-escape eliminates the meaning as delimiter.
    >> The { } become metacharacters.
    >>
    >> > ...and why does the message disappear if I use /\A\{0,0\}\z/. ?

    >>
    >> Here, the \-escape eliminates the meaning as metacharacter.
    >> The { } become normal characters.

    >
    > Ok I see but this is rather confusing:
    >
    > * in a // delimited regex the literal '/' has to be escaped
    > * in a {} delimited regex the literal '{' has *not* to be escaped
    >
    > am I getting it right?


    No. Let me try (untested):

    * in a {}-delimited regex escaping '{' won't make it into a literal.
    (AND unescaped '{' should properly nest).

    [There are two different mechanisms of unescaping in the lifetime of
    a REx.

    a) First, the parser removes delimiters (and unescapes escaped
    delimiters) (it may also remove certain other escapes - do not
    remember details).

    b) The result is passed to REx engine. It processes all the
    remaining special-for-REx escapes.

    I did not have time to document it when I was working on Perl
    RExes. I doubt the docs improved from that time...]

    The difference is kinda subtle. E.g., variables interpolated in RExes
    are subject ONLY to "b"-unescaping. Also, one can see the result of
    "a" in debugging output of

    use re 'debugcolor';

    Hope this helps,
    Ilya

    P.S. If one tries to use \ as a delimiter, one can get yet funnier
    quirks of this 2-step semantic... ;-)
    Ilya Zakharevich, Apr 25, 2009
    #17
  18. Klaus wrote:
    > I am trying to match a literal string '{0,0}' using the syntax m{...}.
    > I know that I have to escape both the '{' and '}' characters.
    >
    > Here is my program
    > ========================
    > use strict;
    > use warnings;
    >
    > $_ = '{0,0}';
    > if (m{\A\{0,0\}\z}) {
    > print "yes\n";
    > }
    > else {
    > print "no\n";
    > }
    > ========================
    >
    > The regexp works as intended and prints "yes",


    Just because it prints yes when you expect it to doesn't mean it works
    as intended. If you intend to do addition, then 2*2 gives the expected
    answer, yet doesn't work as intended.

    >
    > It seems to me that Perl is confused about using '{' and '}' inside a
    > match of the form m{...}


    Turn the m into a q and print the result:

    /home/user> perl -wle 'print q{\A\{0,0\}\Z}'
    \A{0,0}\Z
    /home/user> perl -wle 'print q/\A\{0,0\}\Z/'
    \A\{0,0\}\Z

    This is a generic property of quote like operators, not peculiar to the
    regex variety of them.

    Xho
    Xho Jingleheimerschmidt, Apr 25, 2009
    #18
  19. Klaus wrote:
    > On Apr 24, 10:58 am, Frank Seitz <> wrote:
    >
    >> The { } become metacharacters.

    >
    > I find it unfortunate that they become metacharacters,


    I wouldn't say they become metacharacters, they are metacharacters.
    That is what they started as, and that is what they return to when their
    backwhacks get eaten.


    > particularly so
    > because there is no reason to quote metacharacters { } in the first
    > place,


    Of course there is. If they were not quoted, they would be either hash
    constructors or code blocks, rather than either literal characters or
    regex special characters.

    > as they always come in pairs and are handled natuarally by m{...
    > { }...} as a nested pair of curlies, for example m{a{1,2}}


    They don't always come in pairs. What if the literal string you wanted
    to match were '{0,0' ?

    Xho
    Xho Jingleheimerschmidt, Apr 25, 2009
    #19
  20. Klaus

    Klaus Guest

    On Apr 25, 3:13 am, Ilya Zakharevich <> wrote:
    >   [There are two different mechanisms of unescaping in the lifetime of
    >    a REx.
    >
    >    a) First, the parser removes delimiters


    Ok, so far I am with you.

    > (and unescapes escaped delimiters)
    > (it may also remove certain other escapes - do not remember details).


    Why unescaping escaped delimiters ?

    >    b) The result is passed to REx engine.  It processes all the
    >       remaining special-for-REx escapes.


    I don't know how the REx engine works, but I would be surprised if it
    could not handle an escaped delimiter (such as '\{' in my case),
    whereas at the same time it can handle an escaped non-delimiter (such
    as '\[' for example).

    >    I did not have time to document it when I was working on Perl
    >    RExes.  I doubt the docs improved from that time...]


    That's why I was confused when I came across this case.

    > The difference is kinda subtle.  E.g., variables interpolated in RExes
    > are subject ONLY to "b"-unescaping.


    This information would be useful in the documentation.

    > Also, one can see the result of
    > "a" in debugging output of
    >
    >    use re 'debugcolor';
    >
    > Hope this helps,
    > Ilya
    >
    > P.S.  If one tries to use \ as a delimiter, one can get yet funnier
    >       quirks of this 2-step semantic...  ;-)


    This information would also be useful in the documentation.
    Klaus, Apr 25, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michiel

    NEWBIE:unexplained syntax error

    Michiel, Sep 9, 2004, in forum: Perl
    Replies:
    2
    Views:
    504
    Barry Kimelman
    Sep 10, 2004
  2. Marty Cruise

    Unexplained Instance Error

    Marty Cruise, Jul 3, 2003, in forum: ASP .Net
    Replies:
    4
    Views:
    404
    David Waz..
    Jul 3, 2003
  3. Replies:
    2
    Views:
    470
  4. Geoff Noel
    Replies:
    3
    Views:
    8,023
    Victor Bazarov
    Feb 2, 2005
  5. Joao Silva
    Replies:
    16
    Views:
    359
    7stud --
    Aug 21, 2009
Loading...

Share This Page