Please give me a good "rule-of-thumb" for back-slashing in character classes

Discussion in 'Perl Misc' started by Mr P, May 9, 2007.

  1. Mr P

    Mr P Guest

    It *seems like* any character other than "]" or "\", or in some cases
    "-", in a character class, would be interpreted "in-situ", as that
    exact character. But I get confusing error messages so often, that I
    usual just revert to a policy of "backslash everything" when in-doubt.

    I would appreciate a (hopefully one-sentance so I can remember it)
    guide/rule on what/when to backslash within character classes?

    [\$\#\@\*\=\-\)]

    or

    [$#Q*=-)]

    ?

    Sort of along the lines of : AIEOU and sometimes Y and W (is there
    really a case when W is a vowel I never did see one, but I digress)..


    Gracias,
    MP
     
    Mr P, May 9, 2007
    #1
    1. Advertising

  2. Mr P

    Paul Lalli Guest

    On May 9, 11:06 am, Mr P <> wrote:
    > It *seems like* any character other than "]" or "\", or in some cases
    > "-", in a character class, would be interpreted "in-situ", as that
    > exact character. But I get confusing error messages so often, that I
    > usual just revert to a policy of "backslash everything" when in-doubt.
    >
    > I would appreciate a (hopefully one-sentance so I can remember it)
    > guide/rule on what/when to backslash within character classes?
    >
    > [\$\#\@\*\=\-\)]
    >
    > or
    >
    > [$#Q*=-)]
    >
    > ?


    First thing you need to understand regular expressions undergo
    interpolation just like double-quoted strings, before the RegExp
    parser ever gets a hold of the pattern. That means that Perl is
    searching for $ @ and \, even though those aren't special in a
    character class. So if your pattern contains those, they need to be
    backslashed.

    Second, the characters ], -, and ^ are special in a character class.
    (Technically, ^ is only special if its the first character, however).
    Those three therefore need to be backslashed as well.

    Third, whatever your regexp delimiter is needs to be backslashed.
    "Normally", that's the forward-slash, but you can choose any non-
    alphanumeric as your delimiter.

    I think that's about it. So your analogy is:
    $ @ \ ] - ^ and sometimes /

    Paul Lalli
     
    Paul Lalli, May 9, 2007
    #2
    1. Advertising

  3. Mr P

    Mr P Guest

    On May 9, 11:14 am, Paul Lalli <> wrote:
    > On May 9, 11:06 am, Mr P <> wrote:
    >
    > > It *seems like* any character other than "]" or "\", or in some cases
    > > "-", in a character class, would be interpreted "in-situ", as that
    > > exact character. But I get confusing error messages so often, that I
    > > usual just revert to a policy of "backslash everything" when in-doubt.

    >
    > > I would appreciate a (hopefully one-sentance so I can remember it)
    > > guide/rule on what/when to backslash within character classes?

    >
    > > [\$\#\@\*\=\-\)]

    >
    > > or

    >
    > > [$#Q*=-)]

    >
    > > ?

    >
    > First thing you need to understand regular expressions undergo
    > interpolation just like double-quoted strings, before the RegExp
    > parser ever gets a hold of the pattern. That means that Perl is
    > searching for $ @ and \, even though those aren't special in a
    > character class. So if your pattern contains those, they need to be
    > backslashed.
    >
    > Second, the characters ], -, and ^ are special in a character class.
    > (Technically, ^ is only special if its the first character, however).
    > Those three therefore need to be backslashed as well.
    >
    > Third, whatever your regexp delimiter is needs to be backslashed.
    > "Normally", that's the forward-slash, but you can choose any non-
    > alphanumeric as your delimiter.
    >
    > I think that's about it. So your analogy is:
    > $ @ \ ] - ^ and sometimes /
    >
    > Paul Lalli




    Thank-You Paul. I realize that sometimes ^ is special (as you point
    out, in the beginning). But I have used $ with no \ sucessfully. So at
    least those two, I think, go on the "sometimes" side:


    ... and sometimes ^,$ and /

    perhaps?

    Good start! My only concern is that I'm afeared that the "sometimes"
    part will be de-facto backslashed by me to avoiud potential errors.
    But maybe the sometimes parts can be easily categorized?
     
    Mr P, May 9, 2007
    #3
  4. Mr P

    Paul Lalli Guest

    On May 9, 11:33 am, Mr P <> wrote:
    > On May 9, 11:14 am, Paul Lalli <> wrote:
    > > On May 9, 11:06 am, Mr P <> wrote:

    >
    > > > It *seems like* any character other than "]" or "\", or in some cases
    > > > "-", in a character class, would be interpreted "in-situ", as that
    > > > exact character. But I get confusing error messages so often, that I
    > > > usual just revert to a policy of "backslash everything" when in-doubt.

    >
    > > > I would appreciate a (hopefully one-sentance so I can remember it)
    > > > guide/rule on what/when to backslash within character classes?

    >
    > > > [\$\#\@\*\=\-\)]

    >
    > > > or

    >
    > > > [$#Q*=-)]

    >
    > > > ?

    >
    > > First thing you need to understand regular expressions undergo
    > > interpolation just like double-quoted strings, before the RegExp
    > > parser ever gets a hold of the pattern. That means that Perl is
    > > searching for $ @ and \, even though those aren't special in a
    > > character class. So if your pattern contains those, they need to be
    > > backslashed.

    >
    > > Second, the characters ], -, and ^ are special in a character class.
    > > (Technically, ^ is only special if its the first character, however).
    > > Those three therefore need to be backslashed as well.

    >
    > > Third, whatever your regexp delimiter is needs to be backslashed.
    > > "Normally", that's the forward-slash, but you can choose any non-
    > > alphanumeric as your delimiter.

    >
    > > I think that's about it. So your analogy is:
    > > $ @ \ ] - ^ and sometimes /

    >
    > Thank-You Paul. I realize that sometimes ^ is special (as you point
    > out, in the beginning). But I have used $ with no \ sucessfully. So at
    > least those two, I think, go on the "sometimes" side:
    >
    > ... and sometimes ^,$ and /
    >
    > perhaps?
    >
    > Good start! My only concern is that I'm afeared that the "sometimes"
    > part will be de-facto backslashed by me to avoiud potential errors.
    > But maybe the sometimes parts can be easily categorized?


    Sure.

    $ and @ need to be backlashed if they're followed by anything that
    Perl could consider to be a variable - whether it's a user-defined
    variable or a built-in variable. That means that [fo$,] will need the
    $ escaped, because $, is a valid variable.

    / needs to be backslashed if and only if it is used as the delimiter
    to the regexp. If it is not, it does not need to be backslashed, but
    whatever *is* the delimiter does.

    ^ needs to be backslashed if it is the first character of the
    character class.

    Paul Lalli
     
    Paul Lalli, May 9, 2007
    #4
  5. Mr P

    Joe Smith Guest

    Re: Please give me a good "rule-of-thumb" for back-slashing in characterclasses

    Paul Lalli wrote:

    > Second, the characters ], -, and ^ are special in a character class.
    > (Technically, ^ is only special if its the first character, however).


    Don't forget the other two exceptions:

    ] is not special if it's the first character.
    - is not special if it's the first character.

    -Joe
     
    Joe Smith, May 9, 2007
    #5
  6. Paul Lalli <> wrote:

    > Second, the characters ], -, and ^ are special in a character class.
    > (Technically, ^ is only special if its the first character, however).



    And - is only special if it is not the first or the last character.

    And ] is only special if it is not first. ( /[][]/ looks good in code ;-)

    And \ is always special (unless doubled).


    > Those three therefore need to be backslashed as well.



    That's probably easier to remember. :)


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, May 10, 2007
    #6
  7. Petr Vileta <> wrote:
    > "Paul Lalli" <> píse v diskusním príspevku
    > news:...
    >> On May 9, 11:06 am, Mr P <> wrote:
    >> Second, the characters ], -, and ^ are special in a character class.
    >> (Technically, ^ is only special if its the first character, however).

    >
    > Or ^ may be NOT in /abc[^abc]def/

    ^^
    ^^

    You must have meant "and" instead of "or" because that is
    how ^ is special when it is first in a character class,
    just like Paul said...


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, May 10, 2007
    #7
  8. Mr P

    Paul Lalli Guest

    On May 9, 10:12 pm, "Petr Vileta" <> wrote:
    > "Paul Lalli" <> píse v diskusním príspevkunews:...
    >
    > > On May 9, 11:06 am, Mr P <> wrote:
    > > Second, the characters ], -, and ^ are special in a character class.
    > > (Technically, ^ is only special if its the first character, however).

    >
    > Or ^ may be NOT in /abc[^abc]def/


    Uhm, yes, that's how it's "special"....

    Paul Lalli
     
    Paul Lalli, May 10, 2007
    #8
  9. Mr P

    Brad Baxter Guest

    On May 9, 11:06 am, Mr P <> wrote:
    > Sort of along the lines of : AIEOU and sometimes Y and W (is there
    > really a case when W is a vowel I never did see one, but I digress)..


    How now brown cow?

    --
    Brad
     
    Brad Baxter, May 10, 2007
    #9
  10. "Mr P" <> wrote in message
    news:...
    ....
    > Sort of along the lines of : AIEOU and sometimes Y and W (is there
    > really a case when W is a vowel I never did see one, but I digress)..


    The word where W is a vowel is cwm:

    http://www.m-w.com/dictionary/cwm


    Mario
     
    Mario D'Alessio, May 11, 2007
    #10
  11. Mr P

    Jim Ford Guest

    Re: Please give me a good "rule-of-thumb" for back-slashing in characterclasses

    Mario D'Alessio wrote:
    > "Mr P" <> wrote in message
    > news:...
    > ...
    >> Sort of along the lines of : AIEOU and sometimes Y and W (is there
    >> really a case when W is a vowel I never did see one, but I digress)..

    >
    > The word where W is a vowel is cwm:


    Forget it - it's Welsh! If you want to cater for the Welsh language,
    you'll meet all sorts of horrors!
    ;^)

    Jim Ford
     
    Jim Ford, May 12, 2007
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. -
    Replies:
    7
    Views:
    468
    Tony Morris
    Feb 11, 2005
  2. grocery_stocker
    Replies:
    10
    Views:
    657
    Keith Thompson
    May 25, 2005
  3. utab
    Replies:
    9
    Views:
    339
    Richard Herring
    Jun 26, 2006
  4. JetQi Tan
    Replies:
    8
    Views:
    407
    MikeP
    Jul 5, 2011
  5. John Carter

    Rule of Thumb for the Day....

    John Carter, Jul 30, 2003, in forum: Ruby
    Replies:
    0
    Views:
    99
    John Carter
    Jul 30, 2003
Loading...

Share This Page