Please give me a good "rule-of-thumb" for back-slashing in character classes

M

Mr P

It *seems like* any character other than "]" or "\", or in some cases
"-", in a character class, would be interpreted "in-situ", as that
exact character. But I get confusing error messages so often, that I
usual just revert to a policy of "backslash everything" when in-doubt.

I would appreciate a (hopefully one-sentance so I can remember it)
guide/rule on what/when to backslash within character classes?

[\$\#\@\*\=\-\)]

or

[$#Q*=-)]

?

Sort of along the lines of : AIEOU and sometimes Y and W (is there
really a case when W is a vowel I never did see one, but I digress)..


Gracias,
MP
 
P

Paul Lalli

It *seems like* any character other than "]" or "\", or in some cases
"-", in a character class, would be interpreted "in-situ", as that
exact character. But I get confusing error messages so often, that I
usual just revert to a policy of "backslash everything" when in-doubt.

I would appreciate a (hopefully one-sentance so I can remember it)
guide/rule on what/when to backslash within character classes?

[\$\#\@\*\=\-\)]

or

[$#Q*=-)]

?

First thing you need to understand regular expressions undergo
interpolation just like double-quoted strings, before the RegExp
parser ever gets a hold of the pattern. That means that Perl is
searching for $ @ and \, even though those aren't special in a
character class. So if your pattern contains those, they need to be
backslashed.

Second, the characters ], -, and ^ are special in a character class.
(Technically, ^ is only special if its the first character, however).
Those three therefore need to be backslashed as well.

Third, whatever your regexp delimiter is needs to be backslashed.
"Normally", that's the forward-slash, but you can choose any non-
alphanumeric as your delimiter.

I think that's about it. So your analogy is:
$ @ \ ] - ^ and sometimes /

Paul Lalli
 
M

Mr P

It *seems like* any character other than "]" or "\", or in some cases
"-", in a character class, would be interpreted "in-situ", as that
exact character. But I get confusing error messages so often, that I
usual just revert to a policy of "backslash everything" when in-doubt.
I would appreciate a (hopefully one-sentance so I can remember it)
guide/rule on what/when to backslash within character classes?
[\$\#\@\*\=\-\)]
[$#Q*=-)]

?

First thing you need to understand regular expressions undergo
interpolation just like double-quoted strings, before the RegExp
parser ever gets a hold of the pattern. That means that Perl is
searching for $ @ and \, even though those aren't special in a
character class. So if your pattern contains those, they need to be
backslashed.

Second, the characters ], -, and ^ are special in a character class.
(Technically, ^ is only special if its the first character, however).
Those three therefore need to be backslashed as well.

Third, whatever your regexp delimiter is needs to be backslashed.
"Normally", that's the forward-slash, but you can choose any non-
alphanumeric as your delimiter.

I think that's about it. So your analogy is:
$ @ \ ] - ^ and sometimes /

Paul Lalli



Thank-You Paul. I realize that sometimes ^ is special (as you point
out, in the beginning). But I have used $ with no \ sucessfully. So at
least those two, I think, go on the "sometimes" side:


... and sometimes ^,$ and /

perhaps?

Good start! My only concern is that I'm afeared that the "sometimes"
part will be de-facto backslashed by me to avoiud potential errors.
But maybe the sometimes parts can be easily categorized?
 
P

Paul Lalli

It *seems like* any character other than "]" or "\", or in some cases
"-", in a character class, would be interpreted "in-situ", as that
exact character. But I get confusing error messages so often, that I
usual just revert to a policy of "backslash everything" when in-doubt.
I would appreciate a (hopefully one-sentance so I can remember it)
guide/rule on what/when to backslash within character classes?
[\$\#\@\*\=\-\)]
or
[$#Q*=-)]
?
First thing you need to understand regular expressions undergo
interpolation just like double-quoted strings, before the RegExp
parser ever gets a hold of the pattern. That means that Perl is
searching for $ @ and \, even though those aren't special in a
character class. So if your pattern contains those, they need to be
backslashed.
Second, the characters ], -, and ^ are special in a character class.
(Technically, ^ is only special if its the first character, however).
Those three therefore need to be backslashed as well.
Third, whatever your regexp delimiter is needs to be backslashed.
"Normally", that's the forward-slash, but you can choose any non-
alphanumeric as your delimiter.
I think that's about it. So your analogy is:
$ @ \ ] - ^ and sometimes /

Thank-You Paul. I realize that sometimes ^ is special (as you point
out, in the beginning). But I have used $ with no \ sucessfully. So at
least those two, I think, go on the "sometimes" side:

... and sometimes ^,$ and /

perhaps?

Good start! My only concern is that I'm afeared that the "sometimes"
part will be de-facto backslashed by me to avoiud potential errors.
But maybe the sometimes parts can be easily categorized?

Sure.

$ and @ need to be backlashed if they're followed by anything that
Perl could consider to be a variable - whether it's a user-defined
variable or a built-in variable. That means that [fo$,] will need the
$ escaped, because $, is a valid variable.

/ needs to be backslashed if and only if it is used as the delimiter
to the regexp. If it is not, it does not need to be backslashed, but
whatever *is* the delimiter does.

^ needs to be backslashed if it is the first character of the
character class.

Paul Lalli
 
J

Joe Smith

Paul said:
Second, the characters ], -, and ^ are special in a character class.
(Technically, ^ is only special if its the first character, however).

Don't forget the other two exceptions:

] is not special if it's the first character.
- is not special if it's the first character.

-Joe
 
T

Tad McClellan

Paul Lalli said:
Second, the characters ], -, and ^ are special in a character class.
(Technically, ^ is only special if its the first character, however).


And - is only special if it is not the first or the last character.

And ] is only special if it is not first. ( /[][]/ looks good in code ;-)

And \ is always special (unless doubled).

Those three therefore need to be backslashed as well.


That's probably easier to remember. :)
 
T

Tad McClellan

Petr Vileta said:
Paul Lalli said:
Second, the characters ], -, and ^ are special in a character class.
(Technically, ^ is only special if its the first character, however).

Or ^ may be NOT in /abc[^abc]def/
^^
^^

You must have meant "and" instead of "or" because that is
how ^ is special when it is first in a character class,
just like Paul said...
 
P

Paul Lalli

Paul Lalli said:
Second, the characters ], -, and ^ are special in a character class.
(Technically, ^ is only special if its the first character, however).

Or ^ may be NOT in /abc[^abc]def/

Uhm, yes, that's how it's "special"....

Paul Lalli
 
B

Brad Baxter

Sort of along the lines of : AIEOU and sometimes Y and W (is there
really a case when W is a vowel I never did see one, but I digress)..

How now brown cow?
 
J

Jim Ford

Mario said:
...

The word where W is a vowel is cwm:

Forget it - it's Welsh! If you want to cater for the Welsh language,
you'll meet all sorts of horrors!
;^)

Jim Ford
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top