Where can I find a summary table of various reqular expressions options?

A

Anonieko Ramos

This is like a cheat sheet if I may call it.


\
Marks the next character as either a special character, a literal, a
backreference, or an octal escape. For example, 'n' matches the
character "n". '\n' matches a newline character. The sequence '\\'
matches "\" and "\(" matches "(".

^
Matches the position at the beginning of the input string. If the

RegExp object's Multiline property is set, ^ also matches the position
following '\n' or '\r'.

$
Matches the position at the end of the input string. If the RegExp
object's Multiline property is set, $ also matches the position
preceding '\n' or '\r'.

*
Matches the preceding subexpression zero or more times. For example,
zo* matches "z" and "zoo". * is equivalent to {0,}.

+
Matches the preceding subexpression one or more times. For example,
'zo+' matches "zo" and "zoo", but not "z". + is equivalent to {1,}.



?
Matches the preceding subexpression zero or one time. For example,
"do(es)?" matches the "do" in "do" or "does". ? is equivalent to {0,1}

{n}
n is a nonnegative integer. Matches exactly n times. For example,
'o{2}' does not match the 'o' in "Bob," but matches the two o's in
"food".

{n,m}
m and n are nonnegative integers, where n <= m. Matches at least n
and at most m times. For example, "o{1,3}" matches the first three o's
in "fooooood". 'o{0,1}' is equivalent to 'o?'. Note that you cannot
put a space between the comma and the numbers.

?
When this character immediately follows any of the other quantifiers
(*, +, ?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. A
non-greedy pattern matches as little of the searched string as
possible, whereas the default greedy pattern matches as much of the
searched string as possible. For example, in the string "oooo", 'o+?'
matches a single "o",

while 'o+' matches all 'o's.

..
Matches any single character except "\n". To match any character
including the '\n', use a pattern such as '[.\n]'.

(pattern)
Matches pattern and captures the match. The captured match can be
retrieved from the resulting Matches collection, using the SubMatches
collection in VBScript or the $0.$9 properties in JScript. To match
parentheses characters ( ), use '\(' or '\)'.

(?:pattern)
Matches pattern but does not capture the match, that is, it is a
non-capturing match that is not stored for possible later use. This is
useful for combining parts of a pattern with the "or" character (|).
For example, 'industr(?:y|ies) is a more economical expression than
'industry|industries'.

(?=pattern)
Positive lookahead matches the search string at any point where a
string matching pattern begins. This is a non-capturing match, that
is, the match is not captured for possible later use. For example
'Windows (?=95|98|NT|2000)' matches "Windows" in "Windows 2000" but
not "Windows" in "Windows 3.1". Lookaheads do not consume characters,
that is, after a match occurs, the search for the next match begins
immediately following the last

match, not after the characters that comprised the lookahead.

(?!pattern)
Negative lookahead matches the search string at any point where a
string not matching pattern begins. This is a non-capturing match,
that is, the match is not captured for possible later use. For example
'Windows (?!95|98|NT|2000)' matches "Windows" in "Windows 3.1" but
does not match "Windows" in "Windows 2000". Lookaheads do not consume
characters, that is, after a match occurs, the search for the next
match begins immediately following the last match, not after the
characters that comprised the lookahead.

x|y
Matches either x or y. For example, 'z|food' matches "z" or "food".
'(z|f)ood' matches "zood" or "food".

[xyz]
A character set. Matches any one of the enclosed characters. For
example, '[abc]' matches the 'a' in "plain".

[^xyz]
A negative character set. Matches any character not enclosed. For
example, '[^abc]' matches the 'p' in "plain".

[a-z]
A range of characters. Matches any character in the specified range.
For example, '[a-z]' matches any lowercase alphabetic character in the
range 'a' through 'z'.

[^a-z]
A negative range characters. Matches any character not in the

specified range. For example, '[^a-z]' matches any character not in
the range 'a' through 'z'.

\b
Matches a word boundary, that is, the position between a word and a
space. For example, 'er\b' matches the 'er' in "never" but not the
'er' in "verb".

\B
Matches a nonword boundary. 'er\B' matches the 'er' in "verb" but not
the 'er' in "never".

\cx
Matches the control character indicated by x. For example, \cM
matches a Control-M or carriage return character. The value of x must
be in the range of A-Z or a-z. If not, c is assumed to be a literal
'c' character.

\d
Matches a digit character. Equivalent to [0-9].

\D
Matches a nondigit character. Equivalent to [^0-9].

\f
Matches a form-feed character. Equivalent to \x0c and \cL.

\n
Matches a newline character. Equivalent to \x0a and \cJ.

\r
Matches a carriage return character. Equivalent to \x0d and \cM.

\s
Matches any whitespace character including space, tab, form-feed,
etc. Equivalent to [ \f\n\r\t\v].

\S
Matches any non-white space character. Equivalent to [^\f\n\r\t\v].

\t
Matches a tab character. Equivalent to \x09 and \cI.

\v
Matches a vertical tab character. Equivalent to \x0b and \cK.

\w
Matches any word character including underscore. Equivalent to
'[A-Za-z0-9_]'.

\W
Matches any nonword character. Equivalent to '[^A-Za-z0-9_]'.

\xn
Matches n, where n is a hexadecimal escape value. Hexadecimal escape
values must be exactly two digits long. For example, '\x41' matches
"A". '\x041' is equivalent to '\x04' & "1". Allows ASCII codes to be
used in regular expressions.

\num
Matches num, where num is a positive integer. A reference back to
captured matches. For example, '(.)\1' matches two consecutive
identical characters.

\n
Identifies either an octal escape value or a backreference. If \n is
preceded by at least n captured subexpressions, n is a backreference.
Otherwise, n is an octal escape value if n is an octal digit (0-7).

\nm
Identifies either an octal escape value or a backreference. If \nm is
preceded by at least nm captured subexpressions, nm is a
backreference. If \nm is preceded by at least n captures, n is a
backreference followed by literal m. If neither of the preceding
conditions exists, \nm matches octal escape value nm when n and m are
octal digits (0-7).

\nml
Matches octal escape value nml when n is an octal digit (0-3) and m
and l are octal digits (0-7).

\un
Matches n, where n is a Unicode character expressed as four
hexadecimal digits. For example, \u00A9 matches the copyright symbol
()).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top