how to match whole word

P

Peng Yu

Hi,

The following code snippet is from /usr/bin/rpl. I would like the it
to match a word, for example, "abc" in ":abc:". But the current one
would not match "abc" in ":abc:". I tried to modify it myself. Would
you please let me know what is the corrected way to do it?

Thanks,
Peng

if opts.whole_words:
regex = re.compile(r"(?:(?<=\s)|^)" + re.escape(old_str) + r"(?=\s|
$)",
opts.ignore_case and re.I or 0)
 
G

Gary Herron

Peng said:
Hi,

The following code snippet is from /usr/bin/rpl. I would like the it
to match a word, for example, "abc" in ":abc:". But the current one
would not match "abc" in ":abc:". I tried to modify it myself. Would
you please let me know what is the corrected way to do it?

Thanks,
Peng

if opts.whole_words:
regex = re.compile(r"(?:(?<=\s)|^)" + re.escape(old_str) + r"(?=\s|
$)",
opts.ignore_case and re.I or 0)

The regular expression "\w+" will match (what might be your definition
of) a word, and in particular will match abc in :abc:. Regular
expressions have lots of other special \-sequences that might be worth
your while to read about: http://docs.python.org/lib/re-syntax.html

Gary Herron
 
P

Peng Yu

The regular expression "\w+" will match (what might be your definition
of) a word, and in particular will match abc in :abc:. Regular
expressions have lots of other special \-sequences that might be worth
your while to read about: http://docs.python.org/lib/re-syntax.html

Gary Herron

I didn't read the docs and tried the following code.

regex = re.compile(r"\A" + re.escape(old_str) + r"\Z",
opts.ignore_case and re.I or 0)

But I'm not sure why it is not working.

Thanks,
Peng
 
F

Fredrik Lundh

Peng said:
I didn't read the docs and tried the following code.

regex = re.compile(r"\A" + re.escape(old_str) + r"\Z",
opts.ignore_case and re.I or 0)

But I'm not sure why it is not working.

as the documentation says, \A and \Z matches at the beginning/end of a
*string*, not a word.

</F>
 
J

John S

I didn't read the docs and tried the following code.

regex = re.compile(r"\A" + re.escape(old_str) + r"\Z",
opts.ignore_case and re.I or 0)

But I'm not sure why it is not working.

Thanks,
Peng

Not sure why you picked \A and \Z -- they are only useful if you are
using the re.M flag.
What you want is \b -- match word boundary, on either side of your
word:

regex = re.compile(r"\b" + re.escape(old_str) + r"\b",re.I)

re.I is the same as re.IGNORECASE. More than one option may be OR'ed
together. There's no such thing as "re.O" in Python. I can understand
where you get the idea, as there is an 'o' modifier for REs in Perl.

To summarize, \A and \Z match the beginning and end of a STRING, while
\b matches the beginning or end of a WORD.

-- john
 
F

Fredrik Lundh

John said:
> Not sure why you picked \A and \Z -- they are only useful if you are
> using the re.M flag.

Well, they're aliases for ^ and $ in "normal" mode, at least for strings
that don't end with a newline.
re.I is the same as re.IGNORECASE. More than one option may be OR'ed
together. There's no such thing as "re.O" in Python. I can understand
where you get the idea, as there is an 'o' modifier for REs in Perl.

His code did

opts.ignore_case and re.I or 0

which is the same as "re.I if opts.ignore_case else 0" in Python 2.5,
where 0 is a zero and not an O.

</F>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top