word boundaries for regular expressions

A

Adam Akhtar

Hi did a search for word boundaries but didnt quite find what i was
looking for.

If i have strings containing products and model numbers

e.g.
"JP-ATH Headphones JP"

and I want to remove the last JP but not the one in the modle number how
do i go about it,

i tried

string.gsub(/\bJP\b/, '')
but it removes both.
I guess the hypen in the model number doesnt count as a word letter so
it gets knocked off.

am i doing something wrong here?
 
S

Stefano Crocco

Hi did a search for word boundaries but didnt quite find what i was
looking for.

If i have strings containing products and model numbers

e.g.
"JP-ATH Headphones JP"

and I want to remove the last JP but not the one in the modle number how
do i go about it,

i tried

string.gsub(/\bJP\b/, '')
but it removes both.
I guess the hypen in the model number doesnt count as a word letter so
it gets knocked off.

am i doing something wrong here?

You can replace the first \b with \s, which only matches spaces:

string.gsub(/\sJP\b/, '')

I hope this helps

Stefano
 
M

Michael Morin

Adam said:
Hi did a search for word boundaries but didnt quite find what i was
looking for.

If i have strings containing products and model numbers

e.g.
"JP-ATH Headphones JP"

and I want to remove the last JP but not the one in the modle number how
do i go about it,

i tried

string.gsub(/\bJP\b/, '')
but it removes both.
I guess the hypen in the model number doesnt count as a word letter so
it gets knocked off.

am i doing something wrong here?

If the spaces are consistent, you can do something like this

"JP-ATH Headphones JP".split(/\s+/)[0..-2].join(" ")

or this if they're not.

"JP-ATH Headphones JP".sub(/\s+\w+$/,'')

The advantage of the top one is you can remove something out of the
middle of the string if necessary. The bottom one is probably faster
and generally makes more sense.

--
Michael Morin
Guide to Ruby
http://ruby.about.com/
Become an About.com Guide: beaguide.about.com
About.com is part of the New York Times Company
 
R

Robert Klemme

2008/8/26 Adam Akhtar said:
Hi did a search for word boundaries but didnt quite find what i was
looking for.

If i have strings containing products and model numbers

e.g.
"JP-ATH Headphones JP"

and I want to remove the last JP but not the one in the modle number how
do i go about it,

i tried

string.gsub(/\bJP\b/, '')
but it removes both.
I guess the hypen in the model number doesnt count as a word letter so
it gets knocked off.

Yep, that sums it up pretty well.
am i doing something wrong here?

Obviously, since your results do not match your expectations / requirements. :)

You could use lookahead

irb(main):002:0> "JP-ATH Headphones JP".gsub /\bJP\b(?=\s|$)/, 'XXX'
=> "JP-ATH Headphones XXX"

It all depends on what other occurrences you have and which of them
you want to match.

Kind regards

robert
 
A

Adam Akhtar

Thanks everyone. Yes the strings to match vary a lot in terms of
positiong. Some dont have model numbers, some do, some dont have jp some
do. Ive know of look ahead but never used it before. ill give that a
shot.

Thanks!

adam
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top