Using underscores as well as word boundaries to demarcate a pattern

L

Laura

I am using regular expressions to preprocess text and add text around
certain terms. Because the text can contain html, I use look-ahead
logic to exclude any text that comes between a "<" and a ">". My
expression looks like this:

(\\b)(term)(\\b)(?![^<]*>)

Where "term" is the term I'm looking for in the text. It's working
just fine, but now a new requirement has arisen that means we want to
look for terms separated by underscores as well as word boundary
characters.

How can I modify the above pattern to find my terms when they're
separated by either word boundary characters or an underscore?

Any help gratefully accepted!
 
L

Laura

I figured it out! Just needed to say \b|_ to get it to work, thusly:

(\b|_)(term)(\b|_)(?![^<]*>)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top