How do I identify word<html><html>other word?

L

Laura

I have a situation where I want to find two or more words next to each
other (e.g. "sure thing"). Between those words in the text I'm
searching there can be just white space OR any number of html tags.
How do I write a regexp that will identify this? I've tried the
following, but it doesn't work:

(\\b)(sure\\s+|<\\w*>thing)(\\b)

Thanks for any help!
 
G

Gunnar Hjalmarsson

Laura said:
I have a situation where I want to find two or more words next to
each other (e.g. "sure thing"). Between those words in the text
I'm searching there can be just white space OR any number of html
tags. How do I write a regexp that will identify this?

One approach is to simply remove the HTML first:

perldoc -q "remove HTML"
I've tried the following, but it doesn't work:

(\\b)(sure\\s+|<\\w*>thing)(\\b)

Err.. Which programming language is that? Since you posted here, I
assumed that you were about to use Perl. ;-)

You need to read up on regular expressions in Perl.

perldoc perlretut
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top