help: negative lookahead and backref in regex?

S

stenor

I'd appreciate some help from a regex guru with this:

I'm trying to match a pattern that looks more or less like this:

<string1><statictext><string2>

where string1 and string2 are not the same. I need to do this in a
single regex (rather than capturing $1 and $2 and comparing after the
fact.

I've tried this as a test (in this case <statictext> is just
whitespace:

/(\w+)\s+(?!\1)/

This will match "foo bar" as expected, but it also matches "foo foo"
which it ought not to.

Using positive lookahead had the expected result:
/(\w+)\s+(?=\1)/ matches "foo foo" but not "foo bar", which is the
intended behavior.

Any ideas regarding why using the negative lookahead assertion as I've
done doesn't do what I'd like it to do? And/or how to do it correctly?
email cc appreciated.

Thanks,
Scott
 
G

Gunnar Hjalmarsson

I'm trying to match a pattern that looks more or less like this:

<string1><statictext><string2>

where string1 and string2 are not the same. I need to do this in a
single regex (rather than capturing $1 and $2 and comparing after the
fact.

I've tried this as a test (in this case <statictext> is just
whitespace:

/(\w+)\s+(?!\1)/

This will match "foo bar" as expected, but it also matches "foo foo"
which it ought not to.

No, it doesn't. It matches "oo foo", which it ought to. ;-) Did you try
to print the $1 variable?

Any ideas regarding why using the negative lookahead assertion as I've
done doesn't do what I'd like it to do? And/or how to do it correctly?

Express yourself more clearly, by for instance adding a word boundary
character. This matches "foo bar" but not "foo foo":

/\b(\w+)\s+(?!\1)/
 
A

Anno Siegel

I'd appreciate some help from a regex guru with this:

I'm trying to match a pattern that looks more or less like this:

<string1><statictext><string2>

where string1 and string2 are not the same. I need to do this in a
single regex (rather than capturing $1 and $2 and comparing after the
fact.

I've tried this as a test (in this case <statictext> is just
whitespace:

/(\w+)\s+(?!\1)/

This will match "foo bar" as expected, but it also matches "foo foo"
which it ought not to.

It matches "oo", followed by a blank, followed by something that is not
"oo" (namely "foo"), as it ought to. You may want to anchor the pattern,
or use "\b" boundaries around parts.

Anno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top