Regexp help - Negative lookahead before across word boundaries

P

Phrogz

Given a string like this:
"this.position.x = foo.bar.whee * jim.jam - yow / this.jorgle"

I want to match all the global identifiers which are not 'this', and I
'need' to do so without consuming any other characters.

This regexp:
/[^.]\b(?!this)[a-zA-Z_]\w*\b/
works, but it consumes the preceding character.

I thought this regexp would work:
/(?!\.)\b(?!this)[a-zA-Z_]\w*\b/i
but now I realize why it doesn't. (Because the position after the
period satifies the negative lookahead and the word boundary.)


Help?
 
R

Robert Klemme

Phrogz said:
Given a string like this:
"this.position.x = foo.bar.whee * jim.jam - yow / this.jorgle"

I want to match all the global identifiers which are not 'this', and I
'need' to do so without consuming any other characters.

This regexp:
/[^.]\b(?!this)[a-zA-Z_]\w*\b/
works, but it consumes the preceding character.

I thought this regexp would work:
/(?!\.)\b(?!this)[a-zA-Z_]\w*\b/i
but now I realize why it doesn't. (Because the position after the
period satifies the negative lookahead and the word boundary.)


Help?

That's a tough one. I think you need negative lookbehind - something that
the std Ruby regexp engine does not have. I think oniguruma will suit you
better.
http://raa.ruby-lang.org/project/oniguruma/

However, you can do with the std engine if you allow for more processing
steps:
s.scan(/[\w.]+/).reject{|m| /^this(\.|$)/ =~ m}.map{|m| m.split('.')[0]} => ["foo", "jim", "yow"]
s.scan(/[\w.]+/).reject {|m| /^this(\.|$)/ =~ m}.map{|m|
/^\w+/.match(m)[0]}
=> ["foo", "jim", "yow"]

Kind regards

robert
 
W

William James

Phrogz said:
Given a string like this:
"this.position.x = foo.bar.whee * jim.jam - yow / this.jorgle"

I want to match all the global identifiers which are not 'this', and I
'need' to do so without consuming any other characters.

This regexp:
/[^.]\b(?!this)[a-zA-Z_]\w*\b/
works, but it consumes the preceding character.

s="bar this.position.x = foo.bar.whee * jim.jam - yow / this.jorgle"
p s.scan( /(?:^|[^.])\b(?!this)([a-zA-Z_]\w*)\b/ ).flatten

produces

["bar", "foo", "jim", "yow"]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top