regular expression NOT operator

  • Thread starter Phil Cooper-king
  • Start date
P

Phil Cooper-king

Regex is something I've managed to ignore for some time, only learning
bits when needed.

I'm trying to do code highlighting, so I've been using Reg to find parts
of the code
I've run into an issue with comments

# A comment can contain control statements like if and Constants

things like if and a Constant get picked up by my reg, I'm trying to put
a not operator in my regex but cant seem to get it to work.

so for constants I search /([A-Z].*?\b)/
how can I add "when theres no # on the left"?

I've been trying to use the ?! as the not operator. (is that even a not
operator)
this is what i've been trying /([A-Z].*?\b)(?!<regex>)/

anyhelp would be greatly appreated.

thanks
Phil.
 
D

David A. Black

Hi --

Regex is something I've managed to ignore for some time, only learning
bits when needed.

I'm trying to do code highlighting, so I've been using Reg to find parts
of the code
I've run into an issue with comments

# A comment can contain control statements like if and Constants

things like if and a Constant get picked up by my reg, I'm trying to put
a not operator in my regex but cant seem to get it to work.

so for constants I search /([A-Z].*?\b)/
how can I add "when theres no # on the left"?

I've been trying to use the ?! as the not operator. (is that even a not
operator)
this is what i've been trying /([A-Z].*?\b)(?!<regex>)/

There's negative look-behind in Oniguruma, but you're going to run
into some difficulties anyway, I suspect. For example:

x = 3
puts "This has no comments, and x is #{x}" if x < 5

My advice would be to keep it (relatively) straightforward by doing
something like this as you scan the lines of text:

comment_re = /^\s*#/

if comment_re.match(line)
# treat line as a comment
else
# line is not a comment
end


David

--
David A. Black / Ruby Power and Light, LLC / http://www.rubypal.com
Ruby/Rails training, mentoring, consulting, code-review
Latest book: The Well-Grounded Rubyist (http://www.manning.com/black2)

September Ruby training in NJ has been POSTPONED. Details to follow.
 
G

G_ F_

Phil Cooper-king wrote:
[...]
so for constants I search /([A-Z].*?\b)/

This regex is not going to match a constant. It matches any upper-case
letter followed by a non-greedy wildcard followed by a word boundary.

A constant has to begin with an upper-case letter, possibly followed by
mixed-case letters, numbers and underscores ("_").

The following show the problem: The first two are your regex, and the
second two show a fix.

/([A-Z].*?\b)/ =~ 'noT a constant' # => 2
/([A-Z].*?\b)/ =~ 'a Constant' # => 2

/\b([A-Z]\w*\b)/ =~ 'noT a constant.' # => nil
/\b([A-Z]\w*\b)/ =~ 'a Constant.' # => 2

The # => at the end of the line show where the match occurred. The first
set shows a non-constant having a false-positive.

Regex are extremely powerful, but you have to think out what can go
wrong with them. When you are searching you can get false-positives
easily. If you are searching and replacing, you can get destroyed
content.

Also, "?!" is not a NOT operator, it's a negative look-ahead. A match
succeeds if the initial condition matches followed by no match.

http://www.ruby-doc.org/docs/ProgrammingRuby/html/language.html#UN
 
P

Phil Cooper-king

My advice would be to keep it (relatively) straightforward by doing
something like this as you scan the lines of text:

comment_re = /^\s*#/

if comment_re.match(line)
# treat line as a comment
else
# line is not a comment
end

Thanks David.A, I'll give Oniguruma a look but I think I'll go with your
suggestion

Thanks G_F_ for pointing out my mistake.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,679
Members
48,796
Latest member
Greg L.

Latest Threads

Top