Regexp to match strings that _don't_ being with a string

W

Wes Gamble

I would like to write a regexp that will match a string that does NOT
start with a specified set of characters.

For example,

Given:

xyz123
asldfhsl
xyk2345

and assume that I want to see only strings that don't start with "xyz"
(so in this case, the last 2 in the list).

I tried /^[^(xyz)]/ but I don't trust it. I don't think the grouping
will take inside the character class.

Do I need a negative lookahead assertion?

Thanks,
Wes
 
A

Andrew Johnson

I tried /^[^(xyz)]/ but I don't trust it. I don't think the grouping
will take inside the character class.

The [^(xyz)] creates a negative character class, so your regex would
match any string that started with a character not in the given set.
Not what you really want
Do I need a negative lookahead assertion?

That would be a simple solution: /^(?!xyz)/ which will match when
the beginning of the line/string is not followed by 'xyz'.

andrew
 
C

Chris Alfeld

What about:

! s =3D~ /^xyz/

There is a good discussion of doing exactly this with negative
look-aheads in 'man perlre'. It's... ugly.
 
W

Wes Gamble

In my example, won't /^(?!xyz)/ also match

29384723xyz02342

which is a little more than I want?

WG

Andrew said:
I tried /^[^(xyz)]/ but I don't trust it. I don't think the grouping
will take inside the character class.

The [^(xyz)] creates a negative character class, so your regex would
match any string that started with a character not in the given set.
Not what you really want
Do I need a negative lookahead assertion?

That would be a simple solution: /^(?!xyz)/ which will match when
the beginning of the line/string is not followed by 'xyz'.

andrew
 
A

Andrew Johnson

In my example, won't /^(?!xyz)/ also match

29384723xyz02342

which is a little more than I want?

Uhm, maybe I've misunderstood (wouldn't be the first time) -- I thought you
wanted to match strings that did not begin with 'xyz' ... and as far as I
can tell, "29384723xyz02342" does not start with 'xyz'.

while DATA.gets
print if ~/^(?!xyz)/
end
__END__
xyefoo
xyzpfoo
asdfsdf
1230xyzasdf

produces:

xyefoo
asdfsdf
1230xyzasdf


puzzled,
andrew
 
W

Wes Gamble

Andrew,

That works fine.

In actuality, I do need the ability to pass one regex to do the job into
another utility that will use it to operate on an array of strings.

So although !~ is cool, I really didn't want to have to iterate through
the strings myself.

Thanks,
Wes
 
W

Wes Gamble

I have a new wrinkle.

Now I want to match any line that doesn't have "xyz" or "abc" at the
beginning of the line.

Is there a way to "AND" together the input to the negative lookahead
assertion?

Wes
 
J

Joel VanderWerf

Wes said:
I have a new wrinkle.

Now I want to match any line that doesn't have "xyz" or "abc" at the
beginning of the line.

Is there a way to "AND" together the input to the negative lookahead
assertion?

For lookaheads, you can get AND by concatenating:


irb(main):001:0> /^(?!abc)(?!xyz)/ =~ "abc"
=> nil
irb(main):002:0> /^(?!abc)(?!xyz)/ =~ " abc"
=> 0
irb(main):003:0> /^(?!abc)(?!xyz)/ =~ "xyz"
=> nil
irb(main):004:0> /^(?!abc)(?!xyz)/ =~ " xyz"
=> 0
 
R

Robert Klemme

Wes said:
Thanks, that makes sense since the lookahead doesn't "consume" right?

Right. In this case an alternation works, too:

p %w{abcd xyz as}.map {|s| /^(?!abc|xyz)/=~s}
=> [nil, nil, 0]

De Morgan's Law comes to mind: not a and not b <=> not(a or b) :)

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top