Greedy and non greedy quantifiers

D

Dan Kelly

Im just after a bit of clarification with quantifiers in regular
expressions.
I just want to be sure of the differences between the quantifiers. so
for these various regular expressions,

[a-z]* - this will match any amount lower case letters
[a-z]+ - this will match any amount lower case letters
(whats difference between + and * in this case?)
[a-z]+? -

or

\d* - This will match any amount of digits
\d*? - This will only match none or one number

Please can someone offer some clarification Im still unsure of myself
with these expresions as I'm new to ruby,
Thanks,
Dan
 
A

Andrew Timberlake

Im just after a bit of clarification with quantifiers in regular
expressions.
I just want to be sure of the differences between the quantifiers. so
for these various regular expressions,

[a-z]* - this will match any amount lower case letters
[a-z]+ - this will match any amount lower case letters
(whats difference between + and * in this case?)
[a-z]+? -

or

\d* - This will match any amount of digits
\d*? - This will only match none or one number

Please can someone offer some clarification Im still unsure of myself
with these expresions as I'm new to ruby,
Thanks,
Dan

Dan

* matches 0 or more of the preceding pattern
+ matches 1 or more of the preceding pattern
? matches 0 or 1 of the preceding pattern
{n,m} matches n to m of the preceding pattern

I stand to be corrected but I don't believe that *? or +? are valid at all.

Andrew Timberlake
(e-mail address removed)
082 415 8283
skype: andrewtimberlake

"I have never let my schooling interfere with my education."
--Mark Twain
 
D

Dan Kelly

Andrew said:
Dan

* matches 0 or more of the preceding pattern
+ matches 1 or more of the preceding pattern
? matches 0 or 1 of the preceding pattern
{n,m} matches n to m of the preceding pattern

I stand to be corrected but I don't believe that *? or +? are valid at
all.


yeah that helps thanks, I think I was just making things up..
 
R

Rados³aw Bu³at

puts "aaaabaaaab".match(/.*?b/) # matches "aaab"

It of course matches "aaaab" (missing one 'a')
 
R

Robert Klemme

There are 4 (I hope that didn't miss anything) operators to specify repetitions.

*, +, ? and {m, n}
* - matches 0 or more
+ - matches 1 or more
? - matches 0 or 1 (same as {0,1} )
{m,n} matches m..n

All these operators all greedy (I'll show you examples). To make
non-greedy operator you must add '?' after operator. So non-greedy
operators are:
*?, +?, ??, {m,n}?
*? - matches 0 or more
+? - matches 1 or more
?? - matches 0 or 1 (same as {0,1}?)
{m,n} matches m..n

Maybe '??' looks strange, but all of these are absolutely correct.

So far no differences.

Why we have greedy and non-greedy operators? Because they works in
different way. Greedy operators tries to match as many character
(precisely: left expression) as can and if during matching it must
move back it tries to match fewer characters.

For example:
puts "aaaa".match(/a*/) # matches whole string
puts "aaaa" =~ /a*?/ # matches no characters (but it success)

But can match more if it needs to to make the match a success:

irb(main):005:0> /a*?/.match("aaab").to_a
=> [""]
irb(main):006:0> /a*?b/.match("aaab").to_a
=> ["aaab"]
Another example:
puts "aaaabaaaab".match(/.*b/) # matches whole string
puts "aaaabaaaab".match(/.*?b/) # matches "aaab"

I always recommend Jeffrey Friedl's book "Mastering Regular
Expressions" (http://regex.info/). After reading this book you will be
master of regexp ;-).

Definitively worth reading!

Cheers

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top