Problem in 2 code

Amir Ebrahimifard · Jul 31, 2010

Hi
I dont understand results of these 2 code ?
please explain me what do every code do ?

1 -
x = "This is a test".match( /(\w+)(\w+)/ )
puts x[0]
puts x[1]
puts x[2]

2 -
x = "This is a test".match( /(\w+) (\w+)/ )
puts x[0]
puts x[1]
puts x[2]

Kc Co · Aug 1, 2010

Amir said:
Hi
I dont understand results of these 2 code ?
please explain me what do every code do ?

1 -
x = "This is a test".match( /(\w+)(\w+)/ )
puts x[0]
puts x[1]
puts x[2]

2 -
x = "This is a test".match( /(\w+) (\w+)/ )
puts x[0]
puts x[1]
puts x[2]

/(\w+)(\w+)/ is a Regexp. If you want to know about it, you should
probably look it up to help remove the confusion. \w+ means one or more
word characters, so /(\w+)(\w+)/ means 2 word characters or more. Based
on the string, the match method returns the first instance that matches
it, which is "This". As to what the x[1] and x[2] put to the screen, I
could be totally wrong about it but I'm guessing they might be the parts
that match the parts of the Regexp given. However, it'd probably be best
to ask someone else about that.

In the second code, it's the same thing except there's a space between
the two \w+. This means at least one word character followed by a space
followed by at least one word character. That is why it returns the
match "This is" instead of just "This".

I hope this was helpful.

Amir Ebrahimifard · Aug 1, 2010

Thanks for answer , but yet I have a problem :
what does first code do?
why when I write "puts x[0]" ruby returns "This" and for "puts x[1]"
returns "Thi" and for "puts x[2]" returns "s" ?

Markus Fischer · Aug 1, 2010

Hello Amir,

why when I write "puts x[0]" ruby returns "This" and for "puts x[1]"
returns "Thi" and for "puts x[2]" returns "s" ?

The first (x[0]) is always the complete match the whole regular
expression did match. The rest are the individual sub matches, if there
are any.

One also has to know that, by default, in most implementation any
regular expression is "greedy", which means it tries to match as much
characters as possible.

So, given your first example:

"This is a test".match( /(\w+)(\w+)/ )

\w - match a a single "word" character

\w+ - match at least one *or* more "word" characters

Now since by default everything is greedy, the first \w+ tries to match
as much as possible. Since the second \w+ wants to fulfill it task too,
the first \w+ eats up already everything until the last character and
leaves that for the second \w+ .

There's a special character ? which can be used to tell a regex to be
non-greedy, try this example:

"This is a test".match( /(\w+?)(\w+)/ )

irb(main):006:0> "1234".match(/(\d+?)(\d+)/)
=> #<MatchData "1234" 1:"1" 2:"234">

The \w+? means "match as few as possible" and thus it only matches the
first "1" and leaves all the rest to the second \w+ .

In your case it's debatable whether this regex really makes sense
though; at a first glance it doesn't look like a generally useful case
and really looks very specific.

HTH

[C language] Issue in the Lotka-Volterra model.	0	Jun 28, 2023
How does this line of code work? (it's simple)	1	Feb 13, 2011
Code help please	4	May 19, 2023
Problem with codewars.	5	Dec 4, 2023
Q for a source code in an exercise	1	Dec 15, 2023
Need Help: Program to Accept 2 Matrices and Show their Sum	0	Aug 21, 2022
Python AI chatbot problem, can you help me?	1	Jan 29, 2023
Struct Member Variables Problem	0	Jun 21, 2023

Problem in 2 code

Amir Ebrahimifard

Kc Co

Amir Ebrahimifard

Markus Fischer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads