String#split(/\s+/) vs. String#split(/(\s+)/)

S

Sam Kong

Hello Rubyists,

I'm reading Ruby Cookbook.
The first chapter is about String.
One of the examples shows the differenct between String#split(/\s+/) and
String#split(/(\s+)/) without much explanation.
I understand what sub-grouping is in regex.
Bug I don't understand what role that plays in String#split.

s = "one two three"

p s.split(/\s+/) #=> ["one", "two", "three"]
p s.split(/(\s+)/) #=> ["one", " ", "two", " ", "three"]


Could anybody explain it, please?

Thanks,
Sam
 
D

dblack

Hi --

Hello Rubyists,

I'm reading Ruby Cookbook.
The first chapter is about String.
One of the examples shows the differenct between String#split(/\s+/) and
String#split(/(\s+)/) without much explanation.
I understand what sub-grouping is in regex.
Bug I don't understand what role that plays in String#split.

s = "one two three"

p s.split(/\s+/) #=> ["one", "two", "three"]
p s.split(/(\s+)/) #=> ["one", " ", "two", " ", "three"]


Could anybody explain it, please?

When you use (), you get the delimiter (the thing you're splitting on)
back in the array, along with the items between the delimiters. An
example without spaces might make it clearer:

"aaaXXXbbbXXXccc".split(/XXX/) => ["aaa","bbb","ccc"]
"aaaXXXbbbXXXccc".split(/(XXX)/) => ["aaa","XXX","bbb","XXX","ccc"]

In your example, the delimiter is \s+ which is of variable length;
that's why you get both " " and " " in the final array.


David

--
http://www.rubypowerandlight.com => Ruby/Rails training & consultancy
----> SEE SPECIAL DEAL FOR RUBY/RAILS USERS GROUPS! <-----
http://dablog.rubypal.com => D[avid ]A[. ]B[lack's][ Web]log
http://www.manning.com/black => book, Ruby for Rails
http://www.rubycentral.org => Ruby Central, Inc.
 
K

Ken & Deb Allen

Why does using the parentheses cause the separator string/character
to be placed into the resulting array?

-ken

Hi --

Hello Rubyists,

I'm reading Ruby Cookbook.
The first chapter is about String.
One of the examples shows the differenct between String#split(/\s
+/) and
String#split(/(\s+)/) without much explanation.
I understand what sub-grouping is in regex.
Bug I don't understand what role that plays in String#split.

s = "one two three"

p s.split(/\s+/) #=> ["one", "two", "three"]
p s.split(/(\s+)/) #=> ["one", " ", "two", " ", "three"]


Could anybody explain it, please?

When you use (), you get the delimiter (the thing you're splitting on)
back in the array, along with the items between the delimiters. An
example without spaces might make it clearer:

"aaaXXXbbbXXXccc".split(/XXX/) => ["aaa","bbb","ccc"]
"aaaXXXbbbXXXccc".split(/(XXX)/) => ["aaa","XXX","bbb","XXX","ccc"]

In your example, the delimiter is \s+ which is of variable length;
that's why you get both " " and " " in the final array.


David

--
http://www.rubypowerandlight.com => Ruby/Rails training & consultancy
----> SEE SPECIAL DEAL FOR RUBY/RAILS USERS GROUPS! <-----
http://dablog.rubypal.com => D[avid ]A[. ]B[lack's][ Web]log
http://www.manning.com/black => book, Ruby for Rails
http://www.rubycentral.org => Ruby Central, Inc.
 
E

Eero Saynatkari

Sam said:
Hello Rubyists,

I'm reading Ruby Cookbook.
The first chapter is about String.
One of the examples shows the differenct between String#split(/\s+/) and
String#split(/(\s+)/) without much explanation.
I understand what sub-grouping is in regex.
Bug I don't understand what role that plays in String#split.

s = "one two three"

p s.split(/\s+/) #=> ["one", "two", "three"]
p s.split(/(\s+)/) #=> ["one", " ", "two", " ", "three"]

# Try this one
p s.split /((((\s+))))/
 
J

Jan Svitok

Sam said:
Hello Rubyists,

I'm reading Ruby Cookbook.
The first chapter is about String.
One of the examples shows the differenct between String#split(/\s+/) and
String#split(/(\s+)/) without much explanation.
I understand what sub-grouping is in regex.
Bug I don't understand what role that plays in String#split.

s = "one two three"

p s.split(/\s+/) #=> ["one", "two", "three"]
p s.split(/(\s+)/) #=> ["one", " ", "two", " ", "three"]

# Try this one
p s.split /((((\s+))))/
Could anybody explain it, please?

Thanks,
Sam

Seems like all groups in the separator regex are output to the result array.

I wonder where is it documented, except for the source itself?
(string.c, rb_str_split_m())
 
R

Rick DeNatale

Sam said:
Hello Rubyists,

I'm reading Ruby Cookbook.
The first chapter is about String.
One of the examples shows the differenct between String#split(/\s+/) and
String#split(/(\s+)/) without much explanation.
I understand what sub-grouping is in regex.
Bug I don't understand what role that plays in String#split.

s = "one two three"

p s.split(/\s+/) #=> ["one", "two", "three"]
p s.split(/(\s+)/) #=> ["one", " ", "two", " ", "three"]

# Try this one
p s.split /((((\s+))))/
Could anybody explain it, please?

Thanks,
Sam

Seems like all groups in the separator regex are output to the result array.

I wonder where is it documented, except for the source itself?
(string.c, rb_str_split_m())

Well the pickaxe (2nd ed.) says so:

"If pattern is a Regexp, str is divided where the pattern matches.
Whenever the pattern matches a zero-length string, str is split into
individual characters. If pattern includes groups, these groups will
be included in the returned values."

Ruby-doc.org doesn't have that last sentence, in either the 1.8 nor
the 1.9 documentation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,439
Members
44,829
Latest member
PIXThurman

Latest Threads

Top