regex \s == \n???

T

Tom Cloyd

I'm trying to remove extra spaces from a long string which has some
EOLs, using regex. It's not working. Here's a simple demo:

irb(main):004:0> a="\n abc\n a a a"
=> "\n abc\n a a a"
irb(main):005:0> a.gsub(/\s+/,' ')
=> " abc a a a"

I've dug around in my regex references, and all I can say is that is
hasn't been the least bit helpful. I'm probably not looking for the
right thing.

Can someone more knowledgeable tell me is there's a way to do this -
remove extra spaces without removing the EOLs?

Thanks!

t.

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< (e-mail address removed) >> (email)
<< TomCloyd.com >> (website)
<< sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
S

Stefano Crocco

Alle Friday 06 February 2009, Tom Cloyd ha scritto:
I'm trying to remove extra spaces from a long string which has some
EOLs, using regex. It's not working. Here's a simple demo:

irb(main):004:0> a="\n abc\n a a a"
=> "\n abc\n a a a"
irb(main):005:0> a.gsub(/\s+/,' ')
=> " abc a a a"

I've dug around in my regex references, and all I can say is that is
hasn't been the least bit helpful. I'm probably not looking for the
right thing.

Can someone more knowledgeable tell me is there's a way to do this -
remove extra spaces without removing the EOLs?

Thanks!

t.

According to "The Ruby Programming Language", \s is equivalent to " \t\n\r\f".
So, if you want avoid removing newlines, you'll need to replace \s with
[ \t\r\f] or with a whitespace if you're only intersted in it:

a="\n abc\n a a a"
a.gsub(/ +/, ' ')
=>"\n abc\n a a a"

I hope this helps

Stefano
 
J

joe chesak

[Note: parts of this message were removed to make it a legal post.]

Tom,

If you're just speaking of the space character and you want to replace
double-spaces (or triple-spaces or more) with just a single space, you can
do this.

puts a.gsub(/ +/," ")

Joe
 
T

Tom Cloyd

Stefano, Joe - thank you! I'm only just getting into regex, so I get
easily lost. You solved my problem - each in different ways. A lot of
bang for the buck, indeed!

t.

joe said:
Tom,

If you're just speaking of the space character and you want to replace
double-spaces (or triple-spaces or more) with just a single space, you can
do this.

puts a.gsub(/ +/," ")

Joe


--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< (e-mail address removed) >> (email)
<< TomCloyd.com >> (website)
<< sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
D

David A. Black

Hi --

Stefano, Joe - thank you! I'm only just getting into regex, so I get easily
lost. You solved my problem - each in different ways. A lot of bang for the
buck, indeed!

Another variant:

a.gsub(/[^\S\n]+/, " ")

That character class means "all characters that are not a non-space or
\n." (The ^ is the "not" part.)

You might also be able to use squeeze:

p "abc def \n ghi\n".squeeze # "abc def \n ghi\n"

though that's going to be less versatile if you're dealing, say, with
tabs.


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Coming in 2009: The Well-Grounded Rubyist (http://manning.com/black2)

http://www.wishsight.com => Independent, social wishlist management!
 
M

Mark Thomas

I'm trying to remove extra spaces from a long string which has some
EOLs, using regex. It's not working. Here's a simple demo:

irb(main):004:0> a="\n  abc\n  a  a  a"
=> "\n  abc\n  a  a  a"
irb(main):005:0> a.gsub(/\s+/,' ')
=> " abc a a a"

I've dug around in my regex references, and all I can say is that is
hasn't been the least bit helpful. I'm probably not looking for the
right thing.

A newline is a whitespace char. \s is the same as [ \t\r\n\f]. If you
don't want to match them, remove them. Try
a.gsub(/[ \t]+/,' ')

--Mark
 
M

Mark Thomas

Can't you use squeeze?

Best idea yet. Might as well use a built-in, rather than reinventing
one.

a.squeeze(" ")

Thanks for the reminder. I should review String#instance_methods every
once in a while. There's some good stuff there.
 
S

Siep Korteling

David said:
(squeeze defaults to " " as its argument, so you don't have to provide
an argument unless it's something different.)


David

Is that a 1.9.1 change? In 1.8.6 String#squeeze squeezes everything if
no arguments are given.

"abc aabbcc ".squeeze
#=>"abc abc "

Siep
 
T

Tom Cloyd

David said:
Hi --

Stefano, Joe - thank you! I'm only just getting into regex, so I get
easily lost. You solved my problem - each in different ways. A lot of
bang for the buck, indeed!

Another variant:

a.gsub(/[^\S\n]+/, " ")

That character class means "all characters that are not a non-space or
\n." (The ^ is the "not" part.)

You might also be able to use squeeze:

p "abc def \n ghi\n".squeeze # "abc def \n ghi\n"

though that's going to be less versatile if you're dealing, say, with
tabs.


David
Thanks, David. I continue to be amazed by the depth of your knowledge,
and outright cleverness. In pursuing this simply problem I'm
inadvertently learning a lot. I'm grateful. Thanks for your contribution
that process!

t.

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< (e-mail address removed) >> (email)
<< TomCloyd.com >> (website)
<< sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top