regex \s == \n???

Discussion in 'Ruby' started by Tom Cloyd, Feb 6, 2009.

  1. Tom Cloyd

    Tom Cloyd Guest

    I'm trying to remove extra spaces from a long string which has some
    EOLs, using regex. It's not working. Here's a simple demo:

    irb(main):004:0> a="\n abc\n a a a"
    => "\n abc\n a a a"
    irb(main):005:0> a.gsub(/\s+/,' ')
    => " abc a a a"

    I've dug around in my regex references, and all I can say is that is
    hasn't been the least bit helpful. I'm probably not looking for the
    right thing.

    Can someone more knowledgeable tell me is there's a way to do this -
    remove extra spaces without removing the EOLs?

    Thanks!

    t.

    --

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
    Bellingham, Washington, U.S.A: (360) 920-1226
    << >> (email)
    << TomCloyd.com >> (website)
    << sleightmind.wordpress.com >> (mental health weblog)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     
    Tom Cloyd, Feb 6, 2009
    #1
    1. Advertising

  2. Alle Friday 06 February 2009, Tom Cloyd ha scritto:
    > I'm trying to remove extra spaces from a long string which has some
    > EOLs, using regex. It's not working. Here's a simple demo:
    >
    > irb(main):004:0> a="\n abc\n a a a"
    > => "\n abc\n a a a"
    > irb(main):005:0> a.gsub(/\s+/,' ')
    > => " abc a a a"
    >
    > I've dug around in my regex references, and all I can say is that is
    > hasn't been the least bit helpful. I'm probably not looking for the
    > right thing.
    >
    > Can someone more knowledgeable tell me is there's a way to do this -
    > remove extra spaces without removing the EOLs?
    >
    > Thanks!
    >
    > t.


    According to "The Ruby Programming Language", \s is equivalent to " \t\n\r\f".
    So, if you want avoid removing newlines, you'll need to replace \s with
    [ \t\r\f] or with a whitespace if you're only intersted in it:

    a="\n abc\n a a a"
    a.gsub(/ +/, ' ')
    =>"\n abc\n a a a"

    I hope this helps

    Stefano
     
    Stefano Crocco, Feb 6, 2009
    #2
    1. Advertising

  3. Tom Cloyd

    joe chesak Guest

    [Note: parts of this message were removed to make it a legal post.]

    Tom,

    If you're just speaking of the space character and you want to replace
    double-spaces (or triple-spaces or more) with just a single space, you can
    do this.

    puts a.gsub(/ +/," ")

    Joe

    On Fri, Feb 6, 2009 at 1:10 PM, Tom Cloyd <> wrote:

    > I'm trying to remove extra spaces from a long string which has some EOLs,
    > using regex. It's not working. Here's a simple demo:
    >
    > irb(main):004:0> a="\n abc\n a a a"
    > => "\n abc\n a a a"
    > irb(main):005:0> a.gsub(/\s+/,' ')
    > => " abc a a a"
    >
    > I've dug around in my regex references, and all I can say is that is hasn't
    > been the least bit helpful. I'm probably not looking for the right thing.
    >
    > Can someone more knowledgeable tell me is there's a way to do this - remove
    > extra spaces without removing the EOLs?
    >
    > Thanks!
    >
    > t.
    >
    > --
    >
    > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    > Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
    > Bellingham, Washington, U.S.A: (360) 920-1226
    > << >> (email)
    > << TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental
    > health weblog)
    > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    >
    >
    >
     
    joe chesak, Feb 6, 2009
    #3
  4. Tom Cloyd

    Tom Cloyd Guest

    Stefano, Joe - thank you! I'm only just getting into regex, so I get
    easily lost. You solved my problem - each in different ways. A lot of
    bang for the buck, indeed!

    t.

    joe chesak wrote:
    > Tom,
    >
    > If you're just speaking of the space character and you want to replace
    > double-spaces (or triple-spaces or more) with just a single space, you can
    > do this.
    >
    > puts a.gsub(/ +/," ")
    >
    > Joe
    >
    > On Fri, Feb 6, 2009 at 1:10 PM, Tom Cloyd <> wrote:
    >
    >
    >> I'm trying to remove extra spaces from a long string which has some EOLs,
    >> using regex. It's not working. Here's a simple demo:
    >>
    >> irb(main):004:0> a="\n abc\n a a a"
    >> => "\n abc\n a a a"
    >> irb(main):005:0> a.gsub(/\s+/,' ')
    >> => " abc a a a"
    >>
    >> I've dug around in my regex references, and all I can say is that is hasn't
    >> been the least bit helpful. I'm probably not looking for the right thing.
    >>
    >> Can someone more knowledgeable tell me is there's a way to do this - remove
    >> extra spaces without removing the EOLs?
    >>
    >> Thanks!
    >>
    >> t.
    >>
    >> --
    >>
    >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    >> Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
    >> Bellingham, Washington, U.S.A: (360) 920-1226
    >> << >> (email)
    >> << TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental
    >> health weblog)
    >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    >>
    >>
    >>
    >>

    >
    >



    --

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
    Bellingham, Washington, U.S.A: (360) 920-1226
    << >> (email)
    << TomCloyd.com >> (website)
    << sleightmind.wordpress.com >> (mental health weblog)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     
    Tom Cloyd, Feb 6, 2009
    #4
  5. Hi --

    On Fri, 6 Feb 2009, Tom Cloyd wrote:

    > Stefano, Joe - thank you! I'm only just getting into regex, so I get easily
    > lost. You solved my problem - each in different ways. A lot of bang for the
    > buck, indeed!


    Another variant:

    a.gsub(/[^\S\n]+/, " ")

    That character class means "all characters that are not a non-space or
    \n." (The ^ is the "not" part.)

    You might also be able to use squeeze:

    p "abc def \n ghi\n".squeeze # "abc def \n ghi\n"

    though that's going to be less versatile if you're dealing, say, with
    tabs.


    David

    --
    David A. Black / Ruby Power and Light, LLC
    Ruby/Rails consulting & training: http://www.rubypal.com
    Coming in 2009: The Well-Grounded Rubyist (http://manning.com/black2)

    http://www.wishsight.com => Independent, social wishlist management!
     
    David A. Black, Feb 6, 2009
    #5
  6. Tom Cloyd

    Mark Thomas Guest

    On Feb 6, 7:10 am, Tom Cloyd <> wrote:
    > I'm trying to remove extra spaces from a long string which has some
    > EOLs, using regex. It's not working. Here's a simple demo:
    >
    > irb(main):004:0> a="\n  abc\n  a  a  a"
    > => "\n  abc\n  a  a  a"
    > irb(main):005:0> a.gsub(/\s+/,' ')
    > => " abc a a a"
    >
    > I've dug around in my regex references, and all I can say is that is
    > hasn't been the least bit helpful. I'm probably not looking for the
    > right thing.


    A newline is a whitespace char. \s is the same as [ \t\r\n\f]. If you
    don't want to match them, remove them. Try
    a.gsub(/[ \t]+/,' ')

    --Mark
     
    Mark Thomas, Feb 6, 2009
    #6
  7. Can't you use squeeze?

    Blog: http://random8.zenunit.com/
    Learn rails: http://sensei.zenunit.com/

    On 06/02/2009, at 11:24 PM, joe chesak <> wrote:

    > Tom,
    >
    > If you're just speaking of the space character and you want to replace
    > double-spaces (or triple-spaces or more) with just a single space,
    > you can
    > do this.
    >
    > puts a.gsub(/ +/," ")
    >
    > Joe
    >
    > On Fri, Feb 6, 2009 at 1:10 PM, Tom Cloyd <>
    > wrote:
    >
    >> I'm trying to remove extra spaces from a long string which has some
    >> EOLs,
    >> using regex. It's not working. Here's a simple demo:
    >>
    >> irb(main):004:0> a="\n abc\n a a a"
    >> => "\n abc\n a a a"
    >> irb(main):005:0> a.gsub(/\s+/,' ')
    >> => " abc a a a"
    >>
    >> I've dug around in my regex references, and all I can say is that
    >> is hasn't
    >> been the least bit helpful. I'm probably not looking for the right
    >> thing.
    >>
    >> Can someone more knowledgeable tell me is there's a way to do this
    >> - remove
    >> extra spaces without removing the EOLs?
    >>
    >> Thanks!
    >>
    >> t.
    >>
    >> --
    >>
    >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    >> Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
    >> Bellingham, Washington, U.S.A: (360) 920-1226
    >> << >> (email)
    >> << TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental
    >> health weblog)
    >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    >>
    >>
    >>
     
    Julian Leviston, Feb 6, 2009
    #7
  8. Tom Cloyd

    Mark Thomas Guest

    On Feb 6, 9:37 am, Julian Leviston <> wrote:
    > Can't you use squeeze?


    Best idea yet. Might as well use a built-in, rather than reinventing
    one.

    a.squeeze(" ")

    Thanks for the reminder. I should review String#instance_methods every
    once in a while. There's some good stuff there.
     
    Mark Thomas, Feb 6, 2009
    #8
  9. David A. Black wrote:
    > Hi --


    > (squeeze defaults to " " as its argument, so you don't have to provide
    > an argument unless it's something different.)
    >
    >
    > David


    Is that a 1.9.1 change? In 1.8.6 String#squeeze squeezes everything if
    no arguments are given.

    "abc aabbcc ".squeeze
    #=>"abc abc "

    Siep
    --
    Posted via http://www.ruby-forum.com/.
     
    Siep Korteling, Feb 6, 2009
    #9
  10. Hi --

    On Sat, 7 Feb 2009, Siep Korteling wrote:

    > David A. Black wrote:
    >> Hi --

    >
    >> (squeeze defaults to " " as its argument, so you don't have to provide
    >> an argument unless it's something different.)
    >>
    >>
    >> David

    >
    > Is that a 1.9.1 change? In 1.8.6 String#squeeze squeezes everything if
    > no arguments are given.
    >
    > "abc aabbcc ".squeeze
    > #=>"abc abc "


    Sorry, my mistake. It does squeeze everything.


    David

    --
    David A. Black / Ruby Power and Light, LLC
    Ruby/Rails consulting & training: http://www.rubypal.com
    Coming in 2009: The Well-Grounded Rubyist (http://manning.com/black2)

    http://www.wishsight.com => Independent, social wishlist management!
     
    David A. Black, Feb 6, 2009
    #10
  11. Tom Cloyd

    Tom Cloyd Guest

    David A. Black wrote:
    > Hi --
    >
    > On Fri, 6 Feb 2009, Tom Cloyd wrote:
    >
    >> Stefano, Joe - thank you! I'm only just getting into regex, so I get
    >> easily lost. You solved my problem - each in different ways. A lot of
    >> bang for the buck, indeed!

    >
    > Another variant:
    >
    > a.gsub(/[^\S\n]+/, " ")
    >
    > That character class means "all characters that are not a non-space or
    > \n." (The ^ is the "not" part.)
    >
    > You might also be able to use squeeze:
    >
    > p "abc def \n ghi\n".squeeze # "abc def \n ghi\n"
    >
    > though that's going to be less versatile if you're dealing, say, with
    > tabs.
    >
    >
    > David
    >

    Thanks, David. I continue to be amazed by the depth of your knowledge,
    and outright cleverness. In pursuing this simply problem I'm
    inadvertently learning a lot. I'm grateful. Thanks for your contribution
    that process!

    t.

    --

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
    Bellingham, Washington, U.S.A: (360) 920-1226
    << >> (email)
    << TomCloyd.com >> (website)
    << sleightmind.wordpress.com >> (mental health weblog)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     
    Tom Cloyd, Feb 6, 2009
    #11
  12. Sorry, forgot the " " argument

    Blog: http://random8.zenunit.com/
    Learn rails: http://sensei.zenunit.com/

    On 07/02/2009, at 7:45 AM, "David A. Black" <> wrote:

    > Hi --
    >
    > On Sat, 7 Feb 2009, Siep Korteling wrote:
    >
    >> David A. Black wrote:
    >>> Hi --

    >>
    >>> (squeeze defaults to " " as its argument, so you don't have to
    >>> provide
    >>> an argument unless it's something different.)
    >>>
    >>>
    >>> David

    >>
    >> Is that a 1.9.1 change? In 1.8.6 String#squeeze squeezes everything
    >> if
    >> no arguments are given.
    >>
    >> "abc aabbcc ".squeeze
    >> #=>"abc abc "

    >
    > Sorry, my mistake. It does squeeze everything.
    >
    >
    > David
    >
    > --
    > David A. Black / Ruby Power and Light, LLC
    > Ruby/Rails consulting & training: http://www.rubypal.com
    > Coming in 2009: The Well-Grounded Rubyist (http://manning.com/black2)
    >
    > http://www.wishsight.com => Independent, social wishlist management!
    >
     
    Julian Leviston, Feb 7, 2009
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    735
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,660
    Ant...
    Nov 6, 2003
  3. Replies:
    2
    Views:
    619
  4. Xah Lee
    Replies:
    1
    Views:
    959
    Ilias Lazaridis
    Sep 22, 2006
  5. Replies:
    3
    Views:
    808
    Reedick, Andrew
    Jul 1, 2008
Loading...

Share This Page