The easiest way to separate substrings from a line string

Discussion in 'Ruby' started by Sarawut Poaitwinyu, Jul 9, 2009.

  1. I have working on tag system and There is information like this

    "important urgent project 2009"

    i want to seperate into word, one word at the time is okay

    What is the easiest way to separate those string word by word, space
    between each word might has more than 1.

    My idea is to use loop to check character by character that it is space
    or not, and then cut the part, but i thought it might have easier way
    that i don't know

    Thank you in advance
    --
    Posted via http://www.ruby-forum.com/.
    Sarawut Poaitwinyu, Jul 9, 2009
    #1
    1. Advertising

  2. Hi --

    On Thu, 9 Jul 2009, Sarawut Poaitwinyu wrote:

    > I have working on tag system and There is information like this
    >
    > "important urgent project 2009"
    >
    > i want to seperate into word, one word at the time is okay
    >
    > What is the easiest way to separate those string word by word, space
    > between each word might has more than 1.
    >
    > My idea is to use loop to check character by character that it is space
    > or not, and then cut the part, but i thought it might have easier way
    > that i don't know


    words = string.split

    When you call split with no argument, it splits on whitespace
    (including more than one character).


    David

    --
    David A. Black / Ruby Power and Light, LLC
    Ruby/Rails consulting & training: http://www.rubypal.com
    Now available: The Well-Grounded Rubyist (http://manning.com/black2)
    Training! Intro to Ruby, with Black & Kastner, September 14-17
    (More info: http://rubyurl.com/vmzN)
    David A. Black, Jul 9, 2009
    #2
    1. Advertising

  3. Sarawut Poaitwinyu

    Thriving K. Guest

    David A. Black wrote:
    > Hi --
    >
    > On Thu, 9 Jul 2009, Sarawut Poaitwinyu wrote:
    >
    >> or not, and then cut the part, but i thought it might have easier way
    >> that i don't know

    >
    > words = string.split
    >
    > When you call split with no argument, it splits on whitespace
    > (including more than one character).
    >
    >
    > David




    Thank you , i will try
    --
    Posted via http://www.ruby-forum.com/.
    Thriving K., Jul 9, 2009
    #3
  4. 2009/7/9 David A. Black <>:
    > Hi --
    >
    > On Thu, 9 Jul 2009, Sarawut Poaitwinyu wrote:
    >
    >> I have working on tag system and There is information like this
    >>
    >> "important urgent project 2009"
    >>
    >> i want to seperate into word, one word at the time is okay
    >>
    >> What is the easiest way to separate those string word by word, space
    >> between each word might has more than 1.
    >>
    >> My idea is to use loop to check character by character that it is space
    >> or not, and then cut the part, but i thought it might have easier way
    >> that i don't know

    >
    > =A0words =3D string.split
    >
    > When you call split with no argument, it splits on whitespace
    > (including more than one character).


    I am more like the "positive" guy - meaning explicitly defining what I
    want returned. I would do

    words =3D string.scan /\w+/

    That way dot, question mark and other signs won't hurt. It may not
    make a difference but it's probably good to see different approaches.

    Kind regards

    robert


    --=20
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
    Robert Klemme, Jul 9, 2009
    #4
  5. 2009/7/9 David A. Black <>:
    > On Fri, 10 Jul 2009, Robert Klemme wrote:
    >> 2009/7/9 David A. Black <>:
    >>> On Thu, 9 Jul 2009, Sarawut Poaitwinyu wrote:


    >>> =A0words =3D string.split
    >>>
    >>> When you call split with no argument, it splits on whitespace
    >>> (including more than one character).

    >>
    >> I am more like the "positive" guy - meaning explicitly defining what I
    >> want returned. =A0I would do
    >>
    >> words =3D string.scan /\w+/
    >>
    >> That way dot, question mark and other signs won't hurt. =A0It may not
    >> make a difference but it's probably good to see different approaches.

    >
    > string.split does explicitly define what I want back; it's just
    > something different from what you want back :)


    That's true. I just wanted to make the point that there are these two
    major approaches: define positively what you want in your result or
    define it ex negativo, i.e. state what you want to use as separator.

    The whole point is that both approaches may behave identical with the
    original set of test data but will exhibit different behavior as soon
    as the input changes. If you use #split, you might get something you
    did not want in the first place. With #scan you won't notice - which
    could be bad as well.

    The super safe variant would be to first do a match on the whole
    string to ensure it does contain expected data only and fail if not.
    After that it does not matter any more what extraction method one
    uses.

    > It depends exactly how
    > you define "word". I was assuming it was /\S+/ but it may indeed be
    > /\w+/ (or maybe /[^\W\d_]+/ or something).


    Absolutely.

    Kind regards

    robert

    --=20
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
    Robert Klemme, Jul 9, 2009
    #5
  6. Sarawut Poaitwinyu

    Thriving K. Guest

    Thank you for everyone again, it seems that you guys discussed sort of
    regular expression that i didn't understand but thank you for it anyone,
    i will try to research about it later
    --
    Posted via http://www.ruby-forum.com/.
    Thriving K., Jul 13, 2009
    #6
  7. Sarawut Poaitwinyu

    Dave Burt Guest

    Moving off-topic from the thread, but David Black wrote:
    > ... /\w+/ (or maybe /[^\W\d_]+/ or something).


    Do people actually say /[^\W\d_]/ instead of /[a-z]/i? The latter is
    much easier for me to read. Does the former include non-latin
    word-characters?

    Cheers,
    Dave Burt
    Dave Burt, Jul 13, 2009
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. EvgueniB
    Replies:
    1
    Views:
    623
    Anthony Borla
    Dec 15, 2003
  2. Arnold Peters
    Replies:
    11
    Views:
    578
    Thomas Fritsch
    Jan 28, 2005
  3. Frank Fredstone
    Replies:
    1
    Views:
    436
    Jean-Francois Briere
    Jun 27, 2006
  4. Tung Chau
    Replies:
    1
    Views:
    468
    SM Ryan
    Aug 6, 2004
  5. Tung Chau
    Replies:
    0
    Views:
    372
    Tung Chau
    Aug 6, 2004
Loading...

Share This Page