split on '' (and another for split -1)

Discussion in 'Ruby' started by trans. (T. Onoma), Dec 27, 2004.

  1. Here's a generic routine I'm working on:

    class String
    def last=(str, separator=$/)
    separator = '' unless separator
    raise "separator must be a String" unless String === separator
    s = self.split(separator, -1)
    s[-1] = str
    self.replace(s.join(separator))
    end
    end

    $/ = "\n"
    s = "ab\nc"
    s.last = "123"
    p s
    # => "ab\n123"

    Now try:

    $/ = ""
    s = "abc"
    s.last = "123"
    p s

    Unfortunately this does not give a congruent result.

    As an aside, this relates to the split -1 since any generic routine will not
    know what the split separator is and therefore will require the -1. Though my
    analysis is not complete, I continue to find that in most cases the -1
    parameter is either required, or at worst, inconsequential.

    T.
     
    trans. (T. Onoma), Dec 27, 2004
    #1
    1. Advertising

  2. trans.  (T. Onoma)

    Carlos Guest

    ["trans. (T. Onoma)" <>, 2004-12-27 20.31 CET]
    > Here's a generic routine I'm working on:
    >
    > class String
    > def last=(str, separator=$/)
    > separator = '' unless separator
    > raise "separator must be a String" unless String === separator
    > s = self.split(separator, -1)
    > s[-1] = str
    > self.replace(s.join(separator))
    > end
    > end
    >
    > $/ = "\n"
    > s = "ab\nc"
    > s.last = "123"
    > p s
    > # => "ab\n123"
    >
    > Now try:
    >
    > $/ = ""
    > s = "abc"
    > s.last = "123"
    > p s
    >
    > Unfortunately this does not give a congruent result.


    $/="" means paragraph mode, and it is acknowledged(?) by IO#gets,
    #readlines, String#each, #to_a, etc. Maybe you should do the same?

    And that solves your problems with split('') ;)))).
     
    Carlos, Dec 27, 2004
    #2
    1. Advertising

  3. On Monday 27 December 2004 03:33 pm, Carlos wrote:
    | ["trans. (T. Onoma)" <>, 2004-12-27 20.31 CET]
    |
    | > Here's a generic routine I'm working on:
    | >
    | > class String
    | > def last=(str, separator=$/)
    | > separator = '' unless separator
    | > raise "separator must be a String" unless String === separator
    | > s = self.split(separator, -1)
    | > s[-1] = str
    | > self.replace(s.join(separator))
    | > end
    | > end
    | >
    | > $/ = "\n"
    | > s = "ab\nc"
    | > s.last = "123"
    | > p s
    | > # => "ab\n123"
    | >
    | > Now try:
    | >
    | > $/ = ""
    | > s = "abc"
    | > s.last = "123"
    | > p s
    | >
    | > Unfortunately this does not give a congruent result.
    |
    | $/="" means paragraph mode, and it is acknowledged(?) by IO#gets,
    | #readlines, String#each, #to_a, etc. Maybe you should do the same?
    |
    | And that solves your problems with split('') ;)))).

    Boy, that's a real side-splitter! Watch me slap my knee! ;)))).

    But seriously, you can call it anything you wish. It does not change the
    behavior.

    "Moreover you have a peculiar definition of paragraph. ".split('')
    => ["M", "o", "r", "e", "o", "v", "e", "r", " ", "y", "o", "u", " ", "h", "a",
    "v", "e", " ", "a", " ", "p", "e", "c", "u", "l", "i", "a", "r", " ", "d",
    "e", "f", "i", "n", "i", "t", "i", "o", "n", " ", "o", "f", " ", "p", "a",
    "r", "a", "g", "r", "a", "p", "h", ".", " "]

    A more informative explanation of this "paragraph mode" might actually be
    helpful.

    T.
     
    trans. (T. Onoma), Dec 27, 2004
    #3
  4. trans.  (T. Onoma)

    Carlos Guest

    Paragraph mode (was: split on '' (and another for split -1))

    ["trans. (T. Onoma)" <>, 2004-12-27 22.25 CET]
    > A more informative explanation of this "paragraph mode" might actually be
    > helpful.


    require 'pp'

    $/=""
    pp <<'EOT'.to_a

    A line feed separates lines. For example this one ->
    <- divides these two lines. Two or more line feeds
    separate paragraphs (that is, the regular expression
    /\n\n+/). Here ends the first paragraph:

    And here begins the second.

    Third.


    Fourth.
    EOT

    =>
    ["\nA line feed separates lines. For example this one ->\n<- divides these
    two lines. Two or more line feeds\nseparate paragraphs (that is, the regular
    expression\n/\\n\\n+/). Here ends the first paragraph:\n\n",
    "And here begins the second.\n\n",
    "Third.\n\n\n",
    "Fourth.\n"]

    Good luck.
     
    Carlos, Dec 27, 2004
    #4
  5. On Monday 27 December 2004 04:25 pm, trans. (T. Onoma) wrote:
    | A more informative explanation of this "paragraph mode" might actually be
    | helpful.

    Okay, I looked up what your were trying to explain to me. Sigh, it makes it
    even more complex.

    Firstly, the issue with -1 is still the actual problem I was pointing out:
    i.e. it appends an "" to the end of the array when split on "".

    And now thanks to "paragraph mode" I have another problem. While I was trying
    to make #first= and #last= work congruently with #each, which uses $/, it
    seems that paragraph mode actually subverts the potential for splitting on ''
    as a character mode. To do so one must use // instead, but...

    $/ = //
    TypeError: value of $/ must be String

    Why not just have #each_paragraph for a "paragraph mode"?

    T.
     
    trans. (T. Onoma), Dec 27, 2004
    #5
  6. Re: Paragraph mode (was: split on '' (and another for split -1))

    On Monday 27 December 2004 04:48 pm, Carlos wrote:
    | ["trans. (T. Onoma)" <>, 2004-12-27 22.25 CET]
    |
    | > A more informative explanation of this "paragraph mode" might actually be
    | > helpful.
    |
    | require 'pp'
    |
    | $/=""
    | pp <<'EOT'.to_a
    |
    | A line feed separates lines. For example this one ->
    | <- divides these two lines. Two or more line feeds
    | separate paragraphs (that is, the regular expression
    | /\n\n+/). Here ends the first paragraph:
    |
    | And here begins the second.
    |
    | Third.
    |
    |
    | Fourth.
    | EOT
    |
    | =>
    | ["\nA line feed separates lines. For example this one ->\n<- divides these
    | two lines. Two or more line feeds\nseparate paragraphs (that is, the
    | regular expression\n/\\n\\n+/). Here ends the first paragraph:\n\n",
    | "And here begins the second.\n\n",
    | "Third.\n\n\n",
    | "Fourth.\n"]
    |
    | Good luck.

    Thanks, Carlos.

    T.
     
    trans. (T. Onoma), Dec 27, 2004
    #6
  7. trans. (T. Onoma) wrote:

    > Here's a generic routine I'm working on:
    >
    > class String
    > def last=(str, separator=$/)
    > [...]
    > end
    > end


    AFAIK there is no way of supplying the separator in that case except
    using .send(). I have done an RCR for this, but it was not commonly
    wanted at that time.
     
    Florian Gross, Dec 27, 2004
    #7
  8. On Monday 27 December 2004 04:56 pm, trans. (T. Onoma) wrote:
    | Why not just have #each_paragraph for a "paragraph mode"?

    Another thought:

    $/ = :paragraph

    T.
     
    trans. (T. Onoma), Dec 27, 2004
    #8
  9. On Monday 27 December 2004 05:21 pm, Florian Gross wrote:
    | trans. (T. Onoma) wrote:
    | > Here's a generic routine I'm working on:
    | >
    | > class String
    | > def last=(str, separator=$/)
    | > [...]
    | > end
    | > end
    |
    | AFAIK there is no way of supplying the separator in that case except
    | using .send(). I have done an RCR for this, but it was not commonly
    | wanted at that time.

    At least one can set the $/ before hand, I guess.

    T.
     
    trans. (T. Onoma), Dec 28, 2004
    #9
  10. On Monday 27 December 2004 05:21 pm, Florian Gross wrote:
    | trans. (T. Onoma) wrote:
    | > Here's a generic routine I'm working on:
    | >
    | > class String
    | > def last=(str, separator=$/)
    | > [...]
    | > end
    | > end
    |
    | AFAIK there is no way of supplying the separator in that case except
    | using .send(). I have done an RCR for this, but it was not commonly
    | wanted at that time.

    I didn't see it off hand. Which RCR # is it?

    In doing so would there be a conflict with parallel assignment?

    T.
     
    trans. (T. Onoma), Dec 28, 2004
    #10
  11. trans. (T. Onoma) wrote:

    > | AFAIK there is no way of supplying the separator in that case except
    > | using .send(). I have done an RCR for this, but it was not commonly
    > | wanted at that time.
    >
    > I didn't see it off hand. Which RCR # is it?


    http://www.rcrchive.net/rcr/show/157

    At the time I submitted it there was not much request for it. If that
    changes I could resubmit it under the new format.
     
    Florian Gross, Dec 28, 2004
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    492
  2. Carlos Ribeiro
    Replies:
    11
    Views:
    739
    Alex Martelli
    Sep 17, 2004
  3. Sam Kong
    Replies:
    5
    Views:
    277
    Rick DeNatale
    Aug 12, 2006
  4. Stanley Xu
    Replies:
    2
    Views:
    708
    Stanley Xu
    Mar 23, 2011
  5. Replies:
    4
    Views:
    561
    cwdjrxyz
    Jan 17, 2006
Loading...

Share This Page