Getting the last N bytes of a string:

Discussion in 'Ruby' started by Brian Candler, Jun 1, 2007.

  1. What's the simplest way to get only the last 10 bytes of a string?

    buf[-10,10]

    doesn't work if buf is smaller than 10 bytes, as it returns nil. Is there
    anything simpler than

    buf[-10,10] || buf

    ?

    Thanks,

    Brian.

    P.S. To get the *first* 10 bytes of a string is easy: buf[0,10] always
    works, whether or not buf is smaller than 10 bytes)
     
    Brian Candler, Jun 1, 2007
    #1
    1. Advertisements

  2. Brian Candler

    Leonard Chin Guest

    buf.reverse[0,10]?
     
    Leonard Chin, Jun 1, 2007
    #2
    1. Advertisements

  3. I don't know if this is simple but you can try something like this.

    str =~ /(.{1,10}$)/

    Harry
     
    Harry Kakueki, Jun 1, 2007
    #3
  4. Brian Candler

    David Mullet Guest

    There are several ways. One option is to pad the string with spaces,
    extract your slice, then strip the leading and trailing whitespace:

    buf = "abcdef"

    buf.rjust(10, ' ')[-10, 10].strip

    => "abcdef"


    David

    http://rubyonwindows.blogspot.com
     
    David Mullet, Jun 1, 2007
    #4
  5. Brian Candler

    Leonard Chin Guest

    This works nicely if you do this:
    str[/(.{1,10}$)/]

    Though I guess it depends on what you consider "simple".
     
    Leonard Chin, Jun 1, 2007
    #5
  6. Ok. Here is my first attempt at answering a coding question. /deep
    breath

    'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.reverse.slice(0..9).reverse
     
    Lloyd Linklater, Jun 1, 2007
    #6
  7. Hi,

    Am Freitag, 01. Jun 2007, 22:10:51 +0900 schrieb Leonard Chin:
    str[/(.{1,10}\z)/]

    is what the OP meant.

    Bertram
     
    Bertram Scharpf, Jun 1, 2007
    #7
  8. Brian Candler

    Leonard Chin Guest

    That is a great answer too :)
    I guess the OP didn't mention newlines, so \z works better.
     
    Leonard Chin, Jun 1, 2007
    #8
  9. I'd use:

    buf[-10..-1] || buf

    but this is only a variation of what you have.

    There is buf.reverse[0..9].reverse as well, but this looks even uglier
    to me than the first.

    Stefan
     
    Stefan Mahlitz, Jun 1, 2007
    #9
  10. Well, I guess the parentheses are not necessary.
    This looks a little less cluttered.

    str[/.{1,10}$/]

    Harry
     
    Harry Kakueki, Jun 2, 2007
    #10
  11. Hi,

    Am Freitag, 01. Jun 2007, 20:15:54 +0900 schrieb Brian Candler:
    ,From time to time I find myself trying to do something like
    that. Maybe the best thing was if the String#slice method
    were defined for using negative lengths but that's probably
    too late.

    For my personal use I wrote a litte extension doing exactly
    that. You can have it if you want.

    http://www.bertram-scharpf.de/tmp/bs-ruby.tar.gz

    Inspite of its shortness it's surely buggy. Parameter
    definition will ever be a matter of taste. Don't try to
    convince the core developpers there's a neccessity to proffer
    a solution for this. No chance.

    Bertram
     
    Bertram Scharpf, Jun 2, 2007
    #11
  12. Thanks everyone for your answers.

    I think I'll stick with buf[-10,10] || buf. I was just wondering if I'd
    missed something obvious like buf.last(10), after staring at 'ri String' too
    long.

    Finally I found a usage case where perl has the edge :)

    $ perl -e '$a="abc";print substr($a,-10),"\n"'
    abc
    $ perl -e '$a="abcdefghijklmnop";print substr($a,-10),"\n"'
    ghijklmnop

    Regards,

    Brian.
     
    Brian Candler, Jun 2, 2007
    #12
  13. ok, I have a question now. In the original post, he wanted to cover
    errors caused by undersized strings. I have been trying to grok the
    "zen of ruby" from posts in here and I thought that writing Ruby-ish
    code involves two things: 1. making it so that it reads as plain
    English and 2. Letting Ruby do the work for you.

    I could do it easily enough in Pascal, for example:

    s := @s[length(s) - 10];

    but then we need to check for length errors as mentioned before.

    Here comes the question, why was my approach wrong? I am assuming that
    it was horrible because no one even bothered to tell me that I was being
    a total goober by saying this:

    'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.reverse.slice(0..9).reverse

    I know that just having working code is not enough or we could use C#
    and have done with it. How do we make it readable? How do we let Ruby
    do the work?

    It seems to me that approaches like str[/(.{1,10}\z)/] are neither plain
    language readable nor letting Ruby do the work as it seems that the work
    is in the coding.

    Assuming that all the variations work equally well, how does one choose
    which is the more "ruby like" in its approach? How can I tell which
    approach is ugly and which is elegant?
     
    Lloyd Linklater, Jun 2, 2007
    #13
  14. Well, this is how Rails solves this in ActiveSupport

    vendor/rails/activesupport/lib/active_support/core_ext/string/access.rb

    module ActiveSupport #:nodoc:
    module CoreExtensions #:nodoc:
    module String #:nodoc:
    # Makes it easier to access parts of a string, such as
    specific characters and substrings.
    module Access
    #snip...
    # Returns the last character of the string or the last +limit
    + characters.
    #
    # Examples:
    # "hello".last # => "o"
    # "hello".last(2) # => "lo"
    # "hello".last(10) # => "hello"
    def last(limit = 1)
    (chars[(-limit)..-1] || self).to_s
    end
    end
    end
    end
    end

    Which is mixed into String by:
    vendor/rails/activesupport/lib/active_support/core_ext/string.rb

    require File.dirname(__FILE__) + '/string/inflections'
    require File.dirname(__FILE__) + '/string/conversions'
    require File.dirname(__FILE__) + '/string/access'
    require File.dirname(__FILE__) + '/string/starts_ends_with'
    require File.dirname(__FILE__) + '/string/iterators'
    require File.dirname(__FILE__) + '/string/unicode'

    class String #:nodoc:
    include ActiveSupport::CoreExtensions::String::Access
    include ActiveSupport::CoreExtensions::String::Conversions
    include ActiveSupport::CoreExtensions::String::Inflections
    include ActiveSupport::CoreExtensions::String::StartsEndsWith
    include ActiveSupport::CoreExtensions::String::Iterators
    include ActiveSupport::CoreExtensions::String::Unicode
    end

    So I'd venture to say that the Ruby Way is to simply open up the
    String class and give it a new method. Since there is already an
    Array#first that gives [1,2,3,4].first(2) => [1,2], it just seems
    right to match that with .last and move the pair onto String treating
    the individual characters like the elements of the Array.

    Although the longer-term would seem to be to reconcile the []
    behavior with Ranges:
    => "abcd"
    => nil

    If only that were also "abcd".

    -Rob

    Rob Biedenharn http://agileconsultingllc.com
     
    Rob Biedenharn, Jun 2, 2007
    #14
  15. Brian Candler

    Ryan Davis Guest

    you could do that too:
    => "blah"

    or do what you were looking for:
    => "blah"
     
    Ryan Davis, Jun 4, 2007
    #15
  16. "blah blah blah".to_a.last(4).join as well.

    It seems like a lot of Array methods would be better served as
    equivalents from Enumerable. #last is one of these.

    (It won't nearly be as fast as a pure Array implementation, but it
    would be extremely useful.)
     
    Erik Hollensbe, Jun 12, 2007
    #16
  17. What would be the chances of getting something added to the core? It
    seems to me that a new kind of slice would be in order. I know that
    there are lTrim() and rTrim() (left and right) in other languages. What
    about l_Slice and r_Slice? Then, it would be:

    p 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.r_slice(0..9)
     
    Lloyd Linklater, Jun 12, 2007
    #17
  18. Hi,

    Am Dienstag, 12. Jun 2007, 23:50:36 +0900 schrieb Lloyd Linklater:
    This is what I would have liked to propose.

    If anyone understood the `notempty?' proposal.

    Bertram
     
    Bertram Scharpf, Jun 12, 2007
    #18
  19. p 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.r_slice(0..9)

    What about this:

    def last_bytes(s, number_of_bytes)
    if s.size > number_of_bytes then
    return s.slice(s.size - number_of_bytes..s.size)
    end
    return s
    end

    p last_bytes('ABCDEFGHIJKLMNOPQRSTUVWXYZ', 10)
    p last_bytes('WXYZ', 10)

    this is the result:

    "QRSTUVWXYZ"
    "WXYZ"
     
    Lloyd Linklater, Jun 12, 2007
    #19
  20. Brian Candler

    bbiker Guest

    Sorry to come so late in the thread.

    This seems to work fine for me.

    irb(main):001:0> str = "now is the time for all good men to come"
    => "now is the time for all good men to come"
    irb(main):002:0> str[-10..-1]
    => "en to come"

    irb(main):001:0> str = "now is the"
    => "now is the"
    irb(main):002:0> str[-10..-1]
    => "now is the"

    So I really see no need for a special method to obtain the last x
    bytes of a string.
     
    bbiker, Jun 13, 2007
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.