Efficient processing of binary data streams in Ruby?

Discussion in 'Ruby' started by theosib@gmail.com, Mar 9, 2007.

  1. Guest

    I'm writing a Ruby program that has to process binary data from files
    and sockets. Data items are in bytes, 16-bit words, or 32-bit words,
    and I cannot predict in advance whether the data will be msb-first or
    lsb-first, so I end up writing things like this:

    def unpack_16(x)
    @msb_first ? ((x[0]<<8)|x[1]) : ((x[1]<<8)|x[0])
    end

    def pack_16(x)
    y = "xx"
    if (@msb_first)
    y[0] = x>>8
    y[1] = x&255
    else
    y[0] = x&255
    y[1] = x>>8
    end
    end

    I expect, however, that this will be painfully slow, and I can't
    imagine that this hasn't been though of before. Is there a better way
    to do this that will result in much better performance?

    Thanks!
     
    , Mar 9, 2007
    #1
    1. Advertising

  2. Tim Pease Guest

    On 3/8/07, <> wrote:
    > I'm writing a Ruby program that has to process binary data from files
    > and sockets. Data items are in bytes, 16-bit words, or 32-bit words,
    > and I cannot predict in advance whether the data will be msb-first or
    > lsb-first, so I end up writing things like this:
    >
    > def unpack_16(x)
    > @msb_first ? ((x[0]<<8)|x[1]) : ((x[1]<<8)|x[0])
    > end
    >
    > def pack_16(x)
    > y = "xx"
    > if (@msb_first)
    > y[0] = x>>8
    > y[1] = x&255
    > else
    > y[0] = x&255
    > y[1] = x>>8
    > end
    > end
    >
    > I expect, however, that this will be painfully slow, and I can't
    > imagine that this hasn't been though of before. Is there a better way
    > to do this that will result in much better performance?
    >


    def unpack_16( str )
    @msb_first ? str.unpack('n') : str.unpack('S')
    end

    def pack_16( num )
    @msb_first ? [num].pack('n') : [num].pack('S')
    end


    That will work for little-endian processors (Intel) but not for
    big-endian processors (PowerPC, Sparc). For these methods to work on
    the latter you'll have to do something like this ...

    def unpack_16( str )
    str = str.reverse unless @msb_first
    str.unpack('n')
    end

    def pack_16( num )
    str = [num].pack('n')
    str.reverse unless @msb_first
    end


    Just define the desired method based on the processor type -- which
    can be figued out by doing this ...

    LITTLE_ENDIAN = [42].pack('I')[0] == 42

    if LITTLE_ENDIAN
    # define little endian methods here
    else
    # define big endian methods here
    end

    Hope that helps

    Blessings,
    TwP
     
    Tim Pease, Mar 9, 2007
    #2
    1. Advertising

  3. Guest

    On Fri, 9 Mar 2007, wrote:

    > I'm writing a Ruby program that has to process binary data from files and
    > sockets. Data items are in bytes, 16-bit words, or 32-bit words, and I
    > cannot predict in advance whether the data will be msb-first or lsb-first,
    > so I end up writing things like this:
    >
    > def unpack_16(x)
    > @msb_first ? ((x[0]<<8)|x[1]) : ((x[1]<<8)|x[0])
    > end
    >
    > def pack_16(x)
    > y = "xx"
    > if (@msb_first)
    > y[0] = x>>8
    > y[1] = x&255
    > else
    > y[0] = x&255
    > y[1] = x>>8
    > end
    > end
    >
    > I expect, however, that this will be painfully slow, and I can't imagine
    > that this hasn't been though of before. Is there a better way to do this
    > that will result in much better performance?


    this will be __extremely__ fast for even huge buffers of data


    harp:~ > ruby a.rb
    huge(100000) LSB(8) in 0.00117683410644531s
    huge(100000) LSB(16) in 0.00181722640991211s
    huge(100000) LSB(32) in 0.00884389877319336s
    huge(100000) MSB(8) in 0.00245118141174316s
    huge(100000) MSB(16) in 0.0045168399810791s
    huge(100000) MSB(32) in 0.0078279972076416s


    harp:~ > cat a.rb
    require 'rubygems'
    require 'narray'

    module Intification
    LSB = :LSB
    MSB = :MSB
    HOST = [42].pack('i').unpack('c').first == 42 ? LSB : MSB

    def ints bits = 8, order = LSB
    words = bits / 8

    type =
    case bits.to_i
    when 8
    NArray::BYTE
    when 16
    NArray::SINT
    when 32
    NArray::INT
    else
    raise ArgumentError, bits.inspect
    end

    na = NArray.to_na to_s, type, size/words
    order == HOST ? na : na.swap_byte
    end
    end

    class String
    include Intification
    end

    def bm label
    a = Time.now
    yield
    b = Time.now
    puts "#{ label } in #{ b.to_f - a.to_f }s"
    end

    n = 100_000

    huge = { :LSB => {}, :MSB => {} }

    huge[:LSB][8] = [39,40,41,42].pack('c*') * n
    huge[:LSB][16] = [39,40,41,42].pack('s*') * n
    huge[:LSB][32] = [39,40,41,42].pack('i*') * n

    huge[:MSB][8] = [39,40,41,42].pack('c*') * n
    huge[:MSB][16] = [39,40,41,42].pack('n*') * n
    huge[:MSB][32] = [39,40,41,42].pack('N*') * n

    [:LSB, :MSB].each do |order|
    [8,16,32].each do |bits|
    bm "huge(#{ n }) #{ order.to_s}(#{ bits })" do
    string = huge[order][bits]
    ints = string.ints(bits, order)
    last = ints[-4..-1]
    raise unless last[0] = 39
    raise unless last[1] = 40
    raise unless last[2] = 41
    raise unless last[3] = 42
    end
    end
    end


    regards.

    if youre on windows i have an narray install

    -a
    --
    be kind whenever possible... it is always possible.
    - the dalai lama
     
    , Mar 9, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Adam Warner

    Dual binary/character streams?

    Adam Warner, Nov 6, 2005, in forum: Java
    Replies:
    18
    Views:
    1,036
    Roedy Green
    Nov 8, 2005
  2. Tron Thomas
    Replies:
    3
    Views:
    515
    Tron Thomas
    Nov 8, 2004
  3. Noozer
    Replies:
    2
    Views:
    815
    Joseph Kesselman
    Sep 21, 2007
  4. David Sanders
    Replies:
    12
    Views:
    468
    Jorgen Grahn
    Jan 20, 2008
  5. , India

    text and binary streams

    , India, Aug 23, 2008, in forum: C Programming
    Replies:
    4
    Views:
    437
    Peter Nilsson
    Aug 25, 2008
Loading...

Share This Page