Streams of bits

Discussion in 'Ruby' started by Richard Fairhurst, May 17, 2007.

  1. Hi,

    I'm writing a bit of Ruby to output SWF files.

    SWF's opcodes and arguments are variable-length streams of bits. They're
    packed in direct succession - i.e. not usually padded to byte
    boundaries.

    So, for example, you might have

    00111 5-bit record
    0110101 7-bit record
    0000110 7-bit record
    0001100 7-bit record
    1111101 7-bit record

    which would be packed as

    0b00111011,0b01010000,0b11000011,0b00111110,0b10000000

    (the final byte here is null-padded)

    I'm trying to write these opcode by opcode, and get a bytestream out the
    end of it. Currently I'm just appending each opcode to a long string
    (m+='00111'), and when it comes to writing it out, splitting this every
    eight characters and converting back to a single character. But this is
    awfully slow.

    Can anyone suggest a faster way?

    (Apologies if this shows up twice, I've been arguing with Google Groups
    today. ;) )

    cheers
    Richard


    --
    Posted via http://www.ruby-forum.com/.
     
    Richard Fairhurst, May 17, 2007
    #1
    1. Advertising

  2. Richard Fairhurst <> writes:

    > I'm trying to write these opcode by opcode, and get a bytestream out the
    > end of it. Currently I'm just appending each opcode to a long string
    > (m+='00111'), and when it comes to writing it out, splitting this every
    > eight characters and converting back to a single character. But this is
    > awfully slow.
    >
    > Can anyone suggest a faster way?


    How are you going from your string of 0s and 1s to bytes?

    Might I suggest that the fastest way to do that part of the job is
    this?

    outstring = [m].pack("B*")

    That'll pack things the way you seem to want them, and it'll
    appropriately null-pad the last byte.

    If you want to crunch 0s and 1s into bytes as you're building up your
    opcode string, one possibility is to add this into whatever loop it is
    that is appending opcodes:

    while m.length > 40 do
    outstring += [m.slice!(0...40)].pack("B*")
    end

    That pulls bytes off m five bytes at a time. The goal is to try to
    strike a balance between letting m get too large (which makes
    manipulating it in memory slightly slower) and letting pack - written
    in C - do its job efficiently. (pack is going to be more efficient
    the larger the input)

    You'll still need to at the very end do
    outstring += [m].pack("B*")
    to get the remainder.

    You can experiment with how much to pull off of m at a time to see
    what value makes your particular program fastest. For your purposes,
    you may well find that the fastest solution is to not do any
    0s-and-1s-to-bytes operations in your loop, and simply use pack at the
    end.

    --
    s=%q( Daniel Martin --
    puts "s=%q(#{s})",s.to_a.last )
    puts "s=%q(#{s})",s.to_a.last
     
    Daniel Martin, May 17, 2007
    #2
    1. Advertising

  3. On Thu, May 17, 2007 at 09:52:51PM +0900, Richard Fairhurst wrote:
    > I'm writing a bit of Ruby to output SWF files.
    >
    > SWF's opcodes and arguments are variable-length streams of bits. They're
    > packed in direct succession - i.e. not usually padded to byte
    > boundaries.
    >
    > So, for example, you might have
    >
    > 00111 5-bit record
    > 0110101 7-bit record
    > 0000110 7-bit record
    > 0001100 7-bit record
    > 1111101 7-bit record
    >
    > which would be packed as
    >
    > 0b00111011,0b01010000,0b11000011,0b00111110,0b10000000
    >
    > (the final byte here is null-padded)
    >
    > I'm trying to write these opcode by opcode, and get a bytestream out the
    > end of it. Currently I'm just appending each opcode to a long string
    > (m+='00111'), and when it comes to writing it out, splitting this every
    > eight characters and converting back to a single character. But this is
    > awfully slow.
    >
    > Can anyone suggest a faster way?


    Hmm, sounds like Huffman coding... see Ruby Quiz just gone :)

    If speed is critical it might be worth writing a C extension to do it.
     
    Brian Candler, May 17, 2007
    #3
  4. Richard Fairhurst

    Phrogz Guest

    Phrogz, May 17, 2007
    #4
  5. Daniel Martin wrote:
    > That pulls bytes off m five bytes at a time. The goal is to try to
    > strike a balance between letting m get too large (which makes
    > manipulating it in memory slightly slower) and letting pack - written
    > in C - do its job efficiently. (pack is going to be more efficient
    > the larger the input)


    Thanks (and to everyone else who replied) for a really good bunch of
    suggestions.

    Packing ten bytes at a time seems to be optimal and has shaved a whole
    load of the execution time.

    Thanks again,
    Richard :)

    --
    Posted via http://www.ruby-forum.com/.
     
    Richard Fairhurst, May 18, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GGG
    Replies:
    10
    Views:
    12,741
    Donar
    Jul 6, 2006
  2. sarmin kho
    Replies:
    2
    Views:
    855
    A. Lloyd Flanagan
    Jun 15, 2004
  3. Miki Tebeka
    Replies:
    1
    Views:
    465
    Marcin 'Qrczak' Kowalczyk
    Jun 14, 2004
  4. sergey

    "casting" bits to bits?

    sergey, Nov 8, 2006, in forum: VHDL
    Replies:
    1
    Views:
    752
    sergey
    Nov 8, 2006
  5. Tomás

    Value Bits Vs Object Bits

    Tomás, Jun 2, 2006, in forum: C Programming
    Replies:
    13
    Views:
    565
    Hallvard B Furuseth
    Jul 1, 2006
Loading...

Share This Page