Streams of bits

Discussion in 'Ruby' started by Richard Fairhurst, May 17, 2007.

  1. Hi,

    I'm writing a bit of Ruby to output SWF files.

    SWF's opcodes and arguments are variable-length streams of bits. They're
    packed in direct succession - i.e. not usually padded to byte
    boundaries.

    So, for example, you might have

    00111 5-bit record
    0110101 7-bit record
    0000110 7-bit record
    0001100 7-bit record
    1111101 7-bit record

    which would be packed as

    0b00111011,0b01010000,0b11000011,0b00111110,0b10000000

    (the final byte here is null-padded)

    I'm trying to write these opcode by opcode, and get a bytestream out the
    end of it. Currently I'm just appending each opcode to a long string
    (m+='00111'), and when it comes to writing it out, splitting this every
    eight characters and converting back to a single character. But this is
    awfully slow.

    Can anyone suggest a faster way?

    (Apologies if this shows up twice, I've been arguing with Google Groups
    today. ;) )

    cheers
    Richard
     
    Richard Fairhurst, May 17, 2007
    #1
    1. Advertisements

  2. How are you going from your string of 0s and 1s to bytes?

    Might I suggest that the fastest way to do that part of the job is
    this?

    outstring = [m].pack("B*")

    That'll pack things the way you seem to want them, and it'll
    appropriately null-pad the last byte.

    If you want to crunch 0s and 1s into bytes as you're building up your
    opcode string, one possibility is to add this into whatever loop it is
    that is appending opcodes:

    while m.length > 40 do
    outstring += [m.slice!(0...40)].pack("B*")
    end

    That pulls bytes off m five bytes at a time. The goal is to try to
    strike a balance between letting m get too large (which makes
    manipulating it in memory slightly slower) and letting pack - written
    in C - do its job efficiently. (pack is going to be more efficient
    the larger the input)

    You'll still need to at the very end do
    outstring += [m].pack("B*")
    to get the remainder.

    You can experiment with how much to pull off of m at a time to see
    what value makes your particular program fastest. For your purposes,
    you may well find that the fastest solution is to not do any
    0s-and-1s-to-bytes operations in your loop, and simply use pack at the
    end.
     
    Daniel Martin, May 17, 2007
    #2
    1. Advertisements

  3. Hmm, sounds like Huffman coding... see Ruby Quiz just gone :)

    If speed is critical it might be worth writing a C extension to do it.
     
    Brian Candler, May 17, 2007
    #3
  4. Richard Fairhurst

    Phrogz Guest

    Phrogz, May 17, 2007
    #4
  5. Thanks (and to everyone else who replied) for a really good bunch of
    suggestions.

    Packing ten bytes at a time seems to be optimal and has shaved a whole
    load of the execution time.

    Thanks again,
    Richard :)
     
    Richard Fairhurst, May 18, 2007
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.