Converting variable-length binary value

Discussion in 'Ruby' started by Brian Candler, Jun 4, 2005.

  1. I'm sure there ought to be a Ruby function to do this, but I've been
    scratching my head whilst going through the Pickaxe book :)

    I want to encode/decode a positive number to/from a variable-length
    big-endian binary string.

    Originally the best I could come up with was:

    str = "\001\002\003"
    val = 0
    str.each_byte { |b| val = (val << 8) | b }
    p val
    # => 66051

    val = 1234
    str = ""
    while (val > 0)
    str = (val & 0xff).chr + str
    val >>= 8
    end
    p str
    # => "\004\322"

    pack/unpack seem only to work for fixed lengths, e.g. 2 or 4 bytes.

    Is there a faster or simpler way of doing this in Ruby?

    Then I discovered I can go via hex:

    p "\001\002\003".unpack("H*")[0].hex
    # => 66051

    str = 1234.to_s(16)
    str = "0#{str}" if str.length % 2 != 0
    val = [str].pack("H*")
    p val
    # => "\004\322"

    That's still pretty nasty. Any better offers?

    Thanks,

    Brian.
    Brian Candler, Jun 4, 2005
    #1
    1. Advertising

  2. Brian Candler

    Mark Hubbart Guest

    On 6/4/05, Brian Candler <> wrote:
    > I'm sure there ought to be a Ruby function to do this, but I've been
    > scratching my head whilst going through the Pickaxe book :)
    >=20
    > I want to encode/decode a positive number to/from a variable-length
    > big-endian binary string.
    >=20
    > Originally the best I could come up with was:
    >=20
    > str =3D "\001\002\003"
    > val =3D 0
    > str.each_byte { |b| val =3D (val << 8) | b }
    > p val
    > # =3D> 66051
    >=20
    > val =3D 1234
    > str =3D ""
    > while (val > 0)
    > str =3D (val & 0xff).chr + str
    > val >>=3D 8
    > end
    > p str
    > # =3D> "\004\322"
    >=20
    > pack/unpack seem only to work for fixed lengths, e.g. 2 or 4 bytes.
    >=20
    > Is there a faster or simpler way of doing this in Ruby?
    >=20
    > Then I discovered I can go via hex:
    >=20
    > p "\001\002\003".unpack("H*")[0].hex
    > # =3D> 66051
    >=20
    > str =3D 1234.to_s(16)
    > str =3D "0#{str}" if str.length % 2 !=3D 0
    > val =3D [str].pack("H*")
    > p val
    > # =3D> "\004\322"
    >=20
    > That's still pretty nasty. Any better offers?


    You're on the right track... It looks like you're just doing too much
    work. how about:

    [int.to_s(16)].pack('H*')

    to pack it, and:

    string.unpack('H*').first.to_i(16)

    to unpack?

    Also, if you aren't tied to this exact format, there's the
    BER-compressed integer option in #pack/#unpack, which handles
    variable-length integers in a nice way:

    [12345678901234567890].pack('w')
    =3D=3D>"\201\253\252\252\261\316\330\374\225R"
    [1].pack('w')
    =3D=3D>"\001"

    cheers,
    Mark
    Mark Hubbart, Jun 4, 2005
    #2
    1. Advertising

  3. > how about:
    >
    > [int.to_s(16)].pack('H*')


    That doesn't work if the number of hex digits is odd:

    irb(main):006:0> 1234.to_s(16)
    => "4d2"
    irb(main):007:0> [1234.to_s(16)].pack("H*")
    => "M " # that's \x4d \x20
    irb(main):008:0> ["04d2"].pack("H*")
    => "\004\322" # that's \x04 \xd2
    irb(main):009:0>

    > Also, if you aren't tied to this exact format, there's the
    > BER-compressed integer option in #pack/#unpack, which handles
    > variable-length integers in a nice way


    As it happens, I'm unpacking BER. The length field in a BER-encoded element
    is encoded as a straightforward N octets. See [**] below:

    def ber_read(io)
    blk = io.read(2) # minimum: short tag, short length
    tag = blk[0] & 0x1f
    len = blk[1]

    if tag == 0x1f # long form
    tag = 0
    while true
    ch = io.getc
    blk << ch
    tag = (tag << 7) | (ch & 0x7f)
    break if (ch & 0x80) == 0
    end
    len = io.getc
    blk << len
    end

    if (len & 0x80) != 0 # long form
    len = len & 0x7f
    raise "Indefinite length encoding not supported" if len == 0
    offset = blk.length
    blk << io.read(len)
    # is there a more efficient way of doing this? [**]
    len = 0
    blk[offset..-1].each_byte { |b| len = (len << 8) | b }
    end

    offset = blk.length
    blk << io.read(len)
    return blk, [blk[0] >> 5, tag], offset
    end

    The reason for this is so that I can read a DER element from a stream
    (OpenSSL::ASL1::decode requires the data to be in memory first)

    You'll notice that the long-form *tag* is encoded in the format you mention;
    however I can't use unpack for that since the length isn't known up-front.

    Regards,

    Brian.
    Brian Candler, Jun 5, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mitchua
    Replies:
    5
    Views:
    2,713
    Eric J. Roode
    Jul 17, 2003
  2. Cogito
    Replies:
    7
    Views:
    12,834
    Andy Dingley
    Jun 22, 2004
  3. Replies:
    12
    Views:
    563
    Richard Heathfield
    Apr 8, 2007
  4. Catsquotl
    Replies:
    6
    Views:
    129
    Catsquotl
    Jun 29, 2009
  5. Jack
    Replies:
    3
    Views:
    90
    J. Gleixner
    Jul 17, 2006
Loading...

Share This Page