Reading a Bitstream

Discussion in 'Python' started by Dietrich Epp, Nov 18, 2003.

  1. Dietrich Epp

    Dietrich Epp Guest

    Are there any good modules for reading a bitstream? Specifically, I
    have a string and I want to be able to get the next N bits as an
    integer. Right now I'm using struct.unpack and bit operations, it's a
    bit kludgy but it gets the right results.

    Thanks in advance.
    Dietrich Epp, Nov 18, 2003
    #1
    1. Advertising

  2. Dietrich Epp

    Miki Tebeka Guest

    Hello Dietrich,

    > Are there any good modules for reading a bitstream? Specifically, I
    > have a string and I want to be able to get the next N bits as an
    > integer. Right now I'm using struct.unpack and bit operations, it's a
    > bit kludgy but it gets the right results.

    Have you looked at 'array' and 'xdrlib.Upnacker'?

    HTH.
    Miki
    Miki Tebeka, Nov 18, 2003
    #2
    1. Advertising

  3. Dietrich Epp

    Dietrich Epp Guest

    On Nov 18, 2003, at 7:28 AM, Miki Tebeka wrote:

    > Hello Dietrich,
    >
    >> Are there any good modules for reading a bitstream? Specifically, I
    >> have a string and I want to be able to get the next N bits as an
    >> integer. Right now I'm using struct.unpack and bit operations, it's a
    >> bit kludgy but it gets the right results.

    > Have you looked at 'array' and 'xdrlib.Upnacker'?


    Both of those look like they're aligned to byte boundaries. Am I
    mistaken?

    The file I'm reading has fields ranging from 1 to 32 bits wide, and
    they are packed bit-to-bit.

    I guess I'll write my own module.
    Dietrich Epp, Nov 18, 2003
    #3
  4. Dietrich Epp wrote:

    > Are there any good modules for reading a bitstream? Specifically, I
    > have a string and I want to be able to get the next N bits as an
    > integer. Right now I'm using struct.unpack and bit operations, it's a
    > bit kludgy but it gets the right results.


    As Miki wrote, the array module will probably give you what
    you want more easily than struct.unpack. If you need more
    help, just post a few more details and I will post a code
    snippet. (As to the rest of Miki's post, I'm not sure that
    I really want to know what an "Upnacker" is :)

    Pat
    Patrick Maupin, Nov 19, 2003
    #4
  5. Dietrich Epp

    Dietrich Epp Guest

    On Nov 18, 2003, at 6:10 PM, Patrick Maupin wrote:

    > Dietrich Epp wrote:
    >
    >> Are there any good modules for reading a bitstream? Specifically, I
    >> have a string and I want to be able to get the next N bits as an
    >> integer. Right now I'm using struct.unpack and bit operations, it's a
    >> bit kludgy but it gets the right results.

    >
    > As Miki wrote, the array module will probably give you what
    > you want more easily than struct.unpack. If you need more
    > help, just post a few more details and I will post a code
    > snippet. (As to the rest of Miki's post, I'm not sure that
    > I really want to know what an "Upnacker" is :)


    Maybe I should clarify: I need to read bit fields. Neither are they
    aligned to bytes or do they have fixed offsets. In fact, in one part
    of the file there is a list of objects which starts with a 9 bit object
    type followed by fields whose length and number depend on that object
    type, ranging from a dummy 1-bit field to a tuple of four fields of
    length 9, 5, 8, and 8 bits.

    I looked at the array module and can't find what I'm looking for.
    Here's a bit of typical usage.

    def readStuff(bytes):
    bits = BitStream(bytes[2:])
    isSimple = bits.Get(1)
    objType = chr(bits.Get(8))
    objType += chr(bits.Get(8))
    objType += chr(bits.Get(8))
    objType += chr(bits.Get(8))
    count = bits.Get(3)
    bits.Ignore(5)
    if not isSimple:
    objId = bits.Get(32)
    bytes = bytes[2+bits.PartialBytesRead():]
    return bytes, objType

    This is basically the gamut of what I want to do. I have a string, and
    create a bit stream object. I read fields from the bit stream, some
    may not be present, then return an object and the string that comes
    after it. The objects are aligned to bytes in this case even though
    their fields aren't.

    I can't figure out how to get array to do this. Array does not look at
    all suited to reading a bit stream. struct.unpack *does* work right
    now, with a lot of help, I was wondering if there was an easier way.
    Dietrich Epp, Nov 19, 2003
    #5
  6. On Wed, 19 Nov 2003 01:47:26 -0800, Dietrich Epp <> wrote:

    >
    >On Nov 18, 2003, at 6:10 PM, Patrick Maupin wrote:
    >
    >> Dietrich Epp wrote:
    >>
    >>> Are there any good modules for reading a bitstream? Specifically, I
    >>> have a string and I want to be able to get the next N bits as an
    >>> integer. Right now I'm using struct.unpack and bit operations, it's a
    >>> bit kludgy but it gets the right results.

    >>
    >> As Miki wrote, the array module will probably give you what
    >> you want more easily than struct.unpack. If you need more
    >> help, just post a few more details and I will post a code
    >> snippet. (As to the rest of Miki's post, I'm not sure that
    >> I really want to know what an "Upnacker" is :)

    >
    >Maybe I should clarify: I need to read bit fields. Neither are they
    >aligned to bytes or do they have fixed offsets. In fact, in one part
    >of the file there is a list of objects which starts with a 9 bit object
    >type followed by fields whose length and number depend on that object
    >type, ranging from a dummy 1-bit field to a tuple of four fields of
    >length 9, 5, 8, and 8 bits.
    >
    >I looked at the array module and can't find what I'm looking for.
    >Here's a bit of typical usage.
    >
    >def readStuff(bytes):
    > bits = BitStream(bytes[2:])
    > isSimple = bits.Get(1)
    > objType = chr(bits.Get(8))
    > objType += chr(bits.Get(8))
    > objType += chr(bits.Get(8))
    > objType += chr(bits.Get(8))
    > count = bits.Get(3)
    > bits.Ignore(5)
    > if not isSimple:
    > objId = bits.Get(32)
    > bytes = bytes[2+bits.PartialBytesRead():]
    > return bytes, objType
    >
    >This is basically the gamut of what I want to do. I have a string, and
    >create a bit stream object. I read fields from the bit stream, some
    >may not be present, then return an object and the string that comes
    >after it. The objects are aligned to bytes in this case even though
    >their fields aren't.
    >
    >I can't figure out how to get array to do this. Array does not look at
    >all suited to reading a bit stream. struct.unpack *does* work right
    >now, with a lot of help, I was wondering if there was an easier way.
    >
    >
    >

    Maybe this will do something for you?
    Note that this is a response to your post, and not something previously tested,
    (in fact not tested beyond what you see ;-) and it will be slow if you have
    huge amounts of data to process.

    You pass a string to the constructor, specifying big-endian if not little-endian,
    and then you use the read method to read bit fields, which may optionally have
    their most significant bits interpreted as sign bits.

    E.g., reading 4-bit chunks or bits, little-endian and big-endian:

    >>> import sbits
    >>> sb = sbits.SBits('01234567')
    >>> for i in xrange(8*2): print sb.read(4),

    ...
    0 3 1 3 2 3 3 3 4 3 5 3 6 3 7 3
    >>> sb = sbits.SBits('01234567',False)
    >>> for i in xrange(8*2): print sb.read(4),

    ...
    3 0 3 1 3 2 3 3 3 4 3 5 3 6 3 7
    >>> sb = sbits.SBits('\x05')
    >>> for i in xrange(8): print sb.read(1),

    ...
    1 0 1 0 0 0 0 0
    >>> sb = sbits.SBits('\x05',False)
    >>> for i in xrange(8): print sb.read(1),

    ...
    0 0 0 0 0 1 0 1

    >>> sb = sbits.SBits('01234567')
    >>> hex(sb.read(64))

    '0x3736353433323130L'
    >>> sb = sbits.SBits('01234567',False)
    >>> hex(sb.read(64))

    '0x3031323334353637L'
    >>> sb = sbits.SBits('01234567')
    >>> hex(sb.read(32))

    '0x33323130'
    >>> hex(sb.read(32))

    '0x37363534'
    >>> sb = sbits.SBits('01234567',False)
    >>> hex(sb.read(32))

    '0x30313233'
    >>> hex(sb.read(32))

    '0x34353637'

    Sorry for the lack of doc strings ;-/
    Please let me know if/when you find a bug.


    ====< sbits.py >=========================================
    import itertools
    class SBits(object):
    def __init__(self, s='', little_endian=True):
    self.le = little_endian
    self.buf = 0L
    self.bufbits=0
    self.getbyte = itertools.imap(ord, s).next
    def read(self, nb=0, signed=False):
    try:
    while self.bufbits<nb:
    if self.le:
    self.buf |= (long(self.getbyte())<<self.bufbits) # put at top
    else:
    self.buf = (self.buf<<8) | self.getbyte()
    self.bufbits+=8
    except StopIteration: # no more getbyte data
    raise EOFError, 'Failed to read %s bits from available %s.'%(nb, self.bufbits)
    self.bufbits -= nb
    if self.le:
    ret = self.buf & ((1L<<nb)-1)
    self.buf >>= nb
    else:
    ret = self.buf>>self.bufbits
    self.buf &= ((1L<<self.bufbits)-1)
    if signed:
    signbit = 1L<<(nb-1)
    if signbit & ret:
    ret = ret - signbit -signbit
    if -2**31 <= ret < 2**31: return int(ret)
    return ret #, nb

    def test():
    sb = SBits('\x03'*(sum(xrange(37))+7))
    bits = [sb.read(wid, wid&1>0) for wid in xrange(37)]
    hexis = map(hex,bits)
    shouldbe = [
    '0x0', '0xffffffff', '0x1', '0x0', '0xc', '0x0', '0x6', '0x18',
    '0x30', '0x30', '0x18', '0xfffffe06', '0xc0', '0xc0c', '0x2060', '0x181',
    '0x303', '0xffff0303', '0x18181', '0x6060', '0xc0c0c', '0xc0c0', '0x60606', '0x181818',
    '0x303030', '0x303030', '0x181818', '0xfe060606', '0xc0c0c0', '0xc0c0c0c', '0x20606060', '0x1818181',
    '0x3030303', '-0xFCFCFCFDL', '0x181818181L', '0x60606060', '0xC0C0C0C0CL']
    for i,h in enumerate(hexis): print '%12s%s'%(h,'\n'[:i%4==3]),
    print '\n-----\nThat was%s what was expected.\n-----'%((' not','')[hexis==shouldbe],)

    sb = SBits('\xc0'*(sum(xrange(37))+7), False)
    bits = [sb.read(wid, wid&1>0) for wid in xrange(37)]
    hexis = map(hex,bits)
    shouldbe = [
    '0x0', '0xffffffff', '0x2', '0x0', '0x3', '0x0', '0x18', '0xc',
    '0xc', '0x18', '0x60', '0x303', '0x30', '0x606', '0x181', '0xffffc0c0',
    '0xc0c0', '0xffff8181', '0x20606', '0x3030', '0x30303', '0x6060', '0x181818', '0xc0c0c',
    '0xc0c0c', '0x181818', '0x606060', '0x3030303', '0x303030', '0x6060606', '0x1818181', '0xc0c0c0c0',
    '0xC0C0C0C0L', '0x81818181', '0x206060606L', '0x30303030', '0x303030303L']
    for i,h in enumerate(hexis): print '%12s%s'%(h,'\n'[:i%4==3]),
    print '\n-----\nThat was%s what was expected.\n-----'%((' not','')[hexis==shouldbe],)
    if __name__ == '__main__':
    test()
    =========================================================

    Regards,
    Bengt Richter
    Bengt Richter, Nov 19, 2003
    #6
  7. Dietrich Epp

    Dietrich Epp Guest

    On Nov 19, 2003, at 7:02 PM, Bengt Richter wrote:

    [snip]

    > Maybe this will do something for you?
    > Note that this is a response to your post, and not something
    > previously tested,
    > (in fact not tested beyond what you see ;-) and it will be slow if you
    > have
    > huge amounts of data to process.
    >
    > You pass a string to the constructor, specifying big-endian if not
    > little-endian,
    > and then you use the read method to read bit fields, which may
    > optionally have
    > their most significant bits interpreted as sign bits.
    >
    > E.g., reading 4-bit chunks or bits, little-endian and big-endian:


    [snip]

    It looks like what I did before I started using struct.unpack. I think
    I'll stick with my current code, which, although it looks like ugly C
    code, works. Someday I'll probably get the itch again and make a
    decent module out of it.
    Dietrich Epp, Nov 20, 2003
    #7
  8. Dietrich Epp wrote:

    > Maybe I should clarify: I need to read bit fields. Neither are they
    > aligned to bytes or do they have fixed offsets. In fact, in one part
    > of the file there is a list of objects which starts with a 9 bit object
    > type followed by fields whose length and number depend on that object
    > type, ranging from a dummy 1-bit field to a tuple of four fields of
    > length 9, 5, 8, and 8 bits.
    >
    > I looked at the array module and can't find what I'm looking for.
    > Here's a bit of typical usage.
    >
    > def readStuff(bytes):
    > bits = BitStream(bytes[2:])
    > isSimple = bits.Get(1)
    > objType = chr(bits.Get(8))
    > objType += chr(bits.Get(8))
    > objType += chr(bits.Get(8))
    > objType += chr(bits.Get(8))
    > count = bits.Get(3)
    > bits.Ignore(5)
    > if not isSimple:
    > objId = bits.Get(32)
    > bytes = bytes[2+bits.PartialBytesRead():]
    > return bytes, objType
    >


    The fact that you want to read bitfields was perfectly clear.
    What was not clear to me was that you were apparently asking
    if there was a module which does _exactly_ what you want, and
    I thought you might be merely asking if there was a more suitable
    module to build your bitstream code on than the struct module.

    For this task, IMO struct is unwieldy, and array is much more
    suitable, because the indexing works very nicely, and it is
    quick and painless to convert a string into an array.

    If you need gobs of performance, you may end up writing a C
    module, but if you need moderate performance, you can probably
    do better with an array implementation than with a struct
    implementation, especially if you go ahead and dump an
    entire file into an array.

    Here is a module (completely untested, because it's bedtime)
    which uses array to provide this sort of functionality.

    I call the primary function "read" instead of "Get" because
    it's kinda-sorta modelled on a file object. Also, I didn't
    provide an "Ignore", because all you have to do is call
    "read" without assigning the results to anything.

    Hope this helps.

    Pat


    """
    bitarray class allows reading from a bitstream.

    The number of requested bits on a read are returned
    as a positive long integer.

    Limitations of this implementation:
    - bitstream is not writable after initialization
    - Must be initialized with a string (not a list or tuple)
    - ASSUMES LITTLE-ENDIAN BITSTREAMS
    - NOT TESTED

    Any limitation could be fixed with a six pack :)
    """

    import array

    class bitarray(object):

    # Support the new Athlon 64's by dynamically
    # calculating number of bytes per long,
    # and optimistically assume we'll have 256-bit
    # processors soon :)

    _bytesperlong = 32/len(array.array('L',32*' '))
    _bitsperlong = _bytesperlong*8

    def __init__(self,source_str):
    self._totalbits = len(source_str) * 8
    self._position = 0

    # Pad to longword boundary, then make an array

    source_str += -len(source_str) % self._bytessperlong * chr(0)
    self._bitstream = array.array('L',source_str)

    def seek(self,offset,whence=0):
    self._position = offset + (0,self._position,self._totalbits)[whence]

    def tell(self):
    return self._position

    def read(self,numbits):
    bitsperlong,position = self._bitsperlong,self._position

    if position < 0 or position + numbits > self._totalbits:
    raise IndexError, "Invalid bitarray._position/numbits"

    longaddress,bitoffset = divmod(position,bitsperlong)

    # We may read bits in the final word after ones we care
    # about, so create a mask to remove them later.

    finalmask = (1L << numbits) - 1

    # We may read bits in the first word before the ones we
    # care about, so bump the total bits to read by this
    # amount, so we read enough higher-order bits.

    numbits += bitoffset

    # Read and concatenate every long which contains a bit we need

    outval,outshift = 0L,0
    while numbits > 0:
    outval += self._bitstream[longaddress] << outshift
    longaddress += 1
    outshift += bitsperlong
    numbits -= bitsperlong

    # numbits is now basically a negative number which tells us
    # how many bits to back up from our current position.

    self._position = longaddress*bitsperlong + numbits

    # Shift right to strip off the low-order bits we
    # don't want, then 'and' with the mask to strip
    # off the high-order bits we don't want.

    return (outval >> bitoffset) & finalmask
    Patrick Maupin, Nov 21, 2003
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Zak

    bitstream compatibility

    Zak, Oct 21, 2003, in forum: VHDL
    Replies:
    0
    Views:
    447
  2. Stefan Oedenkoven
    Replies:
    1
    Views:
    515
    zinga
    Jan 4, 2005
  3. lightofspace

    Case range with bitstream: VHDL

    lightofspace, Oct 31, 2006, in forum: VHDL
    Replies:
    0
    Views:
    1,609
    lightofspace
    Oct 31, 2006
  4. quad

    Bitstream programming

    quad, Jan 2, 2007, in forum: VHDL
    Replies:
    0
    Views:
    420
  5. PSN
    Replies:
    1
    Views:
    380
    Chris Fairles
    Dec 13, 2007
Loading...

Share This Page