Byte oriented data types in python

Discussion in 'Python' started by Ravi, Jan 24, 2009.

  1. Ravi

    Ravi Guest

    I have following packet format which I have to send over Bluetooth.

    packet_type (1 byte unsigned) || packet_length (1 byte unsigned) ||
    packet_data(variable)

    How to construct these using python data types, as int and float have
    no limits and their sizes are not well defined.
     
    Ravi, Jan 24, 2009
    #1
    1. Advertising

  2. > packet_type (1 byte unsigned) || packet_length (1 byte unsigned) ||
    > packet_data(variable)
    >
    > How to construct these using python data types, as int and float have
    > no limits and their sizes are not well defined.


    In Python 2.x, use the regular string type: chr(n) will create a single
    byte, and the + operator will do the concatenation.

    In Python 3.x, use the bytes type (bytes() instead of chr()).

    Regards,
    Martin
     
    Martin v. Löwis, Jan 24, 2009
    #2
    1. Advertising

  3. Ravi

    Ravi Guest


    > Take a look at the struct and ctypes modules.


    struct is really not the choice. it returns an expanded string of the
    data and this means larger latency over bluetooth.

    ctypes is basically for the interface with libraries written in C
    (this I read from the python docs)
     
    Ravi, Jan 25, 2009
    #3
  4. Ravi

    Ravi Guest

    On Jan 25, 12:52 am, "Martin v. Löwis" <> wrote:
    > > packet_type (1 byte unsigned) || packet_length (1 byte unsigned) ||
    > > packet_data(variable)

    >
    > > How to construct these using python data types, as int and float have
    > > no limits and their sizes are not well defined.

    >
    > In Python 2.x, use the regular string type: chr(n) will create a single
    > byte, and the + operator will do the concatenation.
    >
    > In Python 3.x, use the bytes type (bytes() instead of chr()).


    This looks really helpful thanks!
     
    Ravi, Jan 25, 2009
    #4
  5. Ravi

    Steve Holden Guest

    Ravi wrote:
    >> Take a look at the struct and ctypes modules.

    >
    > struct is really not the choice. it returns an expanded string of the
    > data and this means larger latency over bluetooth.
    >

    If you read the module documentation more carefully you will see that it
    "converts" between the various native data types and character strings.
    Thus each native data type occupies only as many bytes as are required
    to store it in its native form (modulo any alignments needed).

    > ctypes is basically for the interface with libraries written in C
    > (this I read from the python docs)
    >

    I believe it *is* the struct module you need.

    regards
    Steve
    --
    Steve Holden +1 571 484 6266 +1 800 494 3119
    Holden Web LLC http://www.holdenweb.com/
     
    Steve Holden, Jan 25, 2009
    #5

  6. >>> Take a look at the struct and ctypes modules.

    >> struct is really not the choice. it returns an expanded string of the
    >> data and this means larger latency over bluetooth.

    >
    > I don't know what you mean by "returns an expanded string of
    > the data".
    >
    > I do know that struct does exactly what you requested.


    I disagree. He has a format (type, length, value), with the
    value being variable-sized. How do you do that in the struct
    module?

    > It converts between Python objects and what is bascially a C
    > "struct" where you specify the endianness of each field and
    > what sort of packing/padding you want.


    Sure. However, in the specific case, there is really no C
    struct that can reasonably represent the data. Hence you
    cannot really use the struct module.

    > I use the struct module frequenty to impliment binary,
    > communications protocols in Python. I've used Python/struct
    > with transport layers ranging from Ethernet (raw, TCP, and UDP)
    > to async serial, to CAN.


    Do you use it for the fixed-size parts, or also for the variable-sized
    data?

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #6
  7. >> I disagree. He has a format (type, length, value), with the
    >> value being variable-sized. How do you do that in the struct
    >> module?

    >
    > You construct a format string for the "value" portion based on
    > the type/length header.


    Can you kindly provide example code on how to do this?

    > I don't see how that can be the case. There may not be a
    > single C struct that can represent all frames, but for every
    > frame you should be able to come up with a C struct that can
    > represent that frame.


    Sure. You would normally have a struct such as

    struct TLV{
    char type;
    char length;
    char *data;
    };

    However, the in-memory representation of that struct is *not*
    meant to be sent over the wire. In particular, the character
    pointer has no meaning outside the address space, and is thus
    not to be sent.

    > Both. For varible size/format stuff you decode the first few
    > bytes and use them to figure out what format/layout to use for
    > the next chunk of data. It's pretty much the same thing you do
    > in other languages.


    In the example he gave, I would just avoid using the struct module
    entirely, as it does not provide any additional value:

    def encode(type, length, value):
    return chr(type)+chr(length)+value

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #7
  8. Ravi

    John Machin Guest

    On Jan 26, 2:28 am, Ravi <> wrote:
    > On Jan 25, 12:52 am, "Martin v. Löwis" <> wrote:
    >
    > > > packet_type (1 byte unsigned) || packet_length (1 byte unsigned) ||
    > > > packet_data(variable)

    >
    > > > How to construct these using python data types, as int and float have
    > > > no limits and their sizes are not well defined.

    >
    > > In Python 2.x, use the regular string type: chr(n) will create a single
    > > byte, and the + operator will do the concatenation.

    >
    > > In Python 3.x, use the bytes type (bytes() instead of chr()).

    >
    > This looks really helpful thanks!


    Provided that you don't take Martin's last sentence too literally :)


    | Python 2.6.1 (r261:67517, Dec 4 2008, 16:51:00) [MSC v.1500 32 bit
    (Intel)] on win32
    | >>> p_data = b"abcd" # Omit the b prefix if using 2.5 or earlier
    | >>> p_len = len(p_data)
    | >>> p_type = 3
    | >>> chr(p_type) + chr(p_len) + p_data
    | '\x03\x04abcd'

    | Python 3.0 (r30:67507, Dec 3 2008, 20:14:27) [MSC v.1500 32 bit
    (Intel)] on win32
    | >>> p_data = b"abcd"
    | >>> p_len = len(p_data)
    | >>> p_type = 3
    | >>> bytes(p_type) + bytes(p_len) + p_data # literal translation
    | b'\x00\x00\x00\x00\x00\x00\x00abcd'
    | >>> bytes(3)
    | b'\x00\x00\x00'
    | >>> bytes(10)
    | b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
    | >>> bytes([p_type]) + bytes([p_len]) + p_data
    | b'\x03\x04abcd'
    | >>> bytes([p_type, p_len]) + p_data
    | b'\x03\x04abcd'

    Am I missing a better way to translate chr(n) from 2.x to 3.x? The
    meaning assigned to bytes(n) in 3.X is "interesting":

    2.X:
    nuls = '\0' * n
    out_byte = chr(n)

    3.X:
    nuls = b'\0' * n
    or
    nuls = bytes(n)
    out_byte = bytes([n])

    Looks to me like there was already a reasonable way of getting a bytes
    object containing a variable number of zero bytes. Any particular
    reason why bytes(n) was given this specialised meaning? Can't be the
    speed, because the speed of bytes(n) on my box is about 50% of the
    speed of the * expression for n = 16 and about 65% for n = 1024.

    Cheers,
    John
     
    John Machin, Jan 25, 2009
    #8
  9. > Looks to me like there was already a reasonable way of getting a bytes
    > object containing a variable number of zero bytes. Any particular
    > reason why bytes(n) was given this specialised meaning?


    I think it was because bytes() was originally mutable, and you need a
    way to create a buffer of n bytes. Now that bytes() ended up immutable
    (and bytearray was added), it's perhaps not so useful anymore. Of
    course, it would be confusing if bytes(4) created a sequence of one
    byte, yet bytearray(4) created four bytes.

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #9
  10. > dtype = ord(rawdata[0])
    > dcount = struct.unpack("!H",rawdata[1:3])
    > if dtype == 1:
    > fmtstr = "!" + "H"*dcount
    > elif dtype == 2:
    > fmtstr = "!" + "f"*dcount
    > rlen = struct.calcsize(fmtstr)
    >
    > data = struct.unpack(fmtstr,rawdata[3:3+rlen])
    >
    > leftover = rawdata[3+rlen:]


    Unfortunately, that does not work in the example. We have
    a message type (an integer), and a variable-length string.
    So how do you compute the struct format for that?

    >> Sure. You would normally have a struct such as
    >>
    >> struct TLV{
    >> char type;
    >> char length;
    >> char *data;
    >> };
    >>
    >> However, the in-memory representation of that struct is *not*
    >> meant to be sent over the wire. In particular, the character
    >> pointer has no meaning outside the address space, and is thus
    >> not to be sent.

    >
    > Well if it's not representing the layout of the data we're
    > trying to deal with, then it's irrelevent. We are talking
    > about how convert python objects to/from data in the
    > 'on-the-wire' format, right?


    Right: ON-THE-WIRE, not IN MEMORY. In memory, there is a
    pointer. On the wire, there are no pointers.

    > Like this?
    >
    >>>> def encode(type,length,value):

    > ... return chr(type)+chr(length)+value
    > ...
    >>>> print encode('float', 1, 3.14159)

    > Traceback (most recent call last):
    > File "<stdin>", line 1, in <module>
    > File "<stdin>", line 2, in encode
    > TypeError: an integer is required


    No:

    py> CONNECT_REQUEST=17
    py> payload="call me"
    py> encode(CONNECT_REQUEST, len(payload), payload)
    '\x11\x07call me'

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #10
  11. >> Unfortunately, that does not work in the example. We have
    >> a message type (an integer), and a variable-length string.
    >> So how do you compute the struct format for that?

    >
    > I'm confused. Are you asking for an introductory tutorial on
    > programming in Python?


    Perhaps. I honestly do not know how to deal with variable-sized
    strings in the struct module in a reasonable way, and thus believe
    that this module is incapable of actually supporting them
    (unless you use inappropriate trickery).

    However, as you keep claiming that the struct module is what
    should be used, I must be missing something about the struct
    module.

    > I don't understand your point.
    >
    >> py> CONNECT_REQUEST=17
    >> py> payload="call me"
    >> py> encode(CONNECT_REQUEST, len(payload), payload)
    >> '\x11\x07call me'

    >
    > If all your data is comprised of 8-bit bytes, then you don't
    > need the struct module.


    Go back to the original message of the OP. It says

    # I have following packet format which I have to send over Bluetooth.
    # packet_type (1 byte unsigned) || packet_length (1 byte unsigned) ||
    # packet_data(variable)

    So yes, all his date is comprised of 8-bit bytes, and yes, he doesn't
    need the struct module. Hence I'm puzzled why people suggest that
    he uses the struct module.

    I think the key answer is "use the string type, it is appropriate
    to represent byte oriented data in python" (also see the subject
    of this thread)

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #11
  12. > It deals with variable sized fields just fine:
    >
    > dtype = 18
    > dlength = 32
    > format = "!BB%ds" % dlength
    >
    > rawdata = struct.pack(format, (dtype,dlength,data))


    I wouldn't call this "just fine", though - it involves
    a % operator to even compute the format string. IMO,
    it is *much* better not to use the struct module for this
    kind of problem, and instead rely on regular string
    concatenation.

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #12
  13. Ravi

    John Machin Guest

    On Jan 26, 10:53 am, "Martin v. Löwis" <> wrote:
    > > It deals with variable sized fields just fine:

    >
    > > dtype = 18
    > > dlength = 32
    > > format = "!BB%ds" % dlength

    >
    > > rawdata = struct.pack(format, (dtype,dlength,data))

    >
    > I wouldn't call this "just fine", though - it involves
    > a % operator to even compute the format string. IMO,
    > it is *much* better not to use the struct module for this
    > kind of problem, and instead rely on regular string
    > concatenation.
    >


    IMO, it would be a good idea if struct.[un]pack supported a variable *
    length operator that could appear anywhere that an integer constant
    could appear, as in C's printf etc and Python's % formatting:

    dlen = len(data)
    rawdata = struct.pack("!BB*s", dtype, dlen, dlen, data)
    # and on the other end of the wire:
    dtype, dlen = struct.unpack("!BB", rawdata[:2])
    data = struct.unpack("!*s", rawdata[2:], dlen)
    # more than 1 count arg could be used if necessary
    # *s would return a string
    # *B, *H, *I, etc would return a tuple of ints in (3.X-speak)

    I've worked with variable-length data that looked like
    len1, len2, len3, data1, data2, data3
    and the * gadget would have been very handy:
    len1, len2, len3 = unpack('!BBB', raw[:3])
    data1, data2, data3 = unpack('!*H*i*d', raw[3:], len1, len2, len3)

    Note the semantics of '!*H*i*d' would be different from '!8H2i7d'
    because otherwise you'd need to do:
    bundle = unpack('!*H*i*d', raw[3:], len1, len2, len3)
    data1 = bundle[:len1]
    data2 = bundle[len1:len1+len2]
    data3 = bundle[len1+len2:]
     
    John Machin, Jan 26, 2009
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ramu
    Replies:
    2
    Views:
    327
    rlblaster
    Feb 20, 2006
  2. Replies:
    2
    Views:
    429
    Bruno Desthuilliers
    May 26, 2008
  3. rolo
    Replies:
    3
    Views:
    174
    Robert Klemme
    Apr 9, 2004
  4. Gary Roach
    Replies:
    0
    Views:
    114
    Gary Roach
    Sep 1, 2013
  5. Fábio Santos
    Replies:
    0
    Views:
    122
    Fábio Santos
    Sep 4, 2013
Loading...

Share This Page