Byte oriented data types in python

Discussion in 'Python' started by Ravi, Jan 24, 2009.

  1. Ravi

    Ravi Guest

    I have following packet format which I have to send over Bluetooth.

    packet_type (1 byte unsigned) || packet_length (1 byte unsigned) ||
    packet_data(variable)

    How to construct these using python data types, as int and float have
    no limits and their sizes are not well defined.
     
    Ravi, Jan 24, 2009
    #1
    1. Advertisements

  2. packet_type (1 byte unsigned) || packet_length (1 byte unsigned) ||
    In Python 2.x, use the regular string type: chr(n) will create a single
    byte, and the + operator will do the concatenation.

    In Python 3.x, use the bytes type (bytes() instead of chr()).

    Regards,
    Martin
     
    Martin v. Löwis, Jan 24, 2009
    #2
    1. Advertisements

  3. Ravi

    Ravi Guest

    struct is really not the choice. it returns an expanded string of the
    data and this means larger latency over bluetooth.

    ctypes is basically for the interface with libraries written in C
    (this I read from the python docs)
     
    Ravi, Jan 25, 2009
    #3
  4. Ravi

    Ravi Guest

    This looks really helpful thanks!
     
    Ravi, Jan 25, 2009
    #4
  5. Ravi

    Steve Holden Guest

    If you read the module documentation more carefully you will see that it
    "converts" between the various native data types and character strings.
    Thus each native data type occupies only as many bytes as are required
    to store it in its native form (modulo any alignments needed).
    I believe it *is* the struct module you need.

    regards
    Steve
     
    Steve Holden, Jan 25, 2009
    #5
  6. I disagree. He has a format (type, length, value), with the
    value being variable-sized. How do you do that in the struct
    module?
    Sure. However, in the specific case, there is really no C
    struct that can reasonably represent the data. Hence you
    cannot really use the struct module.
    Do you use it for the fixed-size parts, or also for the variable-sized
    data?

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #6
  7. I disagree. He has a format (type, length, value), with the
    Can you kindly provide example code on how to do this?
    Sure. You would normally have a struct such as

    struct TLV{
    char type;
    char length;
    char *data;
    };

    However, the in-memory representation of that struct is *not*
    meant to be sent over the wire. In particular, the character
    pointer has no meaning outside the address space, and is thus
    not to be sent.
    In the example he gave, I would just avoid using the struct module
    entirely, as it does not provide any additional value:

    def encode(type, length, value):
    return chr(type)+chr(length)+value

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #7
  8. Ravi

    John Machin Guest

    Provided that you don't take Martin's last sentence too literally :)


    | Python 2.6.1 (r261:67517, Dec 4 2008, 16:51:00) [MSC v.1500 32 bit
    (Intel)] on win32
    | >>> p_data = b"abcd" # Omit the b prefix if using 2.5 or earlier
    | >>> p_len = len(p_data)
    | >>> p_type = 3
    | >>> chr(p_type) + chr(p_len) + p_data
    | '\x03\x04abcd'

    | Python 3.0 (r30:67507, Dec 3 2008, 20:14:27) [MSC v.1500 32 bit
    (Intel)] on win32
    | >>> p_data = b"abcd"
    | >>> p_len = len(p_data)
    | >>> p_type = 3
    | >>> bytes(p_type) + bytes(p_len) + p_data # literal translation
    | b'\x00\x00\x00\x00\x00\x00\x00abcd'
    | >>> bytes(3)
    | b'\x00\x00\x00'
    | >>> bytes(10)
    | b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
    | >>> bytes([p_type]) + bytes([p_len]) + p_data
    | b'\x03\x04abcd'
    | >>> bytes([p_type, p_len]) + p_data
    | b'\x03\x04abcd'

    Am I missing a better way to translate chr(n) from 2.x to 3.x? The
    meaning assigned to bytes(n) in 3.X is "interesting":

    2.X:
    nuls = '\0' * n
    out_byte = chr(n)

    3.X:
    nuls = b'\0' * n
    or
    nuls = bytes(n)
    out_byte = bytes([n])

    Looks to me like there was already a reasonable way of getting a bytes
    object containing a variable number of zero bytes. Any particular
    reason why bytes(n) was given this specialised meaning? Can't be the
    speed, because the speed of bytes(n) on my box is about 50% of the
    speed of the * expression for n = 16 and about 65% for n = 1024.

    Cheers,
    John
     
    John Machin, Jan 25, 2009
    #8
  9. Looks to me like there was already a reasonable way of getting a bytes
    I think it was because bytes() was originally mutable, and you need a
    way to create a buffer of n bytes. Now that bytes() ended up immutable
    (and bytearray was added), it's perhaps not so useful anymore. Of
    course, it would be confusing if bytes(4) created a sequence of one
    byte, yet bytearray(4) created four bytes.

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #9
  10. dtype = ord(rawdata[0])
    Unfortunately, that does not work in the example. We have
    a message type (an integer), and a variable-length string.
    So how do you compute the struct format for that?
    Right: ON-THE-WIRE, not IN MEMORY. In memory, there is a
    pointer. On the wire, there are no pointers.
    No:

    py> CONNECT_REQUEST=17
    py> payload="call me"
    py> encode(CONNECT_REQUEST, len(payload), payload)
    '\x11\x07call me'

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #10
  11. Unfortunately, that does not work in the example. We have
    Perhaps. I honestly do not know how to deal with variable-sized
    strings in the struct module in a reasonable way, and thus believe
    that this module is incapable of actually supporting them
    (unless you use inappropriate trickery).

    However, as you keep claiming that the struct module is what
    should be used, I must be missing something about the struct
    module.
    Go back to the original message of the OP. It says

    # I have following packet format which I have to send over Bluetooth.
    # packet_type (1 byte unsigned) || packet_length (1 byte unsigned) ||
    # packet_data(variable)

    So yes, all his date is comprised of 8-bit bytes, and yes, he doesn't
    need the struct module. Hence I'm puzzled why people suggest that
    he uses the struct module.

    I think the key answer is "use the string type, it is appropriate
    to represent byte oriented data in python" (also see the subject
    of this thread)

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #11
  12. It deals with variable sized fields just fine:
    I wouldn't call this "just fine", though - it involves
    a % operator to even compute the format string. IMO,
    it is *much* better not to use the struct module for this
    kind of problem, and instead rely on regular string
    concatenation.

    Regards,
    Martin
     
    Martin v. Löwis, Jan 25, 2009
    #12
  13. Ravi

    John Machin Guest

    IMO, it would be a good idea if struct.[un]pack supported a variable *
    length operator that could appear anywhere that an integer constant
    could appear, as in C's printf etc and Python's % formatting:

    dlen = len(data)
    rawdata = struct.pack("!BB*s", dtype, dlen, dlen, data)
    # and on the other end of the wire:
    dtype, dlen = struct.unpack("!BB", rawdata[:2])
    data = struct.unpack("!*s", rawdata[2:], dlen)
    # more than 1 count arg could be used if necessary
    # *s would return a string
    # *B, *H, *I, etc would return a tuple of ints in (3.X-speak)

    I've worked with variable-length data that looked like
    len1, len2, len3, data1, data2, data3
    and the * gadget would have been very handy:
    len1, len2, len3 = unpack('!BBB', raw[:3])
    data1, data2, data3 = unpack('!*H*i*d', raw[3:], len1, len2, len3)

    Note the semantics of '!*H*i*d' would be different from '!8H2i7d'
    because otherwise you'd need to do:
    bundle = unpack('!*H*i*d', raw[3:], len1, len2, len3)
    data1 = bundle[:len1]
    data2 = bundle[len1:len1+len2]
    data3 = bundle[len1+len2:]
     
    John Machin, Jan 26, 2009
    #13
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.