pack a three byte int

Discussion in 'Python' started by p.lavarre@ieee.org, Nov 8, 2006.

  1. Guest

    Can Python not express the idea of a three-byte int?

    For instance, in the working example below, can we somehow collapse the
    three calls of struct.pack into one?

    >>> import struct
    >>>
    >>> skip = 0x123456 ; count = 0x80
    >>>
    >>> cdb = ''
    >>> cdb += struct.pack('>B', 0x08)
    >>> cdb += struct.pack('>I', skip)[-3:]
    >>> cdb += struct.pack('>BB', count, 0)
    >>>
    >>> print ' '.join(['%02X' % ord(xx) for xx in cdb])

    08 12 34 56 80 00
    >>>


    I ask because I'm trying to refactor working code that is concise:

    >>> cdb0 = '\x08' '\x12\x34\x56' '\x80' '\0'
    >>> print ' '.join(['%02X' % ord(xx) for xx in cdb0])

    08 12 34 56 80 00
    >>>


    Thanks in advance, Pat LaVarre
    , Nov 8, 2006
    #1
    1. Advertising

  2. John Machin Guest

    wrote:
    > Can Python not express the idea of a three-byte int?


    It is a bit hard to determine what that (rhetorical?) question means.
    Possible answers:
    1. Not as concisely as a one-byte struct code -- as you presumably have
    already determined by reading the manual ...
    2. No, but when 24-bit machines become as popular as they were in the
    1960s, feel free to submit an enhancement request :)

    >
    > For instance, in the working example below, can we somehow collapse the
    > three calls of struct.pack into one?
    >
    > >>> import struct
    > >>>
    > >>> skip = 0x123456 ; count = 0x80
    > >>>
    > >>> cdb = ''
    > >>> cdb += struct.pack('>B', 0x08)
    > >>> cdb += struct.pack('>I', skip)[-3:]
    > >>> cdb += struct.pack('>BB', count, 0)
    > >>>
    > >>> print ' '.join(['%02X' % ord(xx) for xx in cdb])

    > 08 12 34 56 80 00
    > >>>


    You could try throwing the superfluous bits away before packing instead
    of after:

    | >>> from struct import pack
    | >>> skip = 0x123456; count = 0x80
    | >>> hi, lo = divmod(skip, 0x10000)
    | >>> cdb = pack(">BBHBB", 0x08, hi, lo, count, 0)
    | >>> ' '.join(["%02X" % ord(x) for x in cdb])
    | '08 12 34 56 80 00'

    but why do you want to do that to concise working code???
    John Machin, Nov 8, 2006
    #2
    1. Advertising

  3. Dave Opstad Guest

    In article <>,
    wrote:

    > Can Python not express the idea of a three-byte int?
    >
    > For instance, in the working example below, can we somehow collapse the
    > three calls of struct.pack into one?
    >
    > >>> import struct
    > >>>
    > >>> skip = 0x123456 ; count = 0x80
    > >>>
    > >>> cdb = ''
    > >>> cdb += struct.pack('>B', 0x08)
    > >>> cdb += struct.pack('>I', skip)[-3:]
    > >>> cdb += struct.pack('>BB', count, 0)


    Why not something like this:

    skip += struct.pack(">L", skip)[1:]

    Dave
    Dave Opstad, Nov 9, 2006
    #3
  4. Dave Opstad Guest

    Sorry, that should have been:

    cdb += struct.pack(">L", skip)[1:]

    Dave
    Dave Opstad, Nov 9, 2006
    #4
  5. John Machin Guest

    Dave Opstad wrote:
    > Sorry, that should have been:
    >
    > cdb += struct.pack(">L", skip)[1:]
    >


    ">L" and ">I" produce exactly the same 4-byte result. The change from
    [-3:] to [1:] is a minor cosmetic improvement, but obscures the
    underlying ... a bit like putting mascara on a pig. I got the
    impression that the OP was interested in more radical improvement.

    Cheers,
    John
    John Machin, Nov 9, 2006
    #5
  6. Guest

    > Not as concisely as a one-byte struct code

    Help, what do you mean?

    > you presumably... read... the manual ...


    Did I reread the wrong parts? I see I could define a ctypes.Structure
    since 2.5, but that would be neither concise, nor since 2.3.

    > when 24-bit machines become ... popular


    Indeed the struct's defined recently, ~1980, were contorted to make
    them easy to say in C, which makes them easy to say in Python, e.g.:

    X28Read10 = 0x28
    cdb = struct.pack('>BBIBHB', X28Read10, 0, skip, 0, count, 0)

    But when talking the 1960's lingo I find I am actually resorting to
    horrors like:

    X12Inquiry = 0x12
    xxs = [0] * 6
    xxs[0] = X12Inquiry
    xxs[4] = allocationLength
    rq = ''.join([chr(xx) for xx in xxs])

    Surely this is wrong? A failure on my part to think in Python?
    , Nov 9, 2006
    #6
  7. Guest

    > > > cdb0 = '\x08' '\x01\x23\x45' '\x80' '\0'
    > >
    > > cdb = ''
    > > cdb += struct.pack('>B', 0x08)
    > > cdb += struct.pack('>I', skip)[-3:]
    > > cdb += struct.pack('>BB', count, 0)

    >
    > The change from [-3:] to [1:] is a minor cosmetic improvement,


    Ouch, [1:] works while sizeof I is 4, yes, but that's not what I meant,
    more generally.

    Something else I tried that doesn't work is:

    skip = 0x12345 ; count = 0x80
    struct.pack('>8b3b21b8b8b', 0x08, 0, skip, count, 0)

    That doesn't work, because in Python struct b means signed char, not
    bit; and a struct repeat count adds fields, rather than making a field
    wider.

    But do you see what I mean? The fields of this struct have 8, 3, 21,
    8, and 8 bits. The "skip" field has 21 bits, the "count" field has 8
    bits, I'd like to vary both of those.

    > why do you want to do that to concise working code???


    cdb0 = '\x08' '\x01\x23\x45' '\x80' '\0'
    works, but it's not parameterised. Writing out the hex literal always
    packs the same x12345 and x80 values into those 21 and 8 bit fields.

    I see I can concisely print and eval the big-endian hex:

    X08Read6 = 0x08
    skip = 0x12345 ; count = 0x80
    hex = '%02X' % X08Read6 + ('%06X' % skip) + ('%02X' % count) + '00'
    ''.join([chr(int(hex[ix:ix+2],0x10)) for ix in range(0,len(hex),2)])

    But that's ugly too. Maybe least ugly so far, if I bury the
    join-chr-int-for-range-len-2 in a def.

    Is there no less ugly way to say pack bits, rather than pack bytes, in
    Python?
    , Nov 9, 2006
    #7
  8. John Machin Guest

    wrote:
    > > Not as concisely as a one-byte struct code

    >
    > Help, what do you mean?


    Help, what did you mean by the question?

    "struct" == "Python struct module"

    Struct module has (concise) codes B, H, I, Q for unsigned integers of
    lengths 1, 2, 4, 8, but does *not* have a code for 3-byte integers.

    >
    > > you presumably... read... the manual ...

    >
    > Did I reread the wrong parts? I see I could define a ctypes.Structure
    > since 2.5, but that would be neither concise, nor since 2.3.


    Looks like you ignored the first word in the sentence ("Not").

    >
    > > when 24-bit machines become ... popular

    >
    > Indeed the struct's defined recently, ~1980, were contorted to make
    > them easy to say in C, which makes them easy to say in Python, e.g.:
    >
    > X28Read10 = 0x28
    > cdb = struct.pack('>BBIBHB', X28Read10, 0, skip, 0, count, 0)
    >
    > But when talking the 1960's lingo I find I am actually resorting to
    > horrors like:
    >
    > X12Inquiry = 0x12
    > xxs = [0] * 6
    > xxs[0] = X12Inquiry
    > xxs[4] = allocationLength
    > rq = ''.join([chr(xx) for xx in xxs])
    >
    > Surely this is wrong? A failure on my part to think in Python?


    It looks wrong (and a few other adjectives), irrespective of what
    problem it is trying to solve. Looks like little-endian 4-byte integer
    followed by 2-byte integer ... what's wrong with struct.pack("<IH",
    X12Inquiry, allocationLength) ????

    Your original question asked about bigendian 3-byte integers; have you
    read the suggested solution that I posted? Does it do what you asked
    (one pack call instead of three)????
    John Machin, Nov 9, 2006
    #8
  9. Guest

    > Help, what did you mean by the question?

    How does Python express the idea:

    i) Produce the six bytes '\x08' '\x01\x23\x45' '\x80' '\0' at run-time
    when given the tuple (0x08, 0x12345, 0x80, 0).

    ii) Produce the six bytes '\x12' '\0\0\0' '\x24' '\0' when given the
    tuple (0x12, 0, 0x24, 0).

    iii) And so on.

    So far, everything I write is ugly. Help?

    > Looks like you ignored ...


    I guess you're asking me to leave the mystery of my question alone long
    enough to show more plainly that indeed I am trying to make sense of
    every word of every answer.

    I guess I should do that in separate replies, cc'ed back into this same
    thread. Please stay tuned.

    Thanks in advance, Pat LaVarre
    , Nov 10, 2006
    #9
  10. Guest

    > "struct" == "Python struct module"
    >
    > Struct module has (concise) codes B, H, I, Q for unsigned integers of
    > lengths 1, 2, 4, 8, but does *not* have a code for 3-byte integers.


    I thought that's what the manual meant, but I was unsure, thank you.

    > > > 1. Not as concisely as a one-byte struct code

    >
    > Looks like you ignored the first word in the sentence ("Not").


    I agree I have no confident idea of what your English meant.

    I guess you're hinting at the solution you think I should find obvious,
    without volunteering what that is.

    Yes? If so, then:

    I guess for you "a one-byte struct code" is a 'B' provided as a "format
    character" of the fmt parameter of the struct.pack function.

    Yes? if so, then:

    You recommend shattering the three byte int:

    skip = 0x012345 ; count = 0x80
    struct.pack('>6B', 0x08, skip >> 0x10, skip >> 8, skip, count, 0)

    Except you know that chokes over:

    DeprecationWarning: 'B' format requires 0 <= number <= 255

    So actually you recommend:

    def lossypack(fmt, *args):
    return struct.pack(fmt, *[(arg & 0xFF) for arg in args])
    skip = 0x012345 ; count = 0x80
    lossypack('>6B', 0x08, skip >> 0x10, skip >> 8, skip, count, 0)

    Yes?

    > > I guess you're asking me ...
    > > to show more plainly that indeed I am trying
    > > to make sense of every word of every answer


    Am I helping?
    , Nov 10, 2006
    #10
  11. Guest

    > > when talking the 1960's lingo
    > > ...
    > > X12Inquiry = 0x12
    > > xxs = [0] * 6
    > > xxs[0] = X12Inquiry
    > > xxs[4] = allocationLength
    > > rq = ''.join([chr(xx) for xx in xxs])

    >
    > It looks wrong (and a few other adjectives),


    Ah, we agree, thank you for saying.

    > Looks like little-endian 4-byte integer
    > followed by 2-byte integer ... what's wrong with struct.pack("<IH",
    > X12Inquiry, allocationLength) ????


    Pack '<IH' doesn't match how the code that I'm refactoring thinks about
    these things.

    The people who wrote this stuff forty years ago were thinking of bit
    fields - here bit lengths of 8 then 3 then 21 then 8 then 8 bits -
    cheating only when the bit boundaries happened to hit byte boundaries.

    Yes, as you describe in this example, I could cheat when the boundaries
    happen to hit H or I boundaries as well, but then I'm still left coping
    with the cases where the fields split on byte boundaries that are not H
    or I boundaries, such as the example:

    > > skip = 0x123456; count = 0x80
    > > hi, lo = divmod(skip, 0x10000)

    >
    > Does it do what you asked (one pack call instead of three)????


    One pack call, not three, yes.

    Shatters the 3 byte int into 1 and 2 bytes by divmod of (0xFFFF + 1),
    yes.

    > > I guess you're asking me ...
    > > to show more plainly that indeed I am trying
    > > to make sense of every word of every answer


    Am I helping?
    , Nov 10, 2006
    #11
  12. Guest

    Speaking as the OP, perhaps I should mention:

    > > [-3:] to [1:] is a minor cosmetic improvement


    To my eye, that's Not an improvement.

    '\x08' '\x01\x23\x45' '\x80' '\0' is the correct pack of (0x08,
    0x12345, 0x80, 0) because '\x01\x23\x45' are the significant low three
    bytes of a big-endian x12345, thus [-3:].

    The [1:] fact that we can keep the 3 significant bytes by tossing
    exactly 1 byte away after rounding the bit length of that digital
    number up to the nearest power of two which happens to be 4 = 3 + 1 is
    merely incidental - not of central significance.
    , Nov 10, 2006
    #12
  13. John Machin Guest

    wrote:
    > > "struct" == "Python struct module"
    > >
    > > Struct module has (concise) codes B, H, I, Q for unsigned integers of
    > > lengths 1, 2, 4, 8, but does *not* have a code for 3-byte integers.

    >
    > I thought that's what the manual meant, but I was unsure, thank you.


    If it doesn't have a code for 3-byte integers in the table of codes, it
    doesn't have one. What's to be unsure about??

    >
    > > > > 1. Not as concisely as a one-byte struct code

    > >
    > > Looks like you ignored the first word in the sentence ("Not").

    >
    > I agree I have no confident idea of what your English meant.


    "Not" is rather unambiguous.

    >
    > I guess you're hinting at the solution you think I should find obvious,
    > without volunteering what that is.


    I did volunteer what it is -- try reading the message again. You may
    need to use the PageDown key :)

    >
    > Yes? If so, then:


    No, not at all. Stop guessing. I ask again: have you read the solution
    that I gave???
    Here it is again:

    """
    You could try throwing the superfluous bits away before packing instead
    of after:

    | >>> from struct import pack
    | >>> skip = 0x123456; count = 0x80
    | >>> hi, lo = divmod(skip, 0x10000)
    | >>> cdb = pack(">BBHBB", 0x08, hi, lo, count, 0)
    | >>> ' '.join(["%02X" % ord(x) for x in cdb])
    | '08 12 34 56 80 00'


    >
    > I guess for you "a one-byte struct code" is a 'B' provided as a "format
    > character" of the fmt parameter of the struct.pack function.
    >
    > Yes?


    Yes -- but I thought we were already over that.


    > if so, then:


    This does not follow.

    >
    > You recommend shattering the three byte int:


    Yes, but not like that. See above.

    >
    > skip = 0x012345 ; count = 0x80
    > struct.pack('>6B', 0x08, skip >> 0x10, skip >> 8, skip, count, 0)
    >
    > Except you know that chokes over:
    >
    > DeprecationWarning: 'B' format requires 0 <= number <= 255


    That is not "choking" -- that is barfing; it is telling you that you
    have done something silly.

    >
    > So actually you recommend:
    >
    > def lossypack(fmt, *args):
    > return struct.pack(fmt, *[(arg & 0xFF) for arg in args])
    > skip = 0x012345 ; count = 0x80
    > lossypack('>6B', 0x08, skip >> 0x10, skip >> 8, skip, count, 0)
    >
    > Yes?


    No, never, not in a pink fit.

    >
    > > > I guess you're asking me ...
    > > > to show more plainly that indeed I am trying
    > > > to make sense of every word of every answer


    You guess wrongly.

    >
    > Am I helping?


    No.
    John Machin, Nov 10, 2006
    #13
  14. Guest

    Perhaps Python can't concisely say three-byte int ...

    But Python can say six-nybble hex:

    >>> import binascii
    >>> cdb = binascii.unhexlify('%02X%06X%02X%02X' % (0x08, 0x12345, 0x80, 0))
    >>> binascii.hexlify(cdb)

    '080123458000'
    >>>


    Thanks again for patiently helping me find this. A shortcut is:

    http://docs.python.org/lib/genindex.html
    search: hex
    , Nov 10, 2006
    #14
  15. John Machin Guest

    wrote:


    >
    > Pack '<IH' doesn't match how the code that I'm refactoring thinks about
    > these things.
    >
    > The people who wrote this stuff forty years ago were thinking of bit
    > fields - here bit lengths of 8 then 3 then 21 then 8 then 8 bits -
    > cheating only when the bit boundaries happened to hit byte boundaries.
    >
    > Yes, as you describe in this example, I could cheat when the boundaries
    > happen to hit H or I boundaries as well, but then I'm still left coping
    > with the cases where the fields split on byte boundaries that are not H
    > or I boundaries, such as the example:
    >
    > Am I helping?


    Yes, you have *finally* said unambiguously what your problem really is
    -- field lengths not a multiple of 8 bits. I suggest that you start a
    new thread, write it out logically and ask for assistance. You should
    get some sensible answers. I will apologise in advance for not
    participating; I'm exhausted.

    Cheers,
    John
    John Machin, Nov 10, 2006
    #15
  16. John Machin Guest

    wrote:
    > Speaking as the OP, perhaps I should mention:
    >
    > > > [-3:] to [1:] is a minor cosmetic improvement

    >
    > To my eye, that's Not an improvement.
    >
    > '\x08' '\x01\x23\x45' '\x80' '\0' is the correct pack of (0x08,
    > 0x12345, 0x80, 0) because '\x01\x23\x45' are the significant low three
    > bytes of a big-endian x12345, thus [-3:].
    >
    > The [1:] fact that we can keep the 3 significant bytes by tossing
    > exactly 1 byte away after rounding the bit length of that digital
    > number up to the nearest power of two which happens to be 4 = 3 + 1 is
    > merely incidental - not of central significance.


    I said *cosmetic* improvement and also said "obscures the underlying"
    need-to-know that 4 - 3 == 1 (which I didn't contemplate needing 2
    paragraphs of laborious explanation).
    John Machin, Nov 10, 2006
    #16
  17. At Thursday 9/11/2006 22:24, wrote:

    >Perhaps Python can't concisely say three-byte int ...
    >
    >But Python can say six-nybble hex:
    >
    > >>> import binascii
    > >>> cdb = binascii.unhexlify('%02X%06X%02X%02X' % (0x08, 0x12345, 0x80, 0))
    > >>> binascii.hexlify(cdb)

    >'080123458000'


    The only problem I can see is that this code is endianness-dependent;
    the suggested versions using pack(">...") not. But this may not be of
    concern to you.


    --
    Gabriel Genellina
    Softlab SRL

    __________________________________________________
    Correo Yahoo!
    Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
    ¡Abrí tu cuenta ya! - http://correo.yahoo.com.ar
    Gabriel Genellina, Nov 10, 2006
    #17
  18. Guest

    > > ... Python can say six-nybble hex:
    > >
    > > >>> import binascii
    > > >>> cdb = binascii.unhexlify('%02X%06X%02X%02X' % (0x08, 0x12345, 0x80, 0))
    > > >>> binascii.hexlify(cdb)

    > >'080123458000'

    >
    > The only problem I can see is that this code is endianness-dependent;
    > the suggested versions using pack(">...") not. But this may not be of
    > concern to you.


    Thanks for cautioning us. I suspect we agree:

    i) pack('>...') can't say three byte int.
    ii) binascii.hexlify evals bytes in the order printed.
    iii) %X prints the bytes of an int in big-endian order.
    iv) struct.unpack '>' of struct.pack '<' flips the bytes of an int
    v) struct.unpack '<' of struct.pack '>' flips the bytes of an int
    vi) [::-1] flips a string of bytes.

    In practice, all my lil-endian structs live by the C/Python-struct-pack
    law requiring the byte size of a field to be a power of two, so I can
    use Python-struct-pack to express them concisely. Only my big-endian
    structs are old enough to violate that recently (since ~1972)
    popularised convention, so only those do I construct with
    binascii.unhexlify.

    Often I wrap a big-endian struct in a lil-endian struct, but I'm ok
    calling hexlify to make the big-endian struct and then catenating it
    into the lil-endian struct, e.g., in the following cdb is big-endian
    inside cbwBytes:

    cbwBytes = struct.pack('<IIIBBB',
    cbw.dSignature, cbw.dTag, cbw.dDataTransferLength,
    cbw.bmFlags, cbw.bLun, cbw.bCbLength,
    ) + cdb[0:cdbLength] + ('\0' * (0x10 - cdbLength))
    , Nov 10, 2006
    #18
  19. At Friday 10/11/2006 00:08, wrote:

    > > > >>> import binascii
    > > > >>> cdb = binascii.unhexlify('%02X%06X%02X%02X' % (0x08,

    > 0x12345, 0x80, 0))
    > > > >>> binascii.hexlify(cdb)
    > > >'080123458000'

    > >
    > > The only problem I can see is that this code is endianness-dependent;
    > > the suggested versions using pack(">...") not. But this may not be of
    > > concern to you.

    >
    >Thanks for cautioning us. I suspect we agree:
    >
    >i) pack('>...') can't say three byte int.
    >ii) binascii.hexlify evals bytes in the order printed.
    >iii) %X prints the bytes of an int in big-endian order.
    >iv) struct.unpack '>' of struct.pack '<' flips the bytes of an int
    >v) struct.unpack '<' of struct.pack '>' flips the bytes of an int
    >vi) [::-1] flips a string of bytes.


    Yes to all.

    >In practice, all my lil-endian structs live by the C/Python-struct-pack
    >law requiring the byte size of a field to be a power of two, so I can
    >use Python-struct-pack to express them concisely. Only my big-endian
    >structs are old enough to violate that recently (since ~1972)
    >popularised convention, so only those do I construct with
    >binascii.unhexlify.


    So you would have no problems. I stand corrected: the code above will
    always generate big-endian numbers.


    --
    Gabriel Genellina
    Softlab SRL

    __________________________________________________
    Correo Yahoo!
    Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
    ¡Abrí tu cuenta ya! - http://correo.yahoo.com.ar
    Gabriel Genellina, Nov 10, 2006
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Schnoffos
    Replies:
    2
    Views:
    1,199
    Martien Verbruggen
    Jun 27, 2003
  2. Hal Styli
    Replies:
    14
    Views:
    1,615
    Old Wolf
    Jan 20, 2004
  3. Tim Jones
    Replies:
    0
    Views:
    371
    Tim Jones
    Jan 31, 2004
  4. Alexander Farber

    pack 'C3U*' not same as pack 'C3(xC)*'

    Alexander Farber, Jun 23, 2005, in forum: Perl Misc
    Replies:
    2
    Views:
    125
    Ilmari Karonen
    Jun 23, 2005
  5. Alex J
    Replies:
    21
    Views:
    499
    Tim Rentsch
    Jun 16, 2013
Loading...

Share This Page