Convert binary char array to integer with reordering

Discussion in 'C Programming' started by Dave, Oct 25, 2005.

  1. Dave

    Dave Guest

    Hi all,

    I have a 4 byte char array with the binary data for two 16-bit signed
    integers in it like this:

    Index 3 2 1 0
    Data Bh Bl Ah Al

    Where Bh is the high byte of signed 16-bit integer B and so on.

    I want to create 32-bit integers A and B with the data in the char
    array.

    I have tried things like (and various other permutations):

    A = (data[1] << 8) | (unsigned int)data[0];
    B = (data[3] << 8) | (unsigned int)data[2];

    and this works except for when say data[1] = 0x00 and data[0] = 0x80.
    In this case, Al gets sign extended all the way to the top of A giving
    0xffffff80 which is wrong of course.

    I read about the arithmetic converions and I believe it is these that
    are converting the right operand to signed and causing the sign
    extension.

    At the moment, I am getting things right like this:

    int A;
    int B;
    char buildA[4], buildB[4], data[4];

    // data[] gets filled here

    buildA[0] = data[0];
    buildA[1] = data[1];

    if (data[1] >> 7)
    {
    buildA[2] = (char)0xff;
    buildA[3] = (char)0xff;
    }
    else
    {
    buildA[2] = 0x00;
    buildA[3] = 0x00;
    }

    buildB[0] = data[2];
    buildB[1] = data[3];

    if (data[3] >> 7)
    {
    buildB[2] = (char)0xff;
    buildB[3] = (char)0xff;
    }
    else
    {
    buildB[2] = 0x00;
    buildB[3] = 0x00;
    }

    A = *((int*)buildA);
    B = *((int*)buildB);

    Surely there is a cleaner way?

    Many thanks for your time,

    Dave
     
    Dave, Oct 25, 2005
    #1
    1. Advertising

  2. In article <>,
    Dave <> wrote:
    >Hi all,
    >
    >I have a 4 byte char array with the binary data for two 16-bit signed
    >integers in it like this:
    >
    >Index 3 2 1 0
    >Data Bh Bl Ah Al
    >
    >Where Bh is the high byte of signed 16-bit integer B and so on.


    Not portable. Can't discuss it here. Blah, blah, blah.
     
    Kenny McCormack, Oct 25, 2005
    #2
    1. Advertising

  3. Dave

    Dick de Boer Guest

    If you convert a signed char to an unsigned int, the result is
    sign-extended, because the char is signed. Try:
    A = (data[1] << 8) | (unsigned char)data[0];
    B = (data[3] << 8) | (unsigned achr)data[2];
    (Or make the array of type unsigned char)

    Now, the unsigned char is default promoted to signed int, and the signed int
    should have the same value as the unsigned char...

    DickB

    "Dave" <> wrote in message
    news:...
    > Hi all,
    >
    > I have a 4 byte char array with the binary data for two 16-bit signed
    > integers in it like this:
    >
    > Index 3 2 1 0
    > Data Bh Bl Ah Al
    >
    > Where Bh is the high byte of signed 16-bit integer B and so on.
    >
    > I want to create 32-bit integers A and B with the data in the char
    > array.
    >
    > I have tried things like (and various other permutations):
    >
    > A = (data[1] << 8) | (unsigned int)data[0];
    > B = (data[3] << 8) | (unsigned int)data[2];
    >
    > and this works except for when say data[1] = 0x00 and data[0] = 0x80.
    > In this case, Al gets sign extended all the way to the top of A giving
    > 0xffffff80 which is wrong of course.
    >
    > I read about the arithmetic converions and I believe it is these that
    > are converting the right operand to signed and causing the sign
    > extension.
    >
    > At the moment, I am getting things right like this:
    >
    > int A;
    > int B;
    > char buildA[4], buildB[4], data[4];
    >
    > // data[] gets filled here
    >
    > buildA[0] = data[0];
    > buildA[1] = data[1];
    >
    > if (data[1] >> 7)
    > {
    > buildA[2] = (char)0xff;
    > buildA[3] = (char)0xff;
    > }
    > else
    > {
    > buildA[2] = 0x00;
    > buildA[3] = 0x00;
    > }
    >
    > buildB[0] = data[2];
    > buildB[1] = data[3];
    >
    > if (data[3] >> 7)
    > {
    > buildB[2] = (char)0xff;
    > buildB[3] = (char)0xff;
    > }
    > else
    > {
    > buildB[2] = 0x00;
    > buildB[3] = 0x00;
    > }
    >
    > A = *((int*)buildA);
    > B = *((int*)buildB);
    >
    > Surely there is a cleaner way?
    >
    > Many thanks for your time,
    >
    > Dave
    >
     
    Dick de Boer, Oct 25, 2005
    #3
  4. Dave

    Jordan Abel Guest

    On 2005-10-25, Kenny McCormack <> wrote:
    > In article <>,
    > Dave <> wrote:
    >>Hi all,
    >>
    >>I have a 4 byte char array with the binary data for two 16-bit signed
    >>integers in it like this:
    >>
    >>Index 3 2 1 0
    >>Data Bh Bl Ah Al
    >>
    >>Where Bh is the high byte of signed 16-bit integer B and so on.

    >
    > Not portable. Can't discuss it here. Blah, blah, blah.


    says who?

    int16_t A = data[1]<<8+data[0];
    int16_t B = data[3]<<8+data[2];

    looks portable to me. chars have to be at least 8 bits [to represent values
    from -127 to 127 signed, 0 to 255 unsigned, they have to be], and int16_t where
    present is exactly 16 bits and signed. Now ideally you should be using unsigned
    char for this, but I don't think it actually matters in this case [well, I
    suppose negative zero could still be a trap representation on non twos
    complement systems].

    Now, the precise _meaning_ of those bytes with regards to the integer value you
    end up with may differ in that negative numbers [in this case, where high bit
    of byte 1 or 3 is set] can be represented in precisely three different ways
    according to the standard, but assuming that he got the byte values in a
    portable way in the first place and is using them on the same machine where he
    generated them, he can put them back the same way he got them out, and assuming
    he used a type guaranteed to be exactly 16 bits (say, c99 int16_t, <stdint.h>)
    he loses no information in doing so.

    While this exercise may seem pointless, it could be intended as a method of
    serialization [in which case, though, he may wish to guarantee a particular
    signed representation of his values as well as a byte order]

    I can't imagine what (other than the possibility that char may be signed) is
    non-portable about this? Was it the use of names that resemble (but aren't
    actually the same as) x86 registers that made you think "non-portable!"?

    --
    How's that for my first post?
     
    Jordan Abel, Oct 25, 2005
    #4
  5. Dave

    Dave Guest

    Hi Dick,

    I used the code you quoted above (left data as char) and it worked
    perfectly.

    I think data needs to stay char so that the sign extension does happen
    with data[1] and [3] as required.

    Many thanks,

    Dave
     
    Dave, Oct 25, 2005
    #5
  6. Dave

    Richard Bos Guest

    Jordan Abel <> wrote:

    > On 2005-10-25, Kenny McCormack <> wrote:
    > > Not portable. Can't discuss it here. Blah, blah, blah.

    >
    > says who?


    Says a loser with a chip on his shoulder who still hasn't got over being
    told, once, that he himself posted something off-topic. Just ignore him
    when he's in this mode.

    Richard
     
    Richard Bos, Oct 25, 2005
    #6
  7. In article <>,
    Jordan Abel <> wrote:
    >int16_t A = data[1]<<8+data[0];
    >int16_t B = data[3]<<8+data[2];


    >looks portable to me.


    int16_t does not exist in C89, and in C99 it is optional.
    It simply doesn't exist on C99 systems that have (say) 18 bit ints.
    That makes it standardized but not portable.

    --
    Chocolate is "more than a food but less than a drug" -- RJ Huxtable
     
    Walter Roberson, Oct 25, 2005
    #7
  8. Dave

    Richard Bos Guest

    "Dave" <> wrote:

    > I want to create 32-bit integers A and B with the data in the char
    > array.
    >
    > I have tried things like (and various other permutations):
    >
    > A = (data[1] << 8) | (unsigned int)data[0];
    > B = (data[3] << 8) | (unsigned int)data[2];
    >
    > and this works except for when say data[1] = 0x00 and data[0] = 0x80.
    > In this case, Al gets sign extended all the way to the top of A giving
    > 0xffffff80 which is wrong of course.


    > char buildA[4], buildB[4], data[4];


    An alternative to the other solutions, perhaps safer because you have
    less chance of running into signed integer overflow, is to make data
    (and because of this, also buildA and buildB) arrays of unsigned int
    instead. They won't get sign-extended then simply because they won't
    have any sign.
    Note also that, since A and B are signed ints, you do not know for
    certain that they are 32 bits - use long or int32_t (or even
    int_least32_t, which is guaranteed to exist under C99) to get around
    this. What's worse, if you ever get an array that represents a value
    that doesn't fit in 31 bits - that is, 32 minus the sign bit - you cause
    undefined behaviour. Again, an unsigned type (uint_least32_t?) could be
    a good solution.

    Richard
     
    Richard Bos, Oct 25, 2005
    #8
  9. In article <djljr4$cuq$>,
    Walter Roberson <-cnrc.gc.ca> wrote:
    >In article <>,
    >Jordan Abel <> wrote:
    >>int16_t A = data[1]<<8+data[0];
    >>int16_t B = data[3]<<8+data[2];

    >
    >>looks portable to me.

    >
    >int16_t does not exist in C89, and in C99 it is optional.
    >It simply doesn't exist on C99 systems that have (say) 18 bit ints.
    >That makes it standardized but not portable.


    Exactly.
     
    Kenny McCormack, Oct 25, 2005
    #9
  10. Dave

    Default User Guest

    Dick de Boer wrote:

    > If you convert a signed char to an unsigned int, the result is
    > sign-extended, because the char is signed. Try: A = (data[1] << 8) |
    > (unsigned char)data[0]; B = (data[3] << 8) | (unsigned achr)data[2];
    > (Or make the array of type unsigned char)
    >
    > Now, the unsigned char is default promoted to signed int, and the
    > signed int should have the same value as the unsigned char...
    >
    > DickB
    >
    > "Dave" <> wrote in message
    > news:...
    > > Hi all,




    Please don't top-post. Your replies belong following or (preferably)
    interspersed with properly trimmed quotes.



    Brian
     
    Default User, Oct 25, 2005
    #10
  11. Dave

    Jordan Abel Guest

    On 2005-10-25, Kenny McCormack <> wrote:
    > In article <djljr4$cuq$>,
    > Walter Roberson <-cnrc.gc.ca> wrote:
    >>In article <>,
    >>Jordan Abel <> wrote:
    >>>int16_t A = data[1]<<8+data[0];
    >>>int16_t B = data[3]<<8+data[2];

    >>
    >>>looks portable to me.

    >>
    >>int16_t does not exist in C89, and in C99 it is optional.
    >>It simply doesn't exist on C99 systems that have (say) 18 bit ints.
    >>That makes it standardized but not portable.

    >
    > Exactly.


    which i call "portable enough" - guaranteed not to appear to work on systems
    where it won't work.

    but, if you must, here's the ruthlessly portable version:

    assumptions: unsigned char data[2] contains the high 8 bits and then the low 8
    bits of a signed 16-bit integer with the same sign representation as the host,
    regardless of the actual byte size or integer width of the host.

    int x =
    ((data[1] & 0177U) << 8) /* non-sign-bit portion of high 'byte' */
    | data[0] /* low 'byte' */
    | ((~0U<<15)&((int)(signed char)( (data[1] & 0200U) << (CHAR_BIT-8) ))
    /* that monster uses the system's sign extension to put the sign bit in the
    * right place, and properly extend it on 2s-comp or 1s-comp systems. */
    ;

    There may be superfluous parentheses - I never could remember the order of
    shifting vs bitwise and, and i'm irrationally uncomfortable with the cast
    operator

    on a signed-magnitude 36-bit system with nine-bit bytes, this should convert
    the bytes: 010011010 011001001

    to the value
    100000000000000000000001101011001001

    or -6857 decimal.

    I didn't bother with systems with a char of less than 8 bits since that's not
    allowed by the standard.
     
    Jordan Abel, Oct 25, 2005
    #11
  12. Dave

    Jordan Abel Guest

    On 2005-10-25, Jordan Abel <> wrote:
    > int x =
    > ((data[1] & 0177U) << 8) /* non-sign-bit portion of high 'byte' */
    > | data[0] /* low 'byte' */
    > | ((~0U<<15)&((int)(signed char)( (data[1] & 0200U) << (CHAR_BIT-8) ))
    > /* that monster uses the system's sign extension to put the sign bit in the
    > * right place, and properly extend it on 2s-comp or 1s-comp systems. */

    /* oops - forgot */
    | (~0<-1?(data[1]&0200U)?(~0)<<15:0:0)
    /* can anyone guess what that one does? */
    > ;
     
    Jordan Abel, Oct 25, 2005
    #12
  13. Dave

    Jordan Abel Guest

    On 2005-10-25, Jordan Abel <> wrote:
    > int x =
    > ((data[1] & 0177U) << 8) /* non-sign-bit portion of high 'byte' */
    > | data[0] /* low 'byte' */
    > | ((~0U<<15)&((int)(signed char)( (data[1] & 0200U) << (CHAR_BIT-8) ))
    > /* that monster uses the system's sign extension to put the sign bit in the
    > * right place, and properly extend it on 2s-comp or 1s-comp systems. */

    /* oops - forgot */
    | (~0>=-1?(data[1]&0200U)?(~0)<<15:0:0)
    /* can anyone guess what that one does? */
    > ;
     
    Jordan Abel, Oct 25, 2005
    #13
  14. Jordan Abel said:

    > On 2005-10-25, Jordan Abel <> wrote:
    >> int x =
    >> ((data[1] & 0177U) << 8) /* non-sign-bit portion of high 'byte' */
    >> | data[0] /* low 'byte' */
    >> | ((~0U<<15)&((int)(signed char)( (data[1] & 0200U) << (CHAR_BIT-8) ))
    >> /* that monster uses the system's sign extension to put the sign bit
    >> in the
    >> * right place, and properly extend it on 2s-comp or 1s-comp systems.
    >> */

    > /* oops - forgot */
    > | (~0>=-1?(data[1]&0200U)?(~0)<<15:0:0)
    > /* can anyone guess what that one does? */
    >> ;


    Gives a possible trap representation on ones' comp systems? :)

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/2005
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
     
    Richard Heathfield, Oct 25, 2005
    #14
  15. In article <djlqrn$oq7$-infra.bt.com>,
    Richard Heathfield <> wrote:
    >Jordan Abel said:
    >
    >> On 2005-10-25, Jordan Abel <> wrote:
    >>> int x =
    >>> ((data[1] & 0177U) << 8) /* non-sign-bit portion of high 'byte' */
    >>> | data[0] /* low 'byte' */
    >>> | ((~0U<<15)&((int)(signed char)( (data[1] & 0200U) << (CHAR_BIT-8) ))
    >>> /* that monster uses the system's sign extension to put the sign bit
    >>> in the
    >>> * right place, and properly extend it on 2s-comp or 1s-comp systems.
    >>> */

    >> /* oops - forgot */
    >> | (~0>=-1?(data[1]&0200U)?(~0)<<15:0:0)
    >> /* can anyone guess what that one does? */
    >>> ;

    >
    >Gives a possible trap representation on ones' comp systems? :)


    Exactly.
     
    Kenny McCormack, Oct 25, 2005
    #15
  16. Dave

    Jordan Abel Guest

    On 2005-10-25, Richard Heathfield <> wrote:
    > Jordan Abel said:
    >
    >> On 2005-10-25, Jordan Abel <> wrote:
    >>> int x =
    >>> ((data[1] & 0177U) << 8) /* non-sign-bit portion of high 'byte' */
    >>> | data[0] /* low 'byte' */
    >>> | ((~0U<<15)&((int)(signed char)( (data[1] & 0200U) << (CHAR_BIT-8) ))
    >>> /* that monster uses the system's sign extension to put the sign bit
    >>> in the
    >>> * right place, and properly extend it on 2s-comp or 1s-comp systems.
    >>> */

    >> /* oops - forgot */
    >> | (~0>=-1?(data[1]&0200U)?(~0)<<15:0:0)
    >> /* can anyone guess what that one does? */
    >>> ;

    >
    > Gives a possible trap representation on ones' comp systems? :)


    ~1>=-2, as i just _told_ you on ##c that i'd modify that to if challenged on
    this basis ;)
     
    Jordan Abel, Oct 25, 2005
    #16
  17. Dave

    Dick de Boer Guest

    "Default User" <> wrote in message
    news:...
    > Dick de Boer wrote:
    >
    > <cut top-post.
    >
    > Please don't top-post. Your replies belong following or (preferably)
    > interspersed with properly trimmed quotes.
    >

    Sorry, slip of my finger (mind)

    Dick
     
    Dick de Boer, Oct 26, 2005
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. lovecreatesbeauty
    Replies:
    1
    Views:
    1,097
    Ian Collins
    May 9, 2006
  2. Replies:
    3
    Views:
    744
  3. davidb
    Replies:
    0
    Views:
    773
    davidb
    Sep 1, 2006
  4. hsun
    Replies:
    4
    Views:
    137
    Simon Kröger
    Aug 31, 2005
  5. Replies:
    5
    Views:
    238
    Tassilo v. Parseval
    Nov 22, 2005
Loading...

Share This Page