Joining 2 char to 1 short

Discussion in 'C Programming' started by Steffen Loringer, Jun 30, 2006.

  1. Hi,

    I'm using the following function to join 2 char (byte) into one short on
    a 32 bit X86 platform:

    unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    {
    unsigned short val = 0;
    val = a;
    val <<= 8;
    val |= b;
    return val;
    }

    Will this also work if compiled on a PowerPC? Are there better ways to
    do it?

    Thanks a lot!

    Steve
    Steffen Loringer, Jun 30, 2006
    #1
    1. Advertising

  2. Steffen Loringer

    Richard Bos Guest

    Steffen Loringer <> wrote:

    > I'm using the following function to join 2 char (byte) into one short on
    > a 32 bit X86 platform:
    >
    > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > {
    > unsigned short val = 0;
    > val = a;
    > val <<= 8;
    > val |= b;
    > return val;
    > }
    >
    > Will this also work if compiled on a PowerPC?


    Depends on what you want. Of course, it _does_ assume that CHAR_BIT is 8
    and sizeof (short) is at least 2. Both of those are very common; neither
    is guaranteed. It is possible to encounter devices where CHAR_BIT is 32,
    and sizeof (char) == sizeof (short) == sizeof (int) == 1.

    However, since an unsigned short must be able to hold at least 2**16-1,
    and therefore be at least 16 bits wide, CHAR_BIT being exactly 8 already
    implies sizeof (short) being at least 2. (The implication doesn't hold
    other way; for example, a system where sizeof (short) is 2, but CHAR_BIT
    is 9 is quite legal. 36-bit word, char is a quarter word, short half of
    one.)

    OTOH, the assumption that CHAR_BIT is 8 can be removed by the
    marvelously exotic expedient of replacing 8 by CHAR_BIT. The assumption
    that sizeof (short) >= 2 is harder to get rid of.

    > Are there better ways to do it?


    Yes; provided you are willing to put a note in the documentation that
    the code assumes that sizeof (short) >= 2, your entire function can be
    replaced by

    unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    {
    return (a<<CHAR_BIT) + b;
    }

    Richard
    Richard Bos, Jun 30, 2006
    #2
    1. Advertising


  3. >> Are there better ways to do it?

    >
    > Yes; provided you are willing to put a note in the documentation that
    > the code assumes that sizeof (short) >= 2, your entire function can be
    > replaced by
    >
    > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > {
    > return (a<<CHAR_BIT) + b;
    > }
    >
    > Richard


    So I assume bit-shifting in this function is independend concerning
    big/low endian systems!? Does the compiler take care for correct
    shifting in both cases?
    Steffen Loringer, Jun 30, 2006
    #3
  4. Steffen Loringer

    Chris Dollin Guest

    Steffen Loringer wrote:

    >>> Are there better ways to do it?

    >>
    >> Yes; provided you are willing to put a note in the documentation that
    >> the code assumes that sizeof (short) >= 2, your entire function can be
    >> replaced by
    >>
    >> unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    >> {
    >> return (a<<CHAR_BIT) + b;
    >> }
    >>
    >> Richard

    >
    > So I assume bit-shifting in this function is independend concerning
    > big/low endian systems!? Does the compiler take care for correct
    > shifting in both cases?


    There isn't a bigendian/littleendian issue to worry about here. Why did
    you think there was?

    --
    Chris "seeker" Dollin
    "Life is full of mysteries. Consider this one of them." Sinclair, /Babylon 5/
    Chris Dollin, Jun 30, 2006
    #4
  5. Steffen Loringer

    pete Guest

    Steffen Loringer wrote:
    >
    > >> Are there better ways to do it?

    > >
    > > Yes; provided you are willing to
    > > put a note in the documentation that
    > > the code assumes that sizeof (short) >= 2,
    > > your entire function can be replaced by
    > >
    > > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > > {
    > > return (a<<CHAR_BIT) + b;
    > > }
    > >
    > > Richard

    >
    > So I assume bit-shifting in this function is independend concerning
    > big/low endian systems!? Does the compiler take care for correct
    > shifting in both cases?


    Yes.
    The meaning of the code might be more obvious
    if done with arithmetic operations.

    unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    {
    return (UCHAR_MAX + 1U) * a + b;
    }

    That also avoids the undefined behavior associated
    with shifting too far, if sizeof(short) is greater than one.
    And, (UCHAR_MAX + 1U) * a, is likely to be compiled t
    o code which is just as fast as the shifting code.

    If sizeof(short) is equal to one, then the function returns b,
    so the "sizeof (short) > 1" documentation
    that Richard Boss mentioned, still applies.

    --
    pete
    pete, Jun 30, 2006
    #5
  6. "Steffen Loringer" <> schrieb im Newsbeitrag
    news:...
    >
    > >> Are there better ways to do it?

    > >
    > > Yes; provided you are willing to put a note in the documentation that
    > > the code assumes that sizeof (short) >= 2, your entire function can be
    > > replaced by
    > >
    > > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > > {
    > > return (a<<CHAR_BIT) + b;
    > > }
    > >
    > > Richard

    >
    > So I assume bit-shifting in this function is independend concerning
    > big/low endian systems!? Does the compiler take care for correct
    > shifting in both cases?
    >


    <OT>
    Have a look at the functions
    ntohs() [htons()] which convert a short value from network byte order (big
    endian) to host byte order (whatever your host system is) [and vice versa]
    </OT>
    Sven Fülster, Jun 30, 2006
    #6
  7. Sven Fülster wrote:
    > "Steffen Loringer" <> schrieb im Newsbeitrag
    > news:...
    >>>> Are there better ways to do it?
    >>> Yes; provided you are willing to put a note in the documentation that
    >>> the code assumes that sizeof (short) >= 2, your entire function can be
    >>> replaced by
    >>>
    >>> unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    >>> {
    >>> return (a<<CHAR_BIT) + b;
    >>> }
    >>>
    >>> Richard

    >> So I assume bit-shifting in this function is independend concerning
    >> big/low endian systems!? Does the compiler take care for correct
    >> shifting in both cases?
    >>

    >
    > <OT>
    > Have a look at the functions
    > ntohs() [htons()] which convert a short value from network byte order (big
    > endian) to host byte order (whatever your host system is) [and vice versa]
    > </OT>


    Or get/put them directly in the right order(big endian in this case) :) ?
    uint16_t unpack16(uint8_t buf[2])
    {
    return buf[0]<<8 | buf[1];
    }
    void pack16(uint16_t val,uint8_t buf[2])
    {
    buf[0] = val>>8;
    buf[1] = val&0xff;
    }
    =?ISO-8859-1?Q?=22Nils_O=2E_Sel=E5sdal=22?=, Jun 30, 2006
    #7
  8. Steffen Loringer

    pete Guest

    pete wrote:
    >
    > Steffen Loringer wrote:
    > >
    > > >> Are there better ways to do it?
    > > >
    > > > Yes; provided you are willing to
    > > > put a note in the documentation that
    > > > the code assumes that sizeof (short) >= 2,
    > > > your entire function can be replaced by
    > > >
    > > > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > > > {
    > > > return (a<<CHAR_BIT) + b;
    > > > }
    > > >
    > > > Richard

    > >
    > > So I assume bit-shifting in this function is independend concerning
    > > big/low endian systems!? Does the compiler take care for correct
    > > shifting in both cases?

    >
    > Yes.
    > The meaning of the code might be more obvious
    > if done with arithmetic operations.
    >
    > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > {
    > return (UCHAR_MAX + 1U) * a + b;
    > }
    >
    > That also avoids the undefined behavior associated
    > with shifting too far, if sizeof(short)
    > is


    should be "isn't"

    > greater than one.
    > And, (UCHAR_MAX + 1U) * a, is likely to be compiled t
    > o code which is just as fast as the shifting code.
    >
    > If sizeof(short) is equal to one, then the function returns b,
    > so the "sizeof (short) > 1" documentation
    > that Richard Boss mentioned, still applies.
    >
    > --
    > pete


    --
    pete
    pete, Jun 30, 2006
    #8
  9. Steffen Loringer posted:

    > Hi,
    >
    > I'm using the following function to join 2 char (byte) into one short

    on
    > a 32 bit X86 platform:
    >
    > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > {
    > unsigned short val = 0;
    > val = a;



    Ridculously inefficient.

    Why set a variable's value to zero, and the immediately give it another
    value?


    > val <<= 8;



    val <<= CHAR_BIT; /* A byte isn't always 8 bits */


    > val |= b;
    > return val;
    > }
    >
    > Will this also work if compiled on a PowerPC? Are there better ways to
    > do it?



    Not necessarily. If unsigned char has 16 value bits and so does short,
    then you won't get the result you want.



    --

    Frederick Gotham
    Frederick Gotham, Jun 30, 2006
    #9
  10. pete posted:

    > Steffen Loringer wrote:
    >>
    >> >> Are there better ways to do it?
    >> >
    >> > Yes; provided you are willing to
    >> > put a note in the documentation that
    >> > the code assumes that sizeof (short) >= 2,
    >> > your entire function can be replaced by
    >> >
    >> > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    >> > {
    >> > return (a<<CHAR_BIT) + b;
    >> > }
    >> >
    >> > Richard

    >>
    >> So I assume bit-shifting in this function is independend concerning
    >> big/low endian systems!? Does the compiler take care for correct
    >> shifting in both cases?

    >
    > Yes.
    > The meaning of the code might be more obvious
    > if done with arithmetic operations.
    >
    > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > {
    > return (UCHAR_MAX + 1U) * a + b;
    > }



    If I'm not mistaken, (UCHAR_MAX + 1U) is well within its rights to
    evaluate to zero. Imagine the following system:

    unsigned char: 32 bits, no padding
    unsigned short: 32 bits, no padding
    unsigned int: 32 bits, no padding
    unsigned long: 32 bits, no padding



    --

    Frederick Gotham
    Frederick Gotham, Jun 30, 2006
    #10
  11. Steffen Loringer

    Tom St Denis Guest

    Frederick Gotham wrote:
    > > unsigned short val = 0;
    > > val = a;

    >
    > Ridculously inefficient.
    >
    > Why set a variable's value to zero, and the immediately give it another
    > value?


    It's poorly written not inefficient. I suggest you look at the output
    of an optimizing compiler from time to time :)

    > > val <<= 8;

    >
    >
    > val <<= CHAR_BIT; /* A byte isn't always 8 bits */


    Um, what if you want to pack 8 bit words into a larger word? For [say]
    network coding you want to be explicit.

    > > val |= b;
    > > return val;
    > > }
    > >
    > > Will this also work if compiled on a PowerPC? Are there better ways to
    > > do it?

    >
    >
    > Not necessarily. If unsigned char has 16 value bits and so does short,
    > then you won't get the result you want.


    Which is why he shifted by 8 and not CHAR_BIT. Chances are the inputs
    are ranged limited to 0..255.

    Speaking as someone who writes code on an x86 and has it excuted on all
    manners of MIPS, PPC, SPARC, IA64 and others I think I know what I'm
    talking about here.

    If the intent was to pack two 8 bit values into a single integer the
    routine

    unsigned pack(unsigned char a, unsigned char b) { return (a<<8)|b; }

    Will work fine.

    Of course in my code I use macros for all this and I explicitly cast
    everything to char or long [unsigned of course] just to cover my bases.

    Tom
    Tom St Denis, Jun 30, 2006
    #11
  12. Steffen Loringer

    pete Guest

    Frederick Gotham wrote:
    >
    > pete posted:
    >
    > > Steffen Loringer wrote:
    > >>
    > >> >> Are there better ways to do it?
    > >> >
    > >> > Yes; provided you are willing to
    > >> > put a note in the documentation that
    > >> > the code assumes that sizeof (short) >= 2,
    > >> > your entire function can be replaced by
    > >> >
    > >> > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > >> > {
    > >> > return (a<<CHAR_BIT) + b;
    > >> > }
    > >> >
    > >> > Richard
    > >>
    > >> So I assume bit-shifting in this function is independend concerning
    > >> big/low endian systems!? Does the compiler take care for correct
    > >> shifting in both cases?

    > >
    > > Yes.
    > > The meaning of the code might be more obvious
    > > if done with arithmetic operations.
    > >
    > > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > > {
    > > return (UCHAR_MAX + 1U) * a + b;
    > > }

    >
    > If I'm not mistaken, (UCHAR_MAX + 1U) is well within its rights to
    > evaluate to zero.


    Yes.

    The part of my post which you snipped, addresses that issue:

    "If sizeof(short) is equal to one,
    then the function returns b,
    so the "sizeof (short) > 1" documentation
    that Richard Boss mentioned, still applies."

    --
    pete
    pete, Jun 30, 2006
    #12
  13. Tom St Denis posted:



    > Speaking as someone who writes code on an x86 and has it excuted on all
    > manners of MIPS, PPC, SPARC, IA64 and others I think I know what I'm
    > talking about here.



    That shouldn't be an issue if you're writing portable code.

    By the way, was the injection of arrogance intentional?



    > unsigned pack(unsigned char a, unsigned char b) { return (a<<8)|b; }
    >
    > Will work fine.



    I'd probably do something like:

    #include <assert.h>

    unsigned short Pack( unsigned char a, unsigned char b )
    {
    assert( a <= 255 );
    assert( b <= 255 );

    return ( a << 8 ) | b;
    }


    --

    Frederick Gotham
    Frederick Gotham, Jun 30, 2006
    #13
  14. Steffen Loringer <> writes:
    > Hi,
    >
    > I'm using the following function to join 2 char (byte) into one short on
    > a 32 bit X86 platform:
    >
    > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > {
    > unsigned short val = 0;
    > val = a;
    > val <<= 8;
    > val |= b;
    > return val;
    > }
    >
    > Will this also work if compiled on a PowerPC? Are there better ways to
    > do it?


    It will do what it says. Whether it will do what you want depends on
    exactly what you want, which you haven't quite told us.

    There are many possible ways to join two chars into a short. What
    exactly are you trying to accomplish? Do you want the value of "a" in
    the high-order bits of the result? Do you want it in the leftmost
    (lowest address) portion of the result?

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Jun 30, 2006
    #14
  15. Steffen Loringer

    Richard Bos Guest

    "Sven Fülster" <> wrote:

    > "Steffen Loringer" <> schrieb im Newsbeitrag
    > news:...
    > >
    > > >> Are there better ways to do it?
    > > >
    > > > Yes; provided you are willing to put a note in the documentation that
    > > > the code assumes that sizeof (short) >= 2, your entire function can be
    > > > replaced by
    > > >
    > > > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > > > {
    > > > return (a<<CHAR_BIT) + b;
    > > > }

    > >
    > > So I assume bit-shifting in this function is independend concerning
    > > big/low endian systems!? Does the compiler take care for correct
    > > shifting in both cases?


    Yes. The shift operators work on values, not on representations. This
    code always puts a in the higher-value byte of the return value, and b
    in the lower-value one.

    > <OT>
    > Have a look at the functions ntohs() [htons()]


    Or rather, don't. Not only will they make your code less portable, they
    will not add any functionality in this case.

    Richard
    Richard Bos, Jul 3, 2006
    #15
  16. On Fri, 30 Jun 2006 11:18:17 GMT, pete <> wrote:

    > Steffen Loringer wrote:
    > >
    > > >> Are there better ways to do it?
    > > >
    > > > Yes; provided you are willing to
    > > > put a note in the documentation that
    > > > the code assumes that sizeof (short) >= 2,
    > > > your entire function can be replaced by
    > > >
    > > > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > > > {
    > > > return (a<<CHAR_BIT) + b;


    To be picky, at least ( (unsigned short)a << CHAR_BIT ) + b
    or (as I prefer) to either use both bitwise
    (unsigned short)a << CHAR_BIT | b /* precedence ok */
    or both arithmetic as you do below.

    Otherwise unsigned char of 8 or even 9 or 12 bits can promote to
    signed int of as little as 15+1 bits, in which left shift of some
    uchar values by CHAR_BIT overflows and produces U.B.

    <snip>
    > The meaning of the code might be more obvious
    > if done with arithmetic operations.
    >
    > unsigned short joinUnsigShort(unsigned char a,unsigned char b)
    > {
    > return (UCHAR_MAX + 1U) * a + b;
    > }
    >
    > That also avoids the undefined behavior associated
    > with shifting too far, if sizeof(short) is greater than one.


    Corrected to isn't. But size or even width of short doesn't matter. It
    is U.B. if unsigned int is the same width as unsigned char (so the
    promoted value is being shifted by >= its width) OR if (signed) int is
    wider than char but not 'more than twice' (precisely, does not have at
    least twice as many value/magnitude bits, plus sign).

    > And, (UCHAR_MAX + 1U) * a, is likely to be compiled t
    > o code which is just as fast as the shifting code.
    >
    > If sizeof(short) is equal to one, then the function returns b,
    > so the "sizeof (short) > 1" documentation
    > that Richard Boss mentioned, still applies.



    - David.Thompson1 at worldnet.att.net
    Dave Thompson, Jul 10, 2006
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. SinusX

    char + short = char*

    SinusX, May 18, 2004, in forum: C++
    Replies:
    5
    Views:
    406
    Sharad Kala
    May 18, 2004
  2. lovecreatesbeauty
    Replies:
    1
    Views:
    1,010
    Ian Collins
    May 9, 2006
  3. David Geering

    longs, long longs, short short long ints . . . huh?!

    David Geering, Jan 8, 2007, in forum: C Programming
    Replies:
    15
    Views:
    545
    Keith Thompson
    Jan 11, 2007
  4. Replies:
    4
    Views:
    802
    Kaz Kylheku
    Oct 17, 2006
  5. Ioannis Vranos

    unsigned short, short literals

    Ioannis Vranos, Mar 4, 2008, in forum: C Programming
    Replies:
    5
    Views:
    658
    Eric Sosman
    Mar 5, 2008
Loading...

Share This Page