Unions vs endian ness

Discussion in 'C Programming' started by root, Nov 27, 2009.

  1. root

    root Guest

    Friends

    I am bit twiddling on a 32 bit integer quantity. Often I need to look at
    only the low 16 bits.

    For this I currently AND with a mask.

    It seems to me that it might simplify the code instead to have an union,
    union u {
    int32_t dw;
    int16_t w;
    };

    But my question is: will this run in to port ability problems if I move
    the code to a plat form with different endian ness?

    /root


    --
    Learn more or lose your rights!
    http://www.guncontrolkills.com/
    http://www.gunbanobama.com/
    Learn more or lose your rights!
    root, Nov 27, 2009
    #1
    1. Advertising

  2. root

    Seebs Guest

    On 2009-11-27, root <> wrote:
    > I am bit twiddling on a 32 bit integer quantity. Often I need to look at
    > only the low 16 bits.
    >
    > For this I currently AND with a mask.
    >
    > It seems to me that it might simplify the code instead to have an union,
    > union u {
    > int32_t dw;
    > int16_t w;
    > };
    >
    > But my question is: will this run in to port ability problems if I move
    > the code to a plat form with different endian ness?


    Probably. Even ignoring the more general undefined behavior thing, there
    is nothing about this to tell you whether you get the low-order or high-order
    16 bits. Typically, that'd be low-order (little-endian) or high-order
    (big-endian), but there have been weirder cases out there, etcetera.

    Suggestion: Use a macro for the mask, use bit masking in it, and don't
    sweat it -- I'd guess most modern compilers generate plenty-efficient code.

    Another thing to consider is using an unsigned type, and then just doing
    ((uint16_t) x) to get the low-order bits.

    -s
    --
    Copyright 2009, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    Seebs, Nov 28, 2009
    #2
    1. Advertising

  3. On Nov 28, 6:06 am, root <> wrote:
    > I am bit twiddling on a 32 bit integer quantity. Often I need to look at
    > only the low 16 bits.
    >
    > It seems to me that it might simplify the code instead to have an union,
    > union u {
    >                 int32_t dw;
    >                 int16_t w;
    >
    > };


    My Jpeg compressor has a union something like
    union u {
    int32_t u_dw;
    int16_t u_w[2];
    #define u_hiword u_w[0]
    #define u_loword u_w[1];
    };
    with config/ifdef set up to reverse hiword/loword on some
    systems. I suppose that it is NOT guaranteed that EITHER
    hiword/loword assignment will work. Just hold your
    breath and remember to doublecheck when you port to a
    new system.

    An interesting thing about this union in my Jpeg compressor
    is that the code worked fine, if you used the *wrong*
    hiword/loword asignment, except for a *slight* loss of arithmetic
    accuracy at the highest fidelity settings. Clever c.l.c'ers,
    with this hint, may have little trouble deducing what I was
    doing with this union.

    James Dow Allen
    James Dow Allen, Nov 29, 2009
    #3
  4. root

    Noob Guest

    James Dow Allen wrote:

    > An interesting thing about this union in my Jpeg compressor [...]


    Would you recommend your JPEG compressor over libjpeg, or are you
    just writing it for your own personal use?

    http://jpegclub.org/
    Noob, Dec 1, 2009
    #4
  5. On 29 Nov, 14:16, "Malcolm McLean" <> wrote:
    > "root" <> wrote in message


    > > I am bit twiddling on a 32 bit integer quantity. Often I need to look at
    > > only the low 16 bits.

    >
    > > For this I currently AND with a mask.

    >
    > > It seems to me that it might simplify the code instead to have an union,
    > > union u {
    > > int32_t dw;
    > > int16_t w;
    > > };

    >
    > Yes. In reality ypu will either get the high two bytes or the low two bytes
    > of dw in w, depending on the endianness of the machine.


    ....unless you run it on a PDP-11


    > However on a strict reading of the standard what you are doing evokes
    > undefined behaviour, because you are writing to one member and reading from
    > another.
    >
    > The AND masking scheme is a bit messy, but it the best solution.


    must be me, but a mask always looks pretty clear to me, maybe a played
    in the binary too much as a child. Unions just give me the willies

    w = dw & 0xffff; /* what could be clearer? */
    Nick Keighley, Dec 1, 2009
    #5
  6. On Dec 1, 7:09 pm, Noob <r...@127.0.0.1> wrote:
    > James Dow Allen wrote:
    > > An interesting thing about this union in my Jpeg compressor [...]

    >
    > Would you recommend your JPEG compressor over libjpeg, or are you
    > just writing it for your own personal use?


    I'd guess mine's still faster; maybe I'll download the other
    some day and check. You're welcome to apply for a license on
    my method, but I won't see any of the revenue. :-(

    Did you solve the puzzle in the message to which you replied?

    James
    James Dow Allen, Dec 1, 2009
    #6
  7. root

    Phil Carmody Guest

    James Dow Allen <> writes:
    > My Jpeg compressor has a union something like
    > union u {
    > int32_t u_dw;
    > int16_t u_w[2];
    > #define u_hiword u_w[0]
    > #define u_loword u_w[1];
    > };
    > with config/ifdef set up to reverse hiword/loword on some
    > systems. I suppose that it is NOT guaranteed that EITHER
    > hiword/loword assignment will work. Just hold your
    > breath and remember to doublecheck when you port to a
    > new system.
    >
    > An interesting thing about this union in my Jpeg compressor
    > is that the code worked fine, if you used the *wrong*
    > hiword/loword asignment, except for a *slight* loss of arithmetic
    > accuracy at the highest fidelity settings. Clever c.l.c'ers,
    > with this hint, may have little trouble deducing what I was
    > doing with this union.


    WSITD - U and V?

    Phil
    --
    Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1
    Phil Carmody, Dec 1, 2009
    #7
  8. On Dec 2, 4:45 am, Phil Carmody <>
    wrote:
    > James Dow Allen <> writes:
    > > My Jpeg compressor has a union something like
    > >    union u {
    > >          int32_t  u_dw;
    > >          int16_t  u_w[2];
    > >    #define  u_hiword u_w[0]
    > >    #define  u_loword u_w[1];
    > >    };
    > > with config/ifdef set up to reverse hiword/loword on some
    > > systems....

    > ..
    > > An interesting thing about this union in my Jpeg compressor
    > > is that the code worked fine, if you used the *wrong*
    > > hiword/loword asignment, except for a *slight* loss of arithmetic
    > > accuracy at the highest fidelity settings.  Clever c.l.c'ers,
    > > with this hint, may have little trouble deducing what I was
    > > doing with this union.

    >
    > WSITD - U and V?


    Translating this for the acronym-impaired, I think Phil means:
    > Wanton Smack in the Derriere - Cr Cb ?
    > where Cr, Cb are the chromaticity values
    > in a method like JFIF.


    Well, simply swapping U and V will be much worse than a
    "slight loss of arithmetic accuracy" on any but *extremely*
    drab color images. :)

    The explanation is actually rather simple,
    but it involves a special (probably little-known)
    technique, and may be *very* difficult to guess
    without more clues. (If I thought there was an
    interest for such puzzles, I might rephrase
    it and post in comp.graphics or somewhere.)

    Here's a big hint, though you'll still
    need to put your thinking cap on for the
    complete explanation:

    Fvkgrra ovgf bs cerpvfvba ner whfg rabhtu
    sbe onfryvar Wcrt vs vg'f cebcreyl pbqrq.
    Jvgu gur snhygl uvjbeq/ybjbeq fjnc, *fbzr*
    bs gur qngn raqrq hc, va rssrpg, jvgu bayl
    svsgrra ovgf bs cerpvfvba.

    Wnzrf Qbj Nyyra
    James Dow Allen, Dec 2, 2009
    #8
  9. On Tue, 1 Dec 2009 04:53:34 -0800 (PST), Nick Keighley
    <> wrote:

    > On 29 Nov, 14:16, "Malcolm McLean" <> wrote:
    > > "root" <> wrote in message


    > > > union u {
    > > > int32_t dw;
    > > > int16_t w;
    > > > };

    > >
    > > Yes. In reality ypu will either get the high two bytes or the low two bytes
    > > of dw in w, depending on the endianness of the machine.

    >
    > ...unless you run it on a PDP-11
    >

    Even on -11; you get the high 2 bytes. It's just not *consistent* with
    the otherwise little-endian behavior where e.g. int16 punned as int8
    (or u-char) gives you the low byte.
    David Thompson, Dec 17, 2009
    #9
  10. Fair warning (Was: Unions vs endianness)

    On Dec 2, 5:34 pm, James Dow Allen <> wrote:
    > > > My Jpeg compressor has a union something like
    > > >    union u {
    > > >          int32_t  u_dw;
    > > >          int16_t  u_w[2];
    > > >    #define  u_hiword u_w[0]
    > > >    #define  u_loword u_w[1];
    > > >    };
    > > > with config/ifdef set up to reverse hiword/loword on some
    > > > systems....

    > > ..
    > > > An interesting thing about this union in my Jpeg compressor
    > > > is that the code worked fine, if you used the *wrong*
    > > > hiword/loword asignment, except for a *slight* loss of arithmetic
    > > > accuracy at the highest fidelity settings.  Clever c.l.c'ers,
    > > > with this hint, may have little trouble deducing what I was
    > > > doing with this union.

    >
    > The explanation is actually rather simple,
    > but it involves a special (probably little-known)
    > technique, and may be *very* difficult to guess
    > without more clues.  (If I thought there was an
    > interest for such puzzles, I might rephrase
    > it and post in comp.graphics or somewhere.)
    >
    > Here's a big hint, though you'll still
    > need to put your thinking cap on for the
    > complete explanation:
    >
    > Fvkgrra ovgf bs cerpvfvba ner whfg rabhtu
    > sbe onfryvar Wcrt vs vg'f cebcreyl pbqrq.
    > Jvgu gur snhygl uvjbeq/ybjbeq fjnc, *fbzr*
    > bs gur qngn raqrq hc, va rssrpg, jvgu bayl
    > svsgrra ovgf bs cerpvfvba.
    >
    > Wnzrf Qbj Nyyra


    There was a reply in this thread 2 weeks later,
    and I thought someone might have solved the puzzle!
    No....

    I created this "programming puzzle"
    spontaneously when I noticed the exact
    union definition I used in this thread.
    No one in c.l.c has solved the puzzle.

    I believe the puzzle to be a difficult
    challenge about low-level programming, but fair,
    and could even be phrased as a good interview question.
    I will probably post it to alt.math.rec in a
    few days, expecting someone there may solve it quickly.

    Thus it becomes a challenge for c.l.c!
    Are you going to let a.m.r show you up?
    :) :)

    James Dow Allen
    James Dow Allen, Dec 18, 2009
    #10
  11. root

    gwowen Guest

    Re: Fair warning (Was: Unions vs endianness)

    On Dec 18, 10:55 am, James Dow Allen <> wrote:
    be a difficult
    > challenge about low-level programming, but fair,
    > and could even be phrased as a good interview question.
    > I will probably post it to alt.math.rec in a
    > few days, expecting someone there may solve it quickly.


    My guess would be you're doing a very fast DCT by rearranging
    (butterflying or diagonalisation or some combination) on the
    individual bit level, and due to the symmettry of the rearranging bits
    get interleaved in such a way that the same bit in the top and bottom
    halves have almost the same meaning...

    Close?
    gwowen, Dec 18, 2009
    #11
  12. Re: Fair warning (Was: Unions vs endianness)

    On Dec 18, 8:51 pm, gwowen <> wrote:
    > On Dec 18, 10:55 am, James Dow Allen <> wrote:
    > My guess would be you're doing a very fast DCT by rearranging
    > (butterflying or diagonalisation or some combination) on the
    > individual bit level, and due to the symmettry of the rearranging bits
    > get interleaved in such a way that the same bit in the top and bottom
    > halves have almost the same meaning...
    >
    > Close?


    Not too close. It *is* a very fast DCT, but the buttefly procedure
    itself is fairly ordinary, using adds, subtracts, shifts and
    multiplies by very small integers. The data in the DCT procedure
    is 16-bit, but the processor itself does 32-bit arithmetic.

    One would expect (hi <--> lo) reversal to be catastrophic ...
    or have no effect at all if the distinction was inessential.
    Instead the reversal leads to *tiny* loss of precision.
    The puzzle is: what exactly was I doing to get this symptom.

    James
    James Dow Allen, Dec 18, 2009
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. hicham
    Replies:
    2
    Views:
    9,005
    dxcoder
    Jul 2, 2003
  2. Ernst Murnleitner

    float: IEEE, big endian, little endian

    Ernst Murnleitner, Jan 13, 2004, in forum: C++
    Replies:
    0
    Views:
    850
    Ernst Murnleitner
    Jan 13, 2004
  3. Joe C
    Replies:
    3
    Views:
    514
    EventHelix.com
    Jan 15, 2004
  4. invincible

    Little Endian to Big Endian

    invincible, Jun 14, 2005, in forum: C++
    Replies:
    9
    Views:
    14,322
    Old Wolf
    Jun 14, 2005
  5. invincible
    Replies:
    1
    Views:
    536
    red floyd
    Jun 14, 2005
Loading...

Share This Page