Unions vs endian ness

root · Nov 27, 2009

Friends

I am bit twiddling on a 32 bit integer quantity. Often I need to look at
only the low 16 bits.

For this I currently AND with a mask.

It seems to me that it might simplify the code instead to have an union,
union u {
int32_t dw;
int16_t w;
};

But my question is: will this run in to port ability problems if I move
the code to a plat form with different endian ness?

/root

Seebs · Nov 28, 2009

I am bit twiddling on a 32 bit integer quantity. Often I need to look at
only the low 16 bits.

For this I currently AND with a mask.

It seems to me that it might simplify the code instead to have an union,
union u {
int32_t dw;
int16_t w;
};

But my question is: will this run in to port ability problems if I move
the code to a plat form with different endian ness?

Probably. Even ignoring the more general undefined behavior thing, there
is nothing about this to tell you whether you get the low-order or high-order
16 bits. Typically, that'd be low-order (little-endian) or high-order
(big-endian), but there have been weirder cases out there, etcetera.

Suggestion: Use a macro for the mask, use bit masking in it, and don't
sweat it -- I'd guess most modern compilers generate plenty-efficient code.

Another thing to consider is using an unsigned type, and then just doing
((uint16_t) x) to get the low-order bits.

-s

James Dow Allen · Nov 29, 2009

I am bit twiddling on a 32 bit integer quantity. Often I need to look at
only the low 16 bits.

It seems to me that it might simplify the code instead to have an union,
union u {
int32_t dw;
int16_t w;

};

My Jpeg compressor has a union something like
union u {
int32_t u_dw;
int16_t u_w[2];
#define u_hiword u_w[0]
#define u_loword u_w[1];
};
with config/ifdef set up to reverse hiword/loword on some
systems. I suppose that it is NOT guaranteed that EITHER
hiword/loword assignment will work. Just hold your
breath and remember to doublecheck when you port to a
new system.

An interesting thing about this union in my Jpeg compressor
is that the code worked fine, if you used the *wrong*
hiword/loword asignment, except for a *slight* loss of arithmetic
accuracy at the highest fidelity settings. Clever c.l.c'ers,
with this hint, may have little trouble deducing what I was
doing with this union.

James Dow Allen

Noob · Dec 1, 2009

James said:
An interesting thing about this union in my Jpeg compressor [...]

Would you recommend your JPEG compressor over libjpeg, or are you
just writing it for your own personal use?

http://jpegclub.org/

Nick Keighley · Dec 1, 2009

Yes. In reality ypu will either get the high two bytes or the low two bytes
of dw in w, depending on the endianness of the machine.

....unless you run it on a PDP-11

However on a strict reading of the standard what you are doing evokes
undefined behaviour, because you are writing to one member and reading from
another.

The AND masking scheme is a bit messy, but it the best solution.

must be me, but a mask always looks pretty clear to me, maybe a played
in the binary too much as a child. Unions just give me the willies

w = dw & 0xffff; /* what could be clearer? */

James Dow Allen · Dec 1, 2009

James said:
James said:

An interesting thing about this union in my Jpeg compressor [...]

Click to expand...

Would you recommend your JPEG compressor over libjpeg, or are you
just writing it for your own personal use?

I'd guess mine's still faster; maybe I'll download the other
some day and check. You're welcome to apply for a license on
my method, but I won't see any of the revenue. :-(

Did you solve the puzzle in the message to which you replied?

James

Phil Carmody · Dec 1, 2009

James Dow Allen said:
My Jpeg compressor has a union something like
union u {
int32_t u_dw;
int16_t u_w[2];
#define u_hiword u_w[0]
#define u_loword u_w[1];
};
with config/ifdef set up to reverse hiword/loword on some
systems. I suppose that it is NOT guaranteed that EITHER
hiword/loword assignment will work. Just hold your
breath and remember to doublecheck when you port to a
new system.

An interesting thing about this union in my Jpeg compressor
is that the code worked fine, if you used the *wrong*
hiword/loword asignment, except for a *slight* loss of arithmetic
accuracy at the highest fidelity settings. Clever c.l.c'ers,
with this hint, may have little trouble deducing what I was
doing with this union.

WSITD - U and V?

Phil

James Dow Allen · Dec 2, 2009

James Dow Allen said:
James Dow Allen said:

My Jpeg compressor has a union something like
union u {
int32_t u_dw;
int16_t u_w[2];
#define u_hiword u_w[0]
#define u_loword u_w[1];
};
with config/ifdef set up to reverse hiword/loword on some
systems.... ..
An interesting thing about this union in my Jpeg compressor
is that the code worked fine, if you used the *wrong*
hiword/loword asignment, except for a *slight* loss of arithmetic
accuracy at the highest fidelity settings. Clever c.l.c'ers,
with this hint, may have little trouble deducing what I was
doing with this union.

Click to expand...

WSITD - U and V?

Translating this for the acronym-impaired, I think Phil means:

Wanton Smack in the Derriere - Cr Cb ?
where Cr, Cb are the chromaticity values
in a method like JFIF.

Well, simply swapping U and V will be much worse than a
"slight loss of arithmetic accuracy" on any but *extremely*
drab color images.

The explanation is actually rather simple,
but it involves a special (probably little-known)
technique, and may be *very* difficult to guess
without more clues. (If I thought there was an
interest for such puzzles, I might rephrase
it and post in comp.graphics or somewhere.)

Here's a big hint, though you'll still
need to put your thinking cap on for the
complete explanation:

Fvkgrra ovgf bs cerpvfvba ner whfg rabhtu
sbe onfryvar Wcrt vs vg'f cebcreyl pbqrq.
Jvgu gur snhygl uvjbeq/ybjbeq fjnc, *fbzr*
bs gur qngn raqrq hc, va rssrpg, jvgu bayl
svsgrra ovgf bs cerpvfvba.

Wnzrf Qbj Nyyra

David Thompson · Dec 17, 2009

...unless you run it on a PDP-11

Even on -11; you get the high 2 bytes. It's just not *consistent* with
the otherwise little-endian behavior where e.g. int16 punned as int8
(or u-char) gives you the low byte.

James Dow Allen · Dec 18, 2009

My Jpeg compressor has a union something like
union u {
int32_t u_dw;
int16_t u_w[2];
#define u_hiword u_w[0]
#define u_loword u_w[1];
};
with config/ifdef set up to reverse hiword/loword on some
systems.... ..
An interesting thing about this union in my Jpeg compressor
is that the code worked fine, if you used the *wrong*
hiword/loword asignment, except for a *slight* loss of arithmetic
accuracy at the highest fidelity settings. Clever c.l.c'ers,
with this hint, may have little trouble deducing what I was
doing with this union.

Click to expand...

Click to expand...

The explanation is actually rather simple,
but it involves a special (probably little-known)
technique, and may be *very* difficult to guess
without more clues. (If I thought there was an
interest for such puzzles, I might rephrase
it and post in comp.graphics or somewhere.)

Here's a big hint, though you'll still
need to put your thinking cap on for the
complete explanation:

Fvkgrra ovgf bs cerpvfvba ner whfg rabhtu
sbe onfryvar Wcrt vs vg'f cebcreyl pbqrq.
Jvgu gur snhygl uvjbeq/ybjbeq fjnc, *fbzr*
bs gur qngn raqrq hc, va rssrpg, jvgu bayl
svsgrra ovgf bs cerpvfvba.

Wnzrf Qbj Nyyra

There was a reply in this thread 2 weeks later,
and I thought someone might have solved the puzzle!
No....

I created this "programming puzzle"
spontaneously when I noticed the exact
union definition I used in this thread.
No one in c.l.c has solved the puzzle.

I believe the puzzle to be a difficult
challenge about low-level programming, but fair,
and could even be phrased as a good interview question.
I will probably post it to alt.math.rec in a
few days, expecting someone there may solve it quickly.

Thus it becomes a challenge for c.l.c!
Are you going to let a.m.r show you up?

James Dow Allen

gwowen · Dec 18, 2009

On Dec 18 said:
challenge about low-level programming, but fair,
and could even be phrased as a good interview question.
I will probably post it to alt.math.rec in a
few days, expecting someone there may solve it quickly.

My guess would be you're doing a very fast DCT by rearranging
(butterflying or diagonalisation or some combination) on the
individual bit level, and due to the symmettry of the rearranging bits
get interleaved in such a way that the same bit in the top and bottom
halves have almost the same meaning...

Close?

James Dow Allen · Dec 18, 2009

My guess would be you're doing a very fast DCT by rearranging
(butterflying or diagonalisation or some combination) on the
individual bit level, and due to the symmettry of the rearranging bits
get interleaved in such a way that the same bit in the top and bottom
halves have almost the same meaning...

Close?

Not too close. It *is* a very fast DCT, but the buttefly procedure
itself is fairly ordinary, using adds, subtracts, shifts and
multiplies by very small integers. The data in the DCT procedure
is 16-bit, but the processor itself does 32-bit arithmetic.

One would expect (hi <--> lo) reversal to be catastrophic ...
or have no effect at all if the distinction was inessential.
Instead the reversal leads to *tiny* loss of precision.
The puzzle is: what exactly was I doing to get this symptom.

James

Reading little-endian data from a file in a portable manner	46	Jul 16, 2010
Array of 4 bit fields?	9	May 25, 2010
Inserting IPv4 header checksum into dummy IP header	6	Dec 1, 2010
Portability issues (union, bitfields)	7	Nov 4, 2009
Union trouble	7	Mar 28, 2008
Pointer Converts to Direct Memory	9	Dec 14, 2007
Accessing high and low bytes of a unsigned short in a struct.	4	May 12, 2005
"free space" with declared type	14	Dec 21, 2004

Unions vs endian ness

root

Seebs

James Dow Allen

Noob

Nick Keighley

James Dow Allen

Phil Carmody

James Dow Allen

David Thompson

James Dow Allen

gwowen

James Dow Allen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads