discuss portibility of this ENDIAN testing code

G

G Patel

Code in question (assuming CHAR_BIT = 8 system):

union
{
unsigned int whole;
unsigned char bytes[sizeof(int)];
} var;

var.whole = 0xFF;

if( var.bytes[0] == 0xFF )
printf("\nLITTLE ENDIAN\n");
else if( var.bytes[sizeof(int)-1] == 0xFF )
printf("\nBIG ENDIAN\n");
else
printf("\nHUH???\n");




I'm wondering about the portibility of the above ENDIAN tester code
(don't worry about CHAR_BIT != 8 as an issue).

I've read some really knowledgeable posts on clc before that kept
emphasizing the fact that C pastes a small abstraction layer over
memory/hardware. And that 2 contiguous bytes in C's layer is not
necessarily 2 contiguous in RAM (or process memory space) -or- the
order of the bytes in C's layer is not necessarily the same as
hardware. So with this in mind, can the above code be made more
portable?

Thanks

Gaya
 
K

Keith Thompson

G Patel said:
Code in question (assuming CHAR_BIT = 8 system):

union
{
unsigned int whole;
unsigned char bytes[sizeof(int)];
} var;

var.whole = 0xFF;

if( var.bytes[0] == 0xFF )
printf("\nLITTLE ENDIAN\n");
else if( var.bytes[sizeof(int)-1] == 0xFF )
printf("\nBIG ENDIAN\n");
else
printf("\nHUH???\n");

I'm wondering about the portibility of the above ENDIAN tester code
(don't worry about CHAR_BIT != 8 as an issue).

I've read some really knowledgeable posts on clc before that kept
emphasizing the fact that C pastes a small abstraction layer over
memory/hardware. And that 2 contiguous bytes in C's layer is not
necessarily 2 contiguous in RAM (or process memory space) -or- the
order of the bytes in C's layer is not necessarily the same as
hardware. So with this in mind, can the above code be made more
portable?

I think you're talking about the distinction between physical memory
and virtual memory. On systems with virtual memory, that's all you
can see; it's not possible to access physical memory directly (except
*maybe* by some horribly system-specific low-level technique).

Within a C object, memory is continguous, and addresses of successive
bytes are adjacent. (There are no guarantees across distinct objects,
except that their addresses are unique; any attempt to apply a
relational operator to the addresses of two objects, such as
"&obj1 < &obj2", invokes undefined behavior.)

I'd use "sizeof(unsigned int)" everywhere you used "sizeof(int)",
since you declared the "whole" member as unsigned int. It happens
that int and unsigned int are guaranteed to be the same size, but I
just now had to check the standard to confirm that; if you use
"unsigned int" consistently, the question doesn't arise.

You said not to worry about CHAR_BIT != 8, but I'll still mention that
the code will report LITTLE_ENDIAN if sizeof(unsigned int) == 1 (which
can only happen if CHAR_BIT >= 16). If an int is a single byte, then
it has no meaningful byte ordering.

Apart from that, I think the code will work reliably as long as
unsigned int has no padding bits. If it does have padding bits,
there's a possibility that the 0xFF won't land in either the
high-order or the low-order byte. In that case, printing "HUH???" is
probably good enough.
 
L

lovecreatesbeauty

Keith said:
G Patel said:
Code in question (assuming CHAR_BIT = 8 system):

union
{
unsigned int whole;
unsigned char bytes[sizeof(int)];
} var;

var.whole = 0xFF;

if( var.bytes[0] == 0xFF )
printf("\nLITTLE ENDIAN\n");
else if( var.bytes[sizeof(int)-1] == 0xFF )
printf("\nBIG ENDIAN\n");
else
printf("\nHUH???\n");

I'm wondering about the portibility of the above ENDIAN tester code
(don't worry about CHAR_BIT != 8 as an issue).

I've read some really knowledgeable posts on clc before that kept
emphasizing the fact that C pastes a small abstraction layer over
memory/hardware. And that 2 contiguous bytes in C's layer is not
necessarily 2 contiguous in RAM (or process memory space) -or- the
order of the bytes in C's layer is not necessarily the same as
hardware. So with this in mind, can the above code be made more
portable?
You said not to worry about CHAR_BIT != 8, but I'll still mention that
the code will report LITTLE_ENDIAN if sizeof(unsigned int) == 1 (which
can only happen if CHAR_BIT >= 16). If an int is a single byte, then
it has no meaningful byte ordering.

I use unsigned short with G Patel's code and get LITTLE ENDIAN and BIG
ENDIAN result on Linux + i386 and HP 9000/800/rp3410 respectively.
CHAR_BIT != 8 does not relate to sizeof(unsigned int) != 4. The endian
ways may exist where multiple bytes presentation exist.
 
F

Frederick Gotham

G Patel posted:
union
{
unsigned int whole;
unsigned char bytes[sizeof(int)];
} var;


This has been discussed several times.

Watch out for:

(1) Padding inside an int.
(2) Trap values for an int.

I posted some C++ code a while back which does this; give me an minute and
I'll convert it to C... *time passes*. Here's my best shot a C-ification.

(I wasn't sure which format specifier would print a size_t.)

#include <stddef.h>
#include <limits.h>

typedef unsigned UType;

typedef struct ByteIndexes {
size_t indexes[sizeof(UType)];
} ByteIndexes;

ByteIndexes DetermineIndexes(void)
{
ByteIndexes bi;
size_t *pindex = bi.indexes;

UType guinea_pig = 0;
char unsigned const *p = (char unsigned const*)&guinea_pig;
char unsigned const *const pover = (char unsigned const*)(&guinea_pig +
1);

UType byte_number = 1;

do guinea_pig |= byte_number << CHAR_BIT * byte_number;
while (++byte_number != sizeof guinea_pig);

do *pindex++ = *p++;
while(p != pover);

return bi;
}

size_t LSBIndexToByteIndex(size_t const i)
{
ByteIndexes static bi;

int static first_time = 1;

if(first_time) first_time = 0, bi = DetermineIndexes();

return bi.indexes;
}

#include <stdlib.h>

int main(void)
{
size_t i;

printf("============================\n"
"|| Byte Order ||\n"
"============================\n\n"
"LSB: Byte 0 -- Memory Address ");

for(i = 0;;)
{
printf("%lu\n",LSBIndexToByteIndex(i++));

if (sizeof(UType) == i) break;

printf(
sizeof(UType) == i + 1 ?
"MSB: Byte %lu -- Memory Address "
: " Byte %lu -- Memory Address ",i);
}
}
 
C

Clark S. Cox III

Frederick said:
G Patel posted:
union
{
unsigned int whole;
unsigned char bytes[sizeof(int)];
} var;


This has been discussed several times.

Watch out for:

(1) Padding inside an int.
(2) Trap values for an int.

I posted some C++ code a while back which does this; give me an minute and
I'll convert it to C... *time passes*. Here's my best shot a C-ification.

(I wasn't sure which format specifier would print a size_t.)

FYI: As of C99, %zu will print a size_t. Before C99, there wasn't one,
and the best one could do was to cast the size_t to an unsigned long and
then use %lu.
 
K

Keith Thompson

Frederick Gotham said:
I posted some C++ code a while back which does this; give me an minute and
I'll convert it to C... *time passes*. Here's my best shot a C-ification.

(I wasn't sure which format specifier would print a size_t.) [...]
size_t LSBIndexToByteIndex(size_t const i) [...]
printf("%lu\n",LSBIndexToByteIndex(i++));

That's not it. "%lu" expects an unsigned long, which is often
compatible with size_t (in fact I don't think I've ever seen a system
where it isn't), but it's not guaranteed.

C99 has "%zu", but a lot of *printf() implementations don't support
that.

The most portable solution is to use "%lu" and cast the size_t
argument to unsigned long:

printf("%lu\n", (unsigned long)LSBIndexToByteIndex(i++));

This can fail if size_t is bigger than unsigned long (not possible in
C90 and explicitly discouraged in C99) *and* if the actual value
exceeds ULONG_MAX; in that case, the printed result will be reduced
modulo ULONG_MAX+1.
 
G

Gordon Burditt

Code in question (assuming CHAR_BIT = 8 system):

There are 24 possible byte orders if sizeof(int) = 4, and 40,320
possible byte orders if sizeof(int) = 8. (Note that this does NOT
depend on CHAR_BIT = 8, but it still works if it is).

Your code mis-identifies some of these as big-endian, some as little-endian,
and identifies some of these (correctly, I guess) under the collective
name HUH??? .
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top