# Endianess...

Discussion in 'C++' started by pellepluttrick@yahoo.com, Feb 2, 2005.

1. ### Guest

Hi,
I thought I understood this stuff - but...

This little program (taken from W. Richard Stevens Unix Network
Programming Volume 1) determines a machine's endianess:

#include <iostream>
using namespace std;

int main()
{
union {
unsigned short us;
char c[sizeof(unsigned short)];
} un;

un.us = 0x1234;

if (sizeof(unsigned short) == 2) {
if (un.c[0] == 0x12 && un.c[1] == 0x34)
cout << "big-endian\n";
else if (un.c[0] == 0x34 && un.c[1] == 0x12)
cout << "little-endian\n";
else
cout << "unknown\n";
} else
cout << "sizeof(unsigned short) = " << sizeof(unsigned short)
<< ".\n";
}

On my machine it prints little-endian (as expected). This obviously
must mean that the bits are laid out in memory as: "0x34, 0x12". Right?

So if I want to send an unsigned short over the network from this
machine (which should be done in network byte-order/big-endian) I must
convert the order of the bytes in memory so this unsigned short is sent
as "0x12, 0x34". Right? (Oh, btw I do not have htons on this platform
:-( ... )

Now the following short block of code converts the unsigned short into
a stream of two bytes:

// Remember: stored in memory as 0x34 0x12
unsigned short us = 0x1234;

char buf[2];

// Convert from little-endian to big endian
buf[0] = us & 0xFF; // Should now contain 0x12
buf[1] = (us >> 8) & 0xFF; // Should now contain 0x34

// Print the stream...
printf("0x%.2X", buf[0]);
printf(", ");
printf("0x%.2X", buf[1]);
printf("\n");

To my great surprise it printed:
0x34, 0x12

What?!?!?!?

If I change to:
buf[0] = (us >> 8) & 0xFF;
buf[1] = us & 0xFF;
everything works OK but I think it should not! Sigh...

/Pelle

, Feb 2, 2005

2. ### Victor BazarovGuest

wrote:
> [...]
> // Remember: stored in memory as 0x34 0x12

That means that 0x34 has *lower* address than 0x12.

> unsigned short us = 0x1234;
>
> char buf[2];
>
> // Convert from little-endian to big endian
> buf[0] = us & 0xFF; // Should now contain 0x12

Why? (0x1234 & 0x00ff) gives 0x34.

> buf[1] = (us >> 8) & 0xFF; // Should now contain 0x34

Why? (0x1234 >> 8) gives 0x12.

> // Print the stream...
> printf("0x%.2X", buf[0]);
> printf(", ");
> printf("0x%.2X", buf[1]);
> printf("\n");
>
> To my great surprise it printed:
> 0x34, 0x12
>
> What?!?!?!?
>
> If I change to:
> buf[0] = (us >> 8) & 0xFF;
> buf[1] = us & 0xFF;
> everything works OK but I think it should not! Sigh...

Write down what the bit patterns look like and perform shifts and ANDs.

Victor

Victor Bazarov, Feb 2, 2005

3. ### Victor BazarovGuest

Victor Bazarov wrote:
> wrote:
>
>> [...]
>> // Remember: stored in memory as 0x34 0x12

>
>
> That means that 0x34 has *lower* address than 0x12.
>
>> unsigned short us = 0x1234;
>>
>> char buf[2];
>>
>> // Convert from little-endian to big endian
>> buf[0] = us & 0xFF; // Should now contain 0x12

>
>
> Why? (0x1234 & 0x00ff) gives 0x34.

Wanted to add: "...regardless of endianness."

>
>> buf[1] = (us >> 8) & 0xFF; // Should now contain 0x34

>
>
> Why? (0x1234 >> 8) gives 0x12.

regardless of endianness, again.

> [...]

Victor Bazarov, Feb 2, 2005
4. ### Gianni MarianiGuest

Gianni Mariani, Feb 3, 2005
5. ### HowardGuest

"Victor Bazarov" <> wrote in message
news:gsaMd.41110\$01.us.to.verio.net...
> Victor Bazarov wrote:
>> wrote:
>>
>>> [...]
>>> // Remember: stored in memory as 0x34 0x12

>>
>>
>> That means that 0x34 has *lower* address than 0x12.
>>
>>> unsigned short us = 0x1234;
>>>
>>> char buf[2];
>>>
>>> // Convert from little-endian to big endian
>>> buf[0] = us & 0xFF; // Should now contain 0x12

>>
>>
>> Why? (0x1234 & 0x00ff) gives 0x34.

>
> Wanted to add: "...regardless of endianness."
>
>>
>>> buf[1] = (us >> 8) & 0xFF; // Should now contain 0x34

>>
>>
>> Why? (0x1234 >> 8) gives 0x12.

>
> regardless of endianness, again.
>
>> [...]

Hi Pelle

Just in case it wasn't clear to you what Victor was saying: when you
shift bits like that, you don't have to worry about what the order of bytes
in the physical memory is. Just treat it like you would on paper. Shifting
to the right, for example, will shift the high-order bits towards the
low-order side, and it does not matter whether the low-order byte is stored
in a higher or lower location in physical memory. So, using your example,
doing 0x1234 >> 8 will always result in 0x0012, and never in 0x3400,
reagardless of the machine's byte ordering.

You only really need to worry about byte ordering when reading values
from a data stream on a machine that has one ordering when that data was
written using the opposite ordering.

(What our software does is use a compile flag that, when compiling for
the PC, byte ordering is intentionally reversed when reading or writing, so
that we know it will match what's done on the Mac. When compiled on the
Mac, reading and writing is done without changing the ordering. We use
#defines and #ifdefs to determine whether to call the order-reversing code
or not.)

-Howard

Howard, Feb 3, 2005