Endianess...

P

pellepluttrick

Hi,
I thought I understood this stuff - but...

This little program (taken from W. Richard Stevens Unix Network
Programming Volume 1) determines a machine's endianess:

#include <iostream>
using namespace std;

int main()
{
union {
unsigned short us;
char c[sizeof(unsigned short)];
} un;

un.us = 0x1234;

if (sizeof(unsigned short) == 2) {
if (un.c[0] == 0x12 && un.c[1] == 0x34)
cout << "big-endian\n";
else if (un.c[0] == 0x34 && un.c[1] == 0x12)
cout << "little-endian\n";
else
cout << "unknown\n";
} else
cout << "sizeof(unsigned short) = " << sizeof(unsigned short)
<< ".\n";
}

On my machine it prints little-endian (as expected). This obviously
must mean that the bits are laid out in memory as: "0x34, 0x12". Right?

So if I want to send an unsigned short over the network from this
machine (which should be done in network byte-order/big-endian) I must
convert the order of the bytes in memory so this unsigned short is sent
as "0x12, 0x34". Right? (Oh, btw I do not have htons on this platform
:-( ... )

Now the following short block of code converts the unsigned short into
a stream of two bytes:

// Remember: stored in memory as 0x34 0x12
unsigned short us = 0x1234;

char buf[2];

// Convert from little-endian to big endian
buf[0] = us & 0xFF; // Should now contain 0x12
buf[1] = (us >> 8) & 0xFF; // Should now contain 0x34

// Print the stream...
printf("0x%.2X", buf[0]);
printf(", ");
printf("0x%.2X", buf[1]);
printf("\n");

To my great surprise it printed:
0x34, 0x12

What?!?!?!?

If I change to:
buf[0] = (us >> 8) & 0xFF;
buf[1] = us & 0xFF;
everything works OK but I think it should not! Sigh...
Please enlight me!

/Pelle
 
V

Victor Bazarov

[...]
// Remember: stored in memory as 0x34 0x12

That means that 0x34 has *lower* address than 0x12.
unsigned short us = 0x1234;

char buf[2];

// Convert from little-endian to big endian
buf[0] = us & 0xFF; // Should now contain 0x12

Why? (0x1234 & 0x00ff) gives 0x34.
buf[1] = (us >> 8) & 0xFF; // Should now contain 0x34

Why? (0x1234 >> 8) gives 0x12.
// Print the stream...
printf("0x%.2X", buf[0]);
printf(", ");
printf("0x%.2X", buf[1]);
printf("\n");

To my great surprise it printed:
0x34, 0x12

What?!?!?!?

If I change to:
buf[0] = (us >> 8) & 0xFF;
buf[1] = us & 0xFF;
everything works OK but I think it should not! Sigh...
Please enlight me!

Write down what the bit patterns look like and perform shifts and ANDs.

Victor
 
V

Victor Bazarov

Victor said:
[...]
// Remember: stored in memory as 0x34 0x12


That means that 0x34 has *lower* address than 0x12.
unsigned short us = 0x1234;

char buf[2];

// Convert from little-endian to big endian
buf[0] = us & 0xFF; // Should now contain 0x12


Why? (0x1234 & 0x00ff) gives 0x34.

Wanted to add: "...regardless of endianness."
buf[1] = (us >> 8) & 0xFF; // Should now contain 0x34


Why? (0x1234 >> 8) gives 0x12.

regardless of endianness, again.
 
H

Howard

Victor Bazarov said:
Victor said:
[...]
// Remember: stored in memory as 0x34 0x12


That means that 0x34 has *lower* address than 0x12.
unsigned short us = 0x1234;

char buf[2];

// Convert from little-endian to big endian
buf[0] = us & 0xFF; // Should now contain 0x12


Why? (0x1234 & 0x00ff) gives 0x34.

Wanted to add: "...regardless of endianness."
buf[1] = (us >> 8) & 0xFF; // Should now contain 0x34


Why? (0x1234 >> 8) gives 0x12.

regardless of endianness, again.

Hi Pelle

Just in case it wasn't clear to you what Victor was saying: when you
shift bits like that, you don't have to worry about what the order of bytes
in the physical memory is. Just treat it like you would on paper. Shifting
to the right, for example, will shift the high-order bits towards the
low-order side, and it does not matter whether the low-order byte is stored
in a higher or lower location in physical memory. So, using your example,
doing 0x1234 >> 8 will always result in 0x0012, and never in 0x3400,
reagardless of the machine's byte ordering.

You only really need to worry about byte ordering when reading values
from a data stream on a machine that has one ordering when that data was
written using the opposite ordering.

(What our software does is use a compile flag that, when compiling for
the PC, byte ordering is intentionally reversed when reading or writing, so
that we know it will match what's done on the Mac. When compiled on the
Mac, reading and writing is done without changing the ordering. We use
#defines and #ifdefs to determine whether to call the order-reversing code
or not.)

-Howard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top