Big Endian - Little Endian

Niranjan · Aug 27, 2008

I have this program :

void main()
{
int i=1;
if((*(char*)&i)==1)
printf("The machine is little endian.");
else
printf("The machine is big endian.");
}

This program tells me if the machine uses big endian or little endian
format.
But I am not able to understand the working of this program. Can
someone please explain the working.

Thanks in advance.

Thanks,
Niranjan.

dertopper · Aug 27, 2008

I have this program :

void main()
{
int i=1;
if((*(char*)&i)==1)
printf("The machine is little endian.");
else
printf("The machine is big endian.");

}

This program tells me if the machine uses big endian or little endian
format.
But I am not able to understand the working of this program. Can
someone please explain the working.

int i=1;

This allocates space for an integer variable (usually 4 bytes) on the
stack (if your environment supports this concept

. The bit pattern
for big-Endian systems will be 00000001_hex, and 01000000_hex for
little-Endian systems.

if((*(char*)&i)==1)

The pointer to i is casted to a pointer to char, leaving the pointer
value the same, but changing how the pointer is treated.
a a+1 a+2 a+3
----------------------------
| 01 | 00 | 00 | 00 | little endian
----------------------------
/\
/| \
|
pointer to i

a a+1 a+2 a+3
----------------------------
| 00 | 00 | 00 | 01 | big endian
----------------------------
/\
/| \
|
pointer to i

After the pointer has been casted to char*, it either points to a byte
that contains a one or a zero.

printf("The machine is little endian.");
else
printf("The machine is big endian.");

Regards,
Stuart

Juha Nieminen · Aug 27, 2008

This allocates space for an integer variable (usually 4 bytes) on the
stack (if your environment supports this concept .

Btw, does the standard guarantee that sizeof(int) > sizeof(char)?
(OTOH, would endianess have any meaning in a system where they have
the same size?)

Moreover, does the standard guarantee that you can reinterpret-cast an
int* to a char*, and then dereference it safely?

Pascal J. Bourguignon · Aug 27, 2008

Juha Nieminen said:
Btw, does the standard guarantee that sizeof(int) > sizeof(char)?
No.

(OTOH, would endianess have any meaning in a system where they have
the same size?)

Could still have. Not from the C point of view, but from the host,
and for interchange, it could still matter.

AFAIK, a C implementation could choose to have char = short = int = 32-bit
even on a octet addressed machine, by using only 32-bit aligned pointers.
Then the byte sex would matter for any IPC.

On the other hand, on a machine where the byte size is really 32-bit,
the natural choice for the C compiler would be char = short = int =
32-bit and there wouldn't be any meaningful byte sex consideration for
local IPC, but this would still matter for remote IPC, network byte
order. But in this case, it would have to be solved cleanly, with
arithmetic on ints, instead of tricks on bytes.

Moreover, does the standard guarantee that you can reinterpret-cast an
int* to a char*, and then dereference it safely?

AFAIK, no. That is, there could be pad bits, or some other strange
mapping, so the test proposed wouldn't be right.

But in practice, on current machines, it works.

Nick Keighley · Aug 27, 2008

AFAIK, no. That is, there could be pad bits, or some other strange
mapping, so the test proposed wouldn't be right.

But in practice, on current machines, it works.

C (and by inheritance C++) guarentees that you can cast any
data pointer ("object" in C-speak) to a pointer to unsigned
char and be able to deref it. Hence you can always hex
dump the representation of an object. U chars *cannot*
have trap representations.

Also Big-endian and Little-endian doesn't exhaust the possibilities,
there are strange DEC-endians as well.

Bo Persson · Aug 27, 2008

Francis said:
I've heard the DEC PDP-11 endianness called "PDP-endian", in which
the value 0x01234567 is stored as the bytes 0x45 0x67 0x01 0x02 at
increasing memory addresses.

Yes.

Some processors have run-time configured endianess, can be different
for different programs.

The DEC-endians also include the possibility of different endianness
for integers and floating point.

Bo Persson

Niranjan · Aug 27, 2008

All,

Thanks for your answers.
The only thing that I still am not clear is that why do we need to
type cast int* to char *.
What difference does it make?
We are not doing any pointer arithmatic or iterations here.
Then what is it that forces to explicit cast the pointer to (char
*)

Thanks,
Niranjan.

Bo Persson · Aug 27, 2008

Niranjan said:
All,

Thanks for your answers.
The only thing that I still am not clear is that why do we need to
type cast int* to char *.
What difference does it make?
We are not doing any pointer arithmatic or iterations here.
Then what is it that forces to explicit cast the pointer to (char
*)

Type char has a double duty in C++, as it is both a character type and
the language's definition of a byte. So by casting an int pointer to a
char pointer, you access the first byte of the object.

Assuming, again, that an int is larger than a char, you access just
one part of the integer.

As you have seen in other posts, this doesn't cover all the corner
cases, but works well on popular desktop computers. On the other hand,
these computers are also known to be little endian, anyway.

Bo Persson

dertopper · Aug 28, 2008

All,

Thanks for your answers.
The only thing that I still am not clear is that why do we need to
type cast int* to char *.
What difference does it make?
We are not doing any pointer arithmatic or iterations here.
Then what is it that forces to explicit cast the pointer to (char
*)

Thanks,
Niranjan.

The explicit cast is necessary to force the compiler to treat the
memory of the integer variable as if it was memory to hold a char
variable. Dereferencing the casted pointer will read only the byte at
location a, whereas dereferencing the original int pointer will read
the bytes at locations a to a + 3.

Regards,
Stuart

James Kanze · Aug 31, 2008

I have this program :

void main()
{
int i=1;
if((*(char*)&i)==1)
printf("The machine is little endian.");
else
printf("The machine is big endian.");
}

This program tells me if the machine uses big endian or little
endian format.

And if it uses some other format?

But I am not able to understand the working of this program.
Can someone please explain the working.

It doesn't work. It has unspecified behavior. (Actually, it
shouldn't even compile, because of the void main, but that's a
different issue.)

FWIW: I've yet to find a case where I needed to know byte order,
and didn't have other, more important implementation
dependencies. (Off hand, the only time I can remember needed to
know byte order was when implementing modf, in the C standard
library. And of course, that code very much depended on knowing
many of the details of the floating point format as well.)

James Kanze · Aug 31, 2008

This allocates space for an integer variable (usually 4 bytes)
on the stack (if your environment supports this concept .

Often 4 bytes today, but I imagine that there are still a lot of
machines where it is 2 bytes. Values of 6 and 1 are also known,
and other values wouldn't surprise me either.

The bit pattern for big-Endian systems will be 00000001_hex,
and 01000000_hex for little-Endian systems.

The bit pattern is required by the standard to be 0x00000001
(supposing 32 bits). No other alternatives are allowed.

The pointer to i is casted to a pointer to char, leaving the pointer
value the same, but changing how the pointer is treated.

Again, on most machines. There are (or have been) machines
where the pointer value will change; there are (or have been)
machines where the two pointers will not even have the same
size.

a a+1 a+2 a+3
----------------------------
| 01 | 00 | 00 | 00 | little endian
----------------------------
/\
/| \
|
pointer to i

a a+1 a+2 a+3
----------------------------
| 00 | 00 | 00 | 01 | big endian
----------------------------
/\
/| \
|
pointer to i

After the pointer has been casted to char*, it either points
to a byte that contains a one or a zero.

That's generally true on most modern general purpose machines,
but you can't count on it.

James Kanze · Aug 31, 2008

Btw, does the standard guarantee that sizeof(int) > sizeof(char)?
No.

(OTOH, would endianess have any meaning in a system where they have
the same size?)

Does it really have any meaning internally even when the sizes
are different?

Moreover, does the standard guarantee that you can reinterpret-cast an
int* to a char*, and then dereference it safely?

Yes, but the results are unspecified.

James Kanze · Aug 31, 2008

On 27 Aug, 14:39, (e-mail address removed) (Pascal J. Bourguignon)
wrote:

C (and by inheritance C++) guarentees that you can cast any
data pointer ("object" in C-speak) to a pointer to unsigned
char and be able to deref it. Hence you can always hex
dump the representation of an object. U chars *cannot*
have trap representations.

The C++ standard also makes this guarantee for char, I think.
On the other hand, it doesn't guarantee that two char's which
compare equal will have the same bit representation; only
unsigned char guarantees that.

Also Big-endian and Little-endian doesn't exhaust the
possibilities, there are strange DEC-endians as well.

Or earlier versions of Microsoft C on a PC, where 32 bit longs
had the order 2301 (where each digit is the power of 256
represented in the byte). Note too that there are (currently)
machines with 9 bit char's, machines where int's contain padding
bits, machines which don't use 2's complement, etc.

Pascal stated it more or less clearly. Internally, you don't
care about byte order---there may not even be any. Externally,
the protocol defined the *values* for each octet, given a
integral value; you use value operations on the int to get them.

Juha Nieminen · Aug 31, 2008

James said:
Does it really have any meaning internally even when the sizes
are different?

If you write some values to a file eg. with fwrite(), it can make a
difference.

James Kanze · Aug 31, 2008

If you write some values to a file eg. with fwrite(), it can
make a difference.

I'm afraid I don't understand. fwrite() really doesn't do
anything that ostream::write() doesn't; it just has an interface
which pretends to. If you have to reread the file at some
future date, possibly with a different program, or a new version
of the same program, then you have to write (and read) a
specified format. Neither fwrite() nor ostream::write() do
this; neither really make much sense unless the argument is a
preformatted buffer (except for the case where you are using the
file as extended memory within the program---frequent back in
the days of 8 or 16 bit processors, and a total memory of only
64 KB, but I can't imagine the need today, with 64 bits virtual
address space).

Juha Nieminen · Sep 1, 2008

James said:
but I can't imagine the need today, with 64 bits virtual
address space).

Clearly you have never needed to read/write humongous amounts of data
as fast as possible.

James Kanze · Sep 1, 2008

Clearly you have never needed to read/write humongous amounts
of data as fast as possible.

You'd be surprised

.

In the good old days, when we had to fit the application into
64KB, just dumping the bits was a very efficient way to
implement overlays of data (and of code, for that matter, if the
system and the linker supported it). Today, however, the
address space of virtual memory is larger than the biggest disks
I can get my hands on, so the only reason I would explicitly
write to disk (as opposed to paging) is because I need to be
able to reread it later. Which means that it must have a
defined format, and just dumping the bits doesn't work.

Ian Collins · Sep 1, 2008

Juha said:
If you write some values to a file eg. with fwrite(), it can make a
difference.

Not if you use an endian neutral file system.

peter koch · Sep 1, 2008

You'd be surprised.

In the good old days, when we had to fit the application into
64KB, just dumping the bits was a very efficient way to
implement overlays of data (and of code, for that matter, if the
system and the linker supported it). Today, however, the
address space of virtual memory is larger than the biggest disks
I can get my hands on, so the only reason I would explicitly
write to disk (as opposed to paging) is because I need to be
able to reread it later. Which means that it must have a
defined format, and just dumping the bits doesn't work.

I mostly agree, but there are exceptions, and they are not THAT few.
One apparant exception is databases: if you want high performance and
reliability, there is no way out of writing data explicitly to disk
and doing so in a binary format. Of course, portability suffers but
you don't want to port to exoteric machines anyway.

/Peter

Juha Nieminen · Sep 1, 2008

Ian said:
Not if you use an endian neutral file system.

fwrite() writes a byte array to the file. How can the file system
"know" what should be "little endian" or "big endian" in a raw byte
array? It can't.

Big Endian and Little Endian	2	Jul 7, 2006
Big-endian, little-endian and sizeof() in different systems	6	Jun 15, 2007
Write int as a 4 byte big-endian to file.	43	Mar 10, 2012
Macro for setting MSB - Intended to work on both Little andBig-endian machines	0	Mar 26, 2013
Macro for setting MSB - Intended to work on both Little andBig-endian machines	16	Mar 26, 2013
Macro for setting MSB - Intended to work on both Little andBig-endian machines	0	Mar 26, 2013
Macro for setting MSB - Intended to work on both Little andBig-endian machines	0	Mar 26, 2013
Endian Independence	25	Jul 27, 2008

Big Endian - Little Endian

Niranjan

dertopper

Juha Nieminen

Pascal J. Bourguignon

Nick Keighley

Bo Persson

Niranjan

Bo Persson

dertopper

James Kanze

James Kanze

James Kanze

James Kanze

Juha Nieminen

James Kanze

Juha Nieminen

James Kanze

Ian Collins

peter koch

Juha Nieminen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads