C
Chris Torek
Types have representations. Endianness is part of that
representation.
This is, well, wrong. I was hoping for an easier way to put that,
but it is just plain wrong ... except in some cases.
Consider a machine that has "endian control" in the CPU, in the
instruction set, and in the MMU. (SPARCV9 implementations do this.)
That is, there is an "invert endian-ness" bit in the CPU, so that
you can "run in either mode", *plus* an "invert endian-ness" bit
in some kind(s) of memory access instructions -- on V9 one uses
the "alternate address space" extensions for this -- so that one
can, for instance, refer to shared memory regions that are being
used by a processor running in the "other" endianness. If you
set the invert bit in both the CPU and the instruction, you get
big-endian ("native", as it were) byte order. However, if the
invert bit is set in the MMU entry for the page, you get the "other"
endian-ness yet again: if all three bits are set (CPU, instruction,
and MMU), you get "little-endian" byte order.
Yet all these accesses can be done as 16-bit "short", 32-bit "int",
or 64-bit "long" / "long long" (depending on compiler mode). (Well,
admittedly, the compiler does not generate lda/sta instructions on
its own, but the inversion bits in the CPU and/or MMU can still be
set.)
Clearly, on V9 SPARCs, endianness is not due to C-level types after
all. So what *is* it due to?
As I have said before, endianness arises from "disassembly and
reassembly". Any atomic entity -- anything that is never taken
apart -- has no need for the concept of "endianness". But when
you take a large item and chop it up into small pieces, then shuffle
the small parts from point A to point B, and finally reassemble
the small parts into a large item, *then* it matters: do you take
the small parts from left to right, or from top to bottom, or
outside in, or inside out, or what?
When you let CPU #1 take a "large" value, like a 32-bit integer,
apart, and then shuffle the pieces -- such as "four 8-bit bytes"
-- over to CPU #2 and ask it to reassemble the 32-bit integer, you
subject yourself to possible different orders. If CPU #1 takes
the value apart from outside in, so that the "first byte" is the
most significant and the second byte is the least significant, but
CPU #2 assembles them "right to left", the value CPU #2 delivers
is not the value CPU #1 took apart. When you let memory subsystem
Q take the value apart, and ask memory subsystem R to reassemble
it, you again subject yourself to possible different orders.
It is tempting to think (or assume) that, on any particular machine,
the C compiler's type is the sole determinant of how everything on
that machine will disassemble and/or reassemble "original values"
to/from "bytes". On some -- perhaps even many -- machines, this
is actually true. But it is not universal, as the SPARC example
illustrates.
The key to understanding "endian-ness" is to think about the slicing
and splicing of values. You must figure out:
- who is doing the slicing or splicing, and from that,
- what order they will use, and
- why.
On a few (admittedly common) machines, there is just the one entity
that does this -- the CPU -- at least as far as C programs are
concerned, and it has just the one order. Not all machines are
that simple. Any machine with "endian-ness control bits" is more
complicated, for instance.