bit order in xx-endian system

  • Thread starter zhangsonglovexiaoniuniu
  • Start date
Z

zhangsonglovexiaoniuniu

hi all,

i get a problem:

assume we have a variable with 2 byte -- *b = 0x0102*,

and i know in memory its byte order as follows:

(little endian)
low ----->high
0x02 0x01
(big endian)
low ---->high
0x01 0x02

now my question is what's the bit order?it is like:

low ----->high
little:
0000 0010 0000 0001
big:
0100 0000 1000 0000

if so or not, why?

thanks
Evan
 
W

Walter Roberson

assume we have a variable with 2 byte -- *b = 0x0102*,
now my question is what's the bit order?it is like:
low ----->high
little:
0000 0010 0000 0001
big:
0100 0000 1000 0000
if so or not, why?

There is pretty much no way of telling without reading the processor
documentation, and even that might deal in abstract bit numberings
that are not what is *really* used internally.

C does not impose very many restrictions on bit ordering.
It has to be consistant within any one type; the signed integral
types must use the same ordering as the corresponding
unsigned integral type of the same width. There is no guarantee
that char uses the same bit ordering as int, and no guarantee
that int uses the same bit ordering as long. Nearly all of
the C operations are defined by the effect they have on the *value*,
not on the bit representation; the few such as ~ that deal at the
bit level do not specify the effect on value (hence the semi-annual
arguments about the effect of ~ upon numbers that are not stored in
twos complement.)

If a processor chose to *internally* order the bits within an octet
as 02481357 then you'd be fairly hard pressed to tell within C
as long as the processor took care of the associated shuffling
when you made value-oriented transformations such as << (which
is *not* defined in terms of bits.)
 
G

Golden California Girls

hi all,

i get a problem:

assume we have a variable with 2 byte -- *b = 0x0102*,

and i know in memory its byte order as follows:

(little endian)
low ----->high
0x02 0x01
(big endian)
low ---->high
0x01 0x02

now my question is what's the bit order?it is like:

low ----->high
little:
0000 0010 0000 0001
big:
0100 0000 1000 0000

if so or not, why?

thanks
Evan

Ah, I see what you are thinking, but that isn't part of your question.

When you are talking about byte endian ness (of multi-byte words) the byte is
the smallest unit of transfer. The arrangement of the bits in the byte aren't
important because the byte is transfered at once, not a bit at a time.

If you want to talk about a serial line and the order of the bits of the byte
that is yet a different endian system and could be either way. However AFAIK
all serial lines operate the same way as that got standard a long long time ago.
Never saw a switch on a ASR 33 to change the order! Parity yes.

Going back to that byte, unless you are on a 4 bit CPU, it really is a parallel
access to the ram or register. All 8 bits come across at once. Your question
is undefined.
 
Z

zhangsonglovexiaoniuniu

Actually, what i care about is the transmission through the network.

as the following example,

what's the difference of the bit-order?

1. char a;

2.struct {
unsigned a:1;
unsigned b:1;
unsigned c:1;
unsigned d:1;
unsigned e:1;
unsigned f:1;
unsigned g:1;
unsigned h:1;
}a;
 
W

Walter Roberson

Actually, what i care about is the transmission through the network.

The bit order for transmission through a network depends upon
the transmission media. And it is invisible to application programs,
because no matter what bit order is used, the octets are reassembled
before they reach the program. The transmission order is only of
interest if you want to examine the waveforms.

FYI, traditional ethernet transmitted octet by octet, LSB
(Least Significant Bit) first. But gigabit ethernet transmits
groups of bits in parallel (and it wasn't the first, there being
various parallel fibre transmission mechanisms.)

as the following example,
what's the difference of the bit-order?
1. char a;
2.struct {
unsigned a:1;
unsigned b:1;
unsigned c:1;
unsigned d:1;
unsigned e:1;
unsigned f:1;
unsigned g:1;
unsigned h:1;
}a;

Oh, wrong question, but for a different reason! C does not define
the order of bitfields; 'h' could end up as the first bit or 'h'
could end up as the last bit. Not only that, but C does not define
the size of the "storage unit" that is being filled up, so your
structure 'a' is not necessarily going to be only 8 bits long:
it wouldn't be surprising to find out that it was 32 bits long
(and then you get into questions about -which- 8 of those 32 bits
would be affected by the structure.) And always keep in mind
that 'char' is not necessarily 8 bits long: 8 bits is the -minimum-
but there are some systems with char of 9 bits and some systems
with char of 16 or 32 bits.
 
J

James Fang

hi all,

i get a problem:

assume we have a variable with 2 byte -- *b = 0x0102*,

and i know in memory its byte order as follows:

(little endian)
low ----->high
0x02 0x01
(big endian)
low ---->high
0x01 0x02

now my question is what's the bit order?it is like:

low ----->high
little:
0000 0010 0000 0001
big:
0100 0000 1000 0000

if so or not, why?

thanks
Evan

The big/little endian won't affect the bits ordering inside one byte
 
C

Chris Torek

... i know [the machine uses a particular] byte order ...
my question is what's the bit order?

"Ordering", also known as "endianness", arises from the process of
taking something apart at one location, then putting it back together
again at a new, different place.

Suppose you are moving from one apartment to another, and you need
to have your bed moved. The obvious, simple way to do this is to
take the bed out of the first apartment, load it on a truck, cart
it to the new place, and unload it. But what if the bed is too
big for the truck? Or, what if (for some reason) it is a whole lot
cheaper to saw the bed into pieces that fit in small boxes, and
ship those?

In this case, you get Fred to take the bed apart into little pieces.
The bed is then shipped to the new place, where Joe puts it back
together.

Unfortunately, Fred takes beds apart from headboard to footboard and
left to right, while Joe assumes they arrive the other way around,
and puts the bed together wrong.

This is an endian-ness conflict: Fred took it apart, then Joe put
it together, and Fred and Joe used different rules.

In a computer, if endian-ness ever arises, the first question to
ask is not "which order", but rather "who is doing the taking-apart,
and who is doing the putting-together?" You have to find out *who*
before you can find out -- or tell -- *how*.

In this case, "who" could be the machine itself, or it could be
the compiler. There is no guarantee that the compiler uses the
same order as the machine. The machine may have a "favored" order,
or may not. If the machine has some disassembly-and-reassembly
order that is "faster" or "more efficient", the compiler will
*probably* use that. If not, the compiler will use ... something.

Your best bet, in general, is *not* to have "the compiler" or "the
machine" do the disassembly and reassembly, but rather to do it
yourself. Then *you* control the order, and you can guarantee that
it goes whichever way it is supposed to. You should only delegate
the task to someone else (such as the machine or compiler) if you
have some way to control the order, or a guarantee of some sort
that the order will not matter. (For an example of the latter,
imagine you are shipping the bed again, but this time Fred will do
both the disassembly and reassembly, and will do it the same way
each time. In this case, you need not care what order he uses --
he just has to be consistent.)
 
Z

zhangsonglovexiaoniuniu

The bit order for transmission through a network depends upon
the transmission media. And it is invisible to application programs,
because no matter what bit order is used, the octets are reassembled
before they reach the program. The transmission order is only of
interest if you want to examine the waveforms.

FYI, traditional ethernet transmitted octet by octet, LSB
(Least Significant Bit) first. But gigabit ethernet transmits
groups of bits in parallel (and it wasn't the first, there being
various parallel fibre transmission mechanisms.)


Oh, wrong question, but for a different reason! C does not define
the order of bitfields; 'h' could end up as the first bit or 'h'
could end up as the last bit. Not only that, but C does not define
the size of the "storage unit" that is being filled up, so your
structure 'a' is not necessarily going to be only 8 bits long:
it wouldn't be surprising to find out that it was 32 bits long
(and then you get into questions about -which- 8 of those 32 bits
would be affected by the structure.) And always keep in mind
that 'char' is not necessarily 8 bits long: 8 bits is the -minimum-
but there are some systems with char of 9 bits and some systems
with char of 16 or 32 bits.

sorry, i give a wrong example. so here, i modify it:

struct {
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
unsigned char e:1;
unsigned char f:1;
unsigned char g:1;
unsigned char h:1;
}a;


Given a condition, i have two machine:
A little endian
B big endian

now i want to transfer *structure a* from A to B.
in A I give *a* a value: a.a = 0x1;
when B got *a* and check the value a.a = ? //what's the result?


Given another condition with the above two machine, and this time we
transfer *char a*
what's the result?

thanks for you all replying. really.
 
Z

zhangsonglovexiaoniuniu

sorry, i give a wrong example. so here, i modify it:

struct {
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
unsigned char e:1;
unsigned char f:1;
unsigned char g:1;
unsigned char h:1;

}a;

Given a condition, i have two machine:
A little endian
B big endian

now i want to transfer *structure a* from A to B.
in A I give *a* a value: a.a = 0x1;
when B got *a* and check the value a.a = ? //what's the result?

Given another condition with the above two machine, and this time we
transfer *char a*
what's the result?

thanks for you all replying. really.- Òþ²Ø±»ÒýÓÃÎÄ×Ö -

- ÏÔʾÒýÓõÄÎÄ×Ö -

my problem come up with the fowllowing structure defined:

typedef struct {
unsigned id :16; /* query identification number */
#if BYTE_ORDER == BIG_ENDIAN
/* fields in third byte */
unsigned qr: 1; /* response flag */
unsigned opcode: 4; /* purpose of message */
unsigned aa: 1; /* authoritive answer */
unsigned tc: 1; /* truncated message */
unsigned rd: 1; /* recursion desired */
/* fields in fourth byte */
unsigned ra: 1; /* recursion available */
unsigned unused :1; /* unused bits (MBZ as of 4.9.3a3) */
unsigned ad: 1; /* authentic data from named */
unsigned cd: 1; /* checking disabled by resolver */
unsigned rcode :4; /* response code */
#endif
#if BYTE_ORDER == LITTLE_ENDIAN || BYTE_ORDER == PDP_ENDIAN
/* fields in third byte */
unsigned rd :1; /* recursion desired */
unsigned tc :1; /* truncated message */
unsigned aa :1; /* authoritive answer */
unsigned opcode :4; /* purpose of message */
unsigned qr :1; /* response flag */
/* fields in fourth byte */
unsigned rcode :4; /* response code */
unsigned cd: 1; /* checking disabled by resolver */
unsigned ad: 1; /* authentic data from named */
unsigned unused :1; /* unused bits (MBZ as of 4.9.3a3) */
unsigned ra :1; /* recursion available */
#endif
/* remaining bytes */
unsigned qdcount :16; /* number of question entries */
unsigned ancount :16; /* number of answer entries */
unsigned nscount :16; /* number of authority entries */
unsigned arcount :16; /* number of resource entries */
} DNS_HEADER;
 
W

Walter Roberson

(e-mail address removed) <[email protected]> wrot= e:
1. char a;
2.struct {
unsigned a:1;
unsigned b:1; [...]
}a;
Oh, wrong question, but for a different reason! C does not define
the order of bitfields; 'h' could end up as the first bit or 'h'
could end up as the last bit. Not only that, but C does not define
the size of the "storage unit" that is being filled up, so your
structure 'a' is not necessarily going to be only 8 bits long:
sorry, i give a wrong example. so here, i modify it:
struct {
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
unsigned char e:1;
unsigned char f:1;
unsigned char g:1;
unsigned char h:1;
}a;

That has exactly the same remarks as before: C does not define
the order of bitfields, and C does not define the size of the
"storage unit" that is being filled up. It doesn't matter whether
you use unsigned b:1 or unsigned int b:1 or unsigned char b:1 :
if there is "room in the storage unit" then the next bitfield is
put into the storage unit, and if there isn't enough room in the
storage unit then the next bitfield is put in the next storage unit.
But C doesn't mandate any particular size of storage unit to put
the result in; the closest it comes to that is that one of the examples
in the C89 standards shows a case in which 32 bits are used --
which, if anything, hints that the "storage unit" used for bitfields
are at least 32 bits long (possibly even if only 1 bit is needed;
the standard doesn't say that all of the storage units must be the
same size...)

Given a condition, i have two machine:
A little endian
B big endian
now i want to transfer *structure a* from A to B.
in A I give *a* a value: a.a =3D 0x1;
when B got *a* and check the value a.a =3D ? //what's the result?

When you transfer structure a from A to B, you would have to
transfer sizeof(a) characters. As explored above, C doesn't tell
us whether sizeof(a) will be sizeof(char) or sizeof(int) or
something else (and there are machines in which sizeof(char)
is the same as sizeof(int) -- machines in which char are 32 bits long).

The order in which bit-fields are stored within a structure
is independant of whether the machine is big-endian or
little-endian. Some big-endian machines might do it one way
and other big-endiam machines might do it another way, and
it could depend upon the compiler version on the same machine.

For example, some compilers might look and say "Ah, a 1 bit field
at the end of the structure; the easiest way to extract that
would be to have the structure be a 32 bit integer and rotate
the integer left so that the bit goes into the carry bit; then
I can use the SCC (Set Condition Carry) operation to set a location
to whatever is in the carry bit." And the same compiler targetted for
different model of processor in the same architecture family might
have similar reasoning except noting that the second processor
doesn't have a rotate-left instruction, just a rotate-right
instruction, so it might decide to order the fields exactly the
opposite way. (There are important architecture families
that offer a rotation only in one of the two directions.)

So no matter whether machine A is big endian or little endian,
you don't know (without further coding) which bit of the structure
was set or how many bytes are in the structure; and when the
structure arrives across at B, you don't know (without further
coding) which bit of the structure B's a.a is talking about.
The B's a.a could end up drawn from one of the padding bytes
from A's structure a... and the contents of padding bytes are
not controlled by the C standard.

To make matters worse, "big-endian" and "little-endian" do not
have a fixed meaning when it comes to values with more than 2 bytes.
If byte 1 is the most significant byte, byte 2 the next most,
byte 3 the third most, and byte 4 the least significant byte,
then a big-endian machine -might- store a four-byte int in
increasing memory in the order 1234 with the address of the int
corresponding to the address of byte 1, but it also could potentially
store in increasing memory in the order 3412 with the address of
the int corresponding to the address of byte 1 -- as long as the
address corresponds to the most significant byte, both machines
would be "big-endian". You get similar problems with "little-endian"
machines: indeed, it is *common* for "little-endian" machines
not to store four-byte ints in the order 4321. Consider, for
example, that a machine that stored bytes in the order 1234
but the address of the int was the address of byte 4 would be
"little-endian": big-endian vs little-endian doesn't even tell
you which direction in memory the bytes run.

Given another condition with the above two machine, and this time we
transfer *char a*
what's the result?

Which char a? In your modified example, you do not have a char a.
You used char in describing your bitfields, but that doesn't mean
that the overall structure a will fit into a char.

If you have a char on machine A and you transfer it to
machine B, then you still have to deal with the possibility
that the number of bits in a char on A is not the same
as the number of bits in a char on B. But if they -do- happen
to be the same length, and you set a char on A to some
particular *value* and transmit it through a network to
machine B, then no matter how the network inbetween represents
values, when the char is read on B then (assuming same-sized chars)
the char on B will have the same *value* as was set on A.


Now another problem: the same *value* on A and B do not
necessarily represent the same *character* on A and B.
The program on machine A might happen to use ISO-8859-1
("Latin-1") but machine B might happen to use EBCDIC, or the
program on machine B might happen to be running in ISO-8859-7
("Greek") or might happen to be running in Unicode with
UTF-16 encoding. So when you transfer characters between
programs, you have to have the programs agree as to which
character encoding is to be used for the transfer
(which won't necessarily be the same encoding that either
program is using to talk to the users... for example,
the two programs might decide to simply things by always
using UTF-32-BE to transfer data, even though the
programs might be talking to the user in simple ASCII.
 
V

vippstar

(e-mail address removed) <[email protected]> wrot= e:
1. char a;
2.struct {
unsigned a:1;
unsigned b:1; [...]
}a;
Oh, wrong question, but for a different reason! C does not define
the order of bitfields; 'h' could end up as the first bit or 'h'
could end up as the last bit. Not only that, but C does not define
the size of the "storage unit" that is being filled up, so your
structure 'a' is not necessarily going to be only 8 bits long:
sorry, i give a wrong example. so here, i modify it:
struct {
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
unsigned char e:1;
unsigned char f:1;
unsigned char g:1;
unsigned char h:1;
}a;

When you transfer structure a from A to B, you would have to
transfer sizeof(a) characters. As explored above, C doesn't tell
us whether sizeof(a) will be sizeof(char) or sizeof(int) or
something else (and there are machines in which sizeof(char)
is the same as sizeof(int) -- machines in which char are 32 bits long).

char doesn't have to be 32 bits long to have the same sizeof with int.
 
W

Walter Roberson

my problem come up with the fowllowing structure defined:

typedef struct {
unsigned id :16; /* query identification number */
#if BYTE_ORDER == BIG_ENDIAN
/* fields in third byte */
unsigned qr: 1; /* response flag */
unsigned opcode: 4; /* purpose of message */
unsigned aa: 1; /* authoritive answer */
unsigned tc: 1; /* truncated message */
unsigned rd: 1; /* recursion desired */
/* fields in fourth byte */
unsigned ra: 1; /* recursion available */
unsigned unused :1; /* unused bits (MBZ as of 4.9.3a3) */
unsigned ad: 1; /* authentic data from named */
unsigned cd: 1; /* checking disabled by resolver */
unsigned rcode :4; /* response code */
#endif
#if BYTE_ORDER == LITTLE_ENDIAN || BYTE_ORDER == PDP_ENDIAN
/* fields in third byte */
unsigned rd :1; /* recursion desired */
unsigned tc :1; /* truncated message */
unsigned aa :1; /* authoritive answer */
unsigned opcode :4; /* purpose of message */
unsigned qr :1; /* response flag */
/* fields in fourth byte */
unsigned rcode :4; /* response code */
unsigned cd: 1; /* checking disabled by resolver */
unsigned ad: 1; /* authentic data from named */
unsigned unused :1; /* unused bits (MBZ as of 4.9.3a3) */
unsigned ra :1; /* recursion available */
#endif
/* remaining bytes */
unsigned qdcount :16; /* number of question entries */
unsigned ancount :16; /* number of answer entries */
unsigned nscount :16; /* number of authority entries */
unsigned arcount :16; /* number of resource entries */
} DNS_HEADER;


C does not define BYTE_ORDER, and I am not -aware- of any C compilers
that define it. BYTE_ORDER and BIG_ENDIAN and LITTLE_ENDIAN are very
likely tested and defined by a program outside of the program you
are showing -- perhaps by a program such as "GNU automake". You
would have to look at that outside test to understand what the program
considers BIG_ENDIAN or LITTLE_ENDIAN. And from the point of view
of C, you need to consider the possibility that you happen to be
working on a system for which that *neither* of that outside program's
tests for BIG_ENDIAN or LITTLE_ENDIAN hold true.

But my advise in a case such as this is not to start by looking
at the program that is constructing the BYTE_ORDER test. My advice
in a case like this is to refer right back to the standard that
defines DNS headers. The standards that define DNS headers are always
very particular about the exact order of bits transmitted. Code such
as you have quoted is attempting to match the specification in the
standard. However (probably for ease of programming), the code
restricts itself to a small number of the possible meanings of
"big endian" and "little endian", taking the most -common- cases,
and Just Not Working Right for the less common cases.
 
Z

zhangsonglovexiaoniuniu

Sir,

Thansk,really!
Well so much i have to learn, thanks a lot again.
only one question, when and how we care about *bit* order?

BRS,
Evan
 
W

Walter Roberson

only one question, when and how we care about *bit* order?

A few possibilities:

A) When you need to meet an external interface that specifies bit order;

B) When you need to design an interface (e.g., a client/server network)
in which the bit order is important. However, if you have control
over the interface design, then it may be better to specify the
interface in terms of values rather than in terms of bits;

C) In some programs, especially compression programs, you are
effectively working with a bit-stream rather than a byte stream;
though usually it is still better to specify the interface in terms
of values rather than in terms of bits (for example, if you created
the compressed file on a machine with 8-bit bytes and you transfered
it to a machine with 9-bit bytes, you want it to be well defined
as to whether the extra 9th bit becomes the next bit in the stream
or if the 9th bit is effectively skipped/ignored.)
 
K

Kaz Kylheku

hi all,

i get a problem:

assume we have a variable with 2 byte -- *b = 0x0102*,

and i know in memory its byte order as follows:

(little endian)
low ----->high
0x02 0x01
(big endian)
low ---->high
0x01 0x02

now my question is what's the bit order?

Bytes are addressable. So that raises the question of which of two
adjacent bytes has the higher address, and hence byte order is born.

Bits are not individually addressable in C. So the question of bit
order is meaningless.

Bit order matters a lot in serial communication. Serial ports and
ethernets transmit the least significant bit of an octet first.

Some machines have bit operations in which bits can be addressed by
number, e.g. from 0 to 31. For instance, the Motorola 68000 has a
BTST instruction for testing the bit of a destination operand:

BTST Dn, <ea>
BTST #<immediate>, <ea>

The bit number is either in a data register D0-D7 or an immediate
value. The value zero refers to the least significant bit. In this
sense, bits actually are addressable by the instruction set
architecture. Consequently, the machine can be said to be little
endian with respect to bits (little bitian?), since the least
significant bit is assigned position zero. (It's big endian at the
byte level: a multi-byte operand is stored with the most significant
byte at the lowest address).

Of course, this bit endianness isn't ``real'' in the sense that it's
not visible to other devices attached to the same bus as the 68000.
It's an internal convention within the bit test instructions only.
True endianness is visible outside of the processor.
 
W

Walter Roberson

Kaz Kylheku said:
Bit order matters a lot in serial communication. Serial ports and
ethernets transmit the least significant bit of an octet first.

Is that true for (e.g.,) SNA, that the least significant bit
is transmitted first? And how does the above transmission
scheme deal with the cases where the serial ports when the character
size is not exactly an octet wide, such as if the termios
c_cflag CSIZE field is set to CS5, CS6, or CS7? If parity
bits are in effect for the serial port, then at which point are they
transmitted?
 
K

Kaz Kylheku

Is that true for (e.g.,) SNA, that the least significant bit
is transmitted first?

SNA is an entire communication protocol suite, analogous to TCP/IP.
There is a data link protocol in the suite, SDLC, but that's layer 2.
It's defined in terms of octets, not bits on a wire. I can't easily
find any info on the IBM communication hardware for which SDLC was
originally targetted.
And how does the above transmission
scheme deal with the cases where the serial ports when the character
size is not exactly an octet wide, such as if the termios
c_cflag CSIZE field is set to CS5, CS6, or CS7? If parity
bits are in effect for the serial port, then at which point are they
transmitted?

First a framing bit is sent, the start bit. Then the data bits, least
significant first. Then the parity bit, if any, followed by the stop
bit(s).
 
W

Walter Roberson

First a framing bit is sent, the start bit. Then the data bits, least
significant first. Then the parity bit, if any, followed by the stop
bit(s).

So you're saying that if I have CS5 set, the least significant bit
of an octet is sent first. Which octet is that? CS5 might be BAUDOT,
and my system char size is not necessarily octets -- could be
two BAUDOT symbols in a 10 bit field for example.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top