(e-mail address removed) <
[email protected]> wrot= e:
1. char a;
2.struct {
unsigned a:1;
unsigned b:1; [...]
}a;
Oh, wrong question, but for a different reason! C does not define
the order of bitfields; 'h' could end up as the first bit or 'h'
could end up as the last bit. Not only that, but C does not define
the size of the "storage unit" that is being filled up, so your
structure 'a' is not necessarily going to be only 8 bits long:
sorry, i give a wrong example. so here, i modify it:
struct {
unsigned char a:1;
unsigned char b:1;
unsigned char c:1;
unsigned char d:1;
unsigned char e:1;
unsigned char f:1;
unsigned char g:1;
unsigned char h:1;
}a;
That has exactly the same remarks as before: C does not define
the order of bitfields, and C does not define the size of the
"storage unit" that is being filled up. It doesn't matter whether
you use unsigned b:1 or unsigned int b:1 or unsigned char b:1 :
if there is "room in the storage unit" then the next bitfield is
put into the storage unit, and if there isn't enough room in the
storage unit then the next bitfield is put in the next storage unit.
But C doesn't mandate any particular size of storage unit to put
the result in; the closest it comes to that is that one of the examples
in the C89 standards shows a case in which 32 bits are used --
which, if anything, hints that the "storage unit" used for bitfields
are at least 32 bits long (possibly even if only 1 bit is needed;
the standard doesn't say that all of the storage units must be the
same size...)
Given a condition, i have two machine:
A little endian
B big endian
now i want to transfer *structure a* from A to B.
in A I give *a* a value: a.a =3D 0x1;
when B got *a* and check the value a.a =3D ? //what's the result?
When you transfer structure a from A to B, you would have to
transfer sizeof(a) characters. As explored above, C doesn't tell
us whether sizeof(a) will be sizeof(char) or sizeof(int) or
something else (and there are machines in which sizeof(char)
is the same as sizeof(int) -- machines in which char are 32 bits long).
The order in which bit-fields are stored within a structure
is independant of whether the machine is big-endian or
little-endian. Some big-endian machines might do it one way
and other big-endiam machines might do it another way, and
it could depend upon the compiler version on the same machine.
For example, some compilers might look and say "Ah, a 1 bit field
at the end of the structure; the easiest way to extract that
would be to have the structure be a 32 bit integer and rotate
the integer left so that the bit goes into the carry bit; then
I can use the SCC (Set Condition Carry) operation to set a location
to whatever is in the carry bit." And the same compiler targetted for
different model of processor in the same architecture family might
have similar reasoning except noting that the second processor
doesn't have a rotate-left instruction, just a rotate-right
instruction, so it might decide to order the fields exactly the
opposite way. (There are important architecture families
that offer a rotation only in one of the two directions.)
So no matter whether machine A is big endian or little endian,
you don't know (without further coding) which bit of the structure
was set or how many bytes are in the structure; and when the
structure arrives across at B, you don't know (without further
coding) which bit of the structure B's a.a is talking about.
The B's a.a could end up drawn from one of the padding bytes
from A's structure a... and the contents of padding bytes are
not controlled by the C standard.
To make matters worse, "big-endian" and "little-endian" do not
have a fixed meaning when it comes to values with more than 2 bytes.
If byte 1 is the most significant byte, byte 2 the next most,
byte 3 the third most, and byte 4 the least significant byte,
then a big-endian machine -might- store a four-byte int in
increasing memory in the order 1234 with the address of the int
corresponding to the address of byte 1, but it also could potentially
store in increasing memory in the order 3412 with the address of
the int corresponding to the address of byte 1 -- as long as the
address corresponds to the most significant byte, both machines
would be "big-endian". You get similar problems with "little-endian"
machines: indeed, it is *common* for "little-endian" machines
not to store four-byte ints in the order 4321. Consider, for
example, that a machine that stored bytes in the order 1234
but the address of the int was the address of byte 4 would be
"little-endian": big-endian vs little-endian doesn't even tell
you which direction in memory the bytes run.
Given another condition with the above two machine, and this time we
transfer *char a*
what's the result?
Which char a? In your modified example, you do not have a char a.
You used char in describing your bitfields, but that doesn't mean
that the overall structure a will fit into a char.
If you have a char on machine A and you transfer it to
machine B, then you still have to deal with the possibility
that the number of bits in a char on A is not the same
as the number of bits in a char on B. But if they -do- happen
to be the same length, and you set a char on A to some
particular *value* and transmit it through a network to
machine B, then no matter how the network inbetween represents
values, when the char is read on B then (assuming same-sized chars)
the char on B will have the same *value* as was set on A.
Now another problem: the same *value* on A and B do not
necessarily represent the same *character* on A and B.
The program on machine A might happen to use ISO-8859-1
("Latin-1") but machine B might happen to use EBCDIC, or the
program on machine B might happen to be running in ISO-8859-7
("Greek") or might happen to be running in Unicode with
UTF-16 encoding. So when you transfer characters between
programs, you have to have the programs agree as to which
character encoding is to be used for the transfer
(which won't necessarily be the same encoding that either
program is using to talk to the users... for example,
the two programs might decide to simply things by always
using UTF-32-BE to transfer data, even though the
programs might be talking to the user in simple ASCII.