Hello, Arthur!
You wrote on Tue, 12 Feb 2008 21:59:55 -0800 (PST):
[skip]
Thanks for your explanations.
A> and refers it using
A> u.i = 0x12;
A> u.c[0] = 0x12;
A> the compiler will simply convert them into instructions like this:
A> movl $0x12, _u
A> movb $0x12, _u
A> The compiler uses same symbols for u.i and u.c[0].
A> The reason why they look different in your debugger is that
A> Intel CPUs use little-endian.
A> un.s is placed in memory like this:
A> 0x02 0x01
A> when referred as u.s, it means a short int 0x0102, i.e. s = 0x102
A> when referred as u.c, it means an array of char, {0x02, 0x01}
But both u.i and u.c are placed in memory on the same little-endian machine,
why do they look differently? I can't catch how it is done.
With best regards, Roman Mashak. E-mail: (e-mail address removed)
Hello! Your compiler stores the information that un.s is a short int
and un.c[] is an array of char. And when you compile your program with
-g, it passes the info to your debugger, so your debugger knows it.
To understand why it looks differently, you have to keep in mind that
both un.c and un.s are symbols that are simply addresses in
memory. (And the two address are the same)
Suppose the union 'un' has been placed in address 0x80490d4, and when
you command 'un.s = 0x102;', the processor will set the one byte
located
at 0x80490d4 to 0x02, and the one at 0x80490d5 to 0x01, since the
Intel
CPU is little-endian
And when you refer to un.s, since sizeof(short) is 2(in most 32-bit
systems),
the CPU will fetch the two byte at 0x80490d4(0x02) and
0x80490d5(0x01), and
connect them, in little-endian. That will be 0x0102, i.e.,
0x102 just as your debugger reports.
But when you refer to un.c, since sizeof(char) is 1, the CPU fetches
one
byte at 0x80490d4(0x02), and present it to the debugger, and then
fetches
the next one.(0x01) It doesn't connect them (in little-endian), so
they
look like what they are placed in the memory, 0x02, 0x01, just as
your debugger reports.
My explanation is lengthily, sorry.