In said:
How is a character stored in a word aligned machine? Assuming on 64bit
machine, 1 byte is reserved for a char, is it the case that only 1
byte is used to store the character and the rest 7 bytes are wasted,
or my assumption is wrong?
It is wrong, due to the special properties of the type unsigned char: it
can be used to examine the representation of any other type. Therefore,
this type cannot, by definition, have "wasted" bits (they are called
padding bits in the C99 standard).
So, possible sizes of char on a 64-bit machine are: 8, 16, 32 and 64-bit.
If the size is less than 64-bit, sizeof word > 1 and multiple chars
can be stored in a word (the word can be aliased with an array of char).
There is only one known architecture with 64-bit word addressing (no
octet-based addressing) where C was implemented: the Cray vector
processor used in the old Cray supercomputers. char is an 8-bit type
on that particular platform.
If my assumption is right, what are the
performance issues in retrieving value of a character variable over
other data types like integer, double or float.
Because the machine uses word addressing, char pointers need to store more
data than all other pointers (the address or position of the byte inside
the word). There are two ways of storing this additional information:
in the low bits, which optimises char pointer arithmetic, but requires
additional operations when the pointer is dereferenced, or in the upper
bits, which simplifies pointer dereferencing (the higher bits are
ignored, as the address space is only 48-bit) but complicates char
pointer arithmetic. I believe both ways have been uses in different
implementations. Either way, after retrieving the word containing the
char, the char itself has to be extracted from the word, and this takes
some additional shifting and masking, so char access is slower. Not
much of a problem in practice, as these machines were not intended for
intensive character manipulations, but as number crunchers.
The other, more common, 64-bit architectures use octet-based addressing
and things are no different from the more common 32-bit architectures.
Dan