Internal representation of char == unsigned small int?

Sathyaish · Jun 27, 2005

When you say char in C, it internally means "an unsigned small integer
with 1-byte memory", right? More importantly, the internal
representation of char does not mean "int" as in
"machine-dependant/register-size dependant integer, which is normally
four-byte on 32-bit processors", right?

Lew Pitcher · Jun 27, 2005

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

When you say char in C, it internally means "an unsigned small integer
with 1-byte memory", right? More importantly, the internal
representation of char does not mean "int" as in
"machine-dependant/register-size dependant integer, which is normally
four-byte on 32-bit processors", right?

Nope.

When you say char in C, you mean an object large enough to store any member of
the basic execution character set. You mean an object that is guaranteed to be
able to represent a range of unsigned values between 0 and 65535, and/or a
range of signed values between -128 and 127.

/How/ the compiler implements this object is up to the compiler. So long as it
meets the minimum requirements of a char, then any storage size is legal.

FWIW, by definition, a char takes 1 byte. However, that 1 byte /can/ be 8 or 9
or 32 or 64 or 128 or even 5000 bits wide, as required by the compiler.
/And/ an int object can have the same size as a char object (or, to put it
another way, a char object can have the same size as an int object).

- --
Lew Pitcher

Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCv2gnagVFX4UWr64RAj35AJsEtCnI/c5kmX0+1dh7IkRySoMO1gCfe99o
p76e4oB8BjnsCvd+FxGOL84=
=RrKr
-----END PGP SIGNATURE-----

Lew Pitcher · Jun 27, 2005

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lew said:
Nope.

When you say char in C, you mean an object large enough to store any member of
the basic execution character set. You mean an object that is guaranteed to be
able to represent a range of unsigned values between 0 and 65535, and/or a

Gakkk. When will I learn to proofread?? I meant unsigned values between 0 and 255.

range of signed values between -128 and 127.

/How/ the compiler implements this object is up to the compiler. So long as it
meets the minimum requirements of a char, then any storage size is legal.

FWIW, by definition, a char takes 1 byte. However, that 1 byte /can/ be 8 or 9
or 32 or 64 or 128 or even 5000 bits wide, as required by the compiler.
/And/ an int object can have the same size as a char object (or, to put it
another way, a char object can have the same size as an int object).

--
Lew Pitcher

Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.

- --
Lew Pitcher

Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCv2iUagVFX4UWr64RAlNmAJ42HxK3pd0VgoxYrBcI9byxHvmvFgCffHM2
cQWmU/7JGCR4n+0zokWntwk=
=ACoz
-----END PGP SIGNATURE-----

Keith Thompson · Jun 27, 2005

Sathyaish said:
When you say char in C, it internally means "an unsigned small integer
with 1-byte memory", right? More importantly, the internal
representation of char does not mean "int" as in
"machine-dependant/register-size dependant integer, which is normally
four-byte on 32-bit processors", right?

C has three distinct one-byte types: char, signed char, and unsigned
char. "Plain" char has the same representation as either signed char
or unsigned char. The minimum ranges are -127..+127 for signed char,
0..255 for unsigned char.

Malcolm · Jun 27, 2005

Sathyaish said:
When you say char in C, it internally means "an unsigned small integer
with 1-byte memory", right? More importantly, the internal
representation of char does not mean "int" as in
"machine-dependant/register-size dependant integer, which is normally
four-byte on 32-bit processors", right?

In English, we use glyphs to repesent characters. So capital A is a upwards
pointing triangle with a raised lower edge, capital B is a straight line
with two semi circles, and so on.

This is a good system for pencil and paper, but trying to store such shapes
directly on computer would be very wasteful. So instead we use a code - 10
means A, 11 means B, 12 means C, and so on.
Usually this code will be ascii, and usually characters will occupy 8 bits.
However you normally don't have to worry about this. C abstracts the
representation, and handles it for you. If you want an A, you just type char
ch = 'A';

Unfortunately the designers of C made a mistake. On their machine, bytes,
the smallest addressable unit of memory, happend to be 8 bits, which was
also perfect for the ascii code. So they decided to use the same word for a
character and a byte, "char". This causes huge problems when we try to go to
non-Latin languages, but we have to live with it.

The result is that you will often see "unsigned char" or more occasionally
"signed char" used as a small integer. You are not guaranteed 8 bits, though
this is by far the most common value. The macro CHAR_BIT gives you the
number of bits in a char.

Keith Thompson · Jun 27, 2005

Malcolm said:
The result is that you will often see "unsigned char" or more occasionally
"signed char" used as a small integer. You are not guaranteed 8 bits, though
this is by far the most common value. The macro CHAR_BIT gives you the
number of bits in a char.

But you are guarantee *at least* 8 bits.

why printf("%d", arg) works with arg of type int, short, char	21	Mar 1, 2014
raw byte representation of unsigned long long?	8	Nov 8, 2006
const void * to const unsigned char (*)[2]	6	Dec 14, 2009
char=int	2	Oct 11, 2005
rescale signed to unsigned (short) int	11	Sep 10, 2010
Printing the range s of unsigned char and unsigned int.	20	Sep 12, 2007
whats the use of unsigned char	11	Nov 6, 2009
size of char pointer or int pointer	0	Sep 10, 2012

Internal representation of char == unsigned small int?

Sathyaish

Lew Pitcher

Lew Pitcher

Keith Thompson

Malcolm

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads