On what does size of data types depend?

S

Sunil

Hi all,

I am using gcc compiler in linux.I compiled a small program
int main()
{
printf("char : %d\n",sizeof(char));
printf("unsigned char : %d\n",sizeof(unsigned char));
printf("short : %d\n",sizeof(short));
printf("unsigned short : %d\n",sizeof(unsigned short));
printf("int : %d\n",sizeof(int));
printf("unsigned int : %d\n",sizeof(unsigned int));
printf("long : %d\n",sizeof(long));
printf("unsigned long : %d\n",sizeof(unsigned long));
printf("long long : %d\n",sizeof(long long));
printf("unsigned long long : %d\n",sizeof(unsigned long
long));
}

Result was

char : 1
unsigned char : 1
short : 2
unsigned short : 2
int : 4
unsigned int : 4
long : 4
unsigned long : 4
long long : 8
unsigned long long : 8


What i want to know is what will be the effect if i use int in
place of long in applications running on linux and also on what factors
does the size of datatypes depend.

Thanks
Sunil.
 
E

EventHelix.com

In general, the code you right should not depend upon the size of a
type on a platform.

The size of a type depends upon the processor architecture and the
compiler vendor. For example, a 16 bit MCU migth consider an int to be
2 bytes and a long to be 4 bytes.

On a 64 bit processor, a compiler vendor may decide to use 8 bytes to
represent an int.
 
E

Eric Sosman

Sunil said:
Hi all,

I am using gcc compiler in linux.I compiled a small program
int main()
{
printf("char : %d\n",sizeof(char));
printf("unsigned char : %d\n",sizeof(unsigned char));
printf("short : %d\n",sizeof(short));
printf("unsigned short : %d\n",sizeof(unsigned short));
printf("int : %d\n",sizeof(int));
printf("unsigned int : %d\n",sizeof(unsigned int));
printf("long : %d\n",sizeof(long));
printf("unsigned long : %d\n",sizeof(unsigned long));
printf("long long : %d\n",sizeof(long long));
printf("unsigned long long : %d\n",sizeof(unsigned long
long));
}

On some systems I have used, the output would claim
that all the types are of size zero. Hint: What type of
value does "%d" expect, and what type of value does sizeof
produce?

You've also neglected to #include <stdio.h> and to
return a value from main() -- the latter is all right in
C99 but not in C90, and is doubtful practice in any case.
You're using gcc, which can produce a lot of helpful
diagnostics if you ask for them: ask for them.
Result was

char : 1
unsigned char : 1
short : 2
unsigned short : 2
int : 4
unsigned int : 4
long : 4
unsigned long : 4
long long : 8
unsigned long long : 8

What i want to know is what will be the effect if i use int in
place of long in applications running on linux and also on what factors
does the size of datatypes depend.

On your system and using your compiler with your choice
of compile-time options, `long' and `int' appear to have the
same size. This suggests that they probably also have the
same range, but you'd need to display the actual values from
<limits.h> to be completely certain.

But even if they have the same size, the same range, and
the same representation, they remain distinct types. You can
prove this to yourself with a small test program:

int func(void);
long (*fptr)(void) = func;

You may wonder why "A difference that makes no difference"
is, in this case, truly a difference. The answer is that the
fact that `int' and `long' look the same on your machine under
current conditions does not mean that they look the same on all
machines, or even on your machine with different compiler flags.
If you intend never to move your program to another machine,
never to upgrade your current machine, never to move to 64-bit
Linux, and never to change compilers, then you can ignore the
distinction between `int' and `long' (except in situations like
that I've shown above, where the compiler is required to be
more scrupulous than you).

On the other hand, if you think that your current system
and all its software might not be the very last one you ever
want to use, you should not pretend that `int' and `long' are
the same thing. Three useful references:

The Tenth Commandment
http://www.lysator.liu.se/c/ten-commandments.html

FAQ Question 1.1
http://www.eskimo.com/~scs/C-faq/top.html

IAQ Question 1.1
http://www.plethora.net/~seebs/faqs/c-iaq.html
(be sure to read the footnotes)
 
T

tanmoy87544

Sunil said:
What i want to know is what will be the effect if i use int in
place of long in applications running on linux and also on what factors
does the size of datatypes depend.

The size depends on the implementation. Incidentally, the size is
measured in units of the space occupied by a char, which is not
guaranteed to take 8 bits of space, though it often does.

In most cases, the exact amount of space something takes should not
concern the programmer: use char when talking about characters,
unsigned char when talking about raw memory, short or unsigned short
when space savings is important, signed or unsigned char for even more
savings, long and unsigned long when dealing with large integers, long
long and unsigned long long when the integers may be really long, and
int and unsigned int when you want to use whatever is the most
`natural' integer representation in the implementation (i.e. int or
unsigned int ought to be the one you use unless there is reason to
deviate). Use the signed types if you think of them as integers, use
the unsigned types if you treat them as bit patterns, or need to use
the extra range on the large positive end, or the logic of the program
is such that blindly converting negative numbers to large positive
integers is the `right thing' to do!

The C standard does guarantee some minimum range of values for each of
these types: look at those, and decide when you want space savings
versus when your integers may become large in magnitude in interpreting
the last paragraph. But don't apply the rules blindly ... experience
teaches you what is likely to be the best data type. C99 also lets you
more fine grained control over integral types: look at them. In rare
cases, bitfields might also be useful.

Do not gratuitously put in `linux' dependencies: and not all linux
platforms will have the exact same behaviour anyway. Even if they are
the same size, int and long are not interchangeable, a program will
often become `incorrect' if you change ints to longs without changing
anything else. Evn though this makes the behaviour undefined, in
practice, on current implementations, it is not likely to create a
difference except in warnings from the compiler. But why take the
chance?

Note that sizeof gives you the space occupied in memory: and it is
possible for an implementation to not effectively use all the space, so
use the macros like CHAR_MAX if you need the exact ranges. It may also
not use the same representation in memory for different types of the
same size (For example, there is no bar to using little endian for ints
and big endian for the longs, as long as the compiler takes care to do
the bit manipulations properly; no implementation I know of does that
yet). The implementation can further require different alignments for
the two types (thus, it might decide on a 2 byte alignment for ints and
4 byte alignment for longs: planning on using 16 bit bus operations for
ints and 32 bit operations for longs; again I know of no implementation
that does that).

In short, you are asking questions which you `should' not be. C is
trying to provide a level of abstraction: a division of labour between
the programmer and the implementation. The programmer describes what
he/she wants and some low level things like whether space or range is
more important, the implementation takes care of the hardware and makes
the program run according to the specification. The standard provides
the language for unambiguous communication. Your questions are
crossing the border, violating one of the raisons de etre of high level
langauges.

Sure, there are occasions when you need to know your precise hardware
and how your implementation maps your code to it. The phrasing of your
question seems to suggest you are not in that situation, though.
 
J

John Devereux

Sunil said:
Hi all,

I am using gcc compiler in linux.I compiled a small program
int main()
{
printf("char : %d\n",sizeof(char));
printf("unsigned char : %d\n",sizeof(unsigned char));
printf("short : %d\n",sizeof(short));

<SNIP>

This brings to mind something that I have wondered about.

I often see advice elsewhere, and in other peoples programs,
suggesting hiding all C "fundamental" types behind typedefs such as

typedef char CHAR;
typedef int INT32;
typedef unsigned int UINT32;
typedef char* PCHAR;

The theory is that application code which always uses these typedefs
will be more likely to run on multiple systems (provided the typedefs
are changed of course).

I used to do this. Then I found out that C99 defined things like
"uint32_t", so I started using these versions instead. But after
following this group for a while I now find even these ugly and don't
use them unless unavoidable.

What do people here think is best?
 
L

Lowell Gilbert

John Devereux said:
<SNIP>

This brings to mind something that I have wondered about.

I often see advice elsewhere, and in other peoples programs,
suggesting hiding all C "fundamental" types behind typedefs such as

typedef char CHAR;
typedef int INT32;
typedef unsigned int UINT32;
typedef char* PCHAR;

The theory is that application code which always uses these typedefs
will be more likely to run on multiple systems (provided the typedefs
are changed of course).

I used to do this. Then I found out that C99 defined things like
"uint32_t", so I started using these versions instead. But after
following this group for a while I now find even these ugly and don't
use them unless unavoidable.

What do people here think is best?

Depends on whether the code really needs to depend on the size of its
variables.
 
T

tanmoy87544

John said:
I used to do this. Then I found out that C99 defined things like
"uint32_t", so I started using these versions instead. But after
following this group for a while I now find even these ugly and don't
use them unless unavoidable.

What do people here think is best?

Good coding style can rarely be encapsulated into simple rules! I
suggest that these C99 features be used in favour of pragmas (like `I
want this thing to be really fast') or making unwarranted assumptions
like unsigned int can hold 32 bits. If I can get away by just using
long instead of int, though, I do it in preference to using the more
precise specifications.

The idea is that even though we can avoid them, we often do not write a
strictly conforming code because a fast algorithm using bit
manipulations might be available if we made assumptions about the exact
number of bits, and conditioning everything on limits.h may be
unnecessary for the project at hand. Or, we may be tempted to use
compiler flags to guarantee speed or space savings. If the C99
integral type definitions solve the problem (by, at worst, detecting
problem at compile time), use them in preference to silent breakage
when code is ported.

Most often, I find no particular reason to use them, and I do not.
 
S

Skarmander

Eric said:
On some systems I have used, the output would claim
that all the types are of size zero. Hint: What type of
value does "%d" expect, and what type of value does sizeof
produce?
<snip>
What exactly *is* the format specifier for size_t in C90? C99 has "%zu",
but is (say) "%lu" guaranteed to work?

S.
 
J

John Devereux

Lowell Gilbert said:
Depends on whether the code really needs to depend on the size of its
variables.

What I am asking is whether one should habitually use them *just in
case* something breaks when run on another platform. I have seen
programs where int, long etc. are *never used* except in a "types.h"
header.
 
E

Eric Sosman

Skarmander wrote On 10/05/05 10:30,:
Eric said:
Sunil said:
printf("char : %d\n",sizeof(char));
[...]

On some systems I have used, the output would claim
that all the types are of size zero. Hint: What type of
value does "%d" expect, and what type of value does sizeof
produce?
<snip>
What exactly *is* the format specifier for size_t in C90? C99 has "%zu",
but is (say) "%lu" guaranteed to work?

The usual C90 way is

printf ("size = %lu\n", (unsigned long)sizeof(Type));

This works because size_t must be an unsigned integer type,
C90 has only four such types, and unsigned long can handle
all the values of any of the four.

In C99 the number of unsigned integer types is much
larger, and varies from one implementation to another. The
widest unsigned integer type is uintmax_t, so one could
write (using another C99-invented length modifier)

printf ("size = %ju\n", (uintmax_t)sizeof(Type));

I do not know for sure why the committee decided to
invent the "z" width modifier, but two motivations seem
plausible:

- That silly cast is a pain, and since other length
modifiers were already being invented it was easy
to introduce a new one for size_t.

- On small machines size_t might be as narrow as 16
bits, while uintmax_t must be at least 64 bits.
Working with 64-bit values might require a multi-
precision software library that would otherwise
not be needed. The "z" modifier lets one avoid
using uintmax_t, and might allow the implementation
to exclude the unnecessary library (recall systems
that tried to omit software floating-point support
when they thought the program wouldn't use it.)
 
S

Skarmander

Eric said:
Skarmander wrote On 10/05/05 10:30,:
Eric Sosman wrote:

Sunil wrote:


printf("char : %d\n",sizeof(char));
[...]

On some systems I have used, the output would claim
that all the types are of size zero. Hint: What type of
value does "%d" expect, and what type of value does sizeof
produce?

<snip>
What exactly *is* the format specifier for size_t in C90? C99 has "%zu",
but is (say) "%lu" guaranteed to work?


The usual C90 way is

printf ("size = %lu\n", (unsigned long)sizeof(Type));

This works because size_t must be an unsigned integer type,
C90 has only four such types, and unsigned long can handle
all the values of any of the four.
I do not know for sure why the committee decided to
invent the "z" width modifier, but two motivations seem
plausible:
<snip>

I'm guessing because it offers a valuable abstraction. The argument by
elimination one has to apply for C90 is shaky: a future implementation
that uses 64-bit size_t's but 32-bit longs (and 64-bit long longs,
presumably) will find its format specifiers outdated.

After all, there's a reason size_t wasn't just defined as "unsigned
long", and the format specifiers should allow for it.

S.
 
S

Simon Biber

Skarmander said:
Eric Sosman wrote:


<snip>

I'm guessing because it offers a valuable abstraction. The argument by
elimination one has to apply for C90 is shaky: a future implementation
that uses 64-bit size_t's but 32-bit longs (and 64-bit long longs,
presumably) will find its format specifiers outdated.

The point is, C90 and C99 are different languages, and correct code on
one is not necessarily correct code on the other. Pick one of the two,
and set up your compiler options to match that choice.

If you are writing C90 code, there cannot be any integer type larger
than unsigned long. Therefore, casting size_t to unsigned long must
preserve the correct value. Any implementation that has 64-bit size_t
but 32-bit long DOES NOT CONFORM TO C90.

If you are writing for C99, then you should be using the %zu specifier
and not trying to cast to unsigned long.
 
S

Skarmander

Simon said:
The point is, C90 and C99 are different languages, and correct code on
one is not necessarily correct code on the other. Pick one of the two,
and set up your compiler options to match that choice.

If you are writing C90 code, there cannot be any integer type larger
than unsigned long. Therefore, casting size_t to unsigned long must
preserve the correct value. Any implementation that has 64-bit size_t
but 32-bit long DOES NOT CONFORM TO C90.
Oh, that's a good point. An implementation wouldn't be allowed to do
that in C90 mode even if it could.

That is, I think. The standard *is* worded in such a way that makes it
impossible for size_t to be an integral type different from unsigned
char, short, int, long, right? It'll say something like "the integral
types are such and such" and "size_t must be an unsigned integral type",
so that size_t is always convertible to an unsigned long without loss.
If you are writing for C99, then you should be using the %zu specifier
and not trying to cast to unsigned long.
Yes, but that wasn't exactly the point. The question was why C99 added
it in the first place. And in my opinion, this was to settle the matter
once and for all. Had C99 not added "%zu", then C99 would be subject to
the same problem, possibly limiting the implementation artifically. C90
needs %lu, C99 would have needed %llu, etc. (Not that I imagine that
many successors to C99 which will boost the ranges of integral types,
but you get the point.)

The comparison here is not between C90 and C99, but between C99 and its
hypothetical successor. The "just use the specifier for the biggest
integer in the language" approach is not stable (and not clean), and the
simple introduction of a new specifier to cover the abstraction solves
this issue now and forever.

S.
 
K

Keith Thompson

John Devereux said:
<SNIP>

This brings to mind something that I have wondered about.

I often see advice elsewhere, and in other peoples programs,
suggesting hiding all C "fundamental" types behind typedefs such as

typedef char CHAR;
typedef int INT32;
typedef unsigned int UINT32;
typedef char* PCHAR;

The theory is that application code which always uses these typedefs
will be more likely to run on multiple systems (provided the typedefs
are changed of course).

I used to do this. Then I found out that C99 defined things like
"uint32_t", so I started using these versions instead. But after
following this group for a while I now find even these ugly and don't
use them unless unavoidable.

Of the typedefs above, I'd have to say that CHAR and PCHAR are utterly
useless. Presumably there's never any reason to define CHAR as
anything other than char, or PCHAR as anything other than char*. If
so, just use char and char* directly, so the reader doesn't have to
wonder if CHAR and PCHAR have been defined properly. If not, the
names CHAR and PCHAR are misleading.

As for INT32 and UINT32, of course those definitions will have to be
changed for systems where int and unsigned int are something other
than 32 bits. C99, as you've seen, provides int32_t and uint32_t in
<stdint.h>. If you don't have a C99 compiler, you can define them
yourself. Doug Gwyn has written a public domain implementation of
some of the new C99 headers for use with C90; see
<http://www.lysator.liu.se/c/q8/>.
 
E

Emmanuel Delahaye

Skarmander a écrit :
What exactly *is* the format specifier for size_t in C90? C99 has "%zu",
but is (say) "%lu" guaranteed to work?

Yes, with (unsigned long).
 
C

Christian Bau

"Sunil said:
What i want to know is what will be the effect if i use int in
place of long in applications running on linux and also on what factors
does the size of datatypes depend.

int is guaranteed to be capable of holding values in the range -32767 to
+32767, nothing else. You can use int of your data is in that range.

long is guaranteed to be capable of holding values in the range from
about -2,000,000,000 to +2,000,000,000. Use long if your data can be
outside the range guaranteed to be available for int, but not outside
the larger range.

Size of datatypes depends on whatever the compiler writer thought was a
good idea. Don't be surprised if sizeof (long) or sizeof (void *) is
greater than four.
 
C

Christian Bau

John Devereux said:
I often see advice elsewhere, and in other peoples programs,
suggesting hiding all C "fundamental" types behind typedefs such as

typedef char CHAR;
typedef int INT32;
typedef unsigned int UINT32;
typedef char* PCHAR;

The theory is that application code which always uses these typedefs
will be more likely to run on multiple systems (provided the typedefs
are changed of course).

I used to do this. Then I found out that C99 defined things like
"uint32_t", so I started using these versions instead. But after
following this group for a while I now find even these ugly and don't
use them unless unavoidable.

typedef char CHAR; and typedef char* PCHAR; is just plain stupid.

"int", "long" etc. , properly used, is the best way to code. However,
they are often not properly used, and there will be lots of trouble
because of that when 64 bit systems become more widely available. If you
use a typedef like INT32 or uint32_t, at least I know what assumptions
you made.
 
C

Christian Bau

Skarmander said:
<snip>
What exactly *is* the format specifier for size_t in C90? C99 has "%zu",
but is (say) "%lu" guaranteed to work?

printf("short: %lu\n", (unsigned long) sizeof(short));

will work as long as a short is fewer than four billion bytes :)

(I remember seeing a bug while a program was being ported: A function
took an argument of type long, and the value passed was "- sizeof
(short)". The function received a value of 65534 which was a bit
unexpected. And yes, the compiler was right. )
 
A

Alexei A. Frounze

....
printf("short: %lu\n", (unsigned long) sizeof(short));

will work as long as a short is fewer than four billion bytes :)

Correct :)
(I remember seeing a bug while a program was being ported: A function
took an argument of type long, and the value passed was "- sizeof
(short)". The function received a value of 65534 which was a bit
unexpected. And yes, the compiler was right. )

C is wonderful in this respect. Perhaps because of this Java AFAIK has no
unsigned types.

Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top