Standard integer types vs <stdint.h> types

E

euler70

char and unsigned char have specific purposes: char is useful for
representing characters of the basic execution character set and
unsigned char is useful for representing the values of individual
bytes. The remainder of the standard integer types are general
purpose. Their only requirement is to satisfy a minimum range of
values, and also int "has the natural size suggested by the
architecture of the execution environment". What are the reasons for
using these types instead of the int_fastN_t types of <stdint.h>?

If I want just a signed integer with at least 16 width, then why
choose int instead of int_fast16_t? int_fast16_t is the "fastest"
signed integer type that has at least 16 width, while int is simply a
signed integer type that has at least 16 width. It seems that choosing
int_fast16_t is at least as good as choosing int. This argument can be
made for N=8,16,32,64, and for the corresponding unsigned types.
<stdint.h> also offers int_leastN_t types, which are useful when
keeping a small storage size is the greatest concern.

The only benefit of using the standard integer types I can see is that
their ranges may expand as the C standard progresses, so code that
uses them might stay useful over time. For example fseek of <stdio.h>
uses a long for its offset parameter, so if the range of long grows
then fseek will automatically offer a wider range of offsets.

It's also interesting to note (or maybe not) that in the Windows
world, the idea of long being general purpose has somewhat been
destroyed and long has become a type that must have exactly 32 bits.
 
M

Malcolm McLean

char and unsigned char have specific purposes: char is useful for
representing characters of the basic execution character set and
unsigned char is useful for representing the values of individual
bytes. The remainder of the standard integer types are general
purpose. Their only requirement is to satisfy a minimum range of
values, and also int "has the natural size suggested by the
architecture of the execution environment". What are the reasons for
using these types instead of the int_fastN_t types of <stdint.h>?

If I want just a signed integer with at least 16 width, then why
choose int instead of int_fast16_t? int_fast16_t is the "fastest"
signed integer type that has at least 16 width, while int is simply a
signed integer type that has at least 16 width. It seems that choosing
int_fast16_t is at least as good as choosing int. This argument can be
made for N=8,16,32,64, and for the corresponding unsigned types.
<stdint.h> also offers int_leastN_t types, which are useful when
keeping a small storage size is the greatest concern.

The only benefit of using the standard integer types I can see is that
their ranges may expand as the C standard progresses, so code that
uses them might stay useful over time. For example fseek of <stdio.h>
uses a long for its offset parameter, so if the range of long grows
then fseek will automatically offer a wider range of offsets.

It's also interesting to note (or maybe not) that in the Windows
world, the idea of long being general purpose has somewhat been
destroyed and long has become a type that must have exactly 32 bits.

Yes, we're rapidly going down the path of destroying the C basic integer
types.

Once you start inventing types like int_fast16_t people will use them, and
the language becomes more and more difficult to read.

My own view is that you should be able to stick to char for characters and
int for integers, in almost every situation. However this is only tenable if
you can use int as both an arbitrary array index and a fast type.
 
F

Flash Gordon

Malcolm McLean wrote, On 18/01/08 09:09:
Yes. Also pointer to char is the type normally taken by "string"
functions including (but not limited to) those in the standard library.
The remainder of the standard integer types are general
purpose. Their only requirement is to satisfy a minimum range of
values, and also int "has the natural size suggested by the
architecture of the execution environment".
Yes.
What are the reasons for
using these types instead of the int_fastN_t types of <stdint.h>?


Well, stdint.h was only introduced in the 1999 standard, a standard that
is not fully implemented by many compilers and not at all by at least
one major player.

Yes, this is true, and an excellent reason for using them. It does limit
portability to implementations that do not support this part of C99,
however it is always possible to write your own version of stdint.h for
those systems.
<stdint.h> also offers int_leastN_t types, which are useful when
keeping a small storage size is the greatest concern.


Yes. Also due to the possibly smaller size they can be *faster* for some
purposes. For example if the smaller size means everything is kept in
cache instead of having to be fetched.

Yes, that is a potential advantage.

There is an argument that fseek should have used another type...

Of course, that is why there is fsetpos.

Yes, and it is not entirely the fault of MS. It is the programmers who
assumed that it would always be exactly 32 bits and/or assumed it would
always be the same size as int. Not breaking such 3rd party code, as I
understand it, was the reason for MS keeping long as 32 bits on Win64.
Yes, we're rapidly going down the path of destroying the C basic integer
types.

Note that this is an opinion that seems to be almost unique to Malcolm.
A lot of what Malcolm disagrees with was part of the original standard
published in 1989 so he has a very strange idea of "rapidly".

Note that some others think the fixed width types are a mistake,
although some of us disagree, there being arguments on both side. More
people (I think) would have liked things like int32_t being the fast
types and having int_exact or int_fixed for the optional fixed sized types.
Once you start inventing types like int_fast16_t people will use them,
and the language becomes more and more difficult to read.

In *your* opinion.
My own view is that you should be able to stick to char for characters
and int for integers, in almost every situation. However this is only
tenable if you can use int as both an arbitrary array index and a fast
type.

Which is not its purpose and does not agree with the way modern
processors work since often a 32 bit integer will be faster than a 64
bit integer (even on 64 bit hardware) yet you need a 64 bit integer for
an arbitrary index. Similar things were true for eariler HW in terms of
16/32 bits. So Malcolm's view is that C has to meet two contradictory
requirements at the same time.
 
B

Bart C

Malcolm McLean said:
Yes, we're rapidly going down the path of destroying the C basic integer
types.

Once you start inventing types like int_fast16_t people will use them, and
the language becomes more and more difficult to read.
My own view is that you should be able to stick to char for characters and
int for integers, in almost every situation. However this is only tenable
if you can use int as both an arbitrary array index and a fast type.

My own view is the opposite. How can one program without knowing the bitsize
of one's datatypes? But I've probably spent too many years low-level
programming.

I gave one example in c.l.c recently of a long int datatype that was either
32 or 64-bits depending on compiler -- on the same processor. So if you port
a supposedly portable program from one compiler to another, it could run
much slower (or faster).

Anyway for interfacing with other (binary) software, the exact sizes of
datatypes becomes important. Or (in my case) trying to interface from
another language to C.
 
R

Richard Heathfield

Flash Gordon said:
Malcolm McLean wrote, On 18/01/08 09:09:


Note that this is an opinion that seems to be almost unique to Malcolm.
A lot of what Malcolm disagrees with was part of the original standard
published in 1989 so he has a very strange idea of "rapidly".

Here at least, I can agree with you.
Note that some others think the fixed width types are a mistake,
although some of us disagree, there being arguments on both side. More
people (I think) would have liked things like int32_t being the fast
types and having int_exact or int_fixed for the optional fixed sized
types.

I would rather not have seen these types at all, since they seem to me to
enshrine poor practice. But I recognise that there are some spheres of
programming in which they could prove useful.
In *your* opinion.

I think he has a point. At the very least, it becomes *uglier* to read. C
as it stands, if well-written, is at least a relatively elegant language,
not just technically and syntactically but also visually. All these
stretched-out underscore-infested type names will be a real nuisance when
scanning quickly through unfamiliar code.

<snip>
 
R

Richard Heathfield

Bart C said:
How can one program without knowing the bitsize of one's datatypes?

We know the minimum value range of our data types - why would we need to
know more than that?
But I've probably spent too many years low-level programming.

Okay, that's one reason. Any more? Huh? Huh? :)
 
M

Malcolm McLean

Bart C said:
My own view is the opposite. How can one program without knowing the
bitsize of one's datatypes? But I've probably spent too many years
low-level
programming.
If you've got to interface with assembly, yes, there is no real option other
than to check through the calling code making sure that the bits are as you
want them. I don't think there's an answer; sometimes you want integers the
size of registers, sometimes fixed sizes, and the assembly code is as likely
to know as the calling C code.

However normally you don't. The integer represents something, which is
normally an index into an array (even chars are really usually indices into
glyph tables). So what you need is an integer that can index the biggest
array possible, and is also fast. Which on some architectures is a bit of a
contradiction, because the vast majority of your arrays will never grow to
more than a hundred items or so, whilst there is a flat memory space that is
many gigabytes in size.
 
F

Flash Gordon

Bart C wrote, On 18/01/08 11:03:

My own view is the opposite. How can one program without knowing the bitsize
of one's datatypes?

A lot of the time, very easily.
But I've probably spent too many years low-level
programming.
Possibly.

I gave one example in c.l.c recently of a long int datatype that was either
32 or 64-bits depending on compiler -- on the same processor. So if you port
a supposedly portable program from one compiler to another, it could run
much slower (or faster).

It could run much slower of faster due to the quality of implementation
of the standard library as well.
Anyway for interfacing with other (binary) software, the exact sizes of
datatypes becomes important. Or (in my case) trying to interface from
another language to C.

Often you *still* don't need to know. I can interface C and C++ without
needing to know, all I need to know is that the compilers that I am
using for the two languages use the same sizes, not what they are. If it
is simply a binary library designed for C on my platform then again all
I need to know is the types (which should be specified in a header file)
and not their sizes.

There *are* times when you need to know, and there are times when you
need to know a type has at least a specific range, but a lot of the time
you do not care if it is larger.
 
P

pete

If I want just a signed integer with at least 16 width, then why
choose int instead of int_fast16_t? int_fast16_t is the "fastest"
signed integer type that has at least 16 width, while int is simply a
signed integer type that has at least 16 width.

Two reasons:
1 int is everywhere.
int_fast16_t isn't avaailable on all C implementations.
2 int exactly fits the description of what you said you wanted
"want just a signed integer with at least 16 width"
 
F

Flash Gordon

Richard Heathfield wrote, On 18/01/08 11:29:
Flash Gordon said:



I think he has a point. At the very least, it becomes *uglier* to read. C
as it stands, if well-written, is at least a relatively elegant language,
not just technically and syntactically but also visually. All these
stretched-out underscore-infested type names will be a real nuisance when
scanning quickly through unfamiliar code.

That, in my opinion, is an argument over the chosen names rather than
the addition of the types. Personally I don't find underscores in names
a problem for scanning, especially once I have learnt the patterns.
 
B

Bart C

Richard Heathfield said:
Bart C said:


We know the minimum value range of our data types - why would we need to
know more than that?

For ... performance?

If I know I need, say, 32-bits, but not 64-bits which would be an overkill,
what do I use? int could be only 16. long int is at least 32 but could be
64.

I would have to choose long int but at the risk of being inefficient (I
might need a few hundred million of them).

If I distribute an appl as source code I don't have control of the final
size unless I specify the exact compiler and version.

So it's necessary to use alternatives, like the stuff in stdint.h.
 
R

Richard Heathfield

Bart C said:
For ... performance?

<shrug> For some, maybe. I generally find that simply selecting good
algorithms is sufficient to give me "good-enough" performance. Yeah,
absolutely, there are some situations where you need to hack every last
spare clock out, but they are not as common as people like to imagine. I'd
rather write clear code than fast code (which doesn't mean I don't like my
code to be fast). And in any case, when you start selecting types based on
their performance, it won't be long before you discover that what's faster
on one machine could well turn out to be slower on another.

So it's necessary to use alternatives, like the stuff in stdint.h.

If you'd said that *you* find it necessary, okay, I'd have to accept that,
obviously. I don't think I've ever found it necessary, though.
 
E

euler70

Bart C said:



<shrug> For some, maybe. I generally find that simply selecting good
algorithms is sufficient to give me "good-enough" performance. Yeah,
absolutely, there are some situations where you need to hack every last
spare clock out, but they are not as common as people like to imagine. I'd
rather write clear code than fast code (which doesn't mean I don't like my
code to be fast). And in any case, when you start selecting types based on
their performance, it won't be long before you discover that what's faster
on one machine could well turn out to be slower on another.
[snip]

The problem with signed char and [unsigned] (short|int|long|long long)
is that they are too general purpose. They are at an awkward
not-very-useful spot between the super-high-level "give me an integer
object in which I can store any integer value" and the super-low-level
"give me this many bits that I can play with". As a result, what seems
to have happened in practice is that different camps have created
their own de-facto purposes for some of these types. For example in
the world of Windows, long is essentially a type that has exactly 32
bits. Elsewhere, long may be the de-facto 64-bit type.

For portable code, this can become detrimental to efficiency. If I
want a fast type with at least 32-bit range, C90 says I should choose
long. This might end up being a good choice for one compiler, but on
another compiler where long has another de-facto purpose for long that
causes long to be significantly less efficient than another available
at-least-32-bit type, then half of the intent behind my choice of long
has been ruined.

If you argue against the preceding paragraph by saying "you should not
be so concerned about efficiency", then I think your reasoning is a
very short step away from concluding that we can discard our worries
about type ranges and efficiency and simply use only intmax_t and
uintmax_t everywhere. Surely this is not the level of abstraction that
C is intended for.

This is the reasoning that has led me to conclude that the
int_(fast|least)N types are more useful than signed char and
[unsigned] (short|int|long|long long). They allow me to state my
entire intent instead of stating only half of it and hoping the other
half works out. Having types that allow me to say "I want the
(fastest|smallest) type that gives me at least N bits" is more useful
than having types that only allow me to say "I want a type that gives
me at least N bits".
 
J

James Kuyper

char and unsigned char have specific purposes: char is useful for
representing characters of the basic execution character set and
unsigned char is useful for representing the values of individual
bytes. The remainder of the standard integer types are general
purpose. Their only requirement is to satisfy a minimum range of
values, and also int "has the natural size suggested by the
architecture of the execution environment". What are the reasons for
using these types instead of the int_fastN_t types of <stdint.h>?


Keep in mind that the C type system grew over decades. The committee
considers backwards compatibility to be very important (IMO, correctly),
but it has also attempted to alleviate some of the problems associated
with the original language design. As a result of those conflicting
goals, C has a lot of internal inconsistencies.

If it had been designed from scratch with something similar to the
current result in mind, we would probably have only the size-named types
from stdint.h, they wouldn't require a special header, and they'd
probably have simpler, shorter names. Aside from the fact that their
names are easier to type, char, short, int, and don't have any inherent
advantage over the size-named types.

If and when C99 gets fully adopted by most mainstream compilers and
programmers, the main remaining reason for using char, short, int, or
long, will be that your code must be compatible with an interface
defined in terms of those types. That applies to almost the entire C
standard library, as well as large parts of most of the other libraries
I've used.
 
M

Malcolm McLean

Bart C said:





<shrug> For some, maybe. I generally find that simply selecting good
algorithms is sufficient to give me "good-enough" performance. Yeah,
absolutely, there are some situations where you need to hack every last
spare clock out, but they are not as common as people like to imagine.
I'd
rather write clear code than fast code (which doesn't mean I don't like
my
code to be fast). And in any case, when you start selecting types based
on
their performance, it won't be long before you discover that what's
faster
on one machine could well turn out to be slower on another.
[snip]

The problem with signed char and [unsigned] (short|int|long|long long)
is that they are too general purpose. They are at an awkward
not-very-useful spot between the super-high-level "give me an integer
object in which I can store any integer value" and the super-low-level
"give me this many bits that I can play with". As a result, what seems
to have happened in practice is that different camps have created
their own de-facto purposes for some of these types. For example in
the world of Windows, long is essentially a type that has exactly 32
bits. Elsewhere, long may be the de-facto 64-bit type.

For portable code, this can become detrimental to efficiency. If I
want a fast type with at least 32-bit range, C90 says I should choose
long. This might end up being a good choice for one compiler, but on
another compiler where long has another de-facto purpose for long that
causes long to be significantly less efficient than another available
at-least-32-bit type, then half of the intent behind my choice of long
has been ruined.

If you argue against the preceding paragraph by saying "you should not
be so concerned about efficiency", then I think your reasoning is a
very short step away from concluding that we can discard our worries
about type ranges and efficiency and simply use only intmax_t and
uintmax_t everywhere. Surely this is not the level of abstraction that
C is intended for.

This is the reasoning that has led me to conclude that the
int_(fast|least)N types are more useful than signed char and
[unsigned] (short|int|long|long long). They allow me to state my
entire intent instead of stating only half of it and hoping the other
half works out. Having types that allow me to say "I want the
(fastest|smallest) type that gives me at least N bits" is more useful
than having types that only allow me to say "I want a type that gives
me at least N bits".

I absolutely agree.
I think we are creating a mess with all these integer types and conventions
on how they should be used.

However generally I want an integer which can index or count an array, and
is fast, and is signed, because intermediate calculations might go below
zero.
This is usually semi-achievable. There will be a fast type the same width as
the address bus, the one bit we can ignore as we are unlikely to want a char
array taking up half memory (we can always resort to "unsigned" for that
special situation), but it might not be the fastest type, and most arrays
will in fact be a lot smaller than the largest possible array.
 
R

Richard Heathfield

Flash Gordon said:
Richard Heathfield wrote, On 18/01/08 11:29:
At the very least, [C with new integer types] becomes *uglier* to read.
C as it stands, if well-written, is at least a relatively elegant
language, not just technically and syntactically but also visually. All
these stretched-out underscore-infested type names will be a real
nuisance when scanning quickly through unfamiliar code.

That, in my opinion, is an argument over the chosen names rather than
the addition of the types.

Yes, it is. My argument against the new types *was* that they are
unnecessary, but I accept that what I really mean is that *I* don't see a
need for them in the kind of code I tend to write. If they will bring real
benefits to other C programmers, well, they're a wart I can live with,
since at least I won't have to come across it all that often, and then
only in other people's code, not my own.

But they could have found better names, surely? Abigail, for instance. Or
Rhododendron.

Yeah, all right, maybe not those precise names... :)
Personally I don't find underscores in names
a problem for scanning, especially once I have learnt the patterns.

Is ugliness a problem? I guess ugliness is in the eye of the beholder.
 
R

Richard Heathfield

(e-mail address removed) said:

If I
want a fast type with at least 32-bit range, C90 says I should choose
long. This might end up being a good choice for one compiler, but on
another compiler where long has another de-facto purpose for long that
causes long to be significantly less efficient than another available
at-least-32-bit type, then half of the intent behind my choice of long
has been ruined.

If you argue against the preceding paragraph by saying "you should not
be so concerned about efficiency",

It depends. :) We cannot and should not *ignore* efficiency.
Nevertheless, there is more to life than speed. Correctness, clarity,
generality and portability are all important too. But, as I said before,
there *are* occasions when you need to push the hardware as fast as it
will go. I do accept that.

<snip>
 
F

Flash Gordon

Malcolm McLean wrote, On 18/01/08 12:19:
However normally you don't. The integer represents something, which is
normally an index into an array (even chars are really usually indices
into glyph tables).

Please be aware that Malcolm seems to be the only person who thinks this.
So what you need is an integer that can index the
biggest array possible, and is also fast. Which on some architectures is
a bit of a contradiction, because the vast majority of your arrays will
never grow to more than a hundred items or so, whilst there is a flat
memory space that is many gigabytes in size.

As a result Malcolm seems to be the only person who thinks this.

The flaws in Malcolm's arguments have been pointed out many times so you
should be able to find them using Google.
 
F

Flash Gordon

Richard Heathfield wrote, On 18/01/08 13:37:
Flash Gordon said:

<snip discussion of types in stdint.h>

I think we reached this point before.
But they could have found better names, surely? Abigail, for instance. Or
Rhododendron.

Yeah, all right, maybe not those precise names... :)

Of course not those names, they should be Brenda, Heather... ;-)
Is ugliness a problem? I guess ugliness is in the eye of the beholder.

It is indeed. I'm so used to underscores in names that I don't see them
as such I just see N words grouped together.
 
F

Flash Gordon

Malcolm McLean wrote, On 18/01/08 13:25:
Bart C said:

Bart C said:

How can one program without knowing the bitsize of one's datatypes?

We know the minimum value range of our data types - why would we
need >> to
know more than that?

For ... performance?

<shrug> For some, maybe. I generally find that simply selecting good
algorithms is sufficient to give me "good-enough" performance. Yeah,
absolutely, there are some situations where you need to hack every last
spare clock out, but they are not as common as people like to
imagine. I'd
rather write clear code than fast code (which doesn't mean I don't
like my
code to be fast). And in any case, when you start selecting types
based on
their performance, it won't be long before you discover that what's
faster
on one machine could well turn out to be slower on another.
[snip]

The problem with signed char and [unsigned] (short|int|long|long long)
is that they are too general purpose. They are at an awkward
not-very-useful spot between the super-high-level "give me an integer
object in which I can store any integer value" and the super-low-level
"give me this many bits that I can play with". As a result, what seems
to have happened in practice is that different camps have created
their own de-facto purposes for some of these types. For example in
the world of Windows, long is essentially a type that has exactly 32
bits. Elsewhere, long may be the de-facto 64-bit type.

For portable code, this can become detrimental to efficiency. If I
want a fast type with at least 32-bit range, C90 says I should choose
long. This might end up being a good choice for one compiler, but on
another compiler where long has another de-facto purpose for long that
causes long to be significantly less efficient than another available
at-least-32-bit type, then half of the intent behind my choice of long
has been ruined.

If you argue against the preceding paragraph by saying "you should not
be so concerned about efficiency", then I think your reasoning is a
very short step away from concluding that we can discard our worries
about type ranges and efficiency and simply use only intmax_t and
uintmax_t everywhere. Surely this is not the level of abstraction that
C is intended for.

This is the reasoning that has led me to conclude that the
int_(fast|least)N types are more useful than signed char and
[unsigned] (short|int|long|long long). They allow me to state my
entire intent instead of stating only half of it and hoping the other
half works out. Having types that allow me to say "I want the
(fastest|smallest) type that gives me at least N bits" is more useful
than having types that only allow me to say "I want a type that gives
me at least N bits".

I absolutely agree.


Do you realise you are agreeing to having several integer types? I
though you wanted there to be only a single one size fits hardly anyone
integer type!
I think we are creating a mess with all these integer types and
conventions on how they should be used.

Urm, he was just saying to use the new types which you say make things work!
However generally I want an integer which can index or count an array,
and is fast, and is signed, because intermediate calculations might go
below zero.

Unsigned arithmetic handles that nicely, but to meet your desires I
suggest you use ptrdiff_t and size_t as appropriate. These types have
been there since at least the 1989 standard was implemented.
This is usually semi-achievable. There will be a fast type the same
width as the address bus, the one bit we can ignore as we are unlikely
to want a char array taking up half memory (we can always resort to
"unsigned" for that special situation), but it might not be the fastest
type, and most arrays will in fact be a lot smaller than the largest
possible array.

See above, the types you want have been supported for a long time. If
you don't like the spelling you can typedef them to something else.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

C99 integer types 24
Types 58
Types in C 117
Integer types in embedded systems 22
C99 stdint.h 20
Performance of signed vs unsigned types 84
ansi types 11
using pre-processor to count bits in integer types... 17

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top