Standard integer types vs <stdint.h> types

M

Malcolm McLean

Flash Gordon said:
See above, the types you want have been supported for a long time. If you
don't like the spelling you can typedef them to something else.
bool breaks libraries.

If ANSI decides that "s64" shall be the name of a new type, that's fine. If
I define it myself, either the whole world adopts my convention, which is to
say I have usurrped ANSI / ISO, or it is a nuisance to everybody trying to
call or read my code.
 
C

CBFalconer

Bart said:
For ... performance?

If I know I need, say, 32-bits, but not 64-bits which would be an
overkill, what do I use? int could be only 16. long int is at
least 32 but could be 64.

I would have to choose long int but at the risk of being
inefficient (I might need a few hundred million of them).

The basic problem here is historic. C was organized to allow for
16 bit processors, which lead to the 16 bit minimum size for ints.
Without that the minimum sizes for short, int, long could have been
16, 32, 64, and long long would never have arisen. We can't change
the minima without breaking all sorts of programs.

At least the situation is such that there can be no more than one
trap value for 16 bit ints. :)
 
C

CBFalconer

.... snip ...

For portable code, this can become detrimental to efficiency. If I
want a fast type with at least 32-bit range, C90 says I should
choose long. This might end up being a good choice for one
compiler, but on another compiler where long has another de-facto
purpose for long that causes long to be significantly less
efficient than another available at-least-32-bit type, then half
of the intent behind my choice of long has been ruined.

If you argue against the preceding paragraph by saying "you should
not be so concerned about efficiency", then I think your reasoning
is a very short step away from concluding that we can discard our
worries about type ranges and efficiency and simply use only
intmax_t and uintmax_t everywhere. Surely this is not the level of
abstraction that C is intended for.

I think you have exagerrated the problem. In general there will
only be one beast where you have to be critical of the exact size.
You should set up your own type, and alias it to the appropriate
standard type by using limits.h and conditional defines. Then
write your code in terms of that type. Note that this also allows
for use of other than the multiple of 8 sizes.
 
J

jacob navia

Flash said:
There *are* times when you need to know, and there are times when you
need to know a type has at least a specific range, but a lot of the time
you do not care if it is larger.

Rarely.

If I write

int a = 45000;

that will not work in 16 bit implementations.

You HAVE to know the bitsize to know how much data
you can put into an integer!

Potentially, all assignments are affected unless
you never use integer data bigger than 32767.
 
J

jacob navia

Richard said:
Bart C said:


We know the minimum value range of our data types - why would we need to
know more than that?


Okay, that's one reason. Any more? Huh? Huh? :)

Consider

int i = 38700;

If you do not care about int size you will never know that
this will not work in 16 bit implementations, or (worst)
i will be initialized to something completely different.

Of course YOU never use data bigger than 32767, as everyone knows.

:)
 
F

Flash Gordon

Malcolm McLean wrote, On 18/01/08 15:21:
bool breaks libraries.

So don't use it. Use _Bool if you want instead. There is not requirement
for you to #include said:
If ANSI decides that "s64" shall be the name of a new type, that's fine.
If I define it myself, either the whole world adopts my convention,
which is to say I have usurrped ANSI / ISO, or it is a nuisance to
everybody trying to call or read my code.

Misusing int as you are want to do causes problems.

Either get yourself on the standard committee and try and get the
language changed
Or find a language that is defined as you want and use that
Or write your own language and use that
Or put up with the language as it is defined.

In any case, by your definition of what is wrong it was wrong in the
1989 standard so stop complaining that these things are rapidly
destroying the language, since that is patently false.
 
R

Richard Heathfield

jacob navia said:
Rarely.

If I write

int a = 45000;

that will not work in 16 bit implementations.

That is not a counter-example to Flash's point. If you need an integer type
that can store 45000, then "you need to know a type has at least a
specific range". Since the least range for int is -32767 to +32767, you
know that this won't do in portable code, so you move up to long int:

long int a = 45000;

This will work just fine in 16 bit implementations, no matter how big a
long int is, because it *must* have at least 32 bits.
You HAVE to know the bitsize to know how much data
you can put into an integer!

No, you have to know whether the value you wish to store will fit into the
range for a given type. If so, it doesn't matter how many /more/ bits it
has. And if not, it still doesn't matter how many bits it has - if your
value doesn't fit, it doesn't fit.
Potentially, all assignments are affected unless
you never use integer data bigger than 32767.

Right, which is why we have this thing called long int.
 
R

Richard Heathfield

jacob navia said:

Consider

int i = 38700;

Why? It's not guaranteed to work. Instead, consider:

long int i = 38700;

which *is* guaranteed to work.
If you do not care about int size you will never know that
this will not work in 16 bit implementations, or (worst)
i will be initialized to something completely different.

I don't know whose point you're arguing against, but the above response
does not address any statement I have made.
Of course YOU never use data bigger than 32767, as everyone knows.

Not so. But when I do, I use long int. And if I need values higher than
2147483647 and know they can't be negative, I'll use unsigned long int.
And if I need values higher than 4294967295, I'll use my bignum library.
 
M

Malcolm McLean

jacob navia said:
If I write

int a = 45000;

that will not work in 16 bit implementations.

You HAVE to know the bitsize to know how much data
you can put into an integer!
The question is, why do you want an integer equal to 45000?
We can thinks of lots of possible reasons, but a good one might be that you
have 45000 customer records you wish to read into memory.
Which means that you need 45000 * N bytes of contiguous storage, which most
machines will happily provide for fairly reasonable values of N. But not 16
bit machines.
As long as int is the same as a pointer and memory space is flat, everything
is fine.
 
B

Bart C

Malcolm said:
I think we are creating a mess with all these integer types and
conventions on how they should be used.

That's probably inevitable with C having to target every conceivable
hardware and having to maintain compatibility with every line of code ever
written.

My main interest is PCs which are typically 32/64 bits now, and will be at
least that for the foreseeable future. So I don't really care about anything
other than 8,16,32,64 bits. (And it can't be a coincidence that these sizes
figure heavily in stdint.h.).
However generally I want an integer which can index or count an
array, and is fast, and is signed, because intermediate calculations
might go below zero.
This is usually semi-achievable. There will be a fast type the same
width as the address bus, the one bit we can ignore as we are
unlikely to want a char array taking up half memory (we can always
resort to "unsigned" for that special situation), but it might not be
the fastest type, and most arrays will in fact be a lot smaller than
the largest possible array.

The number of (physical) address bits appearing on the cpu pinouts is
unlikely to be a full 64 bits. Even virtual address spaces are a lot less
than 64-bits I would guess.

(64-bits addresses a *lot* of memory. It might just be enough to address all
the memory in every desktop PC in the world.)

If your task-space is 4GB or less than your array indexes/indices, size-t's
and so on don't need to be more than 32-bit unsigned. But your file-system
is likely to need 64-bits.

(Actually 48-bits will probably suffice for most purposes but that is too
odd a figure.)
 
M

Malcolm McLean

Bart C said:
That's probably inevitable with C having to target every conceivable
hardware and having to maintain compatibility with every line of code
ever
written.

My main interest is PCs which are typically 32/64 bits now, and will be at
least that for the foreseeable future. So I don't really care about
anything
other than 8,16,32,64 bits. (And it can't be a coincidence that these
sizes
figure heavily in stdint.h.).
Yes. In reality we have 8 bit, 16 bit, 32 bit and 64 bit integers to worry
about. And somehow we've managed to create a zoo of types.
The number of (physical) address bits appearing on the cpu pinouts is
unlikely to be a full 64 bits. Even virtual address spaces are a lot less
than 64-bits I would guess.

(64-bits addresses a *lot* of memory. It might just be enough to address
all the memory in every desktop PC in the world.)

If your task-space is 4GB or less than your array indexes/indices,
size-t's
and so on don't need to be more than 32-bit unsigned. But your file-system
is likely to need 64-bits.

(Actually 48-bits will probably suffice for most purposes but that is too
odd a figure.)
48 bits is probably enough, but let's go with 64.
64 bits will address every byte on the biggest hard drive in the world, for
the conceivable future. So if integer calculations are in 64 bits, we should
have no problems.
Unfortunately there are speed considerations. But if a computer has more
than 4GB of addressible memory, these can't be too serious. There must be a
fast way of addressing all 5GB, or the processor is not of much practical
use. So 64 bits can be the default, which you use almost everywhere, except
in very tight inner loops where maybe speed matters, and except for special
data - raw audio data is traditionally 16 bits per sample, for example, you
wouldn't normally want to quadruple memory take. You do need other types,
but only rarely.
 
P

Paul Hsieh

char and unsigned char have specific purposes: char is useful for
representing characters of the basic execution character set and
unsigned char is useful for representing the values of individual
bytes. The remainder of the standard integer types are general
purpose. Their only requirement is to satisfy a minimum range of
values, and also int "has the natural size suggested by the
architecture of the execution environment". What are the reasons for
using these types instead of the int_fastN_t types of <stdint.h>?


The (u)int_fast<N>_t types? I imagine the C language committee will
make up some rationale that was voiced by some vendor who had some
specific reason for wanting to steer programmers into the direction of
using them for some specific application. This ignores, of course (as
the committee is prone to doing), the fact that a compatible system
created by a different vendor might want a different set of integer
types to be used.

Personally I highly recommend that users stay away from using the
(u)int_fast<N>_t since the just puts programmers in the same position
of not knowing what you've really got that using int and long does to
you. The compiler cannot definitively make a policy about what
integers are fast or not before the programmers have even begun to
write code.
If I want just a signed integer with at least 16 width, then why
choose int instead of int_fast16_t? int_fast16_t is the "fastest"
signed integer type that has at least 16 width, while int is simply a
signed integer type that has at least 16 width. It seems that choosing
int_fast16_t is at least as good as choosing int. This argument can be
made for N=8,16,32,64, and for the corresponding unsigned types.

Presumably a system might have a way of emulating 32 bits exactly that
is slow (but for compatibility reasons is chosen as the type for
[unsigned] long), but can perform with 40 bits somewhat faster even
though the int type is only 16 bits. It might be best for picking the
40 bit integer type for an inner loop, while the 32 bit integer type
for storing in a data structure that you want to fit into a cache, or
be externally compatible with other systems.
<stdint.h> also offers int_leastN_t types, which are useful when
keeping a small storage size is the greatest concern.


Yeah. Its pretty unusual that I would use them though. I don't know
when anyone else would.
The only benefit of using the standard integer types I can see is that
their ranges may expand as the C standard progresses, so code that
uses them might stay useful over time.

Well, I use int when I really don't need to know its specific size and
I don't have a need to go over 32767. Similarly for long. They are
also needed for backward compatibility (stdint.h is a new thing from
C99, and has been added to some non-C99 compilers (gcc, Intel, Open
Watcom for example, but not Visual C++).)
[...] For example fseek of <stdio.h>
uses a long for its offset parameter, so if the range of long grows
then fseek will automatically offer a wider range of offsets.

Right except that this is nonsense as I and others have posted
before. "long" has to retain system compatibility even as file
systems change over time. intmax_t (a type which is practically
guaranteed to change as the compiler changes) is the only possible
correct type to use for file system offsets.
It's also interesting to note (or maybe not) that in the Windows
world, the idea of long being general purpose has somewhat been
destroyed and long has become a type that must have exactly 32 bits.

This has been destroyed on *ALL* systems which have made a transition
from one bitness to another (16->32 or 32->64, as typical examples.)
I am unaware of any compiler which, upon upgrading, simply changed the
size of one of the integer types just to correspond to the underlying
system's new capabilities.

stdint.h has been necessary for C from the very beginning, but this
reality only dawned on the committee in 1999. Of course now that the
industry has basically rejected C99 (for other reasons) we are left
without a pervasive stdint.h interface that we can reliably use. I
have personally done what I can to rectify this situation by making
http://www.pobox.com/~qed/pstdint.h publicly available. (If you use
it and notice that it doesn't work on your system please help me
update it.)
 
P

Paul Hsieh

Bart C said:


We know the minimum value range of our data types - why would we need to
know more than that?

You want to do math in a ring, without knowing what ring you are in?
More specifically: exactly how integers wrap around matters to a large
class of applications (like those that accept input.)

The reason why questions like this are cropping up is because of the
transition from 32 bits to 64 bits which is happening right now.
People want to know if they can still get away with using just 32 bits
with just a little more work, and not break their backwards
compatibility by pushing everything to 64 bits -- but doing so means
they need to *know* when their integers wrap around.
Okay, that's one reason. Any more? Huh? Huh? :)

How about creating a big integer library? How are you supposed to
capture/detect a carry if your underlying system happens to be
capturing it for you in extra integer bits it happened to magically
give you?

This also seriously affects some algorithms like primality testing.
If you know the size of your integers is less than 36 bits, there are
well known fast algorithms that can test for primality
deterministically in finite time. If its more bits, then they only
work *nearly all the time*. So if you implement:

long factor (long n) {
long f;
if (isPrimeUpTo36bits (n)) return 1; /* Not factorable */
f = quickDivisor (n);
if (f > 1) return f;
for (f=3; ; f+=2) {
if (0 == (n % f)) return f;
}
}

How do you even know if the algorithm terminates? If the system
decides that long is 40 bits, then it does not terminate. If the
system decides that long is 32 bits then it does.

We could force this to work by putting an extra condition in the for
loop which might cost in terms of performance. (It actually doesn't
in this case, but the (n % f) can be modified in a sort of "strength
reduction" way (using code much to complicated for a quick USENET
post) where it *does* matter.)
 
J

jxh

I think he has a point. At the very least, it becomes
*uglier* to read. C as it stands, if well-written, is at
least a relatively elegant language, not just technically
and syntactically but also visually. All these stretched-out
underscore-infested type names will be a real nuisance when
scanning quickly through unfamiliar code.

I think it would have been nicer to extend the C syntax to allow
bit-field style specifiers for regular integral objects.

unsigned int u16 : 16; /* u16 takes up 16 bits */

-- James
 
J

jameskuyper

Paul said:
You want to do math in a ring, without knowing what ring you are in?
More specifically: exactly how integers wrap around matters to a large
class of applications (like those that accept input.)

If wrap-around is important to your program, then yes, you need an
exact-sized type. I would disagree that wrap-around is needed for any
program that does input. In most of the programs I've ever written,
wrap around is something to be avoided, not something to be used.
Knowing the minimum ranges of a data type, and having symbolic access
(through limits.h) to the actual ranges, is sufficient for avoiding
wrap-around.
 
J

jameskuyper

(e-mail address removed) wrote:
....
.... I would disagree that wrap-around is needed for any
program that does input. ...

That should have been "every", not "any".
 
I

Ian Collins

CBFalconer said:
I think you have exagerrated the problem. In general there will
only be one beast where you have to be critical of the exact size.
You should set up your own type, and alias it to the appropriate
standard type by using limits.h and conditional defines. Then
write your code in terms of that type.

Isn't that what said:
Note that this also allows
for use of other than the multiple of 8 sizes.
Eh?
 
I

Ian Collins

Malcolm said:
The question is, why do you want an integer equal to 45000?
We can thinks of lots of possible reasons, but a good one might be that
you have 45000 customer records you wish to read into memory.
Which means that you need 45000 * N bytes of contiguous storage, which
most machines will happily provide for fairly reasonable values of N.
But not 16 bit machines.
As long as int is the same as a pointer and memory space is flat,
everything is fine.
Even on a 16 bit machine where int is the same as a pointer and memory
space is flat? You appear to have contradicted yourself.
 
P

Paul Hsieh

jacob navia said:

(This has the additional problem the on most 32-bit compilers where it
*does* work, it will *not* give you a warning indicating that this is
not portable.)
Why? It's not guaranteed to work. Instead, consider:

long int i = 38700;

which *is* guaranteed to work.

It also might be unnecessarily slow. You are letting the compiler
vendor make decisions for you.

int32_t i = 38700;

is also guaranteed to work, and is totally non-controversial about
what it means or is doing.
I don't know whose point you're arguing against, but the above response
does not address any statement I have made.


Not so. But when I do, I use long int. And if I need values higher than
2147483647 and know they can't be negative, I'll use unsigned long int.
And if I need values higher than 4294967295, I'll use my bignum library.

So 1) you are dismissing the possibility of using floating point. 2)
Either your bignum library has support for file offsets or you have
vowed not to support the platform specific extensions for dealing with
file offsets on your machine.
 
R

Richard Heathfield

Paul Hsieh said:
You want to do math in a ring, without knowing what ring you are in?

If it matters (which it doesn't always), it is wise to select the ring
precisely, by applying appropriate mathematical operations.
More specifically: exactly how integers wrap around matters to a large
class of applications (like those that accept input.)

Unsigned integers wrap around to 0 at U<TYPE>_MAX + 1. As long as I know
that, either it's good enough (in which case that's fine), or it isn't, in
which case I have to force the behaviour I want (see above).

For signed integers, overflow invokes undefined behaviour anyway, so the
matter doesn't arise.

How about creating a big integer library?

Done that. Didn't need fixed size ints. Next question.
This also seriously affects some algorithms like primality testing.
If you know the size of your integers is less than 36 bits, there are
well known fast algorithms that can test for primality
deterministically in finite time.

I'm trying to think of a use for less-than-36-bit primes, and failing.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

C99 integer types 24
Types 58
Types in C 117
Integer types in embedded systems 22
C99 stdint.h 20
Performance of signed vs unsigned types 84
ansi types 11
using pre-processor to count bits in integer types... 17

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top