size_t

  • Thread starter Steffen Fiksdal
  • Start date
S

Steffen Fiksdal

Can somebody please give me some rules of thumb about
when I should be using size_t instead of for example int ?

Is size_t *always* typedef'd as the largest unsigned integral type on all
systems ?

I know that the strlen() functions and such returns a size_t,
and it should be received that way.

I am at the time creating some sort of cryptolibrary. Should all my
API methods use size_t parameters instad of for example int parameters,
and why ?

And..Should I bother using size_t other places than when stdlib functions
need/return then in my library? (If yes, why?)

I bother because my code now works on linux,sun solaris and windoze (32
bits), and I don't want strange problems running it on 64 bit
architectures.

Best regards
Steffen
 
B

Ben Pfaff

Steffen Fiksdal said:
Can somebody please give me some rules of thumb about
when I should be using size_t instead of for example int ?

Use size_t to represent the size of an object.
Is size_t *always* typedef'd as the largest unsigned integral type on all
systems ?

No. (That's uintmax_t, at least on C99 systems.)
I am at the time creating some sort of cryptolibrary. Should all my
API methods use size_t parameters instad of for example int parameters,
and why ?

size_t is the appropriate type to represent the size of an
object. If you're representing a size, use size_t, otherwise
pick another appropriate type.

It is not appropriate to simply search-and-replace `int' by
`size_t'.
 
E

Eric Sosman

Steffen said:
Can somebody please give me some rules of thumb about
when I should be using size_t instead of for example int ?

Generally, use a size_t whenever you're describing the
size of something. Some people also recommend using size_t
for array indices, but this is sometimes inconvenient.
Is size_t *always* typedef'd as the largest unsigned integral type on all
systems ?

No; there may be integers larger than any possible object.
For example, you'll find systems where size_t is 32 bits but
unsigned long long is 64.
I know that the strlen() functions and such returns a size_t,
and it should be received that way.

I am at the time creating some sort of cryptolibrary. Should all my
API methods use size_t parameters instad of for example int parameters,
and why ?

Yes, probably. Why? Because an int may not be able to
describe the sizes of the buffers and what-not you pass back
and forth. Remember, an int could be as narrow as 16 bits.
And..Should I bother using size_t other places than when stdlib functions
need/return then in my library? (If yes, why?)

Use size_t for object sizes, array sizes, and so on. If
you're dealing with arrays that might have more than 32767
elements, use size_t array indices.
I bother because my code now works on linux,sun solaris and windoze (32
bits), and I don't want strange problems running it on 64 bit
architectures.

Using size_t where appropriate will avoid some kinds of
trouble. Turn the question around and think of it this way:
Would you use signed char to describe buffer sizes? If not,
why not?
 
K

Keith Thompson

Steffen Fiksdal said:
Can somebody please give me some rules of thumb about
when I should be using size_t instead of for example int ?

Is size_t *always* typedef'd as the largest unsigned integral type on all
systems ?

No. size_t is an unsigned type capable of representing the size (in
bytes) of an object (though it's conceivable that there could be
objects too big for size_t). For example, a 32-bit system could have
a 32-bit size_t but a 64-bit "long long".
I know that the strlen() functions and such returns a size_t,
and it should be received that way.

I am at the time creating some sort of cryptolibrary. Should all my
API methods use size_t parameters instad of for example int parameters,
and why ?

And..Should I bother using size_t other places than when stdlib functions
need/return then in my library? (If yes, why?)

I bother because my code now works on linux,sun solaris and windoze (32
bits), and I don't want strange problems running it on 64 bit
architectures.

size_t is the best type to use if you want to represent sizes of
objects. Using int to represent object sizes is likely to work on
most modern systems, but it isn't guaranteed. On the other hand, the
fact that size_t is unsigned can create some pitfalls. For example,
the following is an infinite loop:

size_t s;
for (s = 10; s >= 0; s --) {
/* ... */
}
 
C

Chris Croughton

Can somebody please give me some rules of thumb about
when I should be using size_t instead of for example int ?

Is size_t *always* typedef'd as the largest unsigned integral type on all
systems ?

No, it only needs to be able to handle the size of any object (for
instance, on a 32 bit system the maximum address is 32 bits, but c99
(and gcc) supports a "long long" type which is guaranteed to be at least
64 bits by C99).
I know that the strlen() functions and such returns a size_t,
and it should be received that way.

I am at the time creating some sort of cryptolibrary. Should all my
API methods use size_t parameters instad of for example int parameters,
and why ?

Ones which are the sizes of things, like buffer lengths, and return
values which are sizes like buffer objects should be size_t. Signed
values and algorithm parameters should be an appropriate integer type.
And..Should I bother using size_t other places than when stdlib functions
need/return then in my library? (If yes, why?)

Again, yes when dealing with things like buffer offsets (unless you know
that it won't be bigger than 32K-1).
I bother because my code now works on linux,sun solaris and windoze (32
bits), and I don't want strange problems running it on 64 bit
architectures.

size_t will always be big enough to cope with object sizes. The best
thing would be to use the sized integer types in stdint.h if you are
using a C99 compiler and library, but if not you may want to detect it
using the values from limits.h (LLONG_MAX will only be defined if long
long is supported -- unfortunately it might not be defined even if long
long is supported for instance using gcc in c89 mode).

Alternatively, if you want to stick with a Unix-type platform (including
Cygwin under Win32) you can use autoconf which will (with appropriate
rules) determine the sizes of types at installation time so you can use
them in preprocessor tests to typedef appropriate types.

Chris C
 
R

Ravi Uday

size_t is the best type to use if you want to represent sizes of
objects. Using int to represent object sizes is likely to work on
most modern systems, but it isn't guaranteed. On the other hand, the
fact that size_t is unsigned can create some pitfalls. For example,
the following is an infinite loop:

size_t s;
for (s = 10; s >= 0; s --) {
/* ... */
}
What are the pitfalls of size_t ! This loop appears straight forward to me.
Yet it runs to infinite loop as you has suggested.
For me it prints (when using %u specifier)
s = 10
s = 9
s = 8
s = 7
s = 6
s = 5
s = 4
s = 3
s = 2
s = 1
s = 0 <------- Shouldnt for loop break after this ?
s = 4294967295
s = 4294967294
s = 4294967293

Thanks,
- Ravi
 
C

CBFalconer

Ravi said:
What are the pitfalls of size_t ! This loop appears straight forward
to me. Yet it runs to infinite loop as you has suggested.
For me it prints (when using %u specifier)
s = 10
s = 9
s = 8
s = 7
s = 6
s = 5
s = 4
s = 3
s = 2
s = 1
s = 0 <------- Shouldnt for loop break after this ?
s = 4294967295
s = 4294967294
s = 4294967293

Why? Where do you see a value of s that fails to meet the simple
test "s >= 0"? The operations on s are following the rules for
unsigned integral values, in which it is hard to uniquely express a
negative value.
 
D

Dan Pop

In said:
Use size_t to represent the size of an object.


No. (That's uintmax_t, at least on C99 systems.)

And unsigned long on C89 implementations.

Dan
 
D

Dan Pop

In said:
size_t is the best type to use if you want to represent sizes of
objects.

Only if you need to be able to represent the size of any object supported
by the implementation. Which is seldom the case.
Using int to represent object sizes is likely to work on
most modern systems, but it isn't guaranteed.

It is guaranteed to fail on most modern 64-bit systems, that use 32-bit
int and 64-bit size_t.

The best argument in favour of using size_t in a function interface is
consistency with the standard C library itself, which uses size_t for
(almost) all function arguments that represent the size of an object.

So, if your own library is known not to manipulate *any* objects whose
size exceeds, say, 10000 bytes, the type int is just fine for the job,
but size_t (marginally) improves your code's readability.
On the other hand, the
fact that size_t is unsigned can create some pitfalls. For example,
the following is an infinite loop:

size_t s;
for (s = 10; s >= 0; s --) {
/* ... */
}

That's why you seldom want to have size_t variables, other than the ones
declared in your function interfaces.

Unsigned types are better reserved for bitwise operations *only*, unless
you really need the additional range they provide. So, even if you know
that your "count" variable is never going to have a negative value, it's
still better to make it int and not unsigned.

Dan
 
K

Keith Thompson

And unsigned long on C89 implementations.

Agreed -- but many pre-C99 implementations provide "long long" and
"unsigned long long" as an extension. Such implementations *should*
provide a way to disable any extensions, including the "long long" and
"unsigned long long" types.
 
K

Keith Thompson

Only if you need to be able to represent the size of any object supported
by the implementation. Which is seldom the case.

But there's no harm in using size_t to represent sizes even if your
program will never create an object bigger than INT_MAX bytes (other
than the gotchas of using an unsigned type).
It is guaranteed to fail on most modern 64-bit systems, that use 32-bit
int and 64-bit size_t.

You're right, my mistake. But it will fail only if you have objects
bigger than 2**31-1 bytes.

[...]
That's why you seldom want to have size_t variables, other than the ones
declared in your function interfaces.

Unsigned types are better reserved for bitwise operations *only*, unless
you really need the additional range they provide. So, even if you know
that your "count" variable is never going to have a negative value, it's
still better to make it int and not unsigned.

One approach is to avoid using unsigned types except where absolutely
necessary. Another is to use them whenever they seem appropriate
(e.g., using size_t for object sizes and array lengths) and watch out
for the pitfalls. I'm not sure which approach is better in general.

Intuitively, signed types act more like mathematical integers than
unsigned types do. When programming with signed types, you're more
likely to be able to get away with thinking of them as mathematical
integers. Things break down when you reach the limits of the type.
For signed types, the limits are likely to be beyond any values you're
going to deal with (but overflow is still an issue); for unsigned
types, the lower limit is 0, and you're very likely to run into it.

POSIX provides a ssize_t typedef, the signed equivalent of size_t.
ISO C does not. (ptrdiff_t is likely to be the same size as size_t,
but using it for anything other than pointer subtraction would be
ugly.)
 
D

Dan Pop

In said:
Agreed -- but many pre-C99 implementations provide "long long" and
"unsigned long long" as an extension. Such implementations *should*
provide a way to disable any extensions, including the "long long" and
"unsigned long long" types.

If they don't, they are not C implementations, just like lcc-win32.

Dan
 
D

Dan Pop

In said:
But there's no harm in using size_t to represent sizes even if your
program will never create an object bigger than INT_MAX bytes (other
than the gotchas of using an unsigned type).

There is no harm, indeed, but there are the pitfalls associated with
any unsigned type that doesn't get promoted to int. You've mentioned
one of them below.
You're right, my mistake. But it will fail only if you have objects
bigger than 2**31-1 bytes.

If you're not worried about large objects, there is little point in
bothering with size_t, except for readability purposes.
One approach is to avoid using unsigned types except where absolutely
necessary. Another is to use them whenever they seem appropriate
(e.g., using size_t for object sizes and array lengths) and watch out
for the pitfalls. I'm not sure which approach is better in general.

The one with fewer pitfalls, of course. Even if you're competent and
careful, the next maintainer of your code may not be.

Dan
 
K

Keith Thompson

In said:
(e-mail address removed) (Dan Pop) writes: [...]
It is guaranteed to fail on most modern 64-bit systems, that use 32-bit
int and 64-bit size_t.

You're right, my mistake. But it will fail only if you have objects
bigger than 2**31-1 bytes.

If you're not worried about large objects, there is little point in
bothering with size_t, except for readability purposes.

If you're not worried about large objects, it's quite possible that
you should be, and that your program or a future version of it will
some day be dealing with larger objects than you've anticipated. I'd
rather write code that can deal with objects in general. It's
actually easier than trying to guess what assumptions I can get away
with making.

I don't think using size_t is that much of a bother anyway.

[...]
The one with fewer pitfalls, of course. Even if you're competent and
careful, the next maintainer of your code may not be.

If the next maintainer of my code is incompetent and careless, there's
nothing I can do to keep him from screwing it up somehow. If I want
code with training wheels, I'll use a language other than C.
 
D

Dan Pop

In said:
[email protected] (Dan Pop) said:
In said:
(e-mail address removed) (Dan Pop) writes: [...]
It is guaranteed to fail on most modern 64-bit systems, that use 32-bit
int and 64-bit size_t.

You're right, my mistake. But it will fail only if you have objects
bigger than 2**31-1 bytes.

If you're not worried about large objects, there is little point in
bothering with size_t, except for readability purposes.

If you're not worried about large objects, it's quite possible that
you should be, and that your program or a future version of it will
some day be dealing with larger objects than you've anticipated.

If the largest object currently needed has a size of 100 bytes, it's
hard to imagine a future version of the same program needing to handle
objects above 2 gigabytes. These sizes are usually imposed by the
problem and are invariant between different versions of the program.
I'd
rather write code that can deal with objects in general. It's
actually easier than trying to guess what assumptions I can get away
with making.

Who said *anything* about guessing? The sizes are usually well defined
by the problem. If you're writing a server and the size of the largest
valid request supported by the protocol you're implementing is 100 bytes,
there is little point in writing your code so that it can handle 10
gigabyte requests: if such a request ever arrives, it is guaranteed to
be invalid and you lose nothing by truncating it to the size of the
largest valid request plus one.

In most real life programs, being able to handle arbitrarily sized
objects has some (non-trivial) costs. These costs must be very
carefully weighted against the actual necessity for handling such
objects.
I don't think using size_t is that much of a bother anyway.

Using size_t doesn't solve, by magic, all the problems related to
supporting arbitrarily sized objects, but it has all the disadvantages
related to the usage of unsigned objects in contexts not requiring them.
If the next maintainer of my code is incompetent and careless, there's
nothing I can do to keep him from screwing it up somehow.

But there are plenty of things you can do to help him get the things
right.
If I want code with training wheels, I'll use a language other than C.

I'm not aware of any language suitable for writing such code.

Dan
 
A

Albert van der Horst

One approach is to avoid using unsigned types except where absolutely
necessary. Another is to use them whenever they seem appropriate
(e.g., using size_t for object sizes and array lengths) and watch out
for the pitfalls. I'm not sure which approach is better in general.

I agree with Dan here. Unsigned's are not "appropriate" for array
length, just because negative array length happens not make much
sense. They are inappropriate because they are used as cardinal
(how much) and ordinal (where). Or in plain English they are
used to count. That is what integers are for.

I would add that it is most unfortunate that many times
in the 16-bit era the difference between counting to 30000
and counting to 60000 made a huge difference in the usability
of a program. The difference between being able to handle
2G and 4G is much less, so I hope programmers get away from the
habit to squeeze that little more range out of a number, just
because negative values seems to make no sense.

In my opinion making size_t an unsigned type was a design error.
Intuitively, signed types act more like mathematical integers than
unsigned types do. When programming with signed types, you're more

Or better: unsigned integers are not integers at all. They are
mathematical objects sometimes best considered as modulo integers,
sometimes as bitmaps.

It is quite practical to keep them rigorously separate:
signed : use + - * 1 < > <= >= || &&
unsigned : use | & ^ << >>

Mix : you are not allowed.
Ask the senior programmer to write a function for you to call
and add it to mix_signed_unsigned.c.

A normal int is the mathematical integer. With a range restriction.
likely to be able to get away with thinking of them as mathematical
integers. Things break down when you reach the limits of the type.
For signed types, the limits are likely to be beyond any values you're
going to deal with (but overflow is still an issue); for unsigned
types, the lower limit is 0, and you're very likely to run into it.

.... not likely to be beyond ...? What statistics are you using here?
60.000 points is not an unlikely resolution for graphics today.
Now calculate a Cartesian distance.
And what was the US deficit again?
Ranges are *always* a concern.

Associating unsigned with a lower limit of zero ...
There are languages where you can specify a range restriction,
like Ada. This makes sense because the compiler helps you in
checking those ranges.
Unsigned is not a "range". It is: "it so happens that I am not
able to represent negative numbers."
POSIX provides a ssize_t typedef, the signed equivalent of size_t.
ISO C does not. (ptrdiff_t is likely to be the same size as size_t,
but using it for anything other than pointer subtraction would be
ugly.)

I didn't know that. I'm going to use it instead of size_t where
possible. Still sizeof returns an unsigned, or not?
Thanks.

Groetjes Albert
 
L

Lawrence Kirby

On Tue, 21 Dec 2004 13:23:04 +0000, Albert van der Horst wrote:

....
A normal int is the mathematical integer. With a range restriction.

It would be equally correct to say an unsigned int is a mathematical
integer with a (different) range restriction. In both cases wrong things
happen from the point of view of the mathematical model if you attempt to
exceed the range of each type. The nature of the "wrong thing" doesn't
really matter, the program must simply avoid it happening at all.

There are many quantities that can't be negative and therefore could
naturally be represented using unsigned types. Most counts for example,
e.g. "the number of people who live in my town/city". The main problem
seems to be a need to represent a value just outside the "valid" range,
which makes things like indexing easier.
... not likely to be beyond ...? What statistics are you using here?
60.000 points is not an unlikely resolution for graphics today. Now
calculate a Cartesian distance.
And what was the US deficit again?
Ranges are *always* a concern.

Associating unsigned with a lower limit of zero ... There are languages
where you can specify a range restriction, like Ada. This makes sense
because the compiler helps you in checking those ranges. Unsigned is not
a "range". It is: "it so happens that I am not able to represent
negative numbers."

Unsigned integer types have ranges that do not include negative numbers.
I didn't know that. I'm going to use it instead of size_t where
possible. Still sizeof returns an unsigned, or not? Thanks.

Yes, the result of sizeof has type size_t which is some unsigned integer
type.

Lawrence
 
A

Albert van der Horst

On Tue, 21 Dec 2004 13:23:04 +0000, Albert van der Horst wrote:

....


It would be equally correct to say an unsigned int is a mathematical
integer with a (different) range restriction. In both cases wrong thingsA

When you instruct a newby, is that your starting point to explain
what an unsigned int is?
IMO it is very bad to associate an unsigned int
with the mathematical integers. You probably wouldn't say that
when you were raised on algol68 were the similar concept is called
`bytes'. Conceptually, you are driving yourself in the wrong direction.
(And formally, you are plain wrong. An unsigned int is a modular
integer, as far as mathematics go. If you insist on using the word range
at least use "circular range".)
The concept range implies or at least suggests that:
"when you keep adding one, you hit a boundary". Be it at run time
error or undefined behaviour. As we all know, this is not true for
unsigned's).

"Equally valid" would be saying:

signed int corresponds to Ada type : INTEGER range -0x8000 .. 0x7fff
unsigned int corresponds to Ada type : INTEGER range 0x8000 .. 0xffff

There is so much more to unsigned than range checking, that this is
wrong at worst, counterproductive at best.
It is not tucking different range checkings on the same type,
if range checking were the habit at all, in C.

(The Ada code is probably not valid. It is used for illustration only.
If it helps to understand, fine. Otherwise please ignore.)

Groetjes Albert

--
 
L

Lawrence Kirby

When you instruct a newby, is that your starting point to explain
what an unsigned int is?

Perhaps less formally but it wouldn't be a bad way to start. I doubt
whether I'd start by wading into the modular properties of unsigned types.
IMO it is very bad to associate an unsigned int
with the mathematical integers.

There is a very natural association between unsigned integers and
counting numbers, perhaps even more so than signed integers.
You probably wouldn't say that
when you were raised on algol68 were the similar concept is called
`bytes'. Conceptually, you are driving yourself in the wrong direction.

Maybe a different direction to algol68, but not wrong.
(And formally, you are plain wrong. An unsigned int is a modular
integer, as far as mathematics go. If you insist on using the word range
at least use "circular range".)

This thread was discussing a case where that is irrelevant, i.e. where
anything that deviates from behaviour of non-finite integers arithmetic is
an error. For unsigned integer types the error manifests in modular
reduction, for signed integer types it manifests in undefined behaviour.
That doesn't make signed arithmetic more like non-finite arithmetic than
unsigned arithmetic, it just means that unsigned arithmetic can also be
applied in areas not relevant to the problem in hand.
The concept range implies or at least suggests that:
"when you keep adding one, you hit a boundary".

The range of a type specifies what values that type can represent. It
simply means that it can't represent valus outside that range. 0 to
UINT_MAX is a perfectly good range specification for integers. At least
the standard thinks so. :)
Be it at run time
error or undefined behaviour. As we all know, this is not true for
unsigned's).

And what if that undefined behavioud happens to end up as "wraps around to
the most negative value the type can represent"? Undefined behaviour
simply isn't a boundary in the sense that you would like it to be. It
isn't a wall, it is a fluffy cloud full of mirrors and razorblades.
"Equally valid" would be saying:

signed int corresponds to Ada type : INTEGER range -0x8000 .. 0x7fff
unsigned int corresponds to Ada type : INTEGER range 0x8000 .. 0xffff

There is so much more to unsigned than range checking, that this is
wrong at worst, counterproductive at best.

I have never claimed that a range defines the full nature of unsigned
integer types. What I claim is that unsigned integers can reasonably and
correctly be used for arithmetic operations that are limited to the range
they can represent, as is the case for signed integer types. Of course
they have other uses too. C's like that, for example ints can be and are
used to hold character values (and char is just an integer type) where
various other languages use a separate non-integer type for this purpose.
It is not tucking different range checkings on the same type, if range
checking were the habit at all, in C.

There is simply no range checking built into the C language. So when you
use integers for arithmetic operations it is up to you to make sure that
all results are in the range that the type supports, be it signed or
unsigned.

Lawrence
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

size_t, ssize_t and ptrdiff_t 56
return -1 using size_t??? 44
size_t in inttypes.h 4
compressing charatcers 35
mixed declarations and code (and size_t)? 7
size_t problems 466
questions about size_t 5
size_t - why? 18

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top