When/why to use size_t

Alex Vinokur · May 23, 2006

Why was the size_t type defined in compilers in addition to unsigned
int/long?
When/why should one use size_t?

Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

santosh · May 23, 2006

Alex said:
Why was the size_t type defined in compilers in addition to unsigned
int/long?
When/why should one use size_t?

size_t was defined as a type to hold the size of an object in C. It is
capable of representing the size of the largest possible object
supported by a C implementation. The sizeof operator yields a value of
this type. You can use size_t to hold the size of objects you declare
and manipulate in your programs. It's slightly more portable and
clearer than using unsigned or unsigned long. But different programmers
have their own preferences in this regard.

pete · May 23, 2006

Alex said:
Why was the size_t type defined in compilers in addition to unsigned
int/long?

In the case where unsigned is big enough to do the job
and also a faster type than long unsigned,
size_t should be unsigned.
In a case where unsigned isn't big enough,
then size_t should be long unsigned, (C89).

When/why should one use size_t?

When counting bytes or elements of an array,
or when interfacing with the return values
of the sizeof operator or functions that return size_t.
I like to match data types in a c program
whenever it's not too difficult.

Richard Heathfield · May 23, 2006

Alex Vinokur said:

Why was the size_t type defined in compilers in addition to unsigned
int/long?

Because size_t has a different purpose to unsigned int and unsigned long.
Its primary purpose is to be able to contain a value equal to the size of
any object, although it can certainly be used for other things.

When/why should one use size_t?

When one is storing either the size of an object, or a count of objects. To
illustrate this, just look at some of the standard library functions that
use size_t:

int setvbuf(FILE *stream, char *buf, int mode, size_t size);
size_t fread(void *ptr, size_t size, size_t nmemb,
FILE *stream);
size_t fwrite(const void *ptr, size_t size, size_t nmemb,
FILE *stream);
void *calloc(size_t nmemb, size_t size);
void *malloc(size_t size);
void *realloc(void *ptr, size_t size);
void *bsearch(const void *key, const void *base,
size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
void qsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
int mblen(const char *s, size_t n);
int mbtowc(wchar_t *pwc, const char *s, size_t n);
size_t mbstowcs(wchar_t *pwcs, const char *s, size_t n);
size_t wcstombs(char *s, const wchar_t *pwcs, size_t n);
void *memcpy(void *s1, const void *s2, size_t n);
void *memmove(void *s1, const void *s2, size_t n);
char *strncpy(char *s1, const char *s2, size_t n);
char *strncat(char *s1, const char *s2, size_t n);
int memcmp(const void *s1, const void *s2, size_t n);
int strncmp(const char *s1, const char *s2, size_t n);
size_t strxfrm(char *s1, const char *s2, size_t n);
void *memchr(const void *s, int c, size_t n);
size_t strcspn(const char *s1, const char *s2);
size_t strspn(const char *s1, const char *s2);
void *memset(void *s, int c, size_t n);
size_t strlen(const char *s);
size_t strftime(char *s, size_t maxsize,
const char *format, const struct tm *timeptr);

pete · May 23, 2006

Richard said:
Alex Vinokur said:

When one is storing either the size of an object,
or a count of objects.

I use long unsigned to count the nodes in a list.

I don't think that the maximum number of objects
that can be allocated by malloc,
is related to how many bytes can be in an object.

Richard Heathfield · May 23, 2006

pete said:

I use long unsigned to count the nodes in a list.

I don't think that the maximum number of objects
that can be allocated by malloc,
is related to how many bytes can be in an object.

That is a fair point, but bear in mind that several standard functions do
take or return size_t for an object count - examples include calloc, fread,
fwrite, strspn, and strlen.

pete · May 23, 2006

Richard said:
pete said:

That is a fair point,
but bear in mind that several standard functions do
take or return size_t for an object count
- examples include calloc, fread, fwrite, strspn, and strlen.

I use it like this:

static long unsigned node_count(list_type *head)
{
long unsigned count;

for (count = 0; head != NULL; head = head -> next) {
++count;
}
return count;
}

Ben Pfaff · May 23, 2006

Richard Heathfield said:
Alex Vinokur said:

When one is storing either the size of an object, or a count of
objects.

It may be worth clarifying that in particular it's appropriate
for storing a count of objects *in memory at any given time*.
When objects are stored on disk, then using a different type may
be worthwhile, because there is, quite possibly, orders of
magnitude more disk space than memory. Similarly, when counting
objects that are not necessarily in memory at one time, e.g. when
objects may be created and destroyed dynamically, then there may
be more objects total over time than can fit in size_t and it may
be reasonable to consider using a different type.

Malcolm · May 23, 2006

Alex Vinokur said:
Why was the size_t type defined in compilers in addition to unsigned
int/long?
When/why should one use size_t?

size_t is an uglification of the C language.
A bit like const or noalias, it seems sensible at first sight, but the
implicatins weren't thought through.

OK. Someone might want to allocate more memory than will fit in an int. So
make malloc() take a size_t.
Note that if an int is, as is the intention, the natural size for an integer
on that platform, one needs to ask how an amount of memory can fail to fit
in a register. But pass that by.

So now a string can be szie_t bytes long. So all the string functions need
to take size_t instead of integers, and return them.

Then it gets worse. Any array could have been allocated with malloc(). So
now all you array indices are size_t. So all the counts of array sizes are
size_t as well. So anything that is the result of an operation of a count of
things in the computer is a size_t. So integers virtually disappear from
your code. size_t has run through it.

But size_t's are unsigned. This introduces subtle problems into code.

For instance if we count down a loop
while(N-- > 0)
sudenly the code breaks, because N is of course a count of something, and so
a size_t.

Also, what if we want to subtract two size_ts, for instnace in computing x y
coordinates fro graphics?

It also becomes harder to validate parameters. Image dimensions cannot be
negative. Therefore

void myimagefunction(unsigned char *rgb, int width, int height)
{
assert(width >= 0);
assert(height >= 0)
}

if we are passed random garbage to this function there is a 75% percent
chance of the assets triggering. if the function is called more than a few
times with random garbage, the chance of the assert triggering is
effectively certain.
However width and height are indices, so they've got to be size_t's, since
the image buffer is allocated with malloc().

size_t is a terrible idea that has no place in C code.

CBFalconer · May 23, 2006

pete said:
.... snip ...

I use it like this:

static long unsigned node_count(list_type *head)
{
long unsigned count;

for (count = 0; head != NULL; head = head -> next) {
++count;
}
return count;
}

Which I would precede with a warning comment about execution time
on circular lists.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>

Richard Heathfield · May 23, 2006

Malcolm said:

size_t is an uglification of the C language.

I disagree. I think it's a useful abstraction that has an important place in
code. The only thing that's a bit annoying is the need to cast when
printing the darn things - a problem that will go away in about 50 or 60
years, when C99 becomes widespread.

So anything that is the result of an operation of a count
of things in the computer is a size_t. So integers virtually disappear
from your code. size_t has run through it.

Personally, I don't see this as a huge problem, or indeed necessarily true.
I am a great fan of size_t, and use it all over the place. And yet, if I
pop into my base development directory on this 'ere machine and do this:

find -name \*.c | xargs grep -w int | wc -l

I find 2565 hits, and another 783 in the headers. To give you something to
compare those numbers with, the figures for size_t are 1229 and 578
respectively.

But size_t's are unsigned. This introduces subtle problems into code.

For instance if we count down a loop
while(N-- > 0)
sudenly the code breaks, because N is of course a count of something, and
so a size_t.

Why is the code broken?

#include <stdio.h>

int main(void)
{
int foo[] = { 0, 1, 2, 3, 4 };
size_t N = sizeof foo / sizeof foo[0];
while(N-- > 0)
{
printf(" %d", foo[N]);
}
putchar('\n');

return 0;
}

Output:

4 3 2 1 0

which is exactly what is expected.

It's true that, after the loop ends, N has the value (size_t)-1, but so
what? It was only a loop counter.

Also, what if we want to subtract two size_ts, for instnace in computing x
y coordinates fro graphics?

Well, in graphics we often want to do rotation and scaling and stuff, and
doing these calcs using integer arithmetic will introduce unacceptable
inaccuracies, so I generally use floating-point for all operations in the
"universe", and then just convert to int at the end for working out an
exact dumping ground for a particular colour. I don't see co-ordinates as
being isomorphic to a number of objects.

It also becomes harder to validate parameters. Image dimensions cannot be
negative.

Just out of curiosity, what meaning would you ascribe to negative image
dimensions? Negative width and height don't make much sense to me.

Therefore

void myimagefunction(unsigned char *rgb, int width, int height)
{
assert(width >= 0);
assert(height >= 0)
}

if we are passed random garbage to this function there is a 75% percent
chance of the assets triggering.

Solution: don't pass random garbage to your functions.

size_t is a terrible idea that has no place in C code.

size_t is a great idea that has an important place in C code.

Keith Thompson · May 23, 2006

Malcolm said:
size_t is an uglification of the C language.
A bit like const or noalias, it seems sensible at first sight, but the
implicatins weren't thought through.

OK. Someone might want to allocate more memory than will fit in an int. So
make malloc() take a size_t.
Note that if an int is, as is the intention, the natural size for an integer
on that platform, one needs to ask how an amount of memory can fail to fit
in a register. But pass that by.

No, let's not pass that by.

There's no requirement for an int to be the size of a register. On a
modern 64-bit system, there's a very good reason for it not to be.
Making int 32 bits allows:
char = 8 bits
short = 16 bits
int = 32 bits
long = 64 bits

If you want to support 8-bit, 16-bit, and 32-bit integers (without
resorting to C99-style extended integer types), you have no choice but
to make int 32 bits.

Using int to represent sizes would make it impossible to have an
object bigger than 2 gigabytes (4 gigabytes if you use unsigned int).

So now a string can be szie_t bytes long. So all the string functions need
to take size_t instead of integers, and return them.

Then it gets worse. Any array could have been allocated with malloc(). So
now all you array indices are size_t. So all the counts of array sizes are
size_t as well. So anything that is the result of an operation of a count of
things in the computer is a size_t. So integers virtually disappear from
your code. size_t has run through it.
Good.

But size_t's are unsigned. This introduces subtle problems into code.

For instance if we count down a loop
while(N-- > 0)
sudenly the code breaks, because N is of course a count of something, and so
a size_t.

Yes, using unsigned types requires some care. Perhaps standardizing
ssize_t, the signed type corresponding to size_t, would have been
helpful.

[snip]

size_t is a terrible idea that has no place in C code.

I disagree.

Joe Smith · May 23, 2006

Malcolm said:
size_t is an uglification of the C language.

This is an interesting perspective. Is size_t ugly in and of itself, or in
the hands of the hackmeisters?

A bit like const or noalias, it seems sensible at first sight, but the
implicatins weren't thought through.

This would surprise me.

[makes his case]

size_t is a terrible idea that has no place in C code.

No place whatsoever? Mr. Heathfield lists about twenty places where it
would seem necessary. Is he wrong? Does he mention things that a
reasonable person would never use? I've never used size_t, as I have not
used many of the features of the C language. When I step up to the
compiler, I know darn well the size of the things I'm trying to work with.
But what if, instead of a cowboy on a keyboard, you had to develop with
differing teams and modules. Furthermore, the boss can't even tell you much
about the object until the last phase. I think right about then I'd be
reaching for size_t and a Long Island Iced Tea, but maybe in reverse order.
joe

Casting char* to int*	15	Nov 16, 2007
sizeof (size_t) and sizeof (pointer)	19	Nov 12, 2007
size_t, when to use it? (learning)	45	Apr 10, 2014
enum while preprocessing	1	Nov 23, 2006
accumulate instead of for-loop	4	Apr 3, 2008
Why do we need non-virtual destructor?	13	Dec 5, 2005
atreturn() like atexit()	4	Jan 4, 2006
unsigned char and -1	7	Oct 25, 2005

When/why to use size_t

Alex Vinokur

santosh

pete

Richard Heathfield

pete

Richard Heathfield

pete

Ben Pfaff

Malcolm

CBFalconer

Richard Heathfield

Keith Thompson

Joe Smith

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads