What's the deal with size_t?

Tubular Technician · Nov 6, 2007

Hello, World!

Reading this group for some time I came to the conclusion that
people here are split into several fractions regarding size_t,
including, but not limited to,

* size_t is the right thing to use for every var that holds the
number of or size in bytes of things.

* size_t should only be used when dealing with library functions.

* size_t should really be a signed type (less warnings)

* size_t is unnecessary (size of object in memory never exceeds
what can be held in an integer).

* size_t is visually unpleasant.

* size_t clutters up / is an uglification of the language
(solving only a theoretical problem).

* size_t usage may be non-portable because it won't be around
anymore in 100 years.

Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)

Ben Pfaff · Nov 6, 2007

Tubular Technician said:
* size_t is the right thing to use for every var that holds the
number of or size in bytes of things.

Only the size of an object held in memory, or a number or size
that can be no greater than the maximum size of an object held in
memory. Thus, size_t is not appropriate for holding the size of
disk file because a disk file can be larger than memory (use
off_t instead).

Jeffrey Stedfast · Nov 6, 2007

Tubular said:
Hello, World!

Reading this group for some time I came to the conclusion that
people here are split into several fractions regarding size_t,
including, but not limited to,

* size_t is the right thing to use for every var that holds the
number of or size in bytes of things.

as far as I'm aware, all of the c-library I/O functions (as well as many
that take "number of items" arguments) all use size_t, so it is best to
match types.

* size_t should only be used when dealing with library functions.

That's a lot of library functions ;-)

* size_t should really be a signed type (less warnings)

except that size_t is often used to hold values that may be larger than
a signed integer can hold.

there is an ssize_t which is signed

* size_t is unnecessary (size of object in memory never exceeds
what can be held in an integer).

size_t is often used to refer to sizes of objects on disk, which can
easily exceed what can be held in a signed integer.

Max size_t is 4GB on 32bit architectures, I'm not entirely sure if that
changes on 64bit architectures, but Linux file sizes can range up to at
least 4GB and I believe that Linux also supports at least 4GB of memory,
so using a signed integer is not sufficient if you want your software to
be portable.

* size_t is visually unpleasant.

that's in the eye of the beholder and isn't a technical reason to
use/not use size_t

* size_t clutters up / is an uglification of the language
(solving only a theoretical problem).

Again, not a very technical argument.

* size_t usage may be non-portable because it won't be around
anymore in 100 years.

I won't be around in 100 years, so I'm not terribly worried whether it
will or won't be around by then.

I suspect, however, that size_t will not be going away.

Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)

It should be used whenever you make a call to a library functions which
uses it and, likely, whenever you want to express the size of an object
in bytes.

Jeff

Richard Heathfield · Nov 6, 2007

Tubular Technician said:

Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)

Observe where, how, and why the standard library uses size_t. Then go thou
and do likewise. The standard library is very often (although by no means
always) a reasonable guide to good practice in your own code.

A great many standard library functions use size_t, not only to specify the
size of an object, but also to enumerate objects.

Peter Nilsson · Nov 6, 2007

Tubular Technician said:
Hello, World!

Reading this group for some time I came to the conclusion
that people here are split into several fractions regarding
size_t, including, but not limited to,

* size_t is the right thing to use for every var that holds
the number of or size in bytes of things.

For memory objects, it's hard to escape that fact.

* size_t should only be used when dealing with library
functions.

Sounds more like a phobia than sense.

* size_t should really be a signed type (less warnings)

How big is -1 bytes?

People get warnings for 2 reasons: bad code and mother hen
clucking compilers. Both are easy to rectify. The later
is easy to ignore.

* size_t is unnecessary (size of object in memory never
exceeds what can be held in an integer).

False for a number of 16-bit int systems that can address
more than 64K of memory. As 64-bit systems accessing more
than 4G of RAM become prevalent, you'll see more
implementations in the same boat because the many will
keep int at 32-bits for backwards compatibility with
programs written by programmers who failed to learn from
their mistakes when 32-bit systems took over 16-bit ones.

* size_t is visually unpleasant.

C is visually unpleasant.

[But size_t is a lot more appealing to me than FILE!]

* size_t clutters up / is an uglification of the language
(solving only a theoretical problem).

No, that's ptrdiff_t.

* size_t usage may be non-portable because it won't be
around anymore in 100 years.

If size_t won't be around, then it'll be because C won't
be around.

Sooo... what's the real deal with size_t? Where should
it be used/avoided (examples?)

Use it where you need to use it. Typical usage is as a size,
count or index of memory objects.

Fact is, it doesn't really matter whether you use size_t
or int. You still have to be conscious of memory limitations
and going beyond the limits of whatever integer you're using.

It's simpler to use size_t in most cases because detecting
wrap around is easy and, unlike int, it _is_ guaranteed to
be able to index any object allocated through normal means.

The only problem with size_t, AFAICS, is that it is not
required to have a rank of unsigned int or above.

cr88192 · Nov 6, 2007

Richard Heathfield said:
Tubular Technician said:

Observe where, how, and why the standard library uses size_t. Then go thou
and do likewise. The standard library is very often (although by no means
always) a reasonable guide to good practice in your own code.

just, be careful of adopting their naming conventions. otherwise, you may
run into, clashes...

my naming conventions often go like this:
<lib>_<name> smaller libs or externally usable names.
<part>_<name> rarer, usually for older code or if I may at some
point split the code into another lib
<lib>_<part>_<name> a general convention.

in the above, lib is usually all caps, part is is often mixed case
(FirstLettersAreCaps), and name is often similar to part.

another convention used in some of my libs (for front-end API functions in
cases where I 'formally' specify the external API, rather than just making a
lib and just using whatever code the lib contains):
<prefix><name>

where prefix is all lower case (usually the lower-case equivalent of lib),
and name is as before (though, in a few cases, I have used all lower case
names, but this has mostly been for 'core' functions intended almost as
extensions or alternatives to the standard library).

usually, only a few front-end functions use these interfaces, with nearly
everything else (internal to the lib) using the previous conventions.

my rules are far more lax for front-end code, but usually this is only a
minor part of my projects.

A great many standard library functions use size_t, not only to specify
the
size of an object, but also to enumerate objects.

yes.

size_t is good, albeit in the past I have traditionally not used it much...

Richard Bos · Nov 6, 2007

Jeffrey Stedfast said:
Tubular Technician wrote:

Hardly. There are most people, who vary from option one to option two,
and then there's Malcolm, who is scared of underlines.

except that size_t is often used to hold values that may be larger than
a signed integer can hold.

there is an ssize_t which is signed

Not in C, there isn't. Maybe in C++.

size_t is often used to refer to sizes of objects on disk,

More likely in memory...

which can easily exceed what can be held in a signed integer.

....for which this is just as true. For the sizes of disk objects, size_t
may not even be enough.

Max size_t is 4GB on 32bit architectures,

Nonsense. size_t may be any unsigned integer type; what's there to stop
an implementation on a "32bit architecture", whatever that means in this
case (and it could mean several things), from making a size_t 64 bits?
After all, if it's a C99 implementation, it already _has_ to make long
long at least that large.

It should be used whenever you make a call to a library functions which
uses it and, likely, whenever you want to express the size of an object
in bytes.

Quite. And it should be avoided if you're a scaredycat who doesn't want
to face up to any other integer types than int and char.

Richard

Ian Collins · Nov 6, 2007

Richard said:
Not in C, there isn't. Maybe in C++.

I think it's Posix.

santosh · Nov 6, 2007

Hello, World!

Reading this group for some time I came to the conclusion that
people here are split into several fractions regarding size_t,
including, but not limited to,

* size_t is the right thing to use for every var that holds the
number of or size in bytes of things.
Yes.

* size_t should only be used when dealing with library functions.

Why? That's daft.

* size_t should really be a signed type (less warnings)

Not worth it.

* size_t is unnecessary (size of object in memory never exceeds
what can be held in an integer).

There is no such requirement. It is provably false.

* size_t is visually unpleasant.

Weakest of all arguments against it.

* size_t usage may be non-portable because it won't be around
anymore in 100 years.

Why is that?

Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)

It's appropriate to hold the sizes of objects, arrays and often to hold
indexes as well. If you really need a signed value then it's, of
course, not appropriate. If you don't mind skirting portability there
is POSIX's ssize_t.

Jeffrey Stedfast · Nov 6, 2007

Richard said:
Hardly. There are most people, who vary from option one to option two,
and then there's Malcolm, who is scared of underlines.

Not in C, there isn't. Maybe in C++.

Ah, this could be an GNUism (I wouldn't be surprised)

More likely in memory...

...for which this is just as true. For the sizes of disk objects, size_t
may not even be enough.

true enough, my mistake.

Nonsense. size_t may be any unsigned integer type; what's there to stop
an implementation on a "32bit architecture", whatever that means in this
case (and it could mean several things), from making a size_t 64 bits?

sorry, I meant to say "at least", as in, "Max size_t is at least 4GB on
32bit" because it has to be able to at least hold values that large.

After all, if it's a C99 implementation, it already _has_ to make long
long at least that large.
Agreed.

Quite. And it should be avoided if you're a scaredycat who doesn't want
to face up to any other integer types than int and char.

Richard

Jeff

Eric Sosman · Nov 6, 2007

Tubular Technician wrote On 11/05/07 20:16,:

Hello, World!

Reading this group for some time I came to the conclusion that
people here are split into several fractions regarding size_t,
including, but not limited to,

* size_t is the right thing to use for every var that holds the
number of or size in bytes of things.

I've never seen this claim made.

* size_t should only be used when dealing with library functions.

Nor this one.

* size_t should really be a signed type (less warnings)

Nor this one.

* size_t is unnecessary (size of object in memory never exceeds
what can be held in an integer).

This claim has been made, and also refuted with actual
examples of real contemporary machines.

* size_t is visually unpleasant.

Big deal. I'm sure some of the people who post here
are visually unpleasant, too.

* size_t clutters up / is an uglification of the language
(solving only a theoretical problem).

This claim has been made, and made, and made, and made,
and made, by one person who never tires of making it, and
making it, and making it, and making it, and making it. You
have probably awakened him, and now we'll get to see him do
it all over again, again, again, again, again. Thanks a lot.

* size_t usage may be non-portable because it won't be around
anymore in 100 years.

Neither will C.

Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)

See Richard Heathfield's reply.

Andrey Tarasevich · Nov 6, 2007

Tubular said:
...
* size_t is the right thing to use for every var that holds the
number of or size in bytes of things.

The first part ("number of things") is incorrect. 'size_t' should not be
used for that purpose (with some exceptions, see below). The second part
("size in bytes of things") is correct - that's exactly what 'size_t' is
there to be used for.

* size_t should only be used when dealing with library functions.

No, if by "library functions" you mean "standard library functions". See
above: 'size_t' should be used whenever there's a need to store or pass
a generic "size in bytes of things".

* size_t should really be a signed type (less warnings)
False.

* size_t is unnecessary (size of object in memory never exceeds
what can be held in an integer).

What "integer"? '[unsigned] int'? That's false. Note, BTW, that 'size_t'
is an "integer" itself.

* size_t is visually unpleasant.

It is consistent with the other legacy naming conventions used in
C89/90. Consistency is immeasurably more important than visual
pleasantry. Moreover, consistency begets visual pleasantry, once you get
used to it.

* size_t clutters up / is an uglification of the language
(solving only a theoretical problem).
False.

* size_t usage may be non-portable because it won't be around
anymore in 100 years.

Hm... I just don't know what to say about it. Doesn't make much sense.

Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)

As it says above, 'size_t' should be used whenever there's a need to
store or pass a generic "size in bytes of things". I.e it should be used
for absolute sizes of objects in bytes (see 'malloc' for example).

Note, that it is also limitedly acceptable to use 'size_t' to express
array sizes and indexes (i.e. element counts), but only in contexts that
are specifically targeted at _arrays_ (as opposed to _containers_ in
general) and _generic_ arrays at that (as opposed to
_application-specific_ arrays, which should use application-specific
types for that purpose) (see 'calloc' for example or string manipulation
functions).

Keith Thompson · Nov 6, 2007

Eric Sosman said:
Tubular Technician wrote On 11/05/07 20:16,: [...]

* size_t is unnecessary (size of object in memory never exceeds
what can be held in an integer).

Click to expand...

This claim has been made, and also refuted with actual
examples of real contemporary machines.

[...]

The claim *as stated* is correct. The size of any object in memory
can never exceed what can be held in an integer (ignoring a nitpicking
controversy about how big an object calloc() can create); size_t is,
after all, an integer type.

Blurring the distinction between "int" and "integer" is one of the
worst errors I see here. There are a number of integer types in C,
ranging from char to long long (and perhaps more if the implementation
provides one or more extended integer types). "int" is just one of
those types (it's also a keyword that can be used in the names of
several other integer types).

(The standard might use the term "integral types" rather than "integer
types"; I'm not sure, it doesn't make much difference to the point,
and my copy of the standard isn't handy at the moment.)

John Bode · Nov 7, 2007

Hello, World!

Reading this group for some time I came to the conclusion that
people here are split into several fractions regarding size_t,
including, but not limited to,

* size_t is the right thing to use for every var that holds the
number of or size in bytes of things.

Generally I agree, although if you know your upper bound isn't going
to exceed one of the other integral types, they should work just as
well.

* size_t should only be used when dealing with library functions.

Disagree; I'll use it when it makes sense.

* size_t should really be a signed type (less warnings)

Nonsensical. How many objects in memory take up < 0 bytes?

* size_t is unnecessary (size of object in memory never exceeds
what can be held in an integer).

As others have pointed out, this isn't necessarily true.

* size_t is visually unpleasant.

Just about the entire C language is visually unpleasant.

* size_t clutters up / is an uglification of the language
(solving only a theoretical problem).

The problem is far from theoretical -- depending on the architecture,
the regular integral types may not be wide enough to represent the
size of an object in memory.

* size_t usage may be non-portable because it won't be around
anymore in 100 years.

If anyone is seriously making that argument, then they are in need of
a beating with a copy of Schildt. That's retarded.

Sooo... what's the real deal with size_t? Where should it be
used/avoided (examples?)

I use size_t types anytime I'm dealing with arrays or allocating
memory.

Malcolm McLean · Nov 7, 2007

Andrey Tarasevich said:
As it says above, 'size_t' should be used whenever there's a need to
store or pass a generic "size in bytes of things". I.e it should be used
for absolute sizes of objects in bytes (see 'malloc' for example).

Note, that it is also limitedly acceptable to use 'size_t' to express
array sizes and indexes (i.e. element counts), but only in contexts that
are specifically targeted at _arrays_ (as opposed to _containers_ in
general) and _generic_ arrays at that (as opposed to
_application-specific_ arrays, which should use application-specific
types for that purpose) (see 'calloc' for example or string manipulation
functions).

The problem is that, in C, the array is by far the most common data
structure. So size_t needs to be used whenever the maximum dimensions of the
array cannot be specified by the programmer.
That creates another problem. Frequently we know that the array will be
relatively small, but it doesn't make much sense to specify a limit. For
instance the number of children in a class must, in Britain, be thirty or
less by law. However that law is often breached. But not so flagrantly that
you have classes of hundreds.
So should the class count be an int or a size_t? If we want to pass a class
to qsort() to rank the children by mark, the function will take a size_t.

Charlton Wilbur · Nov 7, 2007

MMcL> The problem is that, in C, the array is by far the most
MMcL> common data structure. So size_t needs to be used whenever
MMcL> the maximum dimensions of the array cannot be specified by
MMcL> the programmer.

The dialect of C I use most often *requires* the maximum dimension of
the array to be specified by the programmer, either by providing a
constant size at compile-time or by calling a memory allocation
function with a size at run-time.

MMcL> Frequently we know that the array will be relatively small,
MMcL> but it doesn't make much sense to specify a limit. For
MMcL> instance the number of children in a class must, in Britain,
MMcL> be thirty or less by law. However that law is often
MMcL> breached. But not so flagrantly that you have classes of
MMcL> hundreds. So should the class count be an int or a size_t?
MMcL> If we want to pass a class to qsort() to rank the children
MMcL> by mark, the function will take a size_t.

Yes, and the dialect of C I use most often will implicitly cast an int
to a size_t as necessary. Does your compiler not support that feature?

Charlton

Malcolm McLean · Nov 7, 2007

Charlton Wilbur said:
MMcL> The problem is that, in C, the array is by far the most
MMcL> common data structure. So size_t needs to be used whenever
MMcL> the maximum dimensions of the array cannot be specified by
MMcL> the programmer.

The dialect of C I use most often *requires* the maximum dimension of
the array to be specified by the programmer, either by providing a
constant size at compile-time or by calling a memory allocation
function with a size at run-time.

MMcL> Frequently we know that the array will be relatively small,
MMcL> but it doesn't make much sense to specify a limit. For
MMcL> instance the number of children in a class must, in Britain,
MMcL> be thirty or less by law. However that law is often
MMcL> breached. But not so flagrantly that you have classes of
MMcL> hundreds. So should the class count be an int or a size_t?
MMcL> If we want to pass a class to qsort() to rank the children
MMcL> by mark, the function will take a size_t.

Yes, and the dialect of C I use most often will implicitly cast an int
to a size_t as necessary. Does your compiler not support that feature?

You need to explicitly cast, or it will warn.

Whilst quite often arrays are in fact fixed at compile time, this is often
at quite a high level or late stage. You might have three thousand "rooms"
in an adventure game, however the "seach rooms for named object" function
will be written long before this limit is fixed. So the number of rooms
really needs to be a size_t. The programmer of the search function cannot
dictate how big the final game will be. That's if we go down the size_t
route.

Also, a lot of variables are in fact passed by indirection. If you've got
more than one class in the school, sizes might well be stored in an array.
Index numbers giving a key for each pupil probably will be as well.

Flash Gordon · Nov 7, 2007

Malcolm McLean wrote, On 07/11/07 22:10:

You need to explicitly cast, or it will warn.

The C standard does not require that and its not a problem I have.

Whilst quite often arrays are in fact fixed at compile time, this is
often at quite a high level or late stage. You might have three thousand
"rooms" in an adventure game, however the "seach rooms for named object"
function will be written long before this limit is fixed. So the number
of rooms really needs to be a size_t. The programmer of the search
function cannot dictate how big the final game will be. That's if we go
down the size_t route.

However, if in your scenario the programmer of the search function uses
the type int you can have an infinite game? Don't be an idiot. Of
course, if they have decided to leave the decision until later or think
there may be reason to change there is another useful language feature
called typedef.

typedef room_size_t whatever;

Also, a lot of variables are in fact passed by indirection.

In my experience most variables are not passed by indirection (or, in C
terms, most of the time I do not pass a pointer to the variable).

If you've
got more than one class in the school, sizes might well be stored in an
array. Index numbers giving a key for each pupil probably will be as well.

See suggestion above.

Of course, in the example of class sizes you know it is guaranteed to be
less than 32767 so using int is safe if that is the choice you want to make.

Ben Bacarisse · Nov 8, 2007

Malcolm McLean said:
You need to explicitly cast, or it will warn.

That is a QOI issue, not a language issue, surely. With gcc I can
choose.

Tubular Technician · Nov 8, 2007

Andrey said:
The first part ("number of things") is incorrect. 'size_t' should not be
used for that purpose (with some exceptions, see below). The second part
("size in bytes of things") is correct - that's exactly what 'size_t' is
there to be used for.

I was thinking of

struct foo {
/* ... */
};
struct foo Foo[X];
#define FOOSIZE (sizeof Foo / sizeof Foo[0])

No, if by "library functions" you mean "standard library functions". See
above: 'size_t' should be used whenever there's a need to store or pass
a generic "size in bytes of things".

I think the above claim may come from the fact that size_t is not built
into the language like char, int, long, etc. For example (I know there
are probably better examples), if <stdio.h> is included, it makes
available both function prototypes dealing with FILE objects as well
as the FILE type itself.

In case of size_t, it is made available when including headers that use
it, but is not available by the bare language, that is, the compiler
knows by itself the max. size of memory that can be addressed, it knows
by itself about sizeof and the (unnamed) type it yields, but to create an
object of said type inclusion of a header is needed.

(BTW, what is the exact rational behind this? Was there a fear at the
time size_t was added to the language that making it available
unconditionally may conflict with preexisting code, somewhat like with
C99 when bool was added?)

Hm... I just don't know what to say about it. Doesn't make much sense.

IIRC, this claim was made in the big book discussion thread some time
ago and was given as a reason why the example code in said book does
not use size_t.

What's the deal with C99?	111	Mar 24, 2008
Overflow of size_t?	9	Jul 3, 2009
What's the deal with the "toupper" family?	48	Jul 5, 2006
Whats the deal with 'const'?	20	Jul 15, 2006
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
What's the guideline for dealing with unwanted chars in input stream?	17	Dec 31, 2005
simple, practical example of "code-reuse with the help of OOP"	0	Apr 18, 2014
Making Fatal Hidden Assumptions	353	Mar 6, 2006

What's the deal with size_t?

Tubular Technician

Ben Pfaff

Jeffrey Stedfast

Richard Heathfield

Peter Nilsson

cr88192

Richard Bos

Ian Collins

santosh

Jeffrey Stedfast

Eric Sosman

Andrey Tarasevich

Keith Thompson

John Bode

Malcolm McLean

Charlton Wilbur

Malcolm McLean

Flash Gordon

Ben Bacarisse

Tubular Technician

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads