Using size_t clearly (appropriately?)

M

Mark Odell

I've always declared variables used as indexes into arrays to be of
type 'size_t'. I have had it brought to my attention, recently, that
size_t is used to indicate "a count of bytes" and that using it
otherwise is confusing. I also thought that size_t could be signed but
it seems I was wrong on that one.

So if you were to see code iterating through a table of Foo objects
using an index of size_t type, would it be confusing? Should I have
used an index of type int or unsigned int instead?

Thanks,
 
M

Morris Dovey

Mark Odell (in (e-mail address removed))
said:

| I've always declared variables used as indexes into arrays to be of
| type 'size_t'. I have had it brought to my attention, recently, that
| size_t is used to indicate "a count of bytes" and that using it
| otherwise is confusing. I also thought that size_t could be signed
| but it seems I was wrong on that one.
|
| So if you were to see code iterating through a table of Foo objects
| using an index of size_t type, would it be confusing? Should I have
| used an index of type int or unsigned int instead?

Certainly not confusing. Perhaps confidence-building.
 
M

Michael Mair

Mark said:
I've always declared variables used as indexes into arrays to be of
type 'size_t'. I have had it brought to my attention, recently, that
size_t is used to indicate "a count of bytes" and that using it
otherwise is confusing. I also thought that size_t could be signed but
it seems I was wrong on that one.

So if you were to see code iterating through a table of Foo objects
using an index of size_t type, would it be confusing? Should I have
used an index of type int or unsigned int instead?

I would think "here is someone who thought about what an index is"...
:)
If ssize_t were standard C, I'd accept that as well for the reason
that you can easier deal with loops that count downwards.

Typedefs used to define certain roles, say
typedef .... Index;
inspire the same confidence.

int, long, size_t, and maybe unsigned long are perfectly fine
choices for array indices.


Cheers
Michael
 
A

Andrew Poelstra

I've always declared variables used as indexes into arrays to be of
type 'size_t'. I have had it brought to my attention, recently, that
size_t is used to indicate "a count of bytes" and that using it
otherwise is confusing. I also thought that size_t could be signed but
it seems I was wrong on that one.

So if you were to see code iterating through a table of Foo objects
using an index of size_t type, would it be confusing? Should I have
used an index of type int or unsigned int instead?

Thanks,

It wouldn't be confusing at all. In fact, there are situations where
you would /want/ to have size_t as your type. For example, you could
be working with strings and be counting length.

I can't see why size_t would ever be signed. However, you shouldn't
be using negative numbers in most loops.

Now, if your coding guidelines tell you not to use "size_t" for
applications that are not "a count of bytes" (array indexing /is/
a count of bytes, IMHO), then go with that. Random people from
USENet don't trump your boss, even though we think we do. :)
 
R

Richard Heathfield

Mark Odell said:
I've always declared variables used as indexes into arrays to be of
type 'size_t'. I have had it brought to my attention, recently, that
size_t is used to indicate "a count of bytes"

Who says?

Typical standard library functions use size_t in contexts where the value in
question is either the /size/ of an object, in bytes, or the /number/ of
objects that are relevant in the call. Look, for example, at calloc, fread,
fwrite.
and that using it otherwise is confusing.

It's a great type for an index, too. Someone said it's harder to use size_t
to count backwards, but it's not.

for(i = n; i-- > 0; )
{
foo(bar + i);
}
I also thought that size_t could be signed but it seems I was wrong on
that one.

Yes, you're right - you were wrong. :) It must be an unsigned type.
So if you were to see code iterating through a table of Foo objects
using an index of size_t type, would it be confusing?

Not in the slightest.
 
K

Keith Thompson

Andrew Poelstra said:
It wouldn't be confusing at all. In fact, there are situations where
you would /want/ to have size_t as your type. For example, you could
be working with strings and be counting length.

I can't see why size_t would ever be signed. However, you shouldn't
be using negative numbers in most loops.

size_t is guaranteed to be unsigned.

One possible drawback of using size_t (or any unsigned type) is that a
loop like this:

size_t i;
for (i = MAX; i >= 0; i --) {
/* ... */
}

will never terminate, since i will *always* be >= 0. The same issue
applies to signed types:

int i;
for (i = whatever; i <= INT_MAX; i ++) {
/* ... */
}

but it doesn't come up as often. (Also, decrementing a size_t with
the value 0 is well defined; incrementing an int with the value
INT_MAX, or decrementing an int with the value INT_MIN, invokes
undefined behavior.)

Both signed and unsigned integers behave like mathematical integers as
long as you stay away from the ends of their ranges. The difference
is that the ends of the range of a signed integer type are way out
there, and you're likely not to encounter them; the lower range of an
unsigned type is 0, and it's easy to run into that if you're not
careful.
 
P

pete

Mark said:
I've always declared variables used as indexes into arrays to be of
type 'size_t'. I have had it brought to my attention, recently, that
size_t is used to indicate "a count of bytes" and that using it
otherwise is confusing.

Then the nmemb parameter of qsort
must be more confusing to them, than it is to you.

#include <stdlib.h>
void qsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
I also thought that size_t could be signed but
it seems I was wrong on that one.

So if you were to see code iterating through a table of Foo objects
using an index of size_t type, would it be confusing?

No.

The one and only problem that I have with size_t,
is the lack of a size_t format specifier for fprintf in C89.
 
A

Al Balmer

It wouldn't be confusing at all. In fact, there are situations where
you would /want/ to have size_t as your type. For example, you could
be working with strings and be counting length.

I can't see why size_t would ever be signed. However, you shouldn't
be using negative numbers in most loops.

Posix puts their ssize_t (signed size_t) to use for functions that
return either a count or -1. I don't know of anything in standard C
that could use that feature.
Now, if your coding guidelines tell you not to use "size_t" for
applications that are not "a count of bytes" (array indexing /is/
a count of bytes, IMHO), then go with that. Random people from
USENet don't trump your boss, even though we think we do. :)

The standard specifies size_t for some things that are not a count of
bytes.
 
W

William Ahern

Posix puts their ssize_t (signed size_t) to use for functions that return
either a count or -1. I don't know of anything in standard C that could
use that feature.

snprintf. Or, basically, anything printf. Some, like snprintf(), even take
size_t lengths as arguments. Very awkward. Not that ssize_t is
particularly less awkward, but at least they provide a
greater range in practice, and in some scenarios ssize_t could even
solve the issue entirely:

typedef unsigned long size_t;
typedef long long ssize_t;

Where LLONG_MAX >= ULONG_MAX.
 
A

Andrey Tarasevich

Mark said:
I've always declared variables used as indexes into arrays to be of
type 'size_t'. I have had it brought to my attention, recently, that
size_t is used to indicate "a count of bytes" and that using it
otherwise is confusing.

It might me. Not as much "confusing", as conceptually incorrect. 'size_t' type
is intended to be used to represent a concept of 'size of an object'. Number of
elements in the array is described by a completely different concept of 'number
of elements in a container'. Note, that is case of generic container these two
concepts are completely unrelated. In the particular case of an _array_ there's
certain "parasitic" relationship between the two: the latter cannot be greater
than the former. This is often used as a justification for using 'size_t' to
represent array indices. This is a false reasoning. In general case, once again,
using 'size_t' for this purpose is a conceptual error.

In certain particular cases though 'size_t' could be appropriate as an array
index type. For example, when one needs to iterate through an array of raw
memory bytes (i.e. array of 'unsigned char'). Another example would be generic
purpose functions that work with "generic" arrays, i.e. functions that are not
tied to a concrete application-specific area. String processing functions and
functions of 'memset'/'memcpy'/etc group, 'bsearch' and 'qsort' functions belong
to that category.

It is also worth noting (and looks like you know that already) that operator
'[]' accepts signed integral arguments, which indicated that in the most generic
I also thought that size_t could be signed but
it seems I was wrong on that one.

Yes, 'size_t' is always unsigned.
So if you were to see code iterating through a table of Foo objects
using an index of size_t type, would it be confusing?

The first question that has to be answered here is what exactly is 'Foo'. If
this is an application-specific type, then the use of 'size_t' for indexing
would be incorrect. Normally, regardless of whether there are any arrays of
'Foo' in the code, the programmer would have already made a choice of type that
should be used to represent the quantities of 'Foo'. For example, that could be
something like 'typedef unsigned TFooQuantity;' or simply 'unsigned' without any
extra 'typedef's. That type is the type that should be used as index type in
'Foo' arrays, not 'size_t'.
Should I have
used an index of type int or unsigned int instead?

See above. You should ask yourself: what type are you using to represent the
concept of 'quantity' of objects of type 'Foo' in your code. That's exactly the
type you should use for array indexing.
 
K

Keith Thompson

pete said:
Mark Odell wrote: [...]
So if you were to see code iterating through a table of Foo objects
using an index of size_t type, would it be confusing?

No.

The one and only problem that I have with size_t,
is the lack of a size_t format specifier for fprintf in C89.

Which, of course, is easy to work around:

fprintf(some_file, "size = %lu\n", (unsigned long)sizeof whatever);

This isn't guaranteed to work in C99, but a #if test on
__STDC_VERSION__ will solve that.
 
P

pete

Andrey said:
It might me. Not as much "confusing", as conceptually incorrect. 'size_t' type
is intended to be used to represent a concept of 'size of an object'. Number of
elements in the array is described by a completely different concept of 'number
of elements in a container'. Note, that is case of generic container these two
concepts are completely unrelated. In the particular case of an _array_ there's
certain "parasitic" relationship between the two: the latter cannot be greater
than the former. This is often used as a justification for using 'size_t' to
represent array indices. This is a false reasoning. In general case, once again,
using 'size_t' for this purpose is a conceptual error.

In certain particular cases though 'size_t' could be appropriate as an array
index type. For example, when one needs to iterate through an array of raw
memory bytes (i.e. array of 'unsigned char'). Another example would be generic
purpose functions that work with "generic" arrays, i.e. functions that are not
tied to a concrete application-specific area. String processing functions and
functions of 'memset'/'memcpy'/etc group, 'bsearch' and 'qsort' functions belong
to that category.

That's not a bad explanation.
 
P

pete

pete said:
That's not a bad explanation.

But, if I were going to compare the array index
to a size_t expression or assign a size_t value
to an index variable, I would still probably use
a size_t type index variable.
 
R

Richard Heathfield

Andrey Tarasevich said:
It might me. Not as much "confusing", as conceptually incorrect. 'size_t'
type is intended to be used to represent a concept of 'size of an object'.

The calloc, qsort, bsearch, fread, and fwrite standard library functions all
use size_t to count a number of objects, and are therefore counter-examples
(insofar as the Standard is definitively correct).
 
K

Keith Thompson

Andrey Tarasevich said:
It might me. Not as much "confusing", as conceptually
incorrect. 'size_t' type is intended to be used to represent a
concept of 'size of an object'. Number of elements in the array is
described by a completely different concept of 'number of elements
in a container'. Note, that is case of generic container these two
concepts are completely unrelated. In the particular case of an
_array_ there's certain "parasitic" relationship between the two:
the latter cannot be greater than the former. This is often used as
a justification for using 'size_t' to represent array indices. This
is a false reasoning. In general case, once again, using 'size_t'
for this purpose is a conceptual error.

That's well argued, but I disagree.

We use what we have. We have a type size_t that's designed to count
sizes (in bytes) of objects. We don't have a similar type that's
designed to count the number of elements in an array of struct foobar.
If we had such a type, I'd advocate using it (for example, if
declaring "struct foobar" implicitly created an unsigned int typedef
called, say, "struct_foobar_count").

Using size_t to count objects isn't ideal, but it's what we have.
Since objects (other than bit fields, which we generally wouldn't be
interested in counting) are at least one byte each, we know that
size_t has *at least* enough range for the purpose. I don't believe
any other type would be any better, and size_t isn't sufficiently bad
that I'd recommend avoiding it.

If the language had a type to be used generically for counting
objects, surely it would be just an alias for size_t, since the
objects could be bytes in an array. I'm not greatly distressed by the
fact that it's called "size_t" rather than "object_count_t".
 
A

Andrey Tarasevich

Richard said:
The calloc, qsort, bsearch, fread, and fwrite standard library functions all
use size_t to count a number of objects, and are therefore counter-examples
(insofar as the Standard is definitively correct).

All these functions are excellent examples of generinc array processing
functions, with which 'size_t' is perfectly appropriate. I explicitly
mentioned it in my message. I actually mentioned some of these functions
as well.
 
A

Andrey Tarasevich

Keith said:
That's well argued, but I disagree.

I think we already had this discussion before.
We use what we have. We have a type size_t that's designed to count
sizes (in bytes) of objects. We don't have a similar type that's
designed to count the number of elements in an array of struct foobar.
If we had such a type, I'd advocate using it (for example, if
declaring "struct foobar" implicitly created an unsigned int typedef
called, say, "struct_foobar_count").

That only appears so. Whenever some type (say 'struct foobar') is given
some application-specific meaning (say, describe an employee in a
company) and represent a 'countable' object (say, we normally have many
employees in a company) there exists a need to choose a type that will
be used to represent these 'counts', these application-specific
quantities. Note, that we are not talking about any "arrays" yet, but
the need to have the type that represents the 'quantity' already exists.

Now, once we start using arrays, that 'quantity' type immediately
springs to mind as the best choice for index type. Note, that we indeed
"use what we have", as you said in your message. I just want to say that
by the time we get to arrays, we will already "have" the index type, and
it is not 'size_t'. 'size_t' is a bad choice to represent generic
'quantities' for obvious reasons (it might simply not have the range,
think of segmented 16-bit platform with 16-bit 'size_t').

Once again, 'quantities' predate 'arrays'. By the time we get to
'arrays' (or any other containers, for that matter) we should have
already made all the necessary choices about 'quantity' types.
Using size_t to count objects isn't ideal, but it's what we have.
Since objects (other than bit fields, which we generally wouldn't be
interested in counting) are at least one byte each, we know that
size_t has *at least* enough range for the purpose.

In general case 'size_t' is not applicable for counting objects at all.
In general case it's range is not sufficient (16-bit platform again).
Yes, 'size_t' is applicable for counting _array_ _elements_, but that's
nothing more than a language-specific parasitic relationship between the
byte-size of array and the number of elements in it. Letting this
parasitic relationship to seep into the design of application-specific
code is not the right thing to do.
I don't believe
any other type would be any better, and size_t isn't sufficiently bad
that I'd recommend avoiding it.

If the language had a type to be used generically for counting
objects, surely it would be just an alias for size_t, since the
objects could be bytes in an array. I'm not greatly distressed by the
fact that it's called "size_t" rather than "object_count_t".

Once again, on a traditional 16-bit segmented platform with 16-bit
'size_t' the difference between the concept of 'object size' and 'object
count' is especially obvious. As is the inappropriateness of choosing
'size_t' as generic object count type.
 
R

Richard Heathfield

Andrey Tarasevich said:

All these functions are excellent examples of generinc array processing
functions, with which 'size_t' is perfectly appropriate. I explicitly
mentioned it in my message. I actually mentioned some of these functions
as well.

True enough, but I fail to see why you consider them exceptions.

Okay, let's take a different tack. The canonical way to determine the number
of elements in an array (cf C89 3.3.3.4) is: sizeof array / sizeof array[0]

Now, sizeof yields size_t. What is the natural type to use for storing the
result of a division of size_t by size_t? I would argue that it's size_t.
Certainly the division will yield an unsigned type as its result. So it
makes perfect sense to do this:

size_t i;

for(i = 0; i < sizeof array / sizeof array[0]; i++)

Yes? Well, I doubt whether I've convinced you, but maybe some others here
will be swayed by this argument. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

size_t, ssize_t and ptrdiff_t 56
size_t, when to use it? (learning) 45
return -1 using size_t??? 44
usage of size_t 190
size_t 18
ssize_t and size_t 8
What's the deal with size_t? 104
size_t in inttypes.h 4

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top