type of array index?

shmartonak · Mar 24, 2005

For maximum portability what should the type of an array index be? Can
any integer type be used safely? Or should I only use an unsigned type?
Or what?

If I'm using pointers to access array elements as *(mptr+k) where I've
declared

MYTYPE *mptr;

what should be the type of 'k'? Should it be ptrdiff_t?

OT: near as I can tell my implementation of gcc doesn't have an
<stddef.h> file with ptrdiff_t defined. Am I overlooking something?

--

Jonathan Bartlett · Mar 24, 2005

what should be the type of 'k'? Should it be ptrdiff_t?

It should be any type of integer.

Jon

Jonathan Bartlett · Mar 24, 2005

If you really want ptrdiff_t, it appears to be in malloc.h. stdint.h
has its limits. Interestingly, obstack.h thinks that it appears in
stddef.h, too. Perhaps someone should report that to the maintainers.
Using gcc 3.3.1.

Jon

Michael Mair · Mar 24, 2005

For maximum portability what should the type of an array index be? Can
any integer type be used safely? Or should I only use an unsigned type?
Or what?

Essentially, there are two choices:
1. Use int whenever sufficient, use a wider type when necessary.
Sufficient means sufficient with respect to the minimal requirements
from the standard.
2. Always use size_t.
size_t is sufficient for all arrays and for indexing all storage you
can allocate (bytewise).

Personally, I favor the second approach but I am not religious
about it. Drawbacks: You have to be more careful due to the inherent
nonnegativity of the index type, e.g.
for (i=MAX-1; i>=0; i--)
operation_on(array);
either has to become
for (i=MAX; i>0; i--)
operation_on(array[i-1]);
or
for (i=MAX-1; i!=-1; i--)
operation_on(array);
which I like better.
As for int: It is possible that int indices are "faster" as int usually
is the "natural" integer type of the system but if this is really
the case, one can still optimise where necessary. Undefined behaviour
due to forgetting about "old-fashioned" 16 bit ints and resulting
strange errors are less nice.

However, there have been many discussions about that.
Do as you like and look for range/sanity checks as necessary (in both
cases).

If I'm using pointers to access array elements as *(mptr+k) where I've
declared

MYTYPE *mptr;

what should be the type of 'k'? Should it be ptrdiff_t?

Click to expand...

Any integer type which suffices for your range requirements.
I would use ptrdiff_t only where appropriate.

If ssize_t had made it into the standard, I'd suggest that.

OT: near as I can tell my implementation of gcc doesn't have an
<stddef.h> file with ptrdiff_t defined. Am I overlooking something?

Click to expand...

That you cannot find it in the header file does not mean that
it is not there by some magic (or simply comes in from another
included file).

Just try it out:
If

#include <stddef.h>
int main (void) { int a[3]; ptrdiff_t b = (a+2)-(a+1); return 0; }

does not compile, file a bug report.
If it does compile without the #include, do the same.

Cheers
Michael

Michael Mair · Mar 24, 2005

Jonathan said:
If you really want ptrdiff_t, it appears to be in malloc.h. stdint.h

malloc.h is not a C standard header, thus offtopic.

has its limits. Interestingly, obstack.h thinks that it appears in

dito for obstack.h

stddef.h, too. Perhaps someone should report that to the maintainers.
Using gcc 3.3.1.

If including <stddef.h> has the effect that size_t or ptrdiff_t
are available when they were not before, everything is fine:
The implementation may do each and everything behind the scenes.

I did not yet have any problems with gcc's compliance in _this_
respect.

Cheers
Michael

Andrey Tarasevich · Mar 24, 2005

For maximum portability what should the type of an array index be? Can
any integer type be used safely? Or should I only use an unsigned type?
Or what?

If I'm using pointers to access array elements as *(mptr+k) where I've
declared

MYTYPE *mptr;

what should be the type of 'k'? Should it be ptrdiff_t?

It depends.

In concrete (non-generic) code the type is usually dictated by the
natural properties of the application area. You should normally have a
type that designates the total amount of objects of 'MYTYPE' already.
This type is an obvious candidate for the array index type. For example,
if this is an array of, say, file handles and you use 'unsigned short'
object to store the total number of files, then 'unsigned short' would
be a natural choice for the index type for this array. This, of course,
applies only if you don't care about negative indexing. For negative
indexing you'd have to use either 'short' or 'int', depending on the
required range.

In generic code 'ptrdiff_t' is the first candidate for array index type,
which also supports negative indices. If you don't care about negative
indices or you want to emphasize the fact that negative indices are not
allowed in some context, then you might want to go with 'size_t'. This
unsigned type is large enough for indexing of any array.

However, from the pedantic point of view, 'size_t' is intended to
implement a concept of "object size", while array index is more related
to the concept of "container element count". These two concepts are not
related and using 'size_t' for array indexing is a conceptual error. It
might be more elegant to "hide" the 'size_t' behind a typedef name as
follows

typedef size_t pos_ptrdiff_t;

and use 'pos_ptrdiff_t' for generic non-negative array indexing.

Michael Mair · Mar 24, 2005

Andrey said:
It depends.

In concrete (non-generic) code the type is usually dictated by the
natural properties of the application area. You should normally have a
type that designates the total amount of objects of 'MYTYPE' already.
This type is an obvious candidate for the array index type. For example,
if this is an array of, say, file handles and you use 'unsigned short'
object to store the total number of files, then 'unsigned short' would
be a natural choice for the index type for this array. This, of course,
applies only if you don't care about negative indexing. For negative
indexing you'd have to use either 'short' or 'int', depending on the
required range.

In generic code 'ptrdiff_t' is the first candidate for array index type,
which also supports negative indices. If you don't care about negative
indices or you want to emphasize the fact that negative indices are not
allowed in some context, then you might want to go with 'size_t'. This
unsigned type is large enough for indexing of any array.

However, from the pedantic point of view, 'size_t' is intended to
implement a concept of "object size", while array index is more related
to the concept of "container element count". These two concepts are not
related and using 'size_t' for array indexing is a conceptual error. It
might be more elegant to "hide" the 'size_t' behind a typedef name as
follows

typedef size_t pos_ptrdiff_t;

and use 'pos_ptrdiff_t' for generic non-negative array indexing.

Nicely put.
Something for a confessing size_t-user like me to think about

Grabbed and stored.

Cheers
Michael

Keith Thompson · Mar 24, 2005

Jonathan Bartlett said:
If you really want ptrdiff_t, it appears to be in malloc.h. stdint.h
has its limits. Interestingly, obstack.h thinks that it appears in
stddef.h, too. Perhaps someone should report that to the
maintainers. Using gcc 3.3.1.

ptrdiff_t is declared in <stddef.h>. <malloc.h> is not a standard
header (the malloc() function is declared in <stdlib.h>), nor is
<obstack.h>. And <stdint.h> is a standard header in C99, but not in
C90, so not all current implementations will provide it (though it's
not too difficult to roll your own).

those who know me have no need of my name · Mar 25, 2005

in comp.lang.c i read:

OT: near as I can tell my implementation of gcc doesn't have an
<stddef.h> file with ptrdiff_t defined. Am I overlooking something?

there is no requirement by the standard that a file exist -- it's allowed
to be magical, perhaps entirely internal. though in fact it does exist,
it's just that it's not where you are looking.

Stephen Sprunk · Mar 25, 2005

Andrey Tarasevich said:
It depends. ....
In generic code 'ptrdiff_t' is the first candidate for array index type,
which also supports negative indices. If you don't care about negative
indices or you want to emphasize the fact that negative indices are not
allowed in some context, then you might want to go with 'size_t'. This
unsigned type is large enough for indexing of any array.

However, from the pedantic point of view, 'size_t' is intended to
implement a concept of "object size", while array index is more related
to the concept of "container element count". These two concepts are not
related and using 'size_t' for array indexing is a conceptual error.

Since an array is an object, and size_t is defined to be large enough to
represent the size of any object, size_t naturally must be able to represent
any possible array subscript.

In _The Standard C Library_, P. J. Plauger writes:

"Unlike ptrdiff_t, however, size_t is /very/ useful. It is the safest type
to represent any integer data object you use as an array subscript. You
don't have to worry if a small array evolves into a very large one as the
program changes. Subscript arithmetic will never overflow when performed in
type size_t." and "You should make a point of using type size_t /anywhere/
your program performs array subscripting or address arithmetic." (emphasis
in original)

The only reason one might prefer ptrdiff_t for array subscripts is where,
due to looping, you need the ability to count down past zero. In this case,
I would prefer ssize_t (where available) over ptrdiff_t, but first I'd
consider whether it was feasible to restructure the loop condition such that
a signed subscript wasn't needed.

S

Andrey Tarasevich · Mar 25, 2005

Stephen said:
Since an array is an object, and size_t is defined to be large enough to
represent the size of any object, size_t naturally must be able to represent
any possible array subscript.

That's the very reason why I suggest using 'size_t' for array indexing
in generic context. For example, in a generic array-support library.

At application level there's rarely a need to have an array just for the
sake of having an array. What is normally needed at application level is
a container with index-based random-access interface. This could be a
"traditional" array, this could be an associative array, this could be
something like a deque. Today it can be one, tomorrow - another. At this
conceptual level there's no relation between container element count and
object size. The fact that this relation holds for a "traditional" array
is noting more than an accident, a low level detail, which has
absolutely no reason to play any role in the process of choosing the
index type.

In _The Standard C Library_, P. J. Plauger writes:

"Unlike ptrdiff_t, however, size_t is /very/ useful. It is the safest type
to represent any integer data object you use as an array subscript. You
don't have to worry if a small array evolves into a very large one as the
program changes. Subscript arithmetic will never overflow when performed in
type size_t." and "You should make a point of using type size_t /anywhere/
your program performs array subscripting or address arithmetic." (emphasis
in original)

Great, but applies mostly to library-level (generic) code.

If in my program a have a dedicated typename for designating the day of
the week, say

typedef enum DayOfTheWeek { /* ... */ } DayOfTheWeek;

and in some place I need to iterate through an array indexed by the day
of the week, then I'd make a point of using 'DayOfTheWeek' to represent
the index, never the completely irrelevant 'size_t'.

The only reason one might prefer ptrdiff_t for array subscripts is where,
due to looping, you need the ability to count down past zero. In this case,
I would prefer ssize_t (where available) over ptrdiff_t, but first I'd
consider whether it was feasible to restructure the loop condition such that
a signed subscript wasn't needed.

I would do the latter.

Tor Rustad · Mar 29, 2005

Any integer type is allowed. For example main() uses argc:

int main(int argc, char* argv[])

If I'm using pointers to access array elements as *(mptr+k) where
I've declared

Click to expand...

[...]

However, from the pedantic point of view, 'size_t' is intended to
implement a concept of "object size", while array index is more
related to the concept of "container element count". These two
concepts are not related and using 'size_t' for array indexing
is a conceptual error.

Not correct, an array is an object and

sizeof array

return the size of that object. Hence, size_t is a good type choice
for an array index . Note that number of array elements

size_t N = sizeof (array) / sizeof (array[0]);

will never overflow with size_t.

Andrey Tarasevich · Mar 29, 2005

Tor said:
...

However, from the pedantic point of view, 'size_t' is intended to
implement a concept of "object size", while array index is more
related to the concept of "container element count". These two
concepts are not related and using 'size_t' for array indexing
is a conceptual error.

Click to expand...

Not correct, an array is an object and

sizeof array

return the size of that object. Hence, size_t is a good type choice
for an array index . Note that number of array elements

size_t N = sizeof (array) / sizeof (array[0]);

will never overflow with size_t.
...

And? This is exactly what I said in my message before the quoted part.
But how is this relevant within the context of paragraph quoted above?

Once again, the concept of "object size" is not related to the concept
of "container element count". The connection between the two for
built-in C arrays is purely parasitic. It is a mere coincidence, which
makes absolutely no difference at conceptual level.

Stephen Sprunk · Mar 29, 2005

Andrey Tarasevich said:
That's the very reason why I suggest using 'size_t' for array indexing
in generic context. For example, in a generic array-support library.

At application level there's rarely a need to have an array just for
the sake of having an array.

Right, but if you get in the habit of writing code where the index is an
int, sooner or later you'll write code that (after it's been edited by a
dozen other coders) overflows that int but size_t would have worked.

Sure, size_t may be less efficient on some systems, but IMHO rarely enough
to offset the possibility of bugs in the future.

Great, but applies mostly to library-level (generic) code.

IMHO, it applies to nearly all application code as well.

If in my program a have a dedicated typename for designating
the day of the week, say
...

In that case, yes, I might declare the index as an int or even short, but
clear-cut cases like that are exceptions.

In most cases, the original coder has no idea what his code will eventually
evolve into, how other components of the same system will abuse it, or
whether the index will overflow ten years later and crash some
mission-critical system. Why take the chance unless you can prove it's
affecting performance?

S

Keith Thompson · Mar 29, 2005

Andrey Tarasevich said:
Once again, the concept of "object size" is not related to the concept
of "container element count". The connection between the two for
built-in C arrays is purely parasitic. It is a mere coincidence, which
makes absolutely no difference at conceptual level.

Object size and container element count are the same for character
arrays, and since an array element can't be smaller than one byte,
you're guaranteed that size_t is at least big enough to hold any array
index.

If you're writing code that deals with arrays generically (such as the
standard qsort() function), it's probably safest to use size_t. If
you're using something more specific to a given problem domain, and
you happen to know that an array will never have more than INT_MAX
elements, you can use int.

Andrey Tarasevich · Mar 29, 2005

Keith said:
[...]

Once again, the concept of "object size" is not related to the concept
of "container element count". The connection between the two for
built-in C arrays is purely parasitic. It is a mere coincidence, which
makes absolutely no difference at conceptual level.

Click to expand...

Object size and container element count are the same for character
arrays, and since an array element can't be smaller than one byte,
you're guaranteed that size_t is at least big enough to hold any array
index.

That's pretty much what I said in the message quoted by the previous poster.

If you're writing code that deals with arrays generically (such as the
standard qsort() function), it's probably safest to use size_t.

Also, exactly what I said in that message.

If
you're using something more specific to a given problem domain, and
you happen to know that an array will never have more than INT_MAX
elements, you can use int.

Same here.

Keith Thompson · Mar 29, 2005

Andrey Tarasevich said:
Keith said:

[...]

Once again, the concept of "object size" is not related to the concept
of "container element count". The connection between the two for
built-in C arrays is purely parasitic. It is a mere coincidence, which
makes absolutely no difference at conceptual level.

Click to expand...

Object size and container element count are the same for character
arrays, and since an array element can't be smaller than one byte,
you're guaranteed that size_t is at least big enough to hold any array
index.

Click to expand...

That's pretty much what I said in the message quoted by the previous poster.

You said that "object size" and "container element count" are not
related. I showed how they are related. They're not identical, of
course, but the nature of their relationship is such that it's
(almost) always safe to use size_t as an array index.

(If you use a pointer into the middle of an array object to provide
negative indices, then size_t won't work, but that's fairly unusual.
If you have a data structure that implements bit arrays, size_t may
not be big enough, but such a data structure can't be implemented with
array syntax in standard C.)

Andrey Tarasevich · Mar 30, 2005

Keith said:
[...]
Once again, the concept of "object size" is not related to the concept
of "container element count". The connection between the two for
built-in C arrays is purely parasitic. It is a mere coincidence, which
makes absolutely no difference at conceptual level.

Object size and container element count are the same for character
arrays, and since an array element can't be smaller than one byte,
you're guaranteed that size_t is at least big enough to hold any array
index.

Click to expand...

That's pretty much what I said in the message quoted by the previous poster.

Click to expand...

You said that "object size" and "container element count" are not
related.

I said that they are not related at conceptual level. These are two
different concepts.

I showed how they are related.

Not as concepts.

They're not identical, of
course, but the nature of their relationship is such that it's
(almost) always safe to use size_t as an array index.

As an array index - of course. As an index into a generic container -
no. As I said before, is most cases application does not specifically
need an array. It just needs a container with similar interface. In
situations like this an array can be easily "unplugged" from the client
code and another container can be "plugged-in" instead (assuming that
the interface specifications match). Hardcoding an array-specific index
type in this case will only introduce another unnecessary coupling
between the client code and the concrete type of the container.

Andrey Tarasevich · Mar 31, 2005

Stephen said:
Right, but if you get in the habit of writing code where the index is an
int, sooner or later you'll write code that (after it's been edited by a
dozen other coders) overflows that int but size_t would have worked.

Well, just as easily you can get into a situation where 'size_t'
overflows and a dedicated application-specific type works. One day one
of those other coders decides to replace the built-in array with a
custom implementation of some "disjoint" array, like 'deque', for
example. Size of that container is no longer limited by the range of
'size_t'.

Sure, size_t may be less efficient on some systems, but IMHO rarely enough
to offset the possibility of bugs in the future.

IMHO, it applies to nearly all application code as well.

In that case, yes, I might declare the index as an int or even short, but
clear-cut cases like that are exceptions.

In my opinion, a closer look at most practical cases usually reveals
that in essence they are indeed such clear-cut cases.

In most cases, the original coder has no idea what his code will eventually
evolve into, how other components of the same system will abuse it, or
whether the index will overflow ten years later and crash some
mission-critical system. Why take the chance unless you can prove it's
affecting performance?

That's exactly what I'm talking about. The code can easily evolve from
using an array to using a different type of container, with element
count is no longer restrained by the range of 'size_t'. In situations
like this one has to look through the entire code and carefully replace
'size_t' with some other type. This usually involves a lot of chance-taking.

Stephen Sprunk · Mar 31, 2005

Andrey Tarasevich said:
Well, just as easily you can get into a situation where 'size_t'
overflows and a dedicated application-specific type works.

Per the standard, size_t CANNOT overflow when used as an array index.

One day one of those other coders decides to replace the built-in
array with a custom implementation of some "disjoint" array, like
'deque', for example. Size of that container is no longer limited by
the range of 'size_t'.

A 'deque' is not an array and therefore any problem indexing them with
size_t is moot.

[ Rest of post snipped because it doesn't apply to arrays. ]

The OP specifically asked about the type to use for an array index.
Hijacking the thread to generalize to other non-standard container types is
irrelevant at best.

S

Accessing array index addresses with custom datatype in a function	0	Jun 2, 2022
Program to find the largest integer element of an array.	1	Mar 2, 2022
Fatal error: Uncaught Error: Cannot use object of type WP_Error as array in	0	Dec 23, 2021
array index type	26	Oct 24, 2007
Outputting signal values to terminal Within Character Array	0	Dec 10, 2021
qsort returning index	2	Oct 25, 2011
Ternary Conditional Fun Macros for Any Type	20	Aug 1, 2010
Pointer problem when trying to abstract away an array type	11	May 11, 2009

type of array index?

shmartonak

Jonathan Bartlett

Jonathan Bartlett

Michael Mair

Michael Mair

Andrey Tarasevich

Michael Mair

Keith Thompson

those who know me have no need of my name

Stephen Sprunk

Andrey Tarasevich

Tor Rustad

Andrey Tarasevich

Stephen Sprunk

Keith Thompson

Andrey Tarasevich

Keith Thompson

Andrey Tarasevich

Andrey Tarasevich

Stephen Sprunk

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads