int or size_t as the index for the string

M

Mike Wahler

Malcolm said:
size_t is correct, int is traditional and what any C programmer would use.

Not what I'd use, nor do I think would most C programmers.

Why int? Because int is the natural integer type for the program to use.

Why? IMO context determines what the 'natural' type to use is.
In the case of object sizes or array subscripts, the 'natural'
(and more important, guaranteed to work) type is 'size_t'.
If
a string is so long that an int won't hold its length, then its likely that
the program is hopelessly broken anyway.

I fail to see any logic in that assertion.
Is a program that reads e.g. 40,000 bytes from
a file into an array of characters 'broken'?
Note I said likely.

What makes it likely? Do you for some reason find it
unlikely for an array to have more than 32767 elements?
Why?
You can of course construct an artifical program which
uses a single massive string

I suppose 'massive' is a subjective term. But I have
written real production code which used such 'massive'
arrays.
and laugh when the int i solution fails on it.

Undefined behavior is no laughing matter in real code.

Why not always use 'size_t' and *know* it will *always*
work, rather than using 'int' and having it *possibly*
or *probably* work? There are already too many uncertainties
in life. I'll take the guarantee, thank you very much.

-Mike
 
Q

qazmlp

size_t len = strlen( myCString ) ;

for( WHAT i = 0 ; i < len ; ++i )
{
// iterate through the string
// and do some operations
}
 
M

Malcolm

qazmlp said:
size_t len = strlen( myCString ) ;

for( WHAT i = 0 ; i < len ; ++i )
{
// iterate through the string
// and do some operations
}
size_t is correct, int is traditional and what any C programmer would use.

Why int? Because int is the natural integer type for the program to use. If
a string is so long that an int won't hold its length, then its likely that
the program is hopelessly broken anyway.

Note I said likely. You can of course construct an artifical program which
uses a single massive string and laugh when the int i solution fails on it.
 
M

Malcolm

Mike Wahler said:
I fail to see any logic in that assertion.
Is a program that reads e.g. 40,000 bytes from
a file into an array of characters 'broken'?
If int is only 16 bits wide then, probably, yes, things are pretty hopeless.
Such a machine is unlikely to have a clock speed fast enough to deal with a
40,000 character array in decent time, for example.
Also, if you expect a 40,000 char file, then you should also entertain the
possibility that you will deal with a 66,000 char one. If you only expect a
2-3000 line file, but are not setting an absolute limit on the user, you
don't gain much by setting that limit at 40,000 rather than 32,000.
As I said, you can construct a counter-example. Eg maybe the 40,000 chars
are part of the Hebrew Bible and you are writing "Bible code" searching
software. Since the text of the Hebrew Bible is unlikely to change this
strategy doesn't risk memory overflow on the next input.
Why not always use 'size_t' and *know* it will *always*
work, rather than using 'int' and having it *possibly*
or *probably* work? There are already too many uncertainties
in life. I'll take the guarantee, thank you very much.
Because size_t is an ANSI kludge to get round the fact that technically an
object in C can be enormous.
It is arguably useful when you are genuinely passing a memory size, but
problems mount when you use size_t for a count.

For starters, loops are by definition executed many times. int i is almost
guaranteed to fit in a register. size_t has no such guarantee. So why not
just use int i for the inner loops? Now you've got the worst of both
worlds - size_ts cluttering up the code, whilst it still won't run on big
inputs.

Another problem is what happens when we do arithmetic with the index. This
happens quite a bit, and you will get signed / unsigned mismatches. With
pointer arithmetic, size_t looks like a count of bytes (it's a size, right).

MYSTRUCT *ptr = x;
size_t i;
for(i=0;i<10;i++)
ptr + i;

looks like we are adding 0-10 bytes to ptr. This will confuse people who
aren't too familiar with C, as well as people who expect code to be written
clearly.
 
P

pete

Malcolm said:
int i is almost guaranteed to fit in a register.
size_t has no such guarantee.

Notice that you just said that neither does int i.
- size_ts cluttering up the code

size_t has meaning.
If you have int used for both data and indexing,
the code becomes harder to read.
whilst it still won't run on big inputs.

Another problem is what happens when we do arithmetic
with the index. This happens quite a bit,
and you will get signed / unsigned mismatches.

ptrdiff_t is also available instead of size_t.
With pointer arithmetic, size_t looks like a count of bytes

size_t looks like a count.
This will confuse people who aren't too familiar with C,
as well as people who expect code to be written clearly.

I think that it's easier to read code, when size_t
is used for counting, and int is used for integer data.
I especially notice this when I see sorting functions on the net,
which were written for sorting arrays of integers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top