usage of size_t

Keith Thompson · Feb 22, 2010

gwowen said:
I know the difference. I've known the difference for years. But I
still have to think about it (if you see what I mean. I know which
one's Ant and which one's Dec, but I have to think about that too).
If I'm reading some code, I don't want my concentration unnecessarily
broken by having to recall some syntactical nicety, even a
relatively. The next guy to read my code may have to think harder
than me.

That test for inequality is implicit: an explicit one would look like
while(--i != 0). I defer to your knowledge on whether this counts as
a conversion to bool, but whatever such an implicit test is called, I
don't care for it with --i or i--. That's writing for the compiler,
not the human reader.

The test for inequality is part of the definition of the while
statement; it also occurs in "while (x > 0)" (x > 0 yields 0 or 1;
the behavior of the while statement is controlled by whether the
result is unequal to 0).

There could hardly be a conversion to bool, since bool (or _Bool)
didn't exist in C prior to C99. C99 could have changed the rules for
conditions, but it didn't.

That's what the language says. Personally, though, I agree with you.
I dislike the use of expressions that aren't logically Boolean
as conditions. By "logically Boolean", I mean having two possible
meaningful values, where 0 denotes a false condition and anything
non-zero denotes a true condition, with no meaningful distinction
among non-zero values. This includes results of certain operators,
values of variables used in this way, and of course anything of
type _Bool. For other kinds of expressions, I prefer an explicit
"!= 0" or "!= '\0'" or "!= 0.0" or "!= NULL".

But plenty of C programmers don't feel that way, and we all need
to be able to read and understand their code.

Personally, I almost never use --i as anything but an stand-alone
expression, don't use i-- unless I can absolutely help it. Is there a
compiler anywhere for which

z = i--;

produces different code than

z = i;
i = i-1;

Maybe, maybe not. They have the same effect; it doesn't make
sense to choose one or the other based on the generated code
(unless you're working around a serious compiler bug).

And, if not, which one is clearer to a neophyte C coder who's been
given my code to maintain (poor bastard), or a Fortran programmer
trying to see how my C code works, or a mathematician checking my
implementation of his algorithm? Yes, its minor a stylistic point,
and they're automatically subjective, but that's my opinion. I don't
doubt yours is at least as valid, and probably more widely held.

As a standalone statement, I find "i = i - 1;" *less* clear than
"i--;" or "--i;". It would make me ask myself whether the author
is unaware of the "--" operator, and therefore probably shouldn't
be writing C.

And I don't write code for neophyte programmers (except for
examples I post here), programmers who don't know the language,
or non-programmers. It needs to be clear to my peers, including
myself a year later. My style may be more straightforward than some
(I might write "z = i; i--;" rather than "z = i--;"), but I'm not
going to dumb it down to cater to people who probably won't see my
code anyway.

Keith Thompson · Feb 22, 2010

Phil Carmody said:
I don't remember C being a strictly typed language, on the
presumption that implies something similar to being strongly
typed.

That depends on what you mean by "strongly typed".

Malcolm's point (which you partially snipped) is that int* and
size_t* are incompatible types, and he's correct; assigning an int*
to a size_t*, or vice versa, is a constraint violation.

In a sense, C pointer types (other than void*) are "strongly typed",
but C arithmetic types are not.

Keith Thompson · Feb 22, 2010

gwowen said:
Yes, thats exactly right.

And that strikes me as a specialized and fairly rare requirement.

If you have a specific need for your code to be legible by "those
with some knowledge of algorithms and pseudo-code, but little or
no knowledge of C", then of course that's what you should do.

Most of the rest of us have no such requirement, and catering to it
has a non-zero cost (making the code less clear to the experienced C
programmers who are actually likely to read it) that we're unwilling
to pay.

Yet another case is writing code for the purpose of teaching C.
Such code needs to be readable by inexperienced C programmers,
but it should introduce common C idioms. It sounds like you're
not in that position.

Code should be written with its audience in mind -- and the compiler
is not the only audience to be considered.

Keith Thompson · Feb 22, 2010

Francis Moreau said:
Well, size_t is the type of the value returned by sizeof(). And
sizeof() returns the number of bytes (ie char) of its operand. So I
assumed that size_t was introduced to represent a number of char.

And a consequence of that is that, since array elements are at
least one byte, size_t can also safely be used to represent a
number of array elements, or an array index. No other predefined
type guarantees this (other than uintmax_t, which is overkill).
Maybe a smaller type than size_t suffices to count elements of an
array of double, but it's not worth the effort to figure that out.

There are only a limited number of predefined integer types.
We can't expect each one to have a name that precisely
reflects all the purposes for which it can be used. A name like
"size_or_count_or_index_t" might be more accurate than "size_t",
but I like "size_t" just fine.

Ersek, Laszlo · Feb 22, 2010

That doesnt wash with me.

Putting the decrement in the body makes it less clear.

If a post decrement is too clever for the reader then so is using C.

for (i = 0; i < N; ++i) {
arr;
}

is the same as (barring continue / break etc)

i = 0;
while (i < N) {
arr;
++i;
}

and to reverse the traversal, I like to write the following:

i = N;
while (0 < i)
{
--i;
arr;
}

Reasons:

- The syntax of arr doesn't change.

- The set of arr subscripts is the same as before ([0 .. N-1]), only in
reverse order.

- "i" traverses the exact same value set as before ([0 .. N]), only in
reverse order.

The formula

i = N;
while (i--) {
arr;
}

contains one superfluous decrement and invalidates the last point: "i"
will finish with (type)-1, and the set of values visited by "i" will
become [0 .. N] U { (type)-1 }.

Cheers,
lacos

Seebs · Feb 22, 2010

I'm just trying to understand what (expert) people can deduce when
they're seeing an object whose type is size_t.

That it's non-negative.

For example, if you see the following declaration:

int do_something_on_an_array(struct foo array[], size_t len);

Does 'len' parameter imply a size in bytes of 'array' (the one that
sizeof() operator would return assuming the length of the array is
known) or does it mean the number of object of type 'struct foo' in
'array'.

I would expect the latter.

The times when I'll assume len to be in bytes will be when it's associated
with a char * or void *. Otherwise, I assume it to be in the relevant unit.

-s

John Bode · Feb 22, 2010

The whole use of size_t is a load of bollox and c.l.c pedantry.

Use an int when you know you're indexing an array of "only a few"
elements.

This attitude has bitten me in the ass more than once in a mixed-
platform environment (where I spent most of the '90s and the first
half of the '00s); I was working on a machine with 32-bit ints, and
what started out as "only a few" wound up growing into "a metric
assload", where "a metric assload" > 2^16. I didn't realize there was
a problem until the next build went into testing, where the 16-bit int
machine started blowing up.

As a result of the time I wasted on that and other projects, I now
uniformly declare array indices as size_t, regardless of the array
size. Requirements change, environments change, and it just makes
sense to pick the one type that will work in any conceivable scenario,
rather than saying "I'll use an int here, an unsigned long there, and
size_t that other place, and if the requirements or operating
environment change I can always waste a little time to go back and
tweak the small stuff." Use the thing that works everywhere all the
time and be done with it; you've got bigger issues to worry about.

That's not pedantry, that's bitter experience.

I know I find it more readable and just more natural. The
counter and index is an integer.

So is size_t (an integer, that is).

size_t is silly in all but the most extreme circumstances.

It's one of the few things I know I won't have to go back and change
later. That makes up for any silliness.

ImpalerCore · Feb 22, 2010

Hello,

I usually use 'unsigned int' type for variables which hold the length
of a buffer.

However, someone suggests me to use 'size_t'.

So I took a look to the C99 spec and see what it tells about size_t:
and it's the type of the retuned value by sizeof() (6.5.3.4 p4) and
its max value is 65535 (7.18.3 p2).

size_t doesn't seem to be the good type to use when the variable of
that type describes the number of elements of a buffer whose type is
not 'char' and if the buffer size is less than 65535 bytes.

Is that correct ?

Thanks

I'll throw in my size_t issue here.

I have a few functions that use a 'ssize_t' parameter, where -1 is
used to indicate to use the whole string, or the end of the container,
i.e.

my_string_t* my_string_append_n( my_string_t* mstr, const char* str,
ssize_t n );

If n is -1, the entire string 'str' is appended to 'mstr'.

I use size_t on a regular basis, but what is the correct type for
ssize_t? I currently check something like HAVE_SSIZE_T, and if it
doesn't exist, define ssize_t as ptrdiff_t. Is this a viable way to
approach indexes that you want to allow negative values? If not, how
would you do it?

Thanks for the opinions.

Bill Cunningham · Feb 22, 2010

That doesnt wash with me.

Putting the decrement in the body makes it less clear.

If a post decrement is too clever for the reader then so is using C.

Use a debugger.

bartc · Feb 22, 2010

Richard Tobin said:
I once saw someone on Usenet mistakenly use 2^70 as the number of
atoms in the universe, instead of 10^70, which is out by a factor of
about 10^49.

And I'd heard it was 10^80. Which is enough material for ten thousand
million universes, if your figure is correct. And if mine is the correct
one, your universe would only run to about one galaxy.

Nick Keighley · Feb 23, 2010

Use a debugger.

no, don't
a debugger is not the right tool to learn what C constructs do

Nick Keighley · Feb 23, 2010

Assuming he's conversant on the finicky details of C, such as which
operations can overflow, the effects of shifting, etc, etc, etc, etc,
etc. And if he knows this, a little pre-inc or post-inc isn't going to
bother him a bit, nor is an implicit comparison to zero.

yes, it's hard to imagine a mathematician being knowledgeable about
rounding and overflow but not being capable of understanding pre- and
post- increment operators. I'd accept I might have to explain them to
him, once.

? this is a phrase that seem sto get tossed about a lot. Isn't this
the "true scotsman" fallacy. If they don't write C like me they are
*really* a C programmer.

The guy who said if saw
i = i + 1;

he'd suspect the whole code base! What planet is he from?!

yes

I've been using C, personally and professionally, for, oh, a couple
decades and change and I have yet to encounter C code from anyone who was
not either a rank newbie or attempting ofuscation which used constructs
such as you suggest.

in what sense is this code obfuscated! There are nearly always several
reasonable ways to skin a cat in C. This C-as-APL isn't the only way
to code.

They may be out there, professional C coders who, for some reason, avoid
standard C idioms, but I don't' know _where_ out there.

I block out a lot of algorithms in
Matlab, then port the timing-critical bits to C (mainly to avoid the
copy-in-copy-out that plagues matrix operations in Matlab). ++i does
nothing in Matlab, and i++ is a syntax error, so you see a lot of

Click to expand...

i = i+1;

Click to expand...

When porting, I'm not going to change that to ++i just for idiomatic
reasons.

Click to expand...

Which would be fine in a module or routine documented as such: "Code
generated [in|by] MatLab; the use of non-idiomatic expressions is
annoying but functional."

good grief. If I saw a comment like that I'd think "pretencious ****",
but that's just me.

Nick Keighley · Feb 23, 2010

Send me your paycheck each pay period, and I'll send you back an order or
three less and we'll see.

I once heard someone to refer to the earth's population as 6 million.
To which I responded "where'd everyone else go?!"

gwowen · Feb 23, 2010

Why? A comment explaining that neophyte-like code looks that way for a
reason, not because the coder is, in fact a neophyte, would be very
helpful to maintainers.

Or, alternatively, just drop the assumption that "code that doesn't
look like mine is neophyte code". How about judging the quality of
the coder by the robustness and correctness of the code, rather than
whether they use certain syntactic idioms.

True, this will require more thought, but occasionally using thought
rather than dogma is beneficial.

Michael Foukarakis · Feb 23, 2010

Sadly, its also less clear. It requires the reader to remember the
difference between --i and i--, and it requires them to be aware of
the implicit int-to-bool conversion.

If the reader doesn't remember those, maybe he/she is worried about
concepts different than the clarity of idioms.

Michael Foukarakis · Feb 23, 2010

and

c) size_t is just a very misleading name for
something that doesn't hold a size (ie index)

Since when is index a synonym for size?

Michael Foukarakis · Feb 23, 2010

Richard Heathfield said:
Richard Heathfield said:

gwowen wrote:
Idioms is there for those as wants to count down:
size_t i = N;
while(i--)
is simpler, shorter, and more correcterer.
Sadly, its also less clear.
I disagree. I would argue that it's a well-known idiom. Still, I
accept that there are arguments on both sides.
It requires the reader to remember the
difference between --i and i--, and it requires them to be aware of
the implicit int-to-bool conversion.
I would expect any serious C programmer to be aware of both of these
without having to think too strenuously about it, but the second at
least is easily dealt with:
while(i-- > 0)
It's idiomatic precisely
because, until you've seen it many times, it requires more thought
than should be necessary.
size_t i=N-1; // implicitly assume N!=0
do {
foo(i); // or more likely foo(bar[i-1])
i = i - 1; // or --i or i--, as you prefer.
} while(i != 0);
I find my version much easier to read. But then I would, wouldn't I?

Click to expand...

Click to expand...

I see yours coping with N==0, and the others not coping with it,
to be the black and white distinguisher.

Click to expand...

Au contraire, Blackadder. As I posted earlier...
Having given it some thought, I now prefer this...

while(i != 0) { // Here i tells us how many loop iterations remain
--i; // i now indexes the i'th element of an array....
foo(bar);
};

That loop will make anyone who reads your code cringe, for three
reasons:

1) A while(i != 0) loop with an array indexed by i needs special
handling, which adds to confusion. (not the case with do { } while(),
hint hint)
2) Pre-increment (--i) as special case handling raises questions. Why
--i; over i--;, or the more verbose, less idiomatic and generally
clearer i = i + 1; ?
3) It's neither common usage nor idiomatic per popular use of the term
(natural to a specific group of people, usually large enough to even
be recognized as one).

People without good C skills will be more concerned with (1) and (2)
when trying to understand if the loop does indeed what it's supposed
to.

Why avoid common idioms if you're indeed trying to write
comprehensible code? This is a trivial thing we're talking about; a
loop. This:

for(i = 0; i < N; i++) {
foo(bar;
}

is a million (give or take) times clearer than your solution, is
probably what any C newbie has been taught first in the subject of
loops, and conveys exactly the intended behaviour. If you are so
desperate to put it in a while loop, you can use:

while(i++ < N)
foo(bar);

which does NOT raise the concerns you mention (implicit in-/equality
test, etc).

To any programmer who is not
familiar with idiomatic C, and is used to writing a language that does
not have the --i idiom[0] is not clear. To someone familiar with
idiom, yes of course it is, but to anyone else unclear.

Click to expand...

Yet you advocate for it. Why? Are you always this inconsistent with
yourself?

Seebs · Feb 23, 2010

Or, alternatively, just drop the assumption that "code that doesn't
look like mine is neophyte code". How about judging the quality of
the coder by the robustness and correctness of the code, rather than
whether they use certain syntactic idioms.

There isn't enough time in the world to give every piece of code the level
of review you'd give to something you knew was written by, say, Nilges,
or Bill Cunningham.

In practice, heuristics are an EXTREMELY effective way to allocate scarce
resources. The heuristic that certain kinds of quirky writing are a red
flag that the rest of the code will likely contain weirdness, errors, or
things that need careful re-reading to comprehend them, turns out to be
stunningly effective.

True, this will require more thought, but occasionally using thought
rather than dogma is beneficial.

It's nearly always beneficial. However, heuristics aren't dogma; they're just
a first pass to quickly spot cases where it's likely to be necessary to spend
extra time studying some code.

-s

Richard Bos · Feb 23, 2010

Malcolm McLean said:
size_t is an int designed by committee.
Nonsense...

The idea was that you would have a special type to hold amounts of
memory. Since, usually, the address space of a processor is the same
as the pointer width which is the same as an integer data register,

....because this is nonsense. There have been _many_ situations in which
sizeof(void *) != sizeof (size_t) != sizeof (int).

Richard

Richard Bos · Feb 23, 2010

gwowen said:
Your code is clear for people with good C skills.
My code is clear for people without good C skills, and clear (but non-
idiomatic) for those with good skills, and clear-but-hideous for C
mavens. I'm OK with that.

People without good C skills should learn a bit more before trying to
hack on a program written in C. If you want COBOL, you know where to
find it.

So, if you want your code understood as widely as possible, don't be a
vicar of Bray; grasp the nettle, and do Yeoman's service and all
things being equal, Bob's your uncle and you'll come up smelling of
roses... Otherwise you'll do a Devon Loch, be hoist by your own
petard, be gone for a right royal Burton, or otherwise come a
cropper. I wouldn't touch idiomatic English with a bargepole. It's
just not cricket.

And yet, I would prefer a novel written in English for literate readers
_not_ to eschew idioms. You should compare C to a Shaw play or a book by
Joyce. Do not write C as if you are Dr. Seuss - that's what BASIC is
for.

Richard

size_t, ssize_t and ptrdiff_t	56	Oct 12, 2013
size_t in inttypes.h	4	May 26, 2011
The problem with size_t	45	Oct 15, 2009
return -1 using size_t???	44	Feb 11, 2012
Plauger, size_t and ptrdiff_t	26	Feb 17, 2006
size_t and ptr_diff_t	9	Aug 23, 2007
size_t	18	Dec 6, 2004
finding max value of size_t	22	Mar 9, 2007

usage of size_t

Keith Thompson

Keith Thompson

Keith Thompson

Keith Thompson

Ersek, Laszlo

Seebs

John Bode

ImpalerCore

Bill Cunningham

bartc

Nick Keighley

Nick Keighley

Nick Keighley

gwowen

Michael Foukarakis

Michael Foukarakis

Michael Foukarakis

Seebs

Richard Bos

Richard Bos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads