usage of size_t

T

Tim Rentsch

santosh said:
Francis Moreau said:
Well, size_t is the type of the value returned by sizeof(). And
sizeof() returns the number of bytes (ie char) of its operand. So I
assumed that size_t was introduced to represent a number of char.

As far as I know, the purpose of size_t seems to be to serve as a
portable type to hold the sizes of objects. However it's also the
only type guaranteed to hold the number of elements of an object, in
a strict sense. [snip]

Where in the Standard do you find this guarantee? As far
as I know there is no such requirement.
 
T

Tim Rentsch

I would expect any serious C programmer to be aware of both of these
without having to think too strenuously about it

I also think it's unclear, but not because the reader is likely to
be unaware of the difference. The trouble is that it seems natural
for a test at the top of a loop to be testing the value that will
be used in the loop, but here it is testing a different value.

I suppose you could use

for(i=N-1; i != (size_t)-1; i--)

but it's not pretty.[/QUOTE]

Why not just this:

for(i=N-1; i != -1; i--)

which works for any integer type, either signed or
unsigned, whose conversion rank is at least that of int.
 
T

Tim Rentsch

Malcolm McLean said:
Right, but size_t is maximally portable (whatever that means) while
unsigned long is not.
size_t is the only type that is guaranteed to be able to index any
array. [snip]

Oh? Which paragraphs in the Standard provide that guarantee?
 
T

Tim Rentsch

IIRC the only way to get two such pointers was by pointer manipulation
that would be considered to have undefined behaviour in ISO C, BICBW.

That seems highly unlikely, since the implementation-defined conversions
between pointers and integers almost certainly suffice to construct
the offending pointer values without relying on undefined behavior.
 
S

Seebs

Certainly that seems to be the expectation, but I don't
think it's required or necessarily guaranteed. I'm
pretty sure a conforming implementation could have
SIZE_MAX == 65535 but still allow

char too_big[ 100000 ];

for example. And that's only one way of getting a buffer of
more than SIZE_MAX bytes.

I can't see how. If it does that, what does it give you for sizeof(too_big)?

-s
 
I

Ian Collins

Tim said:
Malcolm McLean said:
Right, but size_t is maximally portable (whatever that means) while
unsigned long is not.
size_t is the only type that is guaranteed to be able to index any
array. [snip]

Oh? Which paragraphs in the Standard provide that guarantee?

The result type of the sizeof() operator is size_t, so the range of
size_t has to be large enough to index any array.
 
I

Ian Collins

Tim said:
Certainly that seems to be the expectation, but I don't
think it's required or necessarily guaranteed. I'm
pretty sure a conforming implementation could have
SIZE_MAX == 65535 but still allow

char too_big[ 100000 ];

for example. And that's only one way of getting a buffer of
more than SIZE_MAX bytes.

No, it couldn't. That would break sizeof().
 
K

Keith Thompson

Tim Rentsch said:
Malcolm McLean said:
Right, but size_t is maximally portable (whatever that means) while
unsigned long is not.
size_t is the only type that is guaranteed to be able to index any
array. [snip]

Oh? Which paragraphs in the Standard provide that guarantee?

size_t is the type yielded by the sizeof operator. The sizeof
operator may be applied to any type or expression, and yields the
size in bytes of the type or expression. There is no permission
for this to fail. Since the number of elements of an array cannot
exceed its size in bytes, it follows that the number of elements
can be represented as a size_t.

Finding the relevant paragraphs in the Standard is left as an
exercise. (You'll likely disagree with some of my reasoning anyway.)
 
C

Chris M. Thomasson

Richard Heathfield said:
Chris M. Thomasson wrote:


Chris: I'd like to give you a considered reply, but I lack the time right
now to do your question justice (e.g. by reading your article with
sufficient care). If you could possibly remind me in a few days, I'll do
my best to pay your article the attention it looks like it deserves.

I was thinking that one might want to special case a pointer wrt a
value-based container.
 
T

Tim Rentsch

Seebs said:
Certainly that seems to be the expectation, but I don't
think it's required or necessarily guaranteed. I'm
pretty sure a conforming implementation could have
SIZE_MAX == 65535 but still allow

char too_big[ 100000 ];

for example. And that's only one way of getting a buffer of
more than SIZE_MAX bytes.

I can't see how. If it does that, what does it give you for sizeof(too_big)?

It can because there is no syntax error and no constraint
violation. Probably the declaration itself is already undefined
behavior, but assuming it isn't, doing 'sizeof too_big' would
yield a value that the result type can't represent, which is to
say an exceptional condition, in other words undefined behavior.
Since any program containing such a declaration can never be
strictly conforming, the undefined behavior gives the
implementation license to define the behavior however it wants;
for example, the expression 'sizeof too_big' could be 100000,
despite that value being too large for a size_t variable to
contain.
 
T

Tim Rentsch

Ian Collins said:
Tim said:
Certainly that seems to be the expectation, but I don't
think it's required or necessarily guaranteed. I'm
pretty sure a conforming implementation could have
SIZE_MAX == 65535 but still allow

char too_big[ 100000 ];

for example. And that's only one way of getting a buffer of
more than SIZE_MAX bytes.

No, it couldn't. That would break sizeof().

Please see the explanation in my other most recent
response here (to Seebs's message).
 
T

Tim Rentsch

Keith Thompson said:
Tim Rentsch said:
Malcolm McLean said:
Right, but size_t is maximally portable (whatever that means) while
unsigned long is not.

size_t is the only type that is guaranteed to be able to index any
array. [snip]

Oh? Which paragraphs in the Standard provide that guarantee?

size_t is the type yielded by the sizeof operator. The sizeof
operator may be applied to any type or expression, and yields the
size in bytes of the type or expression. There is no permission
for this to fail.

It has to work only for programs with defined behavior.
Since the number of elements of an array cannot
exceed its size in bytes, it follows that the number of elements
can be represented as a size_t.

Finding the relevant paragraphs in the Standard is left as an
exercise. (You'll likely disagree with some of my reasoning anyway.)

Consider an implementation with SIZE_MAX == 65535, and supplying
an array definition

char too_big[ 100000 ] = {0};

to that implementation. This array definition is legal syntax
and contains no constraint violations -- right? Hence it may be
accepted, either because it's legal or because the implementation
has chosen to define an extension for the undefined behavior. If
it's undefined behavior we're already home free, so let's suppose
for a moment it isn't (ie assume temporarily the declaration is
legal). In that case 'sizeof too_big' would yield a value that
is outside the range of what size_t can represent, which is an
exceptional condition, which is undefined behavior. Hence the
implementation can allow such a declaration, yet size_t cannot
hold a value large enough to index it.

If you disagree with any of the above, would you be so kind
as to supply appropriate section citations or references?

Unless a program contains a syntax error or constrain violation,
it may be accepted without complaint by a conforming implmentation.
Since any program with an array of 100000 bytes is never strictly
conforming, no required diagnostics plus undefined behavior
means the implementation is free to do whatever it wants.
 
T

Tim Rentsch

Ian Collins said:
Tim said:
Malcolm McLean said:
Right, but size_t is maximally portable (whatever that means) while
unsigned long is not.

size_t is the only type that is guaranteed to be able to index any
array. [snip]

Oh? Which paragraphs in the Standard provide that guarantee?

The result type of the sizeof() operator is size_t, so the range of
size_t has to be large enough to index any array.

I've responded on this in reply to Keith Thompson's followup.
 
K

Keith Thompson

Tim Rentsch said:
Keith Thompson said:
Tim Rentsch said:
Right, but size_t is maximally portable (whatever that means) while
unsigned long is not.

size_t is the only type that is guaranteed to be able to index any
array. [snip]

Oh? Which paragraphs in the Standard provide that guarantee?

size_t is the type yielded by the sizeof operator. The sizeof
operator may be applied to any type or expression, and yields the
size in bytes of the type or expression. There is no permission
for this to fail.

It has to work only for programs with defined behavior.
Since the number of elements of an array cannot
exceed its size in bytes, it follows that the number of elements
can be represented as a size_t.

Finding the relevant paragraphs in the Standard is left as an
exercise. (You'll likely disagree with some of my reasoning anyway.)

Consider an implementation with SIZE_MAX == 65535, and supplying
an array definition

char too_big[ 100000 ] = {0};

to that implementation. This array definition is legal syntax
and contains no constraint violations -- right? Hence it may be
accepted, either because it's legal or because the implementation
has chosen to define an extension for the undefined behavior. If
it's undefined behavior we're already home free, so let's suppose
for a moment it isn't (ie assume temporarily the declaration is
legal). In that case 'sizeof too_big' would yield a value that
is outside the range of what size_t can represent, which is an
exceptional condition, which is undefined behavior. Hence the
implementation can allow such a declaration, yet size_t cannot
hold a value large enough to index it.

If you disagree with any of the above, would you be so kind
as to supply appropriate section citations or references?

I think you're right. If the compiler accepts the above declaration
then the mathematical result of ``sizeof too_big'', 100000, cannot be
represented in the appropriate type, size_t. sizeof is just another
operator; its behavior on overflow should be the same as for any other
operator.

But wait, I just realized something very odd. size_t is an unsigned
type. C99 6.2.5p9 says:

A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting
unsigned integer type is reduced modulo the number that is one
greater than the largest value that can be represented by the
resulting type.

So one could argue that, given SIZE_MAX==65535, ``sizeof too_big''
must be 34464.

(On the other hand, ``sizeof too_big'' doesn't have unsigned
*operands*; its the expression as a whole whose result is unsigned.)

I suspect the authors of the standard weren't thinking about "sizeof"
when they wrote 6.2.5p9.
Unless a program contains a syntax error or constrain violation,
it may be accepted without complaint by a conforming implmentation.
Since any program with an array of 100000 bytes is never strictly
conforming, no required diagnostics plus undefined behavior
means the implementation is free to do whatever it wants.

In practice, I think most or all implementations sidestep the issue
by disallowing objects bigger than SIZE_MAX bytes -- or, to put it
another way, by making size_t big enough to represent the size of
any supported object. For example, given:

#include <stddef.h>
char huge[(unsigned long long)(size_t)-1];

gcc says:

c.c:2: error: size of array 'huge' is too large

But as you've argued, it doesn't *have* to reject it.
 
K

Keith Thompson

Keith Thompson said:
In practice, I think most or all implementations sidestep the issue
by disallowing objects bigger than SIZE_MAX bytes -- or, to put it
another way, by making size_t big enough to represent the size of
any supported object. For example, given:

#include <stddef.h>
char huge[(unsigned long long)(size_t)-1];

gcc says:

c.c:2: error: size of array 'huge' is too large

But as you've argued, it doesn't *have* to reject it.

I realize that's not the best possible example, since the expression
actually is small enough to fit in a size_t, and the cast to unsigned
long long is superfluous. (I had tried to use SIZE_MAX, but I forgot
that it's defined in <stdint.h>, not in <limits.h>.)

Here's a better example:

#include <stdint.h>
char huge[(uintmax_t)SIZE_MAX + 1];

which triggers the same error message.
 
T

Tim Rentsch

Keith Thompson said:
Tim Rentsch said:
Keith Thompson said:
Right, but size_t is maximally portable (whatever that means) while
unsigned long is not.

size_t is the only type that is guaranteed to be able to index any
array. [snip]

Oh? Which paragraphs in the Standard provide that guarantee?

size_t is the type yielded by the sizeof operator. The sizeof
operator may be applied to any type or expression, and yields the
size in bytes of the type or expression. There is no permission
for this to fail.

It has to work only for programs with defined behavior.
Since the number of elements of an array cannot
exceed its size in bytes, it follows that the number of elements
can be represented as a size_t.

Finding the relevant paragraphs in the Standard is left as an
exercise. (You'll likely disagree with some of my reasoning anyway.)

Consider an implementation with SIZE_MAX == 65535, and supplying
an array definition

char too_big[ 100000 ] = {0};

to that implementation. This array definition is legal syntax
and contains no constraint violations -- right? Hence it may be
accepted, either because it's legal or because the implementation
has chosen to define an extension for the undefined behavior. If
it's undefined behavior we're already home free, so let's suppose
for a moment it isn't (ie assume temporarily the declaration is
legal). In that case 'sizeof too_big' would yield a value that
is outside the range of what size_t can represent, which is an
exceptional condition, which is undefined behavior. Hence the
implementation can allow such a declaration, yet size_t cannot
hold a value large enough to index it.

If you disagree with any of the above, would you be so kind
as to supply appropriate section citations or references?

I think you're right. If the compiler accepts the above declaration
then the mathematical result of ``sizeof too_big'', 100000, cannot be
represented in the appropriate type, size_t. sizeof is just another
operator; its behavior on overflow should be the same as for any other
operator.

But wait, I just realized something very odd. size_t is an unsigned
type. C99 6.2.5p9 says:

A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting
unsigned integer type is reduced modulo the number that is one
greater than the largest value that can be represented by the
resulting type.

So one could argue that, given SIZE_MAX==65535, ``sizeof too_big''
must be 34464.

It would be if the value were converted, but it isn't; the value
just "is".
(On the other hand, ``sizeof too_big'' doesn't have unsigned
*operands*; its the expression as a whole whose result is unsigned.)
Right.

I suspect the authors of the standard weren't thinking about "sizeof"
when they wrote 6.2.5p9.

Perhaps so, although actually I wasn't thinking of overflow
but an 'exceptional condition' in the sense of 6.5p5,

(that is, if the result is not mathematically defined or
not in the range of representable values for its type)

which definitely seems to apply to the situation here.
In practice, I think most or all implementations sidestep the issue
by disallowing objects bigger than SIZE_MAX bytes

Yes, in fact disallowing objects bigger than some number even
smaller than SIZE_MAX bytes, as you kind of mention in your
followup.
-- or, to put it
another way, by making size_t big enough to represent the size of
any supported object.

I think the first phrasing is more accurate, since even in an
implementation that doesn't allow declarations that would result in
objects larger than SIZE_MAX (and I agree that this is basically all
implementations), it might still be possible to obtain such objects
using extra-linguistic function calls. These objects might even be
directly accessible, for example using an (unsigned long long)
index.
For example, given:

#include <stddef.h>
char huge[(unsigned long long)(size_t)-1];

gcc says:

c.c:2: error: size of array 'huge' is too large

But as you've argued, it doesn't *have* to reject it.

Right. Most implementations do, but they don't have to.
 
K

Keith Thompson

Tim Rentsch said:
It would be if the value were converted, but it isn't; the value
just "is".

The value yielded by the sizeof operator "just is" of type size_t.
If the reduction modulo *_MAX+1 applies to sizeof, then it applies
just as it would for any other operator; no conversion is involved.
For example:

size_t s = 1;
s = -s; /* No conversion happens, but the mathematical value
-1 is reduced module SIZE_MAX+1 to SIZE_MAX */

[...]
Perhaps so, although actually I wasn't thinking of overflow
but an 'exceptional condition' in the sense of 6.5p5,

(that is, if the result is not mathematically defined or
not in the range of representable values for its type)

which definitely seems to apply to the situation here.

That certainly makes more sense for sizeof than reduction module
SIZE_MAX+1 does. Now all we need is a definitive statement that
6.5p5 applies to sizeof and 6.2.5p9 doesn't.

[...]
Yes, in fact disallowing objects bigger than some number even
smaller than SIZE_MAX bytes, as you kind of mention in your
followup.


I think the first phrasing is more accurate, since even in an
implementation that doesn't allow declarations that would result in
objects larger than SIZE_MAX (and I agree that this is basically all
implementations), it might still be possible to obtain such objects
using extra-linguistic function calls. These objects might even be
directly accessible, for example using an (unsigned long long)
index.

I suspect that an implementation that enabled the creation of such
huge objects would still make size_t big enough to represent their
size. If I were an implementer, I'd first decide how big an object
can be, then decide how big size_t needs to be based on that.

[...]
 
T

Tim Rentsch

Keith Thompson said:
Tim Rentsch said:
It would be if the value were converted, but it isn't; the value
just "is".

The value yielded by the sizeof operator "just is" of type size_t.
If the reduction modulo *_MAX+1 applies to sizeof, then it applies
just as it would for any other operator; no conversion is involved.
For example:

size_t s = 1;
s = -s; /* No conversion happens, but the mathematical value
-1 is reduced module SIZE_MAX+1 to SIZE_MAX */

[...]

Oh, I see what you're saying now. A conversion would be
sufficient, but isn't necessary.
That certainly makes more sense for sizeof than reduction module
SIZE_MAX+1 does. Now all we need is a definitive statement that
6.5p5 applies to sizeof and 6.2.5p9 doesn't.

6.2.5p9 does _not_ apply, arguably because it isn't a "computation",
but certainly because sizeof doesn't have unsigned operands (the
operand of sizeof is a type); it yields an unsigned result, but
doesn't have unsigned operand(s).
[...]
Yes, in fact disallowing objects bigger than some number even
smaller than SIZE_MAX bytes, as you kind of mention in your
followup.


I think the first phrasing is more accurate, since even in an
implementation that doesn't allow declarations that would result in
objects larger than SIZE_MAX (and I agree that this is basically all
implementations), it might still be possible to obtain such objects
using extra-linguistic function calls. These objects might even be
directly accessible, for example using an (unsigned long long)
index.

I suspect that an implementation that enabled the creation of such
huge objects would still make size_t big enough to represent their
size. If I were an implementer, I'd first decide how big an object
can be, then decide how big size_t needs to be based on that.

Ahh, but what I'm talking about is an _external_ library that
makes such objects available unbeknownst to the implementation.
So the implementation would be ignorant of the possibility of
over-large objects.
 
K

Keith Thompson

Tim Rentsch said:
Ahh, but what I'm talking about is an _external_ library that
makes such objects available unbeknownst to the implementation.
So the implementation would be ignorant of the possibility of
over-large objects.

I'm skeptical that that would even be possible in most
implementations. On the implementations I've seen, size_t is big
enough to span the machine's entire addressing space; an object
bigger than 2**SIZE_MAX bytes isn't even possible.

The only plausible scenario I can think of is an implementation
that deliberately restricts the size of an object for some reason.
If some external magic provides the address of a huge object, it's
not clear that a program could even index it (an attempt to do so
presumably would invoke undefined behavior). I suspect we're in
DS9K territory.
 
J

James

Keith Thompson said:
I'm skeptical that that would even be possible in most
implementations. On the implementations I've seen, size_t is big
enough to span the machine's entire addressing space; an object
bigger than 2**SIZE_MAX bytes isn't even possible.

Is this 100% guaranteed to return NULL:


assert(! calloc(2, SIZE_MAX));
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

size_t, ssize_t and ptrdiff_t 56
size_t in inttypes.h 4
The problem with size_t 45
return -1 using size_t??? 44
Plauger, size_t and ptrdiff_t 26
size_t and ptr_diff_t 9
size_t 18
finding max value of size_t 22

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,285
Latest member
CryptoTaxxSoftware

Latest Threads

Top