Tim Rentsch said:
Trying to regroup... The constraints in the above cases in
some sense necessary because there is no reasonable way to
make sense of what's being expressed otherwise. I contend
that having objects larger than SIZE_MAX doesn't fall in
that category, because there is a way to make sense of
such objects; it's only if we insist on using values
of type size_t to hold their sizes or their lengths that
we run into trouble.
[...]
The fundamental question is this: can the size of any object *always*
be represented as a size_t?
The standard doesn't currently answer this question, at least not
clearly. It says that size_t is the type of the result of sizeof,
but there are objects whose size cannot be (directly) determined
using sizeof. We can't use malloc() to create an object bigger than
SIZE_MAX bytes, because malloc()'s size parameter is of type size_t.
We *might* be able to create such an object with calloc(SIZE_MAX,
2); on the other hand, an implementation needn't support such calls
(calloc() can fail and return NULL), and I suspect the inventor(s)
of calloc() didn't intend it for that purpose.
A new version of the standard *could* establish a new rule that no
object may exceed SIZE_MAX bytes. (It's been argued, unpersuasively
IMHO, that this rule is already implicit in the current standard.)
I think the following would be sufficient to establish this:
-- A non-VLA type whose size exceeds SIZE_MAX bytes is a constraint
violation.
-- A VLA type whose size exceeds SIZE_MAX bytes cause the program's
behavior to be undefined.
-- A call to calloc() where the mathematical product of the two
arguments exceeds SIZE_MAX must return a null pointer.
Now the question is whether this would be a good idea, and that's
a matter of opinion. In my opinion, it would be. Implementations
could still support objects as large as they're able to; they'd just
have to define size_t appropriately. (All implementations I'm aware
of already do this.) If you create an object by calling calloc(),
you wouldn't have to worry about how to represent its size. If a
function takes a pointer to the first element of an array, you can
reliably use size_t to index it.
The alternative would be to permit objects bigger than SIZE_MAX
bytes, but such objects couldn't be created by malloc(), which
strikes me as an unduly arbitrary restriction. Usually, if I
want to create a really big object, malloc() is the way to do it.
Switching to calloc() because it can create a bigger object seems
silly, since it imposes the overhead of zeroing the object (which
could be significant for something that big).
I just think that a clear statement that size_t can represent the
size of any object (because that's what size_t is for) makes for
a cleaner language.
And if you want a collection of data larger than SIZE_MAX bytes,
you can always use a file.