malloc and maximum size

Ben Bacarisse · Oct 15, 2011

Keith Thompson said:
In particular, I can argue that a conforming compiler can permit

typedef char[2 * SIZE_MAX] way_too_big;

Irrelevant to your point, but you mean, I think,

typedef char way_too_big[2 * SIZE_MAX];

<snip>

James Kuyper · Oct 15, 2011

I realize now that I passed over the paragraph above without commenting
on it. I disagree with it's assertions. sizeof expressions have a type,
and size_t is a typedef for that type, but that's the definition of the
typedef, not the definition of the type. That type can be, and usually
is, a type that already can be named in some other manner by the user,
such as "unsigned long". It's defined in terms of it's properties, not
by the fact that one of the things it's used for is as the result type
of sizeof expressions. It must be suitable for such use, or the typedef
is inappropriate, but it's maximum value is a property of the type
itself, not a property of one particular use of that type.

Well, the maximum value returnable by sizeof has to be the same as the
maximum value representable by size_t:

sizeof(char[(size_t)-1]) == (size_t)(-1)

Assuming I understand what you're getting at, you're begging the
question here. [But the expression doesn't make sense to me, so
the assumption may well be wrong.]

Click to expand...

I had not intended to beg the question. Per 6.3.1.3p2, (size_t)(-1) is
precisely "the maximum value that can be represented" as a size_t.

Click to expand...

Then you were indeed begging the question; the equality is *exactly*
what I'm questioning.

I don't see how "begging the question" comes into play. I thought the
truth of that equality was perfectly clear, and did not present a
detailed argument for it until after I realized that you disagreed.
However, in that argument, I did not assume that equality to be true, I
generated that conclusion from various more fundamental features of C.

char[(size_t)(-1)] is an array type whose size in bytes is (size_t)(-1).
sizeof(char[(size_t)(-1)]) must therefore yield a value which is
(size_t)(-1). Therefore, the largest value that sizeof could possibly
yield must be (size_t)(-1), the same as the maximum representable value.

Click to expand...

I see; I was assuming you were giving a C expression, since it looked
like one. Thanks for the explanation.

Yes, your assumption was correct, sizeof(char[(size_t)(-1)]) is indeed a
C expression.

Keith has already addressed your other points, though not exactly in the
fashion that I would have. It will simplify things if I respond to his
message, rather than yours.

James Kuyper · Oct 15, 2011

On 10/14/2011 07:27 PM, Keith Thompson wrote:
....

C99 5.2.4.1 discusses translation limits (127 nesting levels of blocks,
63 nesting levels of conditional inclusion, etc.). It requires an
implementation to accept 65535 bytes in an object, but that doesn't
apply to the above expression, since it doesn't refer to an object.

On the other hand, 5.2.4.1 doesn't even say that an implementation
must always accept 65535 bytes in an object. It says:

The implementation shall be able to translate and execute at least
one program that contains at least one instance of every one of the
following limits:

followed by a list of limits. By rejecting "sizeof(char[3500000000])",
gcc isn't violating any requirement in 5.2.4.1; presumably it does
translate and execute that one program (which doesn't include an
instance of "sizeof(char[3500000000])".

Presumably an implementation may legitimately reject a program violating
some limit not listed in 5.2.4.1, such as a trillion-line translation
unit. So I *think* that rejecting any program referring to a type
bigger than 2**31 bytes is permitted. But I'm not 100% certain.

As I'm sure you're already aware, 5.2.4.1 is one of my least favorite
clauses in the standard. Taking it's words literally, what it fails to
promise renders virtually the entire rest of the standard meaningless;
this is just one example of that fact. Except when I'm arguing against
someone who disagrees on that point, I generally prefer to pretend that
5.2.4.1 says something that allows the rest of the standard to be
meaningful.

In this particular case, I'd prefer to believe that the standard's
requirement that sizeof(type) yield the size of the specified type
(6.5.3.4p2) always applies, except when it's impossible for it to do so,
because the size is too big to be represented by a size_t.
sizeof(char[(size_t)(-1)]) can be represented by a size_t, so it should
not be exempted from that requirement. I will, however, reluctantly
agree that the standard is not clear on that point.

James Kuyper · Oct 15, 2011

What (arguably) requires the implementation to accept
"sizeof(char[3500000000])" is the same thing that requires it to
accept "sizeof(char[10])", namely the sections discussing the sizeof
operator, the char type, integer constants, and array types.

Click to expand...

Why should the compiler be forced to allow the definition of a type that
defines an object bigger than is allowed??

No type was defined by that expression.

Note that all valid object sizes must fit within a size_t,

More accurately, all sizeof expressions must have a result that fits
within a size_t. That's true whether they have a type argument or an
expression argument. It's true whether their expression argument is an
lvalue or an rvalue. It's true whether or not there's any actual object
referred to by the lvalue expression that is their argument.

It's also, apparently, true whether or not it's actually possible for
the size to be represented by a size_t, because the standard specifies
no exemption from that requirement just because it can't be met. By my
standards that constitutes a defect in the standard.

James Kuyper · Oct 15, 2011

On 10/14/2011 07:03 PM, Keith Thompson wrote:
....

Interesting. For the following program:

#include <stdio.h>
int main(void) {
printf("%zu\n", sizeof (char[(size_t)-1]));
return 0;
}

"gcc -std=c99" rejects it with an error message:

c.c: In function â€˜mainâ€™:
c.c:3:5: error: size of unnamed array is too large

Change the array size to (size_t)-1/2 and it compiles and prints
"2147483647". Change it to (size_t)-1/2+1 and it rejects it with the
same error message.

Does this make "gcc -std=c99" non-conforming?

I've argued that such behavior would make it non-conforming, but the
implementation-limits issue is sufficiently foggy that it's not at all
clear.

James Kuyper · Oct 15, 2011

Do you mean 0xFFFFFF?

0xFF000000?

I spent a lot of time recently writing code with hex masks for 64 bit
types. The shorter, correct versions you've give above didn't look right
- they were too short. :-(

Richard Damon · Oct 15, 2011

What (arguably) requires the implementation to accept
"sizeof(char[3500000000])" is the same thing that requires it to
accept "sizeof(char[10])", namely the sections discussing the sizeof
operator, the char type, integer constants, and array types.

Click to expand...

Why should the compiler be forced to allow the definition of a type that
defines an object bigger than is allowed??

Click to expand...

No type was defined by that expression.

Note that all valid object sizes must fit within a size_t,

Click to expand...

More accurately, all sizeof expressions must have a result that fits
within a size_t. That's true whether they have a type argument or an
expression argument. It's true whether their expression argument is an
lvalue or an rvalue. It's true whether or not there's any actual object
referred to by the lvalue expression that is their argument.

It's also, apparently, true whether or not it's actually possible for
the size to be represented by a size_t, because the standard specifies
no exemption from that requirement just because it can't be met. By my
standards that constitutes a defect in the standard.

If there exists a type/object whose size can not be expressed by sizeof,
then the compiler is going to be obligated to generate a diagnostic and
fail to compile the program. If it doesn't, it can't meet the
requirements of the standard. If it allowed the type/object to exist,
then it is going to need to rely on the ability to fail for exceeding an
implementation limit.

Due to the various problems that arise by having objects bigger than can
be expressed by size_t, it is better for the compiler to generate the
diagnostic on attempting to create the enormous type rather than wait
for the user to apply sizeof to it.

Keith Thompson · Oct 15, 2011

Ben Bacarisse said:
Keith Thompson said:

In particular, I can argue that a conforming compiler can permit

typedef char[2 * SIZE_MAX] way_too_big;

Click to expand...

Irrelevant to your point, but you mean, I think,

typedef char way_too_big[2 * SIZE_MAX];

<snip>

Yes, thanks.

Keith Thompson · Oct 15, 2011

James Kuyper said:
As I'm sure you're already aware, 5.2.4.1 is one of my least favorite
clauses in the standard. Taking it's words literally, what it fails to
promise renders virtually the entire rest of the standard meaningless;
this is just one example of that fact. Except when I'm arguing against
someone who disagrees on that point, I generally prefer to pretend that
5.2.4.1 says something that allows the rest of the standard to be
meaningful.

[...]

The problem is that compilers can run out of resources in a plethora
of circumstances, and it's difficult or impossible to define a
coherent set of criteria such that you can reasonably require *all*
programs that meet those criteria are "small" enough that we can
expect all compilers to accept them.

I find 5.2.4.1 a clever way to deal with it. An implementation
*could* satisfy the rather oddly stated requirement by cheating
(recognizing one specific program that meets all the limits and
does nothing, and treating it as `int main(void) { }"). But in the
absence of that, the easiest way to satisfy 5.2.4.1 is probably to
write a compiler that doesn't impose any fixed limits (as suggested
by the footnote).

Keith Thompson · Oct 15, 2011

Richard Damon said:
On 10/14/2011 08:41 PM, Richard Damon wrote: [...]

Note that all valid object sizes must fit within a size_t,

Click to expand...

More accurately, all sizeof expressions must have a result that fits
within a size_t. That's true whether they have a type argument or an
expression argument. It's true whether their expression argument is an
lvalue or an rvalue. It's true whether or not there's any actual object
referred to by the lvalue expression that is their argument.

It's also, apparently, true whether or not it's actually possible for
the size to be represented by a size_t, because the standard specifies
no exemption from that requirement just because it can't be met. By my
standards that constitutes a defect in the standard.

Click to expand...

If there exists a type/object whose size can not be expressed by sizeof,
then the compiler is going to be obligated to generate a diagnostic and
fail to compile the program. If it doesn't, it can't meet the
requirements of the standard. If it allowed the type/object to exist,
then it is going to need to rely on the ability to fail for exceeding an
implementation limit.

An implementation is *never* required to fail to compile a program
unless it contains a "#error" directive. Violations of syntax rules and
constraints require a diagnostic; once the diagnostic is issued, the
behavior is undefined.

But do you mean that a diagnostic is required even if the program
doesn't invoke sizeof? What requirements can't it meet?

Due to the various problems that arise by having objects bigger than can
be expressed by size_t, it is better for the compiler to generate the
diagnostic on attempting to create the enormous type rather than wait
for the user to apply sizeof to it.

C99 6.5p5:

If an _exceptional condition_ occurs during the evaluation of
an expression (that is, if the result is not mathematically
defined or not in the range of representable values for its
type), the behavior is undefined.

I think that applies to "sizeof" as much as it applies to "+".
Which means that a program that evaluates "sizeof (char[SIZE_MAX+1])"
has undefined behavior; no diagnostic is required.

James Kuyper · Oct 15, 2011

On 10/14/11 10:59 PM, James Kuyper wrote: ....

If there exists a type/object whose size can not be expressed by sizeof,
then the compiler is going to be obligated to generate a diagnostic and
fail to compile the program.

Citation, please? Diagnostics are mandatory only for syntax errors,
constraint violations, and #error directives. Which of those applies to
a program that uses the type char[2][SIZE_MAX]?

If that type is used to define an object, an implementation limit may be
exceeded, but no diagnostic is required when that happens. If the type
is used as an argument for sizeof(), what the standard specifies for the
behavior is an impossibility - but again, no diagnostic is required. If
the type is used in any other way, I don't see any special problems due
to the size of the type.

... If it doesn't, it can't meet the
requirements of the standard.

If such a type is used neither to define an object nor as the argument
of a sizeof expression, what are the requirements that you're referring
to which cannot be met?

James Kuyper · Oct 15, 2011

....
The problem is that compilers can run out of resources in a plethora
of circumstances, and it's difficult or impossible to define a
coherent set of criteria such that you can reasonably require *all*
programs that meet those criteria are "small" enough that we can
expect all compilers to accept them.

I find 5.2.4.1 a clever way to deal with it. An implementation
*could* satisfy the rather oddly stated requirement by cheating
(recognizing one specific program that meets all the limits and
does nothing, and treating it as `int main(void) { }"). But in the
absence of that, the easiest way to satisfy 5.2.4.1 is probably to
write a compiler that doesn't impose any fixed limits (as suggested
by the footnote).

Oh, I understand the reasons for why it's written that way. That doesn't
mean I have to like it.

Richard Damon · Oct 15, 2011

On 10/14/2011 07:27 PM, Keith Thompson wrote:
...

C99 5.2.4.1 discusses translation limits (127 nesting levels of blocks,
63 nesting levels of conditional inclusion, etc.). It requires an
implementation to accept 65535 bytes in an object, but that doesn't
apply to the above expression, since it doesn't refer to an object.

On the other hand, 5.2.4.1 doesn't even say that an implementation
must always accept 65535 bytes in an object. It says:

The implementation shall be able to translate and execute at least
one program that contains at least one instance of every one of the
following limits:

followed by a list of limits. By rejecting "sizeof(char[3500000000])",
gcc isn't violating any requirement in 5.2.4.1; presumably it does
translate and execute that one program (which doesn't include an
instance of "sizeof(char[3500000000])".

Presumably an implementation may legitimately reject a program violating
some limit not listed in 5.2.4.1, such as a trillion-line translation
unit. So I *think* that rejecting any program referring to a type
bigger than 2**31 bytes is permitted. But I'm not 100% certain.

Click to expand...

As I'm sure you're already aware, 5.2.4.1 is one of my least favorite
clauses in the standard. Taking it's words literally, what it fails to
promise renders virtually the entire rest of the standard meaningless;
this is just one example of that fact. Except when I'm arguing against
someone who disagrees on that point, I generally prefer to pretend that
5.2.4.1 says something that allows the rest of the standard to be
meaningful.

One interesting thing to note, is that in reading 5.2.4.1, the standard
does NOT give the implementation any right to fail to translate a
program that exceeds the limits. It can NOT be claimed to be undefined
or unspecified behavior, as the standard's definition of conformance is
NOT limited to these limits (the only restriction I find is that a
strictly conforming program can not violate those limits). The standard
requires that the compiler must be able to succeed on one given program,
but grants no permission to fail.

I suppose that this is really a defect in the standard, and the standard
needs some language to allow the implementation to establish limits, and
to generate a diagnostic and reject a program that exceeds these limits.

Exceeding implementation limits must NOT be considered undefined
behavior, (unless the definition of implementation limits is vastly
improved) as currently you can not know if a given program exceeds the
implementation limits, so considering it undefined behavior, and not
requiring a diagnostic, means it is impossible to show any given
implementation as "non-conforming", as any non-conforming behavior can
be blamed on exceeding some implementation limit.

Keith Thompson · Oct 15, 2011

James Kuyper said:
Oh, I understand the reasons for why it's written that way. That doesn't
mean I have to like it.

Do you have ideas for improving it?

James Kuyper · Oct 16, 2011

Do you have ideas for improving it?

1. Require that any that any program that contains no syntax errors,
constraint violations, or undefined behavior, and which does not exceed
any of those limits, must be translated and, when executed, must produce
a) the behavior defined by the standard for that program, insofar as it
is defined by the standard
b) the behavior defined by the implementation's documentations, insofar
as the behavior is implementation-define
c) behavior that is with permitted range of possibilities, insofar as it
is unspecified.

2. Expand the list of implementation limits to include every feature a
program might possess that might make it difficult to satisfy that
requirement.

3. Lower the values of the implementation limits enough (but no more
than necessary) to make it acceptably easy to create an implementation
satisfying that requirement.

I can't fill in the details on items 2 and 3, that would have to be done
by people who are experts in compiler design, about which I know little.
When I've previously discussed ideas along this line, I've been told that
a) the list of implementation limits called for by item 2 would have to
be infinitely long.

b) to achieve item 3 the limits would have to be so low that EVERY
useful program would exceed at least one limit.

I doubt that those assertions are true, but I'm not enough of an expert
to be sure. a) is a serious flaw, if true. b) merely means that the best
we can do is not very good; it would still be better than the current
situation.

James Kuyper · Oct 16, 2011

On 10/14/11 10:51 PM, James Kuyper wrote: ....

One interesting thing to note, is that in reading 5.2.4.1, the standard
does NOT give the implementation any right to fail to translate a
program that exceeds the limits.

The standard requires acceptance of strictly conforming programs;
without defining what "accept" means. In ordinary English, I can accept
a gift without even opening it; and I can discard it after it accepting
it. Whether or not something else was actually intended, the failure to
provide any alternative to the common English defining of "accept"
renders that requirement toothless.

Note that, in any event, a program which exceeds the minimum
implementation limits in 5.2.4.1 is not strictly conforming.

5.2.4.1 is the ONLY clause that mandates that a conforming
implementation of C be able to translate and execute any program; and
that requirement only applies to the "one program"; which is pretty
unlikely to be any program I ever care to execute. YMMV

... It can NOT be claimed to be undefined
or unspecified behavior, as the standard's definition of conformance is
NOT limited to these limits (the only restriction I find is that a
strictly conforming program can not violate those limits). The standard
requires that the compiler must be able to succeed on one given program,
but grants no permission to fail.

It also imposes no requirement that it succeed, with one worthless
exception.

I suppose that this is really a defect in the standard, and the standard
needs some language to allow the implementation to establish limits, and
to generate a diagnostic and reject a program that exceeds these limits.

Exceeding implementation limits must NOT be considered undefined
behavior, (unless the definition of implementation limits is vastly
improved) as currently you can not know if a given program exceeds the
implementation limits, so considering it undefined behavior, and not
requiring a diagnostic, means it is impossible to show any given
implementation as "non-conforming", as any non-conforming behavior can
be blamed on exceeding some implementation limit.

While I agree that that exceeding an implementation limit does not
render the behavior undefined, it does cancel the ONLY requirement that
an implementation even "accept" the program, much less translate and
execute it.

Oddly enough, the one program that an implementation is required to be
able to translate and execute is one that happens to be permitted to
exceed the minimum implementation limits.

Seebs · Oct 17, 2011

This question is only for understanding purpose.

Good luck with that.

What is the reason for allocating this much less memory while the
parameter type of malloc is size_t ?

I am having a really hard time answering this question, because I can't
figure out how it can be a question.

Imagine that a system has INT_MAX of roughly 2 billion (say, 2^31 - 1).

void
print_calendar(int year) {
/* ... */
}

Do you expect this function to produce useful results for all values of
"year" between -2 billion and +2 billion? Why or why not?

The type may limit the possible range of values, but it is not the ONLY
limit on the range of possible values. Your question makes as much sense
as, observing that someone used "signed char" for a day of the week, asking
why there are not 127 days in a week now.

-s

Tim Rentsch · Jan 25, 2012

Stephen Sprunk said:
SIZE_MAX, defined in stdint.h, is the largest value that can be
represented in type size_t. However when I pass SIZE_MAX as the
argument to malloc, it is unable to allocate SIZE_MAX bytes of memory.

Click to expand...

size_t must be capable of representing the size of the largest object
the implementation can create. [snip]

The Standard does not impose this requirement.

Tim Rentsch · Jan 25, 2012

James Kuyper said:
Certainly there's no difference in *most* cases.

For size_t, the type is just defined as the output type for sizeof,
which is defined in turn as yielding the size of its operand. So I can
see an argument that the "maximum value" of a size_t could be viewed as
the largest value that sizeof could possibly yield on the particular
implementation.

Click to expand...

Well, the maximum value returnable by sizeof has to be the same as the
maximum value representable by size_t:

sizeof(char[(size_t)-1]) == (size_t)(-1)

It has been argued that, since it's not possible for the standard's
specifications for the value and type of sizeof(char[SIZE_MAX][2]) to be
simultaneously satisfied, an implementation is free, among other
possibilities, to treat it as having a value that is larger than
SIZE_MAX. Personally, I think such code should be rejected, but it's not
clear that the standard requires, or even allows, it to be rejected.

However, there's no possible basis that I'm aware of for rejecting
sizeof(char[(size_t)-1]).

Only strictly conforming programs are Standard-ly required to be
accepted. Any other programs (which this example presuably was on
the implementation in question) don't have to be.

Tim Rentsch · Jan 25, 2012

Lowell Gilbert said:
James Kuyper said:

On 10/14/2011 11:10 AM, Lowell Gilbert wrote:
...

What distinction do you see between "largest value representable" if a
type and "maximum value" for a type? The phrases seem synonymous to me.

Click to expand...

Certainly there's no difference in *most* cases.

For size_t, the type is just defined as the output type for
sizeof, [snip]

That's backwards. The sizeof operator is defined as
returning a value of type size_t, not the other way
around. SIZE_MAX is the largest value the type can
hold, not the largest possible value of doing a sizeof.

malloc	40	May 1, 2011
using my own malloc()	14	Jul 30, 2009
malloc and alignment	13	Jan 24, 2009
size_t in inttypes.h	4	May 26, 2011
malloc()/free() question	20	Jul 20, 2008
a fast malloc/free implementation & benchmarks	0	Mar 20, 2011
compressing charatcers	35	Apr 2, 2014
Dealing with naive malloc() implementations	14	May 9, 2007

malloc and maximum size

Ben Bacarisse

James Kuyper

James Kuyper

James Kuyper

James Kuyper

James Kuyper

Richard Damon

Keith Thompson

Keith Thompson

Keith Thompson

James Kuyper

James Kuyper

Richard Damon

Keith Thompson

James Kuyper

James Kuyper

Seebs

Tim Rentsch

Tim Rentsch

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads