malloc and maximum size

Tim Rentsch · Jan 25, 2012

Lowell Gilbert said:
James Kuyper said:

On 10/14/2011 11:10 AM, Lowell Gilbert wrote:
...
Does SIZE_MAX have to be the largest value representable in a size_t?
The standard just says that it's the "maximum value" for the type,

What distinction do you see between "largest value representable" if a
type and "maximum value" for a type? The phrases seem synonymous to me.

Certainly there's no difference in *most* cases.

For size_t, the type is just defined as the output type for sizeof,
which is defined in turn as yielding the size of its operand. So I can
see an argument that the "maximum value" of a size_t could be viewed as
the largest value that sizeof could possibly yield on the particular
implementation.

Click to expand...

Well, the maximum value returnable by sizeof has to be the same as the
maximum value representable by size_t:

sizeof(char[(size_t)-1]) == (size_t)(-1)

Click to expand...

Assuming I understand what you're getting at, you're begging the
question here. [But the expression doesn't make sense to me, so
the assumption may well be wrong.]

It has been argued that, since it's not possible for the standard's
specifications for the value and type of sizeof(char[SIZE_MAX][2]) to be
simultaneously satisfied, an implementation is free, among other
possibilities, to treat it as having a value that is larger than
SIZE_MAX. Personally, I think such code should be rejected, but it's not
clear that the standard requires, or even allows, it to be rejected.

Click to expand...

Right; this argument has gone around a few times, and I don't think
there's a definitive answer either.

However, there's no possible basis that I'm aware of for rejecting
sizeof(char[(size_t)-1]).

Click to expand...

I assume you mean SIZE_MAX rather than (size_t), in which case I agree.
But that doesn't mean SIZE_MAX has to be the largest value a size_t can
hold, if there are other limits on how big an object can be.

With respect to your claim "often is" - can you cite any implementation
where the equality expression given above is NOT true?

Click to expand...

I think you're misreading my claim. There certainly are systems with
architectural limitations on how big an object can be, well short of
what fits in a size_t.

For example, the machine I'm posting from has 32-bit words, from which
it gets a 4-gigabyte range for its size_t. However, the operating
system reserves some of the address space, so no program can ever have
more than 3 gigabytes. As a result, sizeof can *never* return more than
3 gigabytes. This prompted me to wonder whether it would be legitimate
for the system's SIZE_MAX to be 3 gigabytes.

The type size_t is an unsigned integer type. The range of unsigned
integer types is specified in 6.2.6.1 p1; 3 gigabytes isn't one of
the possibilities. The range of unsigned types depends only on how
many value bits are present in their representation; nothing else.
That applies to size_t just as much as it does any other unsigned
integer type.

Does that help?

Tim Rentsch · Jan 25, 2012

James Kuyper said:
What (arguably) requires the implementation to accept
"sizeof(char[3500000000])" is the same thing that requires it to
accept "sizeof(char[10])", namely the sections discussing the sizeof
operator, the char type, integer constants, and array types.

Click to expand...

Why should the compiler be forced to allow the definition of a type that
defines an object bigger than is allowed??

Click to expand...

No type was defined by that expression.

Note that all valid object sizes must fit within a size_t,

Click to expand...

More accurately, all sizeof expressions must have a result that fits
within a size_t. [snip]

The Standard does not impose such a requirement.

It's also, apparently, true whether or not it's actually possible for
the size to be represented by a size_t, because the standard specifies
no exemption from that requirement just because it can't be met. By my
standards that constitutes a defect in the standard.

There's no defect; it's simply undefined behavior, under 6.5 p5.

Tim Rentsch · Jan 25, 2012

Richard Damon said:
On 10/14/11 7:27 PM, Keith Thompson wrote:

What (arguably) requires the implementation to accept
"sizeof(char[3500000000])" is the same thing that requires it to
accept "sizeof(char[10])", namely the sections discussing the sizeof
operator, the char type, integer constants, and array types.

Why should the compiler be forced to allow the definition of a type that
defines an object bigger than is allowed??

Click to expand...

No type was defined by that expression.

Note that all valid object sizes must fit within a size_t,

Click to expand...

More accurately, all sizeof expressions must have a result that fits
within a size_t. That's true whether they have a type argument or an
expression argument. It's true whether their expression argument is an
lvalue or an rvalue. It's true whether or not there's any actual object
referred to by the lvalue expression that is their argument.

It's also, apparently, true whether or not it's actually possible for
the size to be represented by a size_t, because the standard specifies
no exemption from that requirement just because it can't be met. By my
standards that constitutes a defect in the standard.

Click to expand...

If there exists a type/object whose size can not be expressed by
sizeof, then the compiler is going to be obligated to generate a
diagnostic and fail to compile the program.

In fact an implementation isn't obligated to do either of those.
The program could just work.

If it doesn't, it can't
meet the requirements of the standard. If it allowed the type/object
to exist, then it is going to need to rely on the ability to fail for
exceeding an implementation limit.

I think you're imagining a requirement that the Standard doesn't
actually impose.

Due to the various problems that arise by having objects bigger than
can be expressed by size_t, it is better for the compiler to generate
the diagnostic on attempting to create the enormous type rather than
wait for the user to apply sizeof to it.

Probably true, but that's a QOI question, not a conformance
question.

Tim Rentsch · Jan 25, 2012

James Kuyper said:
On 10/14/2011 07:27 PM, Keith Thompson wrote:
...

C99 5.2.4.1 discusses translation limits (127 nesting levels of blocks,
63 nesting levels of conditional inclusion, etc.). It requires an
implementation to accept 65535 bytes in an object, but that doesn't
apply to the above expression, since it doesn't refer to an object.

On the other hand, 5.2.4.1 doesn't even say that an implementation
must always accept 65535 bytes in an object. It says:

The implementation shall be able to translate and execute at least
one program that contains at least one instance of every one of the
following limits:

followed by a list of limits. By rejecting "sizeof(char[3500000000])",
gcc isn't violating any requirement in 5.2.4.1; presumably it does
translate and execute that one program (which doesn't include an
instance of "sizeof(char[3500000000])".

Presumably an implementation may legitimately reject a program violating
some limit not listed in 5.2.4.1, such as a trillion-line translation
unit. So I *think* that rejecting any program referring to a type
bigger than 2**31 bytes is permitted. But I'm not 100% certain.

Click to expand...

As I'm sure you're already aware, 5.2.4.1 is one of my least favorite
clauses in the standard. Taking it's words literally, what it fails to
promise renders virtually the entire rest of the standard meaningless;
[snip]

I'm sure that's true for some, but most people who read
the Standard have no real problem with it. Show of hands,
anyone?

Tim Rentsch · Jan 25, 2012

Richard Damon said:
On 10/14/2011 07:27 PM, Keith Thompson wrote:
...

C99 5.2.4.1 discusses translation limits (127 nesting levels of blocks,
63 nesting levels of conditional inclusion, etc.). It requires an
implementation to accept 65535 bytes in an object, but that doesn't
apply to the above expression, since it doesn't refer to an object.

On the other hand, 5.2.4.1 doesn't even say that an implementation
must always accept 65535 bytes in an object. It says:

The implementation shall be able to translate and execute at least
one program that contains at least one instance of every one of the
following limits:

followed by a list of limits. By rejecting "sizeof(char[3500000000])",
gcc isn't violating any requirement in 5.2.4.1; presumably it does
translate and execute that one program (which doesn't include an
instance of "sizeof(char[3500000000])".

Presumably an implementation may legitimately reject a program violating
some limit not listed in 5.2.4.1, such as a trillion-line translation
unit. So I *think* that rejecting any program referring to a type
bigger than 2**31 bytes is permitted. But I'm not 100% certain.

Click to expand...

As I'm sure you're already aware, 5.2.4.1 is one of my least favorite
clauses in the standard. Taking it's words literally, what it fails to
promise renders virtually the entire rest of the standard meaningless;
this is just one example of that fact. Except when I'm arguing against
someone who disagrees on that point, I generally prefer to pretend that
5.2.4.1 says something that allows the rest of the standard to be
meaningful.

Click to expand...

One interesting thing to note, is that in reading 5.2.4.1, the
standard does NOT give the implementation any right to fail to
translate a program that exceeds the limits. It can NOT be claimed to
be undefined or unspecified behavior, as the standard's definition of
conformance is NOT limited to these limits (the only restriction I
find is that a strictly conforming program can not violate those
limits). The standard requires that the compiler must be able to
succeed on one given program, but grants no permission to fail.
[snip elaboration]

All that matters is that the Standard doesn't impose any requirement
to _accept_ an arbitrary program, except strictly conforming ones.
In the absence of any stated requirement on this axis, an
implementation is free to do as it chooses and still be
conforming.

Kaz Kylheku · Jan 25, 2012

James Kuyper said:
James Kuyper said:

As I'm sure you're already aware, 5.2.4.1 is one of my least favorite
clauses in the standard. Taking it's words literally, what it fails to
promise renders virtually the entire rest of the standard meaningless;
[snip]

Click to expand...

I'm sure that's true for some, but most people who read
the Standard have no real problem with it. Show of hands,
anyone?

5.2.4.1 simply gives a set of requirements in such a way that it is clear
that all of the limits are independent. An implementation should be able to
translate a program which exercises each of the limits /simultaneously/.
That is the point. The implementation canot assert that, oops, since you
declared an object of 65535 bytes, you can only have 63 nesting of blocks, and
not the required 127.

At the same time, it is not right to require an implementation to be able to
handle EVERY program which contains an instance of each of the limits. That
category includes arbitrarily large programs. Moreover, it is not a testable
requirement because the set of programs is infinite.

Should the requirement say "three programs"? Fifty? One program is enough to
show that the limits can be hit concurrently.

There is clearly no intent there that conformance is hinged to a single test
case which exercises some limits. Such a program does not have anywhere near
the required coverage anyway to properly test a compiler. Nowhere is it
stated that if the implementation translates at least one such a program, it is
blessed as conforming and the job is done. "shall" does not mean "shall only".

The standard does not provide any acceptance test suite (and that does not
make it meaningless either). It is meant to be used as a kind of "contract"
between implementors and users. It's clear to everyone involved that there are
numerous detailed requirements given in the standard, and that users want to
see the requirements implemented earnestly so that a wide range of programs can
be translated and not just a few hand-picked ones. It's up to the parties to
decide what constitutes an acceptance test. There often isn't one. If the
compiler isn't freeware, you may have some customer support, and failing that,
a "money back" type warranty where liability is limited to a refund of the
purchase price.

Tim Rentsch · Jan 25, 2012

James Kuyper said:
[snip]

Oddly enough, the one program that an implementation is required to be
able to translate and execute is one that happens to be permitted to
exceed the minimum implementation limits.

Silly. Programs don't need "permission" to exceed minimum
implementation limits. Whether an implemetation accepts such
programs or not is up to it, either for its distinguished
program or any other one.

Tim Rentsch · Jan 25, 2012

Keith Thompson said:
James Kuyper said:

Not even sizeof(char[3500000000])? For an implementation where
(size_t)(-1) is 4294967295, what gives that implementation permission to
do anything with that expression other than yield a value of
(size_t)3500000000? The implementation is free to issue a diagnostic, of
course, but not to reject the code.

Click to expand...

Interesting. For the following program:

#include <stdio.h>
int main(void) {
printf("%zu\n", sizeof (char[(size_t)-1]));
return 0;
}

"gcc -std=c99" rejects it with an error message:

c.c: In function 'main':
c.c:3:5: error: size of unnamed array is too large

Change the array size to (size_t)-1/2 and it compiles and prints
"2147483647". Change it to (size_t)-1/2+1 and it rejects it with the
same error message.

Does this make "gcc -std=c99" non-conforming?

No, because there is no requirement that this program
be accepted.

Tim Rentsch · Jan 25, 2012

James Kuyper said:
Do you have ideas for improving it?

Click to expand...

1. Require that any that any program that contains no syntax errors,
constraint violations, or undefined behavior, and which does not exceed
any of those limits, must be translated and, when executed, must produce
a) the behavior defined by the standard for that program, insofar as it
is defined by the standard
b) the behavior defined by the implementation's documentations, insofar
as the behavior is implementation-define
c) behavior that is with permitted range of possibilities, insofar as it
is unspecified.

2. Expand the list of implementation limits to include every feature a
program might possess that might make it difficult to satisfy that
requirement.

3. Lower the values of the implementation limits enough (but no more
than necessary) to make it acceptably easy to create an implementation
satisfying that requirement. [snip incidental]

Interesting idea, but quite impractical.

Kaz Kylheku · Jan 25, 2012

1. Require that any that any program that contains no syntax errors,
constraint violations, or undefined behavior, and which does not exceed
any of those limits, must be translated and, when executed, must produce

This appears superfluous, because this notion of conformance follows from
all the other requirements. What you've written adds up to "an implementation
shall be conforming" (i.e. there exists no valid test case or other proof
method that finds it nonconforming.)

It already follows from the rest of the document. Whenever you take a
program that is within the limits, you can use the standard to deduce what that
program's behavior should be (if that program is also portable and has a
well-defined behavior). If you don't get that behavior, then you have hit upon
a test case indicating nonconformity (and you know that with or without
the proposed paragraph).

Moreover, the proposed paragraph troublingly suggests that the implementors
must validate their work by an astonishingly vast set of test cases (that will
fill more than the world's total hard drive storage, and yet which fails to
represent real-world situations) or failing that, prove it correct with
formal methods.

Kaz Kylheku · Jan 25, 2012

Keith Thompson said:
Keith Thompson said:

James Kuyper said:

Not even sizeof(char[3500000000])? For an implementation where
(size_t)(-1) is 4294967295, what gives that implementation permission to
do anything with that expression other than yield a value of
(size_t)3500000000? The implementation is free to issue a diagnostic, of
course, but not to reject the code.

Click to expand...

Interesting. For the following program:

#include <stdio.h>
int main(void) {
printf("%zu\n", sizeof (char[(size_t)-1]));
return 0;
}

"gcc -std=c99" rejects it with an error message:

c.c: In function 'main':
c.c:3:5: error: size of unnamed array is too large

Change the array size to (size_t)-1/2 and it compiles and prints
"2147483647". Change it to (size_t)-1/2+1 and it rejects it with the
same error message.

Does this make "gcc -std=c99" non-conforming?

Click to expand...

No, because there is no requirement that this program
be accepted.

If for every feature of the above program, we can deduce what the behavior
should be (and thus in doing so we must find that it does not violate any
limit, or invoke any UB) then we can deduce that this program must be accepted,
or else it represents a test case that shows the implementation to be
nonconforming.

I don't believe that it violates any limits or invokes any UB, which leads me
to conclude that it does show the above gcc installation to be nonconforming.

This all follows from the software engineering sense of what it means to have a
test case, what requirements are, and what it means for a test case to fail.

It might be a test case that is considered "degenerate", which is a code word
for "something that the users probably won't run into or care about".
Or that it's not a "showstopper bug".

But that sort of negotation or classification of defects is outside of the
standard.

Now if you make a program which stresses each of the limists and gcc chokes on
it, then you have not proven a nonconformity. This is because 5.4.2.1
has the effect of excusing the implementors from handling /all/ such programs.
They have to handle only one, and it can be any program from that set;
not restricted to one that is chosen by the users of the implementation.

If you change your program so that it does not exercise all of the limits
(maybe by scaling back just one of them) and if you can still reproduce the
failure, then you have found a nonconformance.

Shao Miller · Jan 25, 2012

This question is only for understanding purpose.

[...]

What is the reason for allocating this much less memory while the
parameter type of malloc is size_t ?

Normally what factor drives the maximum size of a single chunk of
memory allocated by malloc ?

Please explain.

I believe it's up to the implementation to a large extent. An
implementation might have 'size_t' sufficient to represent all of memory
with a flat memory model. It might then allow you to create a pointer
that points anywhere in memory like this:

unsigned char * byte_at_some_address = (unsigned char *) (size_t) XXX;

Or to represent an arbitrary pointer as an unsigned integer:

size_t address = (size_t) (void *) some_ptr;

If you want to test some limits, you could try this:

/* For use with C89 */

#include <stddef.h>
#include <stdlib.h>

static void * biggest_buf(void);

int main(void) {
char * foo;
char * bar;
char * baz;

foo = biggest_buf();
bar = biggest_buf();
baz = biggest_buf();
free(foo);
free(bar);
free(baz);
return EXIT_SUCCESS;
}

static void * biggest_buf(void) {
char * big_buf;
size_t max_size, big_size, test_size, highest_test;

/* Find the maximum size_t */
max_size = 0;
--max_size;

switch (1) while (1) {
/* big_size <= test_size <= highest_test */
big_buf = malloc(test_size);
if (!big_buf) {
if (test_size == big_size) {
/* Maximum allocation has changed. Start over */
default:
test_size = highest_test = max_size;
big_size = 0;
continue;
}
/*
* We couldn't allocate a bigger buffer than last time.
* Split the difference and try a smaller allocation
*/
highest_test = test_size;
test_size = (test_size - big_size) / 2 + big_size;
continue;
}

/* Check if we've found the biggest allocation */
if (test_size == big_size) {
/* All done */
break;
}

/* Otherwise, we might be able to allocate more */
free(big_buf);
big_size = test_size;
test_size = (highest_test - big_size) / 2 + big_size;
continue;
}
return big_buf;
}

Shao Miller · Jan 25, 2012

This question is only for understanding purpose.

[...]

What is the reason for allocating this much less memory while the
parameter type of malloc is size_t ?

Normally what factor drives the maximum size of a single chunk of
memory allocated by malloc ?

Please explain.

Click to expand...

Oh and just in case nobody else mentioned it yet, 'malloc' is supposed
to return a contiguous range of memory. So if you have 4 GiB of RAM (or
whatever), you might expect that there's going to be some stuff
occupying some of it, and the maximum _contiguous_ chunk won't even be
equal to the amount of available/unused RAM, as there'll probably be
some fragmentation.

Keith Thompson · Jan 25, 2012

Tim Rentsch said:
Only strictly conforming programs are Standard-ly required to be
accepted. Any other programs (which this example presuably was on
the implementation in question) don't have to be.

I think C99 4p3 contradicts that:

A program that is correct in all other aspects, operating on
correct data, containing unspecified behavior shall be a correct
program and act in accordance with 5.1.2.3.

Tim Rentsch · Feb 1, 2012

Kaz Kylheku said:
James Kuyper said:

As I'm sure you're already aware, 5.2.4.1 is one of my least favorite
clauses in the standard. Taking it's words literally, what it fails to
promise renders virtually the entire rest of the standard meaningless;
[snip]

Click to expand...

I'm sure that's true for some, but most people who read
the Standard have no real problem with it. Show of hands,
anyone?

Click to expand...

5.2.4.1 simply gives a set of requirements in such a way that it is clear
that all of the limits are independent. An implementation should be able to
translate a program which exercises each of the limits /simultaneously/.
That is the point. The implementation canot assert that, oops, since you
declared an object of 65535 bytes, you can only have 63 nesting of blocks, and
not the required 127. [snip some elaboration of that]

There is clearly no intent there that conformance is hinged to a single test
case which exercises some limits. [snip]

5.2.4.1 simply establishes a lower bound, and in my opinion a
fairly reasonable one. Everything above the 5.2.4.1 lower bound
is therefore relegated to QOI, which I think makes a lot of
sense, and certainly more sense than going to the other extreme.
The purpose of the Standard is to define the language (albeit
indirectly, by giving requirements for implementations of the
language); it is not to establish what constitutes a minimally
"acceptable" implementation, which is better left to some other
arena.

Tim Rentsch · Feb 1, 2012

Kaz Kylheku said:
Keith Thompson said:

[...]
Not even sizeof(char[3500000000])? For an implementation where
(size_t)(-1) is 4294967295, what gives that implementation permission to
do anything with that expression other than yield a value of
(size_t)3500000000? The implementation is free to issue a diagnostic, of
course, but not to reject the code.

Interesting. For the following program:

#include <stdio.h>
int main(void) {
printf("%zu\n", sizeof (char[(size_t)-1]));
return 0;
}

"gcc -std=c99" rejects it with an error message:

c.c: In function 'main':
c.c:3:5: error: size of unnamed array is too large

Change the array size to (size_t)-1/2 and it compiles and prints
"2147483647". Change it to (size_t)-1/2+1 and it rejects it with the
same error message.

Does this make "gcc -std=c99" non-conforming?

Click to expand...

No, because there is no requirement that this program
be accepted.

Click to expand...

If for every feature of the above program, we can deduce what the behavior
should be (and thus in doing so we must find that it does not violate any
limit, or invoke any UB) then we can deduce that this program must be accepted,
or else it represents a test case that shows the implementation to be
nonconforming. [snip elaboration]

I don't find any support for this conclusion in the Standard. The
Standard _does_ require that implementations accept any strictly
conforming program, but the program given above is not strictly
conforming. I'm not aware of any requirement in the Standard
that any program other than strictly conforming ones be accepted.
There is the one distinguished program (of each implementation's
choosing) that must accepted and executed successfully, but the
program above cannot be that.

In the absence of a requirement to accept a particular program, I
believe implementations are not required to accept it, especially
since for some programs there is such a requirement.

Tim Rentsch · Feb 1, 2012

Keith Thompson said:
I think C99 4p3 contradicts that:

A program that is correct in all other aspects, operating on
correct data, containing unspecified behavior shall be a correct
program and act in accordance with 5.1.2.3.

You bring up a good point. I suppose one could argue that a
non-strictly conforming program that an implemention chooses
not to accept is not "correct in all other aspects", but
certainly the issue is open to debate.

However, even if we grant this exception, it is very narrow.
It speaks only to the possibility of unspecified behavior,
and not (for example) a program exceeding some minimum level
of required implementation limit, so 4p3 doesn't apply to
the program (not shown) under discussion.

malloc	40	May 1, 2011
using my own malloc()	14	Jul 30, 2009
malloc and alignment	13	Jan 24, 2009
size_t in inttypes.h	4	May 26, 2011
malloc()/free() question	20	Jul 20, 2008
a fast malloc/free implementation & benchmarks	0	Mar 20, 2011
compressing charatcers	35	Apr 2, 2014
Dealing with naive malloc() implementations	14	May 9, 2007

malloc and maximum size

Tim Rentsch

Tim Rentsch

Tim Rentsch

Tim Rentsch

Tim Rentsch

Kaz Kylheku

Tim Rentsch

Tim Rentsch

Tim Rentsch

Kaz Kylheku

Kaz Kylheku

Shao Miller

Shao Miller

Keith Thompson

Tim Rentsch

Tim Rentsch

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads