calloc/free: a preplexing observation

boris · Jan 31, 2005

Hi!

I'm seeking some answers about what seems to be a memory leak.

I have a loop that looks much like this:
double *largeArray = (double*) calloc();
for (...) {
printf("iteration #...\n");
for (...) {
double *foo = (double*) calloc();
.....
.....
largeArray[someIndex] = something;
free(foo);
}
}

Though the actual code is larger, it only differs in 20+ lines of
trivial math performed on stack variables.

Clearly, foo cannot be leaking since it's being freed (and no, it
cannot be allocated outside of the loop, since its size varies each
time.

Now, when I monitor memory usage with top it grows relatively quickly
(300K per pass over the outer loop), thus there ought to be a memory
leak. At first I thought that the "largeArray" was being optimized not
to calloc all at once, but rather on demand, page by page (which would
be bizzarre) but now I believe that might not be the case since the
"largeArray" is about 4000*4000 of double which should be about 16MB -
and I see usage of > 100MB after a few hundred iterations.

I'm using gcc 3.2.2 on i*86 Linux.
Any guesses would be appreciated.

Thanks!

Boris

Richard Tobin · Jan 31, 2005

"largeArray" is about 4000*4000 of double which should be about 16MB -

4000*4000 doubles is 16M * sizeof(double), which is 128MB if you have
8-byte doubles.

-- Richard

j · Jan 31, 2005

Hi!

I'm seeking some answers about what seems to be a memory leak.

I have a loop that looks much like this:
double *largeArray = (double*) calloc();
for (...) {
printf("iteration #...\n");
for (...) {
double *foo = (double*) calloc();
....
....
largeArray[someIndex] = something;
free(foo);
}
}

Though the actual code is larger, it only differs in 20+ lines of
trivial math performed on stack variables.

Clearly, foo cannot be leaking since it's being freed (and no, it
cannot be allocated outside of the loop, since its size varies each
time.

Now, when I monitor memory usage with top it grows relatively quickly
(300K per pass over the outer loop), thus there ought to be a memory
leak. At first I thought that the "largeArray" was being optimized not
to calloc all at once, but rather on demand, page by page (which would
be bizzarre) but now I believe that might not be the case since the
"largeArray" is about 4000*4000 of double which should be about 16MB -
and I see usage of > 100MB after a few hundred iterations.

I'm using gcc 3.2.2 on i*86 Linux.
Any guesses would be appreciated.

I really do not see an issue with the code you have provided.
(Other than casting where unnecessary). It is too incomplete.
Can you not provide all of it? If not, I would recommend the
use of valgrind here. But that is off-topic for this newsgroup.

Clark S. Cox III · Jan 31, 2005

Hi!

I'm seeking some answers about what seems to be a memory leak.
[snip]

... since the
"largeArray" is about 4000*4000 of double which should be about 16MB -
and I see usage of > 100MB after a few hundred iterations.

Do the math again:

4,000 * 4,000
= 16,000,000

If sizeof(double) is 8 then:

8B * 16,000,000
= 128,000,000B
â‰ˆ 122 MB

So your usage of > 100MB seems to be right in line with what should be
expected.

boris · Jan 31, 2005

Right-O. I'm an idiot: >100MB is exactly the right space usage.
However, why is it not all allocated with the first calloc of
largeArray - why do I see 'top' report ever-growing usage? This is
where I would probably want to use -fprefetch-loop-arrays, which is not
supported on my architecture according to gcc

As for providing more code, I could - but the rest of it is just junk -
this is all of the relevant code.

Boris

Michael Mair · Jan 31, 2005

Right-O. I'm an idiot: >100MB is exactly the right space usage.
However, why is it not all allocated with the first calloc of
largeArray - why do I see 'top' report ever-growing usage? This is
where I would probably want to use -fprefetch-loop-arrays, which is not
supported on my architecture according to gcc

As for providing more code, I could - but the rest of it is just junk -
this is all of the relevant code.

But it will not work when pasted into some sort of main() function.

Boris

#include <stdlib.h>
#include <stdio.h>

int main (void)
{where are you looping

printf("iteration #...\n");
for (...) {

Click to expand...

dito

double *foo = (double*) calloc(); how much are you calloc()ing
....
....
largeArray[someIndex] = something;

Click to expand...

where are someindex and something declared/initialized
where is largeArray free()d

return 0;
}

Now, give us that stuff requested or create a minimal example --
then we can help you.

Note that calloc() does not necessarily make sense for doubles

Please do not top-post.

Cheers
Michael

Keith Thompson · Jan 31, 2005

I'm seeking some answers about what seems to be a memory leak.

I have a loop that looks much like this:
double *largeArray = (double*) calloc();
for (...) {
printf("iteration #...\n");
for (...) {
double *foo = (double*) calloc();
....
....
largeArray[someIndex] = something;
free(foo);
}
}

Though the actual code is larger, it only differs in 20+ lines of
trivial math performed on stack variables.

Clearly, foo cannot be leaking since it's being freed (and no, it
cannot be allocated outside of the loop, since its size varies each
time.

Now, when I monitor memory usage with top it grows relatively quickly
(300K per pass over the outer loop), thus there ought to be a memory
leak. At first I thought that the "largeArray" was being optimized not
to calloc all at once, but rather on demand, page by page (which would
be bizzarre) but now I believe that might not be the case since the
"largeArray" is about 4000*4000 of double which should be about 16MB -
and I see usage of > 100MB after a few hundred iterations.

Apart from your miscalculation of the size allocated for largeArray,
there's no guarantee that free() gives memory back to the operating
system. Very likely it stays within your program and becomes
available for further allocation. You don't give us a clue about what
arguments you're giving to calloc(), but it's possible that you're
fragmenting the heap and making it difficult for the system to
re-allocate the memory you've freed.

I would probably add some printf() statements to log all the calls to
calloc() and free(). For example:

double *foo = calloc(something, something_else);
/*
* don't cast the result of malloc() or calloc().
*/
printf("foo = calloc("%lu, %lu) --> [%p]\n",
(unsigned long)something,
(unsigned long)something_else,
(void*)foo);
...
printf("free(foo), foo=[%p]\n", (void*)foo);
free(foo);

Analyze the results and make sure you're freeing everything you
allocate. If not, there's your problem; if so, the displayed
addresses may tell you something, or there may be some system-specific
way to trace the internal behavior of calloc() and free().

Incidentally, it's not safe to assume that calloc() will set all the
doubles in your allocated array to 0.0. It sets the allocated memory
to all-bits-zero. This is often the representation of 0.0, but the
language doesn't guarantee it.

Old Wolf · Jan 31, 2005

Right-O. I'm an idiot: >100MB is exactly the right space usage.
However, why is it not all allocated with the first calloc of
largeArray - why do I see 'top' report ever-growing usage? This is
where I would probably want to use -fprefetch-loop-arrays, which is not
supported on my architecture according to gcc

Probably your operating system is doing 'lazy allocation'. It will
allocate you an address space but not actually claim that memory yet.

Then when you try and access memory in the space you have been
given, it will go and actually allocate that memory.

If there is not actually any memory available then it will die
in a screaming heap, or start swapping endlessly.

I think the point of lazy allocation is so that if a programmer
is lazy and just mallocs a huge chunk at the start, then other
applications do not need to suffer the effects of having not
much memory available.

Keith Thompson · Feb 1, 2005

Old Wolf said:
Probably your operating system is doing 'lazy allocation'. It will
allocate you an address space but not actually claim that memory yet.

Then when you try and access memory in the space you have been
given, it will go and actually allocate that memory.

If there is not actually any memory available then it will die
in a screaming heap, or start swapping endlessly.

I think the point of lazy allocation is so that if a programmer
is lazy and just mallocs a huge chunk at the start, then other
applications do not need to suffer the effects of having not
much memory available.

It's pretty clear that lazy allocation is non-conforming. A program
should be able to determine whether enough memory is available when it
attempts to allocate it; that's why malloc() provides a simple and
clear mechanism for reporting failure. There's no way a program can
fail gracefully if the OS randomly kills it when it tries to access
memory it thinks it's already allocated.

The OP was using calloc(), which zeros the allocated memory, but
perhaps the system simulates that (so that the memory which springs
into existence when it's accessed looks like it's already filled with
zeros).

If your system does lazy allocation, one way to make it act as if it
were more nearly conforming would be to fill the allocated memory
with, say, 0xff bytes immediately after allocating it. That still
won't let it fail gracefully, but at least the failure will occur
sooner rather than later.

An experiment the OP might try is to fill the allocated memory with
some non-zero value immediately after calloc(), then fill it with
zeros again. Obviously this is going to slow things down (so you
won't want to do this in your production version), but it could be
useful to see whether this affects the memory behavior.

Christian Bau · Feb 1, 2005

Keith Thompson said:
An experiment the OP might try is to fill the allocated memory with
some non-zero value immediately after calloc(), then fill it with
zeros again. Obviously this is going to slow things down (so you
won't want to do this in your production version), but it could be
useful to see whether this affects the memory behavior.

I have seen exactly this method being used in serious production code -
a function "my_malloc ()" with the same arguments as malloc, that would
call malloc (), install a signal handler, fill the malloc ()'d pointer
with some data, and finally return the pointer. If anything went wrong
while filling the allocated memory, the signal handler would stop the
signal from propagating; in that case the pointer was free()d and the
function returned NULL. Truly horrible code to attempt to get a
conforming implementation.

Keith Thompson · Feb 1, 2005

Christian Bau said:
I have seen exactly this method being used in serious production code -
a function "my_malloc ()" with the same arguments as malloc, that would
call malloc (), install a signal handler, fill the malloc ()'d pointer
with some data, and finally return the pointer. If anything went wrong
while filling the allocated memory, the signal handler would stop the
signal from propagating; in that case the pointer was free()d and the
function returned NULL. Truly horrible code to attempt to get a
conforming implementation.

And the signal handler can't be implemented portably (since there's no
guarantee which signal will be raised if it fails). But it does seem
like a reasonable approach.

If you're willing to go a little further into the land of
non-portability, you can likely save some time by setting only one
byte per memory page. This requires knowing what a memory page is, of
course, and assumes that accessing any byte in a page will trap if and
only if accessing a single byte in the page will trap.

CBFalconer · Feb 1, 2005

Keith said:
.... snip ...

It's pretty clear that lazy allocation is non-conforming. A program
should be able to determine whether enough memory is available when it
attempts to allocate it; that's why malloc() provides a simple and
clear mechanism for reporting failure. There's no way a program can
fail gracefully if the OS randomly kills it when it tries to access
memory it thinks it's already allocated.

That depends. (ever hear that phrase before ;-). If the memory is
not actually available when usage is attempted, the OS can simply
put the program to sleep until it is available. Remember, there
are no speed guarantees.

However if the action is to abort, then I agree with you.

Old Wolf · Feb 1, 2005

Keith said:
It's pretty clear that lazy allocation is non-conforming.

It's a different sort of non-conformance to something like,
say, giving the wrong result for a division.

Would you also say that any operating system that allows
the user to kill an application is non-conforming? (because
it allows the application to abort when the C standard says
it should have kept running).

Also, any system where stack overflow is a possibility is
non-conforming (which is pretty much every device with a
stack-based implementation for function calls), unless there
are some limits imposed by the standard which I'm not aware of.
But people have to program on these systems every day.

Chris Croughton · Feb 1, 2005

It's pretty clear that lazy allocation is non-conforming. A program
should be able to determine whether enough memory is available when it
attempts to allocate it; that's why malloc() provides a simple and
clear mechanism for reporting failure. There's no way a program can
fail gracefully if the OS randomly kills it when it tries to access
memory it thinks it's already allocated.

By that standard there are no fully conforming real world C compilers,
because all sorts of things can cause programs to abort (being swapped
out and never swapped back in again, running out of process space,
faulty disk drives, a stray cosmic ray in a RAM chip, etc.).

The OS could allocate the memory but it would not necessarily be 'real'
(it's called virtual memory for that reason); it /should/ make sure that
it doesn't allocate more in total than is available in the system, but
it need not do so (it could, for instance, display a message to the
operator to allocate more swap space, or even to attach a new drive,
when it needed the memory). (Similarly, airline companies should only
book seats they have, and return an error if they are full, but in
practice they assume a nmber of "no-shows" so over-commit their
resources in the hope that something will become free...)

The OP was using calloc(), which zeros the allocated memory, but
perhaps the system simulates that (so that the memory which springs
into existence when it's accessed looks like it's already filled with
zeros).

It could do that with any value, of course.

If your system does lazy allocation, one way to make it act as if it
were more nearly conforming would be to fill the allocated memory
with, say, 0xff bytes immediately after allocating it. That still
won't let it fail gracefully, but at least the failure will occur
sooner rather than later.

Write zeros, then 0xFFs, and then a random bit pattern in case the
system notices that complete allocation units are the same value and
deallocates them. But that still won't make it portable, because the
system is free to do whatever it likes as long as it returns the same
value when read as was written to it (and it can take as long as it
likes, the C standards say nothing about performance, if it saves
everything to backing store and asks the operator to build a new machine
and reload the program that is still conforming).

An experiment the OP might try is to fill the allocated memory with
some non-zero value immediately after calloc(), then fill it with
zeros again. Obviously this is going to slow things down (so you
won't want to do this in your production version), but it could be
useful to see whether this affects the memory behavior.

As I understood it the OP was simply noticing the output of the top(1)
command, which displays certain aspects of the process but not all of
the attributes. Certainly if the OP is interested they can do all sorts
of tests (I routinely run speed tests on various machines and compilers)
but it will only, at most, give a small indication of what the system
will do.

Chris C

Keith Thompson · Feb 1, 2005

Old Wolf said:
It's a different sort of non-conformance to something like,
say, giving the wrong result for a division.

I disagree. malloc() is supposed to return a null pointer to indicate
that the requested memory can't be allocated. If it returns a
non-null pointer, it's non-conforming.

Would you also say that any operating system that allows
the user to kill an application is non-conforming? (because
it allows the application to abort when the C standard says
it should have kept running).

Programs can always be affected by external interactions (shutting off
the power if nothing else). I suppose it could be argued that
allowing the program not to finish is non-conforming, but I'm not
*quite* that picky.

Also, any system where stack overflow is a possibility is
non-conforming (which is pretty much every device with a
stack-based implementation for function calls), unless there
are some limits imposed by the standard which I'm not aware of.
But people have to program on these systems every day.

The standard allows implementations to impose limits.

No realistic implementation can provide unlimited resources (say, for
infinitely deep recursion or a loop that will take a trillion years to
complete). A realistic implementation can provide a malloc() that
doesn't lie to the client.

Old Wolf · Feb 2, 2005

Keith said:
I disagree. malloc() is supposed to return a null pointer to
indicate that the requested memory can't be allocated. If it
returns a non-null pointer, it's non-conforming.

I would argue that the as-if rule allows the OS to not actually
allocate the memory until it is needed.

Suppose for sake of clarity that the OS waits until memory is
available (if the application tries to write to an address
that hasn't been allocated by the OS yet). Then the application
cannot discern in any way that the memory has not been
allocated and there is no requirement for memory writes to occur
in any time frame.

Furthermore, the OS has actually allocated an address range
in the application's virtual address space. There is no
requirement for the virtual address space to be mapped to a
physical address space at the same time (in fact there can't
be, otherwise we could not have systems with disk-swapped
virtual memory).

Keith Thompson · Feb 2, 2005

Old Wolf said:
I would argue that the as-if rule allows the OS to not actually
allocate the memory until it is needed.

Suppose for sake of clarity that the OS waits until memory is
available (if the application tries to write to an address
that hasn't been allocated by the OS yet). Then the application
cannot discern in any way that the memory has not been
allocated and there is no requirement for memory writes to occur
in any time frame.

[...]

I agree *if* an attempt to access unavailable memory causes the
program to wait until it becomes available. If the attempt causes the
program to abort, the as-if rule doesn't apply.

Richard Tobin · Feb 2, 2005

Old Wolf said:
I would argue that the as-if rule allows the OS to not actually
allocate the memory until it is needed.

Right, but if it can then not do so, it's not conforming.

Lazy allocation that guarantees that there will be enough when it's
required is conforming.

-- Richard

Lawrence Kirby · Feb 2, 2005

I disagree. malloc() is supposed to return a null pointer to indicate
that the requested memory can't be allocated. If it returns a
non-null pointer, it's non-conforming.

What we know is that in the *abstract machine* when malloc() returns
non-null an object has been properly created. What an actual
implementation is required to do is a very different and rather more
complex question. Clause 4p6 says

"A conforming hosted implementation shall accept any strictly conforming
program."

"Accept" is the key word here and the standard doesn't define further what
it means. I suggest that it means "not reject" in the sense of saying "I
won't compile this code because it isn't valid C". Consider that a
strictly conforming program can be arbitrarily long and no real-world
compiler is capable of translating every possible strictly conforming
program. The compiler can say "sorry, I can't translate this" in some
fashion, but not "your program is invalid".

5.1.2.3 places requirement on how an implementation must honour the
semantics of the abstract machine. This is based on observable behaviour
notably on I/O and side-effects notably of volatile objects. 5.1.2.3p5
says

"The least requirements on a conforming implementation are:

- At sequence points, volatile objects are stable in the sense that
previous accesses are complete and subsequent accesses have not yet
occurred.

- At program termination, all data written into files shall be identical
to the result that execution of the program according to the abstract
semantics would have produced.

- The input and output dynamics of interactive devices shall
take place as specified in 7.19.3. The intent of these requirements is
that unbuffered or line-buffered output appear as soon as possible, to
ensure that prompting messages actually appear prior to a program
waiting for input."

Note that what is lacking, and what MUST be lacking if there is any chance
of creating a real-world conforming implementation, is any sense that the
implementation must execute the program successfully to completion. All we
know is that IF a sequence point is reached volatile objects are stable,
IF we reach program termination (see 5.1.2.2.3) file output must match the
abstract machine, IF file I/O to interactive devices happens then it
should behave as per the abstract machine.

The standard doesn't guarantee that a conforming implementation will
execute any strictly conforming program to completion (except one
specified in 5.2.4.1), nor does it place any restrictions on how or why
the execution of a program might fail. All that can really be said is that
to the extent that a program does execute it must be consistent with the
abstract machine.

So, aborting the execution of a program, except the one specified by the
implementation w.r.t. 5.2.4.1, because the implementation doesn't
have enough memory available to continue the execution, is very much
allowed by the standard; not in the abstract machine but in an
implementation. 5.2.4.1 is interesting in that an overcommitting system
must make sure that it doesn't trip up for this program. Maybe. Any
multitasking system that doesn't reserve memory permanently for the
possible translation and execution of this program may find itself unable
unable to do so in some circumstances.

Consider:

void foo(void)
{
char data[1000];

puts("I got here");

data[500] = 0;
}

Let's say an implementation aborted the execution of the program at the
statement data[500] = 0; due to hitting a stack quota limit. It spotted
this when the write operation caused a trap on an unmapped memory page and
it tried to allocate a new one. As far as the abstract machine is
concerned the definition of data causes that object to be fully created
when the block is entered, in much the same way that malloc() has created
an object when it returns non-null (for a non-zero argument). So this
is a non-conforming implementation if you consider overcommitting for
malloc'd memory non-conforming.

Programs can always be affected by external interactions (shutting off
the power if nothing else). I suppose it could be argued that
allowing the program not to finish is non-conforming, but I'm not
*quite* that picky.

But is the standard? I don't see anything to suggest that at all.

The standard allows implementations to impose limits.

For specific things, none of which are directly relevant here.

No realistic implementation can provide unlimited resources (say, for
infinitely deep recursion or a loop that will take a trillion years to
complete). A realistic implementation can provide a malloc() that
doesn't lie to the client.

The question here is conformance. Can a conforming implementation
overcommit or not?

Lawrence

Richard Bos · Feb 2, 2005

Chris Croughton said:
By that standard there are no fully conforming real world C compilers,
because all sorts of things can cause programs to abort (being swapped
out and never swapped back in again, running out of process space,
faulty disk drives, a stray cosmic ray in a RAM chip, etc.).

None of those are even remotely under the implementation's control,
though. malloc() is.

Richard

Does ANSI C allow free() to always be empty?	38	Mar 9, 2014
Where to free memory ?	1	Nov 9, 2009
memory managers and malloc/free	3	Aug 12, 2006
creating with new and destroying with free	3	Jun 4, 2007
clc selected threads (30-jan-2005 to 31-jan-2005) #1	3	Feb 6, 2005
free() dumps core with a segfault.	7	Sep 15, 2004
How does free() know how many elements should be freed in a dynamic array?	2	Jan 13, 2006
Observing a container	2	Dec 10, 2013

calloc/free: a preplexing observation

boris

Richard Tobin

j

Clark S. Cox III

boris

Michael Mair

Keith Thompson

Old Wolf

Keith Thompson

Christian Bau

Keith Thompson

CBFalconer

Old Wolf

Chris Croughton

Keith Thompson

Old Wolf

Keith Thompson

Richard Tobin

Lawrence Kirby

Richard Bos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads