Memory management strategies in C. (long)

J

jacob navia

If you have some spare time, I would like to know what do you
think about this.

Did I forget something? Something wrong somewhere?

Thanks

P.S. This is part of a tutorial I am writing for lcc-win32. This explains
why there are windows examples.

Memory management strategies
----------------------------

Each program needs some workspace to work in. How this space is managed
(allocated, recycled) makes a memory allocation strategy.
Here is a short description of some of the most popular ones.

1: Static buffers
-----------------
This is the simplest strategy. You reserve a fixed memory area (buffer)
at compile time, and you use that space and not a byte more during the
run time of the program.

Advantages:
----------
It is the fastest possible memory management method since it has no run-
time overhead. There is no memory allocation, nor recycling that incurs
in run time costs.

In memory starved systems (embedded systems, microcontroller applications,
etc) it is good to know that there is no possibilitiy of memory
fragmentation or other memory space costs associated with dynamic
allocation.

Disadvantages:
--------------
Since the amount of memory allocated to the program is fixed, it is not
possible to adapt memory consumption to the actual needs of the program.
The static buffers could be either over-dimensioned, wasting memory space,
or not enough to hold the data needed. Since the static buffers must be
patterned after the biggest possible input, they will be over-dimensioned
for the average case.

Unless programming is adapted to this strategy, it is difficult to reuse
memory being used in different buffers to make space for a temporary
surge in the space needs of the program.


2: Stack based allocation
-------------------------
The C standard allows for this when you write:

int fn(int a)
{
char workspace[10000];
...
}

In this case, the compiler generates code that allocates 10000 bytes
of storage from the stack. This is a refinement of the static buffers
strategy.
A variant of this strategy allows for dynamic allocation. Instead of
allocating a memory block of size "siz" with malloc, we can write:

char workspace[siz];

and the compiler will generate code that allocates "siz" bytes from the
program stack.

Advantages:
----------
Very fast allocation and deallocation. To allocate a memory block only a
few assembly instructions are needed. Deallocation is done without any
extra cost when the function where the variables are located exits.

Disadvantages:
-------------
There is no way to know if the allocation fails. If the stack has
reached its maximum size, the application will catastrophically fail
with a stack overflow exception.

There is no way to pass this memory block to a calling function. Only
functions called by the current function can see the memory allocated
with this method.

Even if the C99 standard is already several years old, some compilers do
not implement this. Microsoft compilers, for instance, do not allow this
type of allocation. A work-around is to use their _alloca function.
Instead of the code above you would write:

char *workspace = _alloca(siz);

3: "Arena" based allocation
---------------------------
This strategy is adapted when a lot of allocations are done in a
particular sequence of the program, allocations that can be released
in a single block when the data is no longer needed. The program
allocates a large amount of memory called "arena", and sub-allocates it
to the consuming routines needing memory. When a certain phase of the
program is reached, the whole chunk of memory is either marked as free
or released to the operating system.

The windows operating system provides support for this strategy with the
APIs CreateHeap, HeapAlloc, and others.

Advantages:
----------
Fewer calls to memory allocation/deallocation routines.

No global fragmentation of memory.

Disadvantages:
-------------
Since the size of the memory that will be needed is not known in advance,
once an arena is full, the strategy fails or needs to be complemented
with more sophisticated variations. A common solution is to make the
arena a linked list of blocks, what needs a small processing overhead.

Determining when the moment has come to release all memory is tricky
unless the data processed by the program has a logical structure that
adapts itself to this strategy. Since there is no way of preserving
data beyond the frontier where it is released, data that is to be
preserved must be copied into another location.

4:The malloc / free strategy
----------------------------
This is the strategy that is most widely used in the C language. The
standard provides the functions malloc and free. The program allocates
memory as needed, keeping track of each memory block, and freeing it
when no longer needed. The free function needs a pointer to the same
exact location that was returned by malloc.

Advantages:
----------
o It is very flexible, since the program can allocate as needed, without
being impossed any other limit besides the normal limit of available
memory.

o It is economic since the program doesn't grab any more memory than it
actually needs.
o It is portable since it is based in functions required by the C
language.

Disadvantages:
--------------
o It is very error prone. Any error will provoke obscure and difficult
to track bugs that need advanced programming skills to find. And the
possibilities of errors are numerous: freeing twice a memory block,
passing a wrong pointer to free, forgetting to free a block, etc.

o The time used by memory allocation functions can grow to an important
percentage of the total run time of the application. The complexity of
the application increases with all the code needed to keep track and
free the memory blocks.

o This strategy suffers from the memory fragmentation problem. After
many malloc/free cycles, the memory space can be littered with many
small blocks of memory, and when a request for a big block of memory
arrives, the malloc system fails even if there is enough free memory
to satisfy the request. Since it is impossible for the malloc system
to move memory blocks around, no memory consolidation can be done.

o Another problem is aliasing, i.e. when several pointers point to the
same object. It is the responsability of the programmer to invalidate
all pointers to an object that has been freed, but this can be very
difficult to do in practice. If any pointer to a freed object remains
in some data structure, the next time it will be used, the program can
catastrophically fail or return invalid results, depending on whether
the block was reallocated or not.


5: The malloc with no free strategy
-----------------------------------
This strategy uses only malloc, never freeing any memory. It is adapted
to transient programs, i.e. programs that do a well defined task and
then exit.
It relies on the operating system to reclaim the memory used by the
program.

Advantages:
-----------
o Simplified programming, since all the code needed to keep track of
memory blocks disappears.
o It is fast since expensive calls to free are avoided.

Disadvantages:
-------------
The program could use more memory than strictly needed.

It is very difficult to incorporate software using this strategy into
another program, i.e. to reuse it. This strategy can be easily
converted into an arena based strategy though, since only a call to
free the arena used by the program would be needed.


6: Automatic freeing (garbage collection).
-----------------------------------------
This stragegy relies upon a collector, i.e. a program that scans the
stack and the global area of the application looking for pointers to
its buffers. All the memory blocks that have a pointer to them, or to
an inner portion of them, are marked as used, the others are
considered free.

This strategy combines easy of use and reclaiming of memory in a
winning combination for most applications, and it is the recommended
strategy for people that do not feel like messing around in the debugger
to track memory accounting bugs.

Advantages:
-----------
Program logic is simplified and freed from the chores of keeping track
of memory blocks.
The program uses no more memory than needed since blocks no longer in
use are recycled.

Disadvantages:
--------------
It requires strict alignment of pointers in addresses multiple of four.
Normally, this is ensured by the compiler, but under certain packing
conditions (compilation option -Zp1) the following layout could be
disastrous:

#pragma pack(1)
struct {
short a;
char *ptr;
} s;

The pointer "ptr" will NOT be aligned in a memory address multiple of
four, and it will not be seen by the collector because the alignment
directive instructs the compiler to pack structure members.

You are supposed to store the pointers in memory accessible to the
collector. If you store pointers to memory allocated by the collector
in files, for instance, or in the "windows extra bytes" structure
maintained by the OS, the collector will not see them and it will
consider the memory they point to as free, releasing them again to the
application when new requests are done.

Whenever a full gc is done (a full scan of the stack and the heap), a
noticeable stop in program activity can be perceived by the user. In
normal applications this can take a bit less than a second in large
memory pools and slow machines. The collector tries to improve this
by doing small partial collections each time a call to its allocator
function is done.

If you have only one reference to a block, the block will be retained.
If you have stored somewhere a pointer to a block no longer needed, it
can be very difficult indeed to find it.

The garbage collector of lcc-win32 is a conservative one, i.e. if
something in the stack looks like a pointer, it will be assumed that
this is a pointer (fail-safe) and the memory block referenced will be
retained. This means that if by chance you are working with numeric
data that contains numbers that can be interpreted as valid memory
addresses more memory will be retained than strictly necessary. The
collector provides special APIs for allocating tables that contain
no pointers and whose contents will be ignored by the collector. Use
them to avoid this problems.

6:Mixed strategies
------------------
Obviously you can use any combination of this methods in your programs.
But some methods do not mix well. For instance combining malloc/free
with automatic garbage collection exposes you to more errors than
using only one method. If you pass to free() a pointer allocated with
GC_malloc chaos will reign in your memory areas.

To the contrary, the stack allocation strategy can be combined very
well with all other strategies since it is specially adapted to the
allocation of small buffers that make for many of the calls to the
allocator functions.
 
S

Samuel Barber

CBFalconer said:
This is not a part of any standard, and is thus totally
non-portable. Such systems as have provided this do so with
differing semantics. The user would be better advised to insist
on a compiler with at least the variable arrays of C99
implemented, where he can expect his code to remain portable to
more and more systems as time goes by.

Lousy advice. alloca() is a lot more portable than variable size
arrays. Anyway, the real portability issue lies with the amount of
stack used, not the language facility. On a non-virtual memory
machine, you don't have an "infinite" stack to play with.
Correction - used in programs written in the C language. The
standard language itself provides this ability.

To be picky, malloc() is a library routine, not a language facility.
There's a big difference.
Firstly, it is totally non-standard and thus not portable unless
the system itself, in source form, written in standard C, is
included. Even then, it is only suitable to a sub-class of
programs, and the exact characteristics required of such programs
are not as widely known as those of normal standard C programs.

There's at least one free GC package which has been widely ported.
Things are not as black-and-white as you apparently want to believe.

Sam
 
Z

Zeljko Vrba

On 21 Aug 2003 02:48:54 -0700,


That difference being what? malloc() is part of the C language. A
language that doesn't have malloc() is not C.
Oh, but it is. The standard allows for free-standing implementations which
are not required to support some (I don't know exactly which) library
facilities.

One platform that comes close to this is PalmOS which
- doesn't have malloc
- it has something similar called MemPtrNew
- dynamic memory allocation is *strongly* discouraged and severely limited
in allocation amounts available to the program
 
S

Samuel Barber

Daniel Haude said:
On 21 Aug 2003 02:48:54 -0700,


That difference being what? malloc() is part of the C language. A
language that doesn't have malloc() is not C.

There's an important distinction between language facilities
(implemented in the compiler) and libraries (which are often written
in C). You can take a standard C compiler and mate it with a
non-standard library...or no library at all. The language is still C,
but without the standard library.

Sam
 
S

Samuel Barber

CBFalconer said:
My point is that variable size arrays are in the standard, and
code using them will port to more and more systems as time goes
by.

You would have a point if it were clear that C99 is being rapidly
embraced by the C community.
alloca is not standardized, will disappear, and may well vary
between systems even if those systems implement it.

There's no reason to think that alloca will "disappear". Lots of
important code uses it.

Sam
 
C

Chris Torek

(To be precise, malloc() is part of any hosted implementation.)

There's an important distinction between language facilities
(implemented in the compiler) and libraries (which are often written
in C). You can take a standard C compiler and mate it with a
non-standard library...or no library at all.

Perhaps; perhaps not. One might also claim that there is an
important distinction between C Standard Library functions and
other (e.g., third-party) library functions; and in fact there
is. For instance, if you write your own memcpy() function, then
use a call like:

memcpy(dst, src, sizeof(*src));

and compile this code with versions of GCC, you will discover
that your function is never called:

% cat foo.c
void show(int *dst, int *src) {
memcpy(dst, src, sizeof *src);
}
% cc -O2 -S foo.c
% cat foo.s
.file "foo.c"
.text
.p2align 2,,3
.globl show
.type show,@function
show:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
movl (%eax), %edx
movl 8(%ebp), %eax
movl %edx, (%eax)
leave
ret
.Lfe1:
.size show,.Lfe1-show
.ident "GCC: (GNU) 3.2.2"
%

There is no call to memcpy() here -- gcc has replaced it with
an inline copy. If I change the memcpy() to copy 100 items I
get:

cld
movl $100, %ecx
rep
movsl

-- still no call to memcpy().

Any hosted C compiler may, at its discretion, replace any call to
any C standard library function with something that has the same
semantics as described by the C standard. This can even include
re-using the function name but changing the parameters, so that
an assembly-language implementation of (e.g.) pow() needs to take
three arguments instead of two.
 
A

Arthur J. O'Dwyer

Even if you don't include the standard headers?

Yup. But only using gcc -O3 [for the memcpy example anyway], at least
where I am. gcc -O2 doesn't do the inlining, although both do
complain about the undefined behavior.

-Arthur
 
C

Chris Torek

Chris Torek said:
... For instance, [gcc replaces memcpy() calls with different code]

Even if you don't include the standard headers?

Yes. Of course, one should include the proper header, or an
explicit declaration (which I failed to do in the examples)... :)

(Incidentally, on some platforms, gcc also calls memcpy and/or
memset for structure and array initialization, even under
-ffreestanding, so that if you are using gcc to build embedded
systems, you must provide the functions.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top