Sourin said:
Hi all,
I am trying to write code for my experiments. My work involves huge
datasets, and as such my code needs to be memory efficient. I did some
hand calculations regarding the amount of dynamic memory required for
my datastructures ( used sizeof() to get the size of the structures ).
But mallinfo() function show that approximately double that amount of
memory is being allocated. I am working with gcc 3.2.2 on a redhat 9
machine. I come from electrical engineering background and dont know
much about memory allocations or overheads. Can anyone please
enlighten me.
Others have already mentioned the possibility that some
of your struct types contain padding, but nobody so far has
suggested what you might do about it. One way to get rid of
the padding is to use non-Standard language extensions to tell
the compiler not to pad certain struct types; many compilers
support such extensions, although the details differ from one
compiler to the next. Unfortunately, this usually results in
slower and/or larger code; the padding existed for a reason,
and eliminating it exacts a penalty.
Another approach, if you've got large arrays of these
padded structs, is break the structs apart and "factor" the
data into parallel arrays. For example, instead of
struct {
short shrift;
/* padding likely here */
double trouble;
} array[10000];
.... you could write
short shrift[10000];
double trouble[10000];
.... and the ten thousand padding slots simply vanish. Of
course, you've sacrificed some notational and mnemonic
convenience -- it's obvious that array[42].shrift and
array[42].trouble are somehow related, but in the revised
program the relationship between shrift[42] and trouble[42]
is less apparent. Functions like qsort() won't work with
this scheme, for example, because they don't know about
the relationship between these parallel arrays.
On other possible contributor to greater-than-expected
memory consumption is the overhead of keeping track of each
allocated area. malloc() implementations differ, but most
will add somewhere between four and sixteen bytes to each
allocation, in addition to rounding the total up to a nice
multiple of (usually) four, eight, or sixteen bytes. In a
malloc(1) call, the overhead due to bookkeeping and rounding
will probably be much greater than the one-byte "payload,"
and if you make a hundred thousand such calls, most of your
program's memory will be tied up in this useless (to you)
overhead. malloc(1) is an extreme case, but the effect can
be significant even for moderate-sized structs -- an eight-
byte struct may incur eight bytes of overhead, for only
fifty percent "efficiency."
If you are dealing with many such smallish structs, a
way to reduce the bookkeeping and rounding overhead is to
allocate a whole block of them at once and dole them out
individually, thus incurring only one chunk of overhead
for the whole batch instead of one for each struct:
struct thing *new_thing(void) {
static struct thing *avail;
static int howmany = 0;
if (howmany == 0) {
howmany = 1000;
avail = malloc(howmany * sizeof *avail);
if (avail == NULL)
return NULL; /* out of memory */
}
return &array[--howmany];
}
The disadvantage of this approach is that you can no longer
free() or realloc() the individual structs; you can only do
so on an entire batch at once. Depending on the application,
this may or may not be a problem.