memory pool library (repost)

G

Grey Alien

I am looking to write a very simple memory pool library to store only
one data type at a time - i.e. to provide a contiguous block of memory
to be alloc'd free'd by the calling program. I am


I have come up with this API so far and will welcome any feedback on it.
In particular, I will need help with implementing the linked lists used
for record keeping as to which blocks were free or not (and for
"defragging" the pool when "holes" appear in the contiguous block)

This is what I've come up with so far. Comments/feedback welcome.


struct mempool_t
{

unsigned char * pool_ ;
unsigned int block_size ;

/* variable that keeps track of whether an element is free or not */
unsigned char * available ;
};


/* Public API */
/* Creates the pool, N elements of size item_size. Return 0 on succes, 1
otherwise */
int mempool_create(struct mempool_t *pool, const unsigned int
num_elements, const unsigned int item_size);

/* requests mem chunk from pool, 0 ptr if no more mem or rqst could not
be fulfilled */
unsigned char * mempool_malloc(struct mempool_t *pool, const unsigned
int num_elements) ;

/* requests zero-inited mem chunk from pool, 0 ptr if no more mem or
rqst could not be fulfilled */
unsigned char * mempool_calloc(struct mempool_t *pool, const unsigned
int num_elements) ;

/* requests mem chunk realloc from pool, 0 ptr if no more mem or rqst
could not be fulfilled */
unsigned char * mempool_realloc(struct mempool_t *pool, const unsigned
int num_elements) ;


/* requests mempool resize (expand only) 0 if succesful, 1 otherwise */
int mempool_resize(struct mempool_t **pool, const unsigned int
num_elements) ;


/* Local (static) functions */
int mempool_defrag(struct mempool_t * pool);
 
B

Ben Bacarisse

Grey Alien said:
I am looking to write a very simple memory pool library to store only
one data type at a time - i.e. to provide a contiguous block of memory
to be alloc'd free'd by the calling program. I am


I have come up with this API so far and will welcome any feedback on
it. In particular, I will need help with implementing the linked lists
used for record keeping as to which blocks were free or not (and for
"defragging" the pool when "holes" appear in the contiguous block)

OK, you have the API but maybe you could do this differently! The
costs of mallocs and frees can be ameliorated by replacing free with a
function that puts the block onto a "free chain" for a given size.
The allocator takes an item from the free chain (if there is one) and
calls malloc otherwise:

void *my_alloc(size_t s)
{
void *p;
size_t size_idx;
if (s >= MIN_ALLOC_SIZE && s < MAX_ALLOC_SIZE &&
free_chain[size_idx = s - MIN_ALLOC_SIZE] != NULL) {
p = free_chain[size_idx];
free_chain[size_idx] = *(void **)p; /* [1] */
}
else p = malloc(s);
return p;
}

A matching my_free never calls free -- it just puts the block into the
correct chain for its size.

This works very well when most of your program's allocations are of a
few known (and ideally small) sizes.

You can pre-prime the free chains with an initial list of blocks:

for (size = MIN_ALLOC_SIZE; size < MAX_ALLOC_SIZE; size++) {
int i;
for (i = 0; i < pre_alloc_count[size - MIN_ALLOC_SIZE]; i++) {
void *p = my_alloc(size);
my_free(p, size);
}
}

Obviously you can also do one large initial allocation and just
pretend that it is made up of little bits, but that complicates the
code.

You can also avoid wasting space in the free_chain array (at the
expense of some arithmetic) if you know that the common sizes are s0,
s0 + x, s0 + 2x, s0 + 3x and so on. This may or may not be worth
while. Complicate the code only if you know there is a benefit.

[1] This is just a sketch. The yucky cast can (and should) be removed
but using a structure whose first member is a void *. If you don't
mind using C99, you can use the tidied-up "struct hack":

struct alloc_unit { void *next; unsigned char extra[]; };

MIN_ALLOC_SIZE better be >= sizeof (void *).
 
B

Ben Bacarisse

Bad form to reply to one's own post, but I forgot to say that you may
*know* you need all the complexity of "defragging" and "memory pools".
If so of course just ignore me... but I wanted to suggest a very simple
API as an alternative as you seem to suggest the complexity is
worrying you.
 
C

Chris Dollin

Ben said:
OK, you have the API but maybe you could do this differently! The
costs of mallocs and frees can be ameliorated by replacing free with a
function that puts the block onto a "free chain" for a given size.

Trouble is, the built-in malloc+free my /already/ do this. Second-
guessing the implementation is losing game.

Presumably the OP has evidence that the easy way isn't good enough?
 
G

Grey Alien

Ben said:
I am looking to write a very simple memory pool library to store only
one data type at a time - i.e. to provide a contiguous block of memory
to be alloc'd free'd by the calling program. I am


I have come up with this API so far and will welcome any feedback on
it. In particular, I will need help with implementing the linked lists
used for record keeping as to which blocks were free or not (and for
"defragging" the pool when "holes" appear in the contiguous block)


OK, you have the API but maybe you could do this differently! The
costs of mallocs and frees can be ameliorated by replacing free with a
function that puts the block onto a "free chain" for a given size.
The allocator takes an item from the free chain (if there is one) and
calls malloc otherwise:

void *my_alloc(size_t s)
{
void *p;
size_t size_idx;
if (s >= MIN_ALLOC_SIZE && s < MAX_ALLOC_SIZE &&
free_chain[size_idx = s - MIN_ALLOC_SIZE] != NULL) {
p = free_chain[size_idx];
free_chain[size_idx] = *(void **)p; /* [1] */
}
else p = malloc(s);
return p;
}

A matching my_free never calls free -- it just puts the block into the
correct chain for its size.

This works very well when most of your program's allocations are of a
few known (and ideally small) sizes.

You can pre-prime the free chains with an initial list of blocks:

for (size = MIN_ALLOC_SIZE; size < MAX_ALLOC_SIZE; size++) {
int i;
for (i = 0; i < pre_alloc_count[size - MIN_ALLOC_SIZE]; i++) {
void *p = my_alloc(size);
my_free(p, size);
}
}

Obviously you can also do one large initial allocation and just
pretend that it is made up of little bits, but that complicates the
code.

You can also avoid wasting space in the free_chain array (at the
expense of some arithmetic) if you know that the common sizes are s0,
s0 + x, s0 + 2x, s0 + 3x and so on. This may or may not be worth
while. Complicate the code only if you know there is a benefit.

[1] This is just a sketch. The yucky cast can (and should) be removed
but using a structure whose first member is a void *. If you don't
mind using C99, you can use the tidied-up "struct hack":

struct alloc_unit { void *next; unsigned char extra[]; };

MIN_ALLOC_SIZE better be >= sizeof (void *).

Ben, Thanks for the post. I am still reading the code to make sure I
understand it fully. It does seem simple enough for what I want to do -
but I am also very wary about mucking around with memory for two reasons:

i). I have never written a memory pool before
ii). I normally work in C++, so this is terra incognito AFAIC

Could you "flesh out" your example, a little more when you have a
chance, so that there is "enough there" that can form a basis on which I
build the library?.

Essentially, the C source I am using has a struct like this:

struct stag_
{
double * arr_ ;
size_t arrsz ;
size_t pos ;
};

It has functions that alloc/ free/ realloc's the arr_ variable.
(double*). I have exposed functions that make extensive use of this
structure, to a scripting library - to run simulations on bioinformatic
data. There will be literally hundreds of 1000's of allocs/frees/realloc
per single script (even before you take loops etc into account). It is
a no-brainer that apart from the overhead of alloc/realloc etc - memory
will be badly defragged since we are dealing with very small "chunks"
(essentially - just a double). This is why I need a memory
pool/allocator which starts with a pool of previously allocated doubles
(although I'd prefer to make the library generic enough that any data
type can be stored in it, - specified at initialization, i.e. a Simple
Segregated Storage). In C++ lingo, I would make the allocator a
templated factory pattern.

Back to the C world, I simply need a way of storing a large pool of a
previously allocated type (specifically, in this case, a pool of
doubles), and then handle requests for malloc/calloc/realloc (memmove?)
and free from that pool. I hope someone can help.
 
B

Ben Bacarisse

Chris Dollin said:
Trouble is, the built-in malloc+free my /already/ do this. Second-
guessing the implementation is losing game.

I thought most implementations of free attempted to coalesce adjacent
blocks and that that was, for some allocation patterns, a significant
cost. If that is wrong (either part of it), then you are dead right.
I am not feeling confident...

My suggestion dates from a very old project in the distant past
(before void roamed the earth) and the costs of malloc/free may have
been very different then.
 
B

Ben Bacarisse

Grey Alien said:
Could you "flesh out" your example, a little more when you have a
chance, so that there is "enough there" that can form a basis on which
I build the library?.

Hang on. Chris Dollin made a good point that this scheme may buy you
nothing. I may have some code that uses it and so may be able to do
some tests...
 
C

CBFalconer

Ben said:
I thought most implementations of free attempted to coalesce adjacent
blocks and that that was, for some allocation patterns, a significant
cost. If that is wrong (either part of it), then you are dead right.
I am not feeling confident...

Yes, but most is not all. My nmalloc does, at most, two combines
per free (one above, one below the freed block), and usually one or
none. There are no lengthy searches. Similarly it goes to some
trouble to avoid the need for copying during realloc. Apart from
the debug mechanisms (set for gcc) it only requires sbrk from the
system, and the ability to convert void* to byte* and do arithmetic
on the result.
 
B

Ben Bacarisse

CBFalconer said:
Yes, but most is not all. My nmalloc does, at most, two combines
per free (one above, one below the freed block), and usually one or
none.

OK, but doing (sometimes) more than zero combines is going to be worse
than never doing any. That was the gist of my suggestion.

In a simple test of 10,000,000 randomly chosen mallocs and frees (all
of a few small sizes), a simple free chain can just beat the system
malloc/free setup by about 2.8s vs 3.1s (this is of course doing
almost nothing but the allocations so my suggestion, elsethead, is not
to bother!). Switching in your nmalloc is, maybe, a shade slower than the
standard ones at about 3.2s[1].

BTW, I get:
In file included from /usr/include/stdlib.h:438,
from nmalloc.c:77:
/usr/include/sys/types.h:151: error: conflicting types for ‘ulong’
nmalloc.c:74: error: previous declaration of ‘ulong’ was here

when building it. Simple to fix, but you probably want to sort it
out.

[1] I say "maybe" because all I did average half a dozen program run
times. The difference is probably not statistically significant.
 
B

Ben Bacarisse

I missed this first time round. With only one size, avoiding free
will buy you a tiny amount more, but still unlikely to be enough to be
worth while.
Could you "flesh out" your example, a little more when you have a
chance, so that there is "enough there" that can form a basis on which
I build the library?.

Following on form the warning from Chris Dollin that there may be no
point in doing this, I have done some simple tests. Over a large
number of allocations and frees (with about 50-50 split of frees and
mallocs), chaining and unchaining the blocks buys you a tiny speed
advantage: about 2.8s vs 3.1s for a program doing 10,000,000 actions
(about 5,000,000 each of malloc and free). The test program is
dominated by the random sampling, not the allocation and freeing of
memory.[1]

In short, you are unlikely to get any significant advantage unless you
program does almost thing with the data. Do you know you will have a
speed problem? Are you writing a LISP virtual machine where
allocating and freeing CONS cells is about all it does? Do you know
that your system's malloc and free are particularly bad?

I did not spot your point of allocating only one size. In that case
you will get a shade more advantage since you will not need to test
any sizes, and the technique is simple enough that I may be worth
doing anyway, but it smacks of premature optimisation.

I would suggest you keep the allocations (and de-allocations) of your
special-sized objects separate, with their own functions (but which
just call malloc and free). That way it is simple to plug in special
allocator for these single-sized objects if you decide it might be
worth while (after measuring!).

I was going to say, come back if you want me to say more, but a
single-size free chain is so short I'll include it here (untested, not
even compiled):

struct alloc_block {
struct alloc_block *next;
};

static struct alloc_block *free_chain = NULL;

void *special_alloc()
{
void *p = free_chain;
if (p)
free_chain = free_chain->next;
else p = malloc(SPECIAL_ALLOC_SIZE);
return p;
}

void special_free(void *p)
{
struct alloc_block *block = p;
block->next = free_chain;
free_chain = block;
}

If your library can expose the type being allocated, then you don't
need to go via void *.

[1] This is one system, with one implementation of malloc/free (gcc
4.1.2 and glibc, in this case). There may be systems where you get
*worse* performance, or some with bad C libraries where it really pays
off.
 
C

CBFalconer

Ben said:
OK, but doing (sometimes) more than zero combines is going to be
worse than never doing any. That was the gist of my suggestion.

The need for nmalloc showed up when I was systematically freeing
something line 10 to 20,000 (or more) items at once, and the system
went to sleep. That operation is now O(N) and fast, while it used
to be O(N*N). Other delays are minor.
 
T

Thad Smith

Grey said:
I am looking to write a very simple memory pool library to store only
one data type at a time - i.e. to provide a contiguous block of memory
to be alloc'd free'd by the calling program. I am

See my response in your earlier thread. In general, it is a bad idea to
start a new thread on the same topic because you splinter the discussion.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,277
Latest member
VytoKetoReview

Latest Threads

Top