Suggestions for my mallocator

Discussion in 'C Programming' started by nfwavzrdvwl, Dec 27, 2006.

  1. nfwavzrdvwl

    nfwavzrdvwl Guest

    Hi all,

    This is my first posting on this forum! I hope it will be the first of
    many as I work my way up to be a C guru :)

    We had an assigment to write our own storage allocator, and mine was
    graded very poorly with comment "Hopelessly naive". Now I can see the
    main problem, which is that some calls to myfree() won't actually
    release the memory, but that would be pretty difficult to achieve, and
    the rest of it seems to work fine to me.

    If anyone here can make any constructive comments, that would be great

    #include <stdio.h>
    #include <string.h>
    #include <stddef.h>

    #define SYSTEM_RAM ( 1<<30 )
    #define MAX_TO_USE ( ((SYSTEM_RAM) / 3) * 2 )

    static char heap[MAX_TO_USE];
    static char *p=heap;
    static char *oblivion=heap+MAX_TO_USE;
    static char *last;

    void *mymalloc(long bytes)
    if(p+bytes<oblivion) {
    return (void *) last;
    return NULL;

    void myfree(void *q)
    if(q==last) {
    /* need something more sophisticated - we don't have it, so
    * let's just leak the memory :) */

    int main()
    char *t=mymalloc(14);
    strcpy(t,"Hello, world!");
    printf("%s\n", t);
    return 0;
    nfwavzrdvwl, Dec 27, 2006
    1. Advertisements

  2. said:
    The job of the deallocator is to make the storage available for re-use by
    the program. See pp185ff of K&R2 for an example.
    Richard Heathfield, Dec 27, 2006
    1. Advertisements

  3. nfwavzrdvwl

    nfwavzrdvwl Guest

    Mine does do that, but admittedly not in certain cases :~

    The only problem occurs if mymalloc and myfree calls are intertwined
    rather than nested like parentheses. And after all this was just an
    exercise - it's not like any production code will use my functions -
    and I'd guess that ultimately you need kernel calls for a serious
    implementation of malloc (keeping large arrays around the place can't
    really be ideal :?)

    Anyway, back to my code, it seems that the trouble is that to keep
    track of an arbitrary number of calls to mymalloc(), my allocator would
    itself need arbitrarily large storage, and it could only get this from
    malloc(), which sort of defeats the object. :???:
    nfwavzrdvwl, Dec 27, 2006
  4. nfwavzrdvwl

    Malcolm Guest

    No. It is quite legitimate to allow only a finite, though large number of
    allocations. If you have a megabyte of memory available, and allow only 1024
    allocations, then unless the average allocation is under 1K the user will
    never notice. plenty of commercial allocators will pad small allocations up
    to 1K.

    The important thing is that calls to malloc() and free() can be interleaved
    in any order, as long as the free() follows its matching malloc(), when
    memory is freed it is avialable for reallocation, and the blocks you
    allocate do not overlap. This is not trivial to achieve.

    For bonus points you want to use your storage area efficiently - avoid
    fragmentation - and you want the malloc() and the free() to execute in a
    short time.

    I've hand-code mallocs for production systems. These were embedded programs.
    The allocator didn't have to run in special kernel mode. It was simply a
    normal C function that accessed a global array.
    Malcolm, Dec 27, 2006
  5. said:
    And yet that is the typical usage, which is why you need to do more work
    than just a simple "start here" indicator. I refer you again to K&R2's
    example, which gives you a solid introduction to the issues and how to set
    about resolving them.
    ....which won't do you much good if you don't do it properly. Sorry, but
    there it is.
    Ultimately, yes, a production-quality memory manager in a protected-mode OS
    is going to need to ask the OS for its memory.
    No, you need to keep track of memory that's *returned* to you, so that you
    can give it out *again*. And that means keeping track of each block you
    allocate, rather than just saying "well, everything below offset X has been

    As I said before, see pp185ff of K&R2 for an example.
    Richard Heathfield, Dec 27, 2006
  6. You can only free the chunk of memory you have allocated last.
    Well, that's like building a car that has no steering wheel, no
    accelerator and no brake padel and just a single button labeled
    "Go" which, when you hold it down, makes the car go straight for-
    ward. You could also argue that "the only problem" is that it
    works only on a straight street with no other traffic...
    If you look up the pages in K&R2 Richard told you about you will
    find one way how to deal with that "unsurmountable" problem by
    just putting a structure at the start of each newly "allocated"
    block of memory that contains a field with the size and a pointer
    to the next block's address. By iterating through the linked list
    of pointers you can find the next fitting free block or the block
    you're asked to free. When you free a block you also joins it with
    a free block before or after it (if that exists) to avoid fragmen-
    tation. All that takes not much more than 50 lines of code (if you
    modify it to use memory only from a fixed area like in your case
    instead of asking the OS for more memory when running out of space)
    and at a "cost" of about the size of a pointer plus an integer size
    field for each block. The example code in K&R2 also deals with the
    alignment of the memory returned by malloc(), a problem your program
    does not even take into consideration.

    Regards, Jens
    Jens Thoms Toerring, Dec 27, 2006
  7. nfwavzrdvwl

    nfwavzrdvwl Guest

    Thanks to all on the forum for your thoughts :)

    I guess it's a question of trade-offs here between flexibility (how
    many allocations do you allow?), speed vs. memory usage (how much
    fragmentation is tolerable?), etc. I've come up with what seems like a
    pretty good way of doing this, which ensures absolutely no
    fragmentation occurs - once a block is freed, the deallocator
    immediately tidies things up. An array of pointers keeps references to
    all the currently allocated blocks.

    The only slight disadvantage of this approach is that it necessitates a
    slightly different API for malloc and free, with an extra level of

    #include <stdio.h>
    #include <string.h>
    #include <stddef.h>

    #define SYSTEM_RAM ( 1<<30 )
    #define MAX_TO_USE ( ((SYSTEM_RAM) / 3) * 2 )
    #define MAX_ALLOCS ( (MAX_TO_USE) >> 10 )

    /* set up heap to allocate from */
    static char heap[MAX_TO_USE], *top=heap, *oblivion=heap+MAX_TO_USE;

    /* pointers to allocated blocks */
    static char *blocks[MAX_ALLOCS];

    void **mymalloc(long bytes)
    char **q;
    if(top+bytes<oblivion) {
    /* find first free pointer in array */
    for(q=blocks; *q; q++)
    return NULL;
    return (void **) q;
    return NULL;

    void myfree(void **q)
    char **next, **r;
    long size;
    if(q==NULL || *q==NULL)
    /* OK... let's find the start of the block after the one to be freed
    for(r=blocks; r<blocks+MAX_ALLOCS; r++)
    if((char *) *q<*r && *r<*next)
    size=next-(char **) q; /* size of the block */
    /* shift down to fill in block to be freed */
    memmove(*q, *next, MAX_TO_USE-(next-blocks));
    /* move down all pointers above next */
    for(r=blocks; r<blocks+MAX_ALLOCS; r++)

    int main()
    char **s=(char **) mymalloc(8);
    char **t=(char **) mymalloc(7);
    strcpy(*s,"Hello, ");
    printf("%s%s\n", *s, *t);
    myfree((void **) s);
    myfree((void **) t);
    return 0;

    PS. Thanks for the heads-up on the K&R2 discussion - I've got a pdf of
    it coming down as a torrent as we speak, so I'll read it as soon as I
    get it :)
    nfwavzrdvwl, Dec 27, 2006
  8. nfwavzrdvwl

    websnarf Guest

    Well, the comment is fair. But the problem is that writing a good
    allocation system is actually quite difficult.

    Writing one that satisfies all possible useful criteria and runs in
    O(1) (both malloc and free) is basically impossible. You can do one
    that is O(ln(#allocations)) however it is horrendously complicated.
    Usually, however, its straight forward to write an allocation system
    that is O(1) *most* of the time, or which is O(1) all the time, but
    does not always successfully allocate or recover all of memory

    The classic "buddy allocator" system has each allocation point to the
    two allocations (or edges of a large block) that it is adjacent to.
    Each allocated unit would also have header data like a flag indicating
    whether it is allocated or free, and how large the allocation is. Thus
    freeing each block is easy to see -- you just switch the state to
    "free" and immediately merge with the any adjacent block that is free
    (this is always at most two steps.)

    The real problem is how do you do allocations. If you maintain a list
    of free entries (with internal references, like a linked list or
    something) then you can look through the list for entries that are
    large enough to fit your allocation. But then the operation is
    O(#allocations) which can be very slow. So the simplest thing to do is
    to partition these into lists of particular size groupings, powers of
    two are usually good, and just picking the first entry from each list
    after you've guaranteed that its in a size range that will sufficiently
    satisfy the allocation request. However, you are still stuck with the
    problem that unless you look through your entire lists in each category
    (thus bringing you back to O(#allocations)) you cannot guarantee "best
    fit" allocations. So you will commonly end up allocation too much

    Then there are other strage strategies like fibonacci allocators. The
    idea is that you start with a block the size of a fibonacci number, and
    you can always subdivide the block into two sub-blocks also with
    fibonacci number sizes. In this way you can avoid the overhead of
    storing the size of the block, since that in a sense is determined by
    your offset from the big block. Personally, I don't quite see the
    whole logic of this scheme, but people have done such things.

    Of course the other thing about allocations strategies is that they
    often entail system-level details. For example, the addresses for new
    space from the system may always be adjacent to your previous
    allocations. The way this is possible is if you have a dynamic memory
    mapping system where physical memory is fetched in distinct pieces, but
    which can be mapped into logical memory at a prescribed address. This
    is the way the UNIX-style "sbrk()" function works (Windows NT does not
    support a similar mechanism; you simply allocate in distinct blocks
    that may or may not be adjacent.)

    Another system detail has to do with multi-threading. To support
    multithreading environments you have to make sure that at most one
    thread is calling your memory allocation functions at a time. Without
    this kind of mechanism, multi-threading would not even work. This can
    often have a profound impact on allocator design -- using distinct
    regions with different threads can dramatically raise performance, if
    you design things right.

    So, in short, writing a good memory allocation system is not something
    can be undertaken lightly, unless you are satisfied with getting
    unimpressive answers.
    websnarf, Dec 27, 2006
  9. said:
    I suggest you buy a copy instead. Keeping yourself on the right side of the
    law is always a good idea.
    Richard Heathfield, Dec 27, 2006
  10. For ease of discussion below, let's assume top contains 0x10000 and
    oblivion contains 0x20000.
    Let's also assume bytes is 0x1000 for each of two calls.

    Why not use size_t for consistency

    A void** is not guaranteed to be able to hold the address of anything
    other than a void*.

    Since a void* can hold the address of anything, there is no reason not
    to use it.
    On first call, blocks[0] is NULL, *q is false, and loop exits before
    first iteration.

    On second call, loop exits when q points to block[1].
    On first call, blocks[0] now contains 0x10000.

    On second call, blocks[1] now contains 0x11000.
    On first call, top now contains 0x11000.

    On second call, top now contains 0x12000.
    Why are you returning the address of an element in blocks when you
    could just as easily return an address in heap.

    On first call, you return &blocks[0].

    On second call, your return &blocks[1].
    Let's assume we return block[1] first.

    Let's return block[0] second but pretend the first call never
    On first call, top contains 0x12000.

    On second call also.
    The second relation will not be evaluated until the first one is true.
    On first call, *q is 0x11000.
    On first iteration, *r is 0x10000; if is false.
    On second iteration *r is 0x11000; if is false.
    On all subsequent iterations, *r is NULL; if is false.

    On second call, *q is 0x10000.
    On first iteration, *r is ox10000; if is false
    On second call, *r is 0x11000 and *next is 0x12000; if is true.
    On subsequent calls, *r is NULL; if is false.
    You seem to be missing an exit from the for loop once you find the
    block you are interested in.

    On first call, you fall out of for loop never changing next and r
    pointing one byte beyond end of blocks.

    On second call, next is set to &blocks[1].
    This arithmetic does not yield the number of bytes in the block.

    On first call, next contains &top and q contains &block[1]. The
    difference has no meaning since it is simply a measure (but not in
    bytes) of the space between to unrelated objects defined at file

    On second call, next contains &blocks[1] and q contains block[0]. The
    difference is guaranteed to always be exactly 1.
    Your third argument computes the wrong number of bytes. next-blocks
    is the number of elements in blocks, not the number of bytes in heap.

    Furthermore, by moving data in heap, you have broken the interface
    with your users. Once you have returned the address of blocks[N] to a
    user and he has dereferenced it to get the address of heap[M] as the
    start of his allocated area, you may not move heap[M] to some other
    area in heap.
    Once you move any of the entries in blocks, you have broken the
    interface with your users. Once you have returned the address of
    blocks[N] to a user, you may not alter the value contained in
    blocks[N] until the user frees it. Your user is entitled to use the
    value in blocks[N] as often as he wants to get to the start of his
    allocated area.
    K&R2 is not legally available as a pdf.

    Remove del for email
    Barry Schwarz, Dec 27, 2006
  11. nfwavzrdvwl

    nfwavzrdvwl Guest

    Thanks for the reply... I think you've misunderstood my approach, which
    is understandable as I didn't give any information apart from a signal
    that it breaks the usual malloc/free API...
    That's the point of the pointers-to-pointers... in my scheme, the
    allocation black-box maintains an array of pointers, each of which
    points to a block of memory in the big array. However, the contract
    with the caller is as follows. Let's say he calls malloc(N) and is
    given p, which is a char **. Then *p is secretly an element of the
    array of pointers, so **p points to a block in the big array of size N.
    However, the caller isn't promised that *p is a constant pointer, only
    that p is a constant pointer and *p always points to the right thing -
    in other words, he always has to dereference twice as **p and mustn't
    store *p=q, fiddle around, then hope that *q will still be valid.

    The pay-off for this is that the allocator is guaranteed to be most
    efficient possible in storage terms, in the sense that fragementation
    is never allowed to occur.

    I agree that this is a slight disadvantage, but once again this is only
    an exercise, not a proposal for a genuine library function.

    I'll have to check the calculations in the rest of the code and debug
    Maybe that's strictly speaking true, but some considerations:
    1. I could wait a week and borrow it from the library - same result,
    i.e. I read their bit on allocators, but I've wasted a week.
    2. If Kernighan & Ritchie were around today they'd probably be in the
    spheres of Open Content, Free Documentation, etc., which is a more
    ethical approach to "IP".
    3. It's unlikely either of them will go hungry because I don't buy a
    copy of their book.
    nfwavzrdvwl, Dec 27, 2006
  12. said:

    No, that's a legal source.
    You'd spend that whole week sitting on your hands?
    K&R *are* around today, and so is their publisher. And they don't take
    kindly to copyright violations.
    So go steal a Ferrari, why don't you? What kind of morality is that?
    Richard Heathfield, Dec 27, 2006
  13. nfwavzrdvwl

    CBFalconer Guest

    It is inherently impossible to make a fragmentation proof allocator
    for C, because of the presence of arbitrary pointers. Any such
    would require indirect memory access for everything, which would be
    a monstrous performance hit when enforced, together with
    specialized compilers (or interpreters). However it is definitely
    possible to have O(1) performance for all operations and to always
    recover memory. It doesn't even require buddy systems (which have
    their own penalties). I have published such a system, which only
    requires care in writing.

    See <>
    CBFalconer, Dec 27, 2006
  14. nfwavzrdvwl

    nfwavzrdvwl Guest

    It's the morality of common sense and practicality.

    Scenario 1. I go to a library and find K&R. Under UK fair-use law I'm
    allowed to photocopy 5% for personal use. In this case I copy the pages
    on allocators. I end up with several sheets of paper containing the
    relevent bits from the book.

    Scenario 2. Someone shares a PDF of K&R with me over the internet. No
    one has stolen anything: he still has the file himself. I print out the
    pages on allocators and delete the file (for argument's sake). I end up
    with several sheets of paper containing the relevent bits from the

    If you're so anal that you worry about these different means to obtain
    an identical result, then I pity you.

    Copyright law is completely screwed up. I have a dual-boot
    Windows/Linux machine. I take my favourite DVD, and if I boot into
    Windows then I can legally play it, whereas if I boot into Linux then
    there is no legal way I can obtain the same result (reading the disc,
    displaying the movie on the screen). That may tell you that I should
    submit myself to this idiocy and never watch DVDs in Linux. It tells
    *me* however that the law is an ass and life is too short.
    nfwavzrdvwl, Dec 27, 2006
  15. On 27 Dec 2006 13:20:51 -0800, wrote:

    I make my living writing software. I have no interest in helping you
    steal from someone else who does the same.

    Remove del for email
    Barry Schwarz, Dec 27, 2006
  16. nfwavzrdvwl

    websnarf Guest

    Well more importantly, you can choose the ordering in which a sequence
    of allocations is freed, and therefore can always create whatever kind
    of fragmentation you like. So this clearly is not one of the criteria
    a C-style allocator can have.
    Ok, but what about also always being able to successfully allocate?
    I noticed you don't supply any seperate documentation for it. Can I
    assume the comments in the code are accurate? From what I can peruse:

    1) You seem to be neglecting at least one merge scenario. When you
    split off the excess from an allocation, you are not merging it with
    any possible adjacent free entries. In a sense this causes unnecessary
    fragmentation of the remaining free entries. Remember with your
    strategy, you want your free entries to be as large as possible. Try
    writing a benchmark with a lot of reallocs in it, and you will see the
    difference this makes.

    2) Your function size2bucket() is a linear search (though, of course,
    its count is bounded above by sizeof(ulong)*CHAR_BITS). Think about it
    as bits but seperated from the mathematical properties for a second.
    You can re-implement it using a binary search, which means on real
    system you would have an effective loop count of 5 or 6.

    3) You are also giving up as soon as your algorithm is unable to find
    memory using its "fast method". If you run out of memory, you are in
    typically in "screwedzville" anyways. You can simply go back to your
    free list and check the bucket one lower, and start scanning linearly
    for some fixed amount before truly bailing. This is important for
    certain scenarios where you are allocating near the system limit in
    terms of amount of memory, but nearly all your allocations are the
    exact same size (this scenario is very typical). With your strategy as
    coded, you can "run out of memory" even though you actually have plenty
    of memory available -- you just aren't willing to scan for it. Its
    very difficult to implement something which works robustly and all the
    time here, but you can capture typical cases without too much effort.

    In any event, it is as I said. You can implement something with O(1),
    but it ends up not always being able to allocate all available memory
    websnarf, Dec 28, 2006
  17. n> If Kernighan & Ritchie were around today they'd probably be in
    n> the spheres of Open Content, Free Documentation, etc., which is
    n> a more ethical approach to "IP".

    Kernighan & Ritchie *are* around today, and if they wanted their book
    released under a permissive license they could do so.

    N> It's unlikely either of them will go hungry because I don't buy
    n> a copy of their book.

    It's unlikely you'll get any help from people who make a living with
    intellectual property, now that you've demonstrated your cavalier
    attitude towards it.

    Charlton Wilbur, Dec 28, 2006
  18. nfwavzrdvwl

    Nelu Guest

    Yes, common sense, you're right.
    Or you can buy the whole book and reward the authors' effort.
    Are you sure there's a legal pdf copy of K&R2 on the Internet?
    Also, personal use doesn't authorize you or others to share it
    over the Internet - personal means you not half the world. The
    copyright system is not broken... you probably talk about the
    patent system being broken which is something different. Broken
    or not the law is the way it is. Until it gets changed you are
    breaking the law.
    So if I shoot you I should be safe because you would die anyway
    eventually and I would still get my goal accomplished (identical
    result). Screw the law, it's broken... The point is: we live by
    the law, broken or not. You can fight it (FSF vs. patents), fight
    to change it, but you don't break it until it's changed.
    Patents, again, not copyright. You're getting it wrong. It
    doesn't play on Linux because there is no program under Linux to
    run it, because the guys distributing software respect the law by
    not using stuff they don't agree with. You don't seem to get that.
    Nelu, Dec 28, 2006
  19. Another word for "strictly speaking true" is "true", and you can drop
    the "maybe".
    If they were around today? Not very bright, are you?

    If you're going to commit a crime, you probably shouldn't announce it
    to the world in a manner that allows you to be traced.
    Keith Thompson, Dec 28, 2006
  20. Nelu said:

    There isn't, as Prentice Hall will confirm if you ask them.

    Richard Heathfield, Dec 28, 2006
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.