Re: Storage management techniques in C

Discussion in 'C Programming' started by BGB / cr88192, Dec 24, 2009.

  1. "Richard Harter" <> wrote in message
    news:...
    >
    >
    > In this thread I would like to explore storage allocation
    > techniques in C programming.
    >
    > One of the performance issues of concern in C programming is the
    > amount of time spent in allocating and deallocating storage. In
    > measurements that I have made and that others have reported
    > upwards of 25% of the time used is spent in the allocator - and
    > it can be much higher, depending on the allocator design.
    >
    > The issue is simple; the more calls that are made to malloc and
    > free, the greater the time cost of memory allocation. What is
    > more, the likelihood of memory leaks goes up with the number of
    > calls to the allocator.
    >
    > What should be done to reduce the cost of allocation?
    >


    <snip>

    one thing that generally works fairly well IME for lots of objects all of a
    particular type and size (a fairly common case IME), is to have a special
    "free" function which doesn't actually free the object, but instead adds it
    to a type-specific free-list.

    when allocating an object, first one can check that the free-list is not
    NULL, and if this is the case, simply grab the first item and update the
    list to the next item.

    if the list is empty, fall back to a more general purpose allocator.
    periodically, one may also free the free list (for example, via a GC
    callback, to maintain a memory quota, ...).

    in general, this can notably speed up object allocation/deallocation for a
    given object type.
     
    BGB / cr88192, Dec 24, 2009
    #1
    1. Advertising

  2. "Richard Harter" <> wrote in message
    news:...
    > On Thu, 24 Dec 2009 15:31:21 -0700, "BGB / cr88192"
    > <> wrote:
    >
    >>
    >>"Richard Harter" <> wrote in message
    >>news:...

    > <snip>
    >>
    >>one thing that generally works fairly well IME for lots of objects all of
    >>a
    >>particular type and size (a fairly common case IME), is to have a special
    >>"free" function which doesn't actually free the object, but instead adds
    >>it
    >>to a type-specific free-list.
    >>
    >>when allocating an object, first one can check that the free-list is not
    >>NULL, and if this is the case, simply grab the first item and update the
    >>list to the next item.
    >>
    >>if the list is empty, fall back to a more general purpose allocator.
    >>periodically, one may also free the free list (for example, via a GC
    >>callback, to maintain a memory quota, ...).
    >>
    >>in general, this can notably speed up object allocation/deallocation for a
    >>given object type.

    >
    > It's a good technique; I've used it myself. However there are
    > situations where it doesn't work very well. See my reply to
    > bartc.
    >


    cases where objects are used in bursts, partial solutions:
    keep track of number of free items and start freeing if there are too many;
    in some cases, maybe register a callback with the GC which can trigger bulk
    freeing if memory is tight.

    fragmentation:
    allocate objects in lumps (granted, this may not always turn out well, and
    may require special MM/GC support).

    actually, I have not really had too many problems with fragmentation IME,
    although I suspect this may be due to my usage of generally "unorthodox"
    allocator algos (AKA: bitmap and cell-based allocators, which oddly almost
    no one too far outside LISP-land tend to use...).

    granted, fragmentation-related performance issues may be difficult to
    detect, so possibly I might not have noticed them...


    in general though, in my newer GC I keep cons cells and object-cells in
    different places, though mostly this is more due to technical reasons.

    cons cells are in one place, with a special cons-specific allocator (special
    cases: a cons is always 2 pointers, and a cons is never more than 1 cell);
    object cells are in another place, and use a different allocator (for small
    objects, the allocator is specialized for small objects, as finding space
    for larger objects via bitmap searches would be slow);
    larger objects use free lists.

    one can debate what the exact ideal cutoff is between a cell-based allocator
    and a free-list allocator though, but IME it hasn't been too much of an
    issue thus far.

    a cell allocator has a small per-object overhead at the cost of a
    linear-space overhead, and a free-list allocator has a higher per-object
    overhead with no linear-space overhead.

    for example, with masses of 10 byte objects, a 40 byte object header would
    be a killer;
    but, with multi-MB objects, a 6-8% space overhead would be a killer (say, if
    for every 10MB there were 820kB of bitmaps...), whereas the 40 byte object
    header would be insignificant.

    hence, the use of different strategies depending on object size, ...


    >
     
    BGB / cr88192, Dec 31, 2009
    #2
    1. Advertising

  3. BGB / cr88192

    bartc Guest

    "BGB / cr88192" <> wrote in message
    news:hhjcb5$pet$...

    > for example, with masses of 10 byte objects, a 40 byte object header would
    > be a killer;
    > but, with multi-MB objects, a 6-8% space overhead would be a killer (say,
    > if for every 10MB there were 820kB of bitmaps...), whereas the 40 byte
    > object header would be insignificant.


    What are the bitmaps used for?

    With 1 bit per byte, the overhead on 10MB is about 1.25MB. But with 1 bit
    per 16 bytes say, the overhead is some 80KB, some 0.8%.

    --
    Bartc
     
    bartc, Jan 1, 2010
    #3
  4. "bartc" <> wrote in message
    news:TAk%m.20822$...
    > "BGB / cr88192" <> wrote in message
    > news:hhjcb5$pet$...
    >
    >> for example, with masses of 10 byte objects, a 40 byte object header
    >> would be a killer;
    >> but, with multi-MB objects, a 6-8% space overhead would be a killer (say,
    >> if for every 10MB there were 820kB of bitmaps...), whereas the 40 byte
    >> object header would be insignificant.

    >
    > What are the bitmaps used for?
    >


    space allocation and keeping track of which objects exist and where they
    exist at, ...

    this can be constrasted with the strategy of using the object-headers and
    linked lists for memory management.
    for example, with bitmaps, one need not actually even have object headers,
    but in my case they exist mostly to contain semantic information (exact
    object size, type, flags, ... but, given a lack of any embedded pointers, it
    allows the entire header to be packed into 64 bits in my case...).

    some of my early allocators in this style did not have object headers, but
    eventually these headers became a part of the core MM/GC, as this made it so
    that the information available in the header was available for every object
    (eliminating the occasional "odd man out" which did not have a header).


    for example, in a "traditional" MM, one could have a collection of
    free-lists (of assorted object sizes), and find a memory block of sufficient
    size from the needed list. in this case, the object would then likely be
    migrated to a list of live objects, and have the size of the object managed
    in the header via a size field.

    so, a memory block might look something like:
    struct MemBlock_s {
    MemBlock *next_list; //next block in free-list / live-list
    MemBlock *next_phys; //next block physically (so we can merge / coalesce
    blocks)
    size_t size; //object size
    int flags; //some flags and stuff
    ....
    //data starts here
    };

    for very small objects, this header is massive...


    if the object is, instead, an external bit pattern, we don't need to keep
    track of objects via such headers.
    similarly, coalescing is "free", since freeing any 2 (or more) adjacent
    objects will create a big hole in the bitmap, meaning a space which can hold
    a bigger object (IOW: coalescing free space is free).

    similarly, for GC it tends to be MUCH faster than when working with
    free-lists.


    the one, notable, cost with this strategy is that the performance turns to
    crap if ones' objects are large, hence there is a good reason to manage
    larger objects via a free list instead (these 2 strategies seem to work
    fairly well when used in conjunction).

    this works well, also, since large objects tend to be fairly rare vs small
    objects, and the small number of large objects greatly reduces time used in
    things like linked-list hopping.

    at the same time, I can also speed up the large-object GC by using a sorted
    array of large objects, so that I can locate objects quickly via a binary
    search, ...


    > With 1 bit per byte, the overhead on 10MB is about 1.25MB. But with 1 bit
    > per 16 bytes say, the overhead is some 80KB, some 0.8%.
    >


    it is typically 8 bits per 16-bytes in my current GC (or 6.25%).
    so, it is not "strictly" a bitmap in the 1-bit-per-item sense, but I lack a
    better term.

    in this setup:
    2 bits: cell type (free/header/data/reserved);
    2 bits: cell color (white/black/gray/extra_black);
    3 bits: gc-type (conservative/atomic, 'GCP': defiled, ref-count/many)
    1 bit: (?...)

    sadly, I can't really remember how all this works, as I didn't exactly
    document it so well, and getting this much required going back and trying to
    decipher a lot of the evil bit-twiddly...

    earlier versions of the GC (from years ago) used 4 bits per cell, but this
    was expanded to 8 bits both to improve performance and provide a few more
    bits for other uses.

    a bunch more info goes on in the object header (more bit-twiddly).

    'GCP' was an ill-advised feature (an attempt at pointer-based precise
    collection to be used in-conjunction with conservative collection), which
    has mostly died back (not part of the main API anymore), but it first added
    ref-counting (in this incarnation, another past "sibling" GC had also added
    ref-counting earlier).

    there is also conservative ref-counting which is implemented via 4 bits
    located in the object header, which is now used instead.


    > --
    > Bartc
     
    BGB / cr88192, Jan 2, 2010
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Scott
    Replies:
    0
    Views:
    313
    Scott
    Oct 9, 2003
  2. sarathy
    Replies:
    2
    Views:
    676
    sarathy
    Jul 17, 2006
  3. Richard Harter

    RFC on a storage management utility package

    Richard Harter, Nov 8, 2006, in forum: C Programming
    Replies:
    14
    Views:
    541
    Mark McIntyre
    Nov 9, 2006
  4. bartc

    Re: Storage management techniques in C

    bartc, Dec 24, 2009, in forum: C Programming
    Replies:
    2
    Views:
    366
  5. izzahmeor
    Replies:
    0
    Views:
    802
    izzahmeor
    Feb 3, 2010
Loading...

Share This Page