help with memory management strategy?

Discussion in 'C Programming' started by Ark Khasin, Nov 12, 2007.

  1. Ark Khasin

    Ark Khasin Guest

    My pet project is a command-line utility (preprocessor), so it runs and
    terminates. It uses lots of memory allocations; most of them are quite
    small. What to do with the allocated objects when they are no longer
    used? I consider the following "pure" strategies:
    - Meticulously free anything ever malloc'ed. This involves
    inefficiencies of calling free, but worse yet, in certain cases object
    duplication appears necessary to ensure uniqueness of custodial pointers.
    - Forget about no-longer-used objects and let them rot in the heap. This
    is the fastest and most compact but may lead to out-of-memory failures
    on a really complex processing job where method 1 would succeed. It
    would sound like negligence then, even though I've never observed this
    happen in practice.
    - Use a different allocator endowed with garbage collector. I am not
    concerned about non-deterministic performance since for my utility what
    counts is the total execution time. However, I am a little scared by
    disclaimers that a GC may be tricked into reclaiming memory which is
    still in use. [Does anyone know of a bullet-proof GC?]

    I'd like to ask for advice on which strategy (or a mix of them) is
    considered the best for this sort of tasks?
     
    Ark Khasin, Nov 12, 2007
    #1
    1. Advertisements

  2. Ark Khasin

    dj3vande Guest

    Don't worry about the inefficiencies of calling free; if they're
    noticeable at all, the inefficiencies of leaking a bunch of memory will
    be even worse.
    If you do this, make sure you document that the code doesn't clean up
    its allocated memory and is unsuitable for any re-use as part of a
    larger project.
    Better yet, just don't do it.
    (4) As (1), but use reference-counting to make your life easier.
    For objects that you'd want to have pointers to from widely different
    places, also allocate a reference count. When you're done with it in
    one place, decrement the reference count and if it's gone to zero
    deallocate it.
    This will give you a lot of the benefits of GC with very little of the
    hassle, but watch out for circular data structures.

    (5) Use pool allocators
    When you're doing a chunk of work that allocates memory that won't be
    needed after that chunk of work is done, store a pointer to the
    allocated memory somewhere that's easy to get at. When you're done
    with the memory, just forget about it; when you're done with the chunk
    of work, go through your list of allocated pointers and free all of
    them.

    (6) Re-think your design
    If you're having trouble keeping track of what's going where, that may
    indicate that your design is flawed. There may be a better design that
    makes it easier to keep track of things and make sure they all get
    deallocated when you're done with them.

    (7) Use different tools
    If you're suffering from the lack of a particular high-level
    abstraction, it may be easier to find a language that has it than to
    try to build it in C.
    (If you really do have a problem that lends itself well to GC, it might
    also be better to use a language with GC built in than to try to bolt
    it onto C.)

    I'd try (1), (6), (4), (5), and (7), in that order, and if I fell off
    the end of that list I'd give up and get a job mowing lawns instead.


    dave
     
    dj3vande, Nov 12, 2007
    #2
    1. Advertisements

  3. Ark Khasin

    Tor Rustad Guest

    Well, I would rather expect malloc() to be the heavy-weight call here,
    not free(). This method, is the traditional one, which works in all cases.
    Right, if being lazy, bad things can happen. If doing a lot of malloc's,
    I wouldn't recommend being lazy. :)
    There are issues with e.g. Boehm GC, but such cases appears to be rather
    constructed. For a non-safety related command line utility, I don't see
    the problem with using Boehm GC.
     
    Tor Rustad, Nov 12, 2007
    #3
  4. Ark Khasin

    Ark Khasin Guest

    The trouble is, the current version of the utility (Unimal) is used in
    the build process of a safety-critical product, so IMO it would not be
    wise to take chances. Currently, it uses a mix of thoughtful cleanup and
    deliberate leaks but I just thought there is a better way...
     
    Ark Khasin, Nov 12, 2007
    #4
  5. Ark Khasin

    CBFalconer Guest

    You omitted 'Use a malloc package that avoid the long sequences on
    free". I developed this because of the problem, and all operations
    are O(1). It is semi-portable, requiring being able to treat the
    memory as a big char array (or arrays), and needs gcc, because of
    gcc style variadic macros. It also depends on the presence of the
    sbrk() system call. It is available at:

    <http://cbfalconer.home.att.net/download/nmalloc.zip>
     
    CBFalconer, Nov 12, 2007
    #5
  6. Ark Khasin

    CBFalconer Guest

    To the contrary, the usual malloc package has O(n) performance for
    free, where n is the number of allocated blocks. The makes the
    operation of freeing a large segment of those blocks O(n * n).
    Elsethread I have given references to my malloc package, which is
    O(1) and makes the large segment of blocks free operation O(n).

    When you have n in the 10,000 to 20,000 range you can easily see
    the difference between O(n) and O(n * n) free() operation. My
    hashlib testing package has special provisions for this.
     
    CBFalconer, Nov 12, 2007
    #6
  7. Ark Khasin

    Tor Rustad Guest

    If building safety-critical program, I want tools which do their job
    well. During the build process, correctness of the output is critical,
    not really avoiding a failure to build. Extra memory can be added, so
    can swap space.

    The Boehm GC does indeed do some very low-level stuff, and using it will
    increase the intrinsic complexity of your tool, which may introduce some
    nasty bugs. An alternative, is to first use the GC as a *leak detector*,
    and also pull in a test-case where you compare the output, when GC is
    used and when it isn't. Those outputs should always be identical.

    If being in your shoes, I guess my choice would be going for explicit
    free() calls, but you are in a better position to judge this. Using GC
    as a leak-detector in debug builds for a while, will give you even
    better know how.
     
    Tor Rustad, Nov 13, 2007
    #7
  8. Ark Khasin

    Ark Khasin Guest

    Thank you indeed for this observation. It put me in the justifiably
    conservative mindset.
    This of course can uncover an error but cannot prove its absence. As a
    leak detector... I'll have to study this further because e.g. some leaks
    may remain intentional.
    Thank you
     
    Ark Khasin, Nov 14, 2007
    #8
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.