Defined and undefined C pointer manipulation

Discussion in 'C Programming' started by James Harris, Apr 26, 2014.

  1. (snip, someone wrote)
    One reason not to is the poor peformance of virtual memory systems.
    I know some put at least some of the data elsewhere for better
    virtual memory reasons. If you follow a linked list through the
    beginning of each allocated block, you have to page in that block.

    I know some do it differently, but I don't remember how much.
    You could put a linked list somewhere else, and a pointer to a
    link in the list before each block, to make free fast.
    A hash table for free should also be pretty fast.
    I suppose, but you could do that even without the data before
    each block.

    One that I have wondered about is web browsers and the data that
    they keep for each tab or window. Many perform poorly, doing a lot
    of paging, as their memory use gets big.

    -- glen
     
    glen herrmannsfeldt, Apr 27, 2014
    #21
    1. Advertisements

  2. (snip)
    The OS/360 (and successor) GETMAIN/FREEMAIN allows for freeing
    part of a previous allocation. You can allocate a big block,
    then free the middle of it, leaving two smaller blocks.

    There is also a call with a minimum and maximum, such that any
    sized block in the range can be returned, along with its length.
    Might be nice for a hash table, where bigger is better, but smaller
    can also work. And also can reduce fragmentation.

    -- glen
     
    glen herrmannsfeldt, Apr 27, 2014
    #22
    1. Advertisements

  3. (snip)
    The BLAST program for DNA and protein sequence searching does that.
    It generates finite state automata in an array of pointers, but also
    needs a bit to indicate when a match is made. On word addressed
    machines, where pointer pointers have low zero bits, it uses the
    low bit. On word addressed Cray machines, a high order bit.
    (More specifically, it generates a DAWG.)

    I believe the comments specifically indicate Cray.

    In normal use, most characters are not matches, so the pointer is
    used directly, for a very fast inner loop. In the case of a match,
    it exits the loop and processes the hit.

    (It was some years ago that I was working with BLAST source.
    It could have changed by now.)

    -- glen
     
    glen herrmannsfeldt, Apr 27, 2014
    #23
  4. James Harris

    BartC Guest

    That's exactly what I've used for many years. It works extremely well.

    And you will know the size of a block more often that you might think. So if
    you are allocating space for a struct, you will obviously know the size of
    that struct. If you have a dynamic array, it's of little use unless you know
    the bounds.

    Only for things such as zero-terminated strings, where you do not store the
    length, would you need to calculate it to free it.

    Furthermore, if malloc() and free() already worked like that, then it would
    be easier build size-retaining versions on top, than to do it the other way
    around.

    (I haven't considered the usefulness of block-size information when
    malloc/free need to manage memory blocks. Last time I implemented something
    like that and need to know this, I used a separate bit-map; for allocations
    based on 16-byte chunks, the overhead would be only 0.8%. But for small
    power-of-two allocations, it hasn't been necessary and fragmentation hasn't
    been a problem.)
     
    BartC, Apr 28, 2014
    #24
  5. James Harris

    Stefan Ram Guest

    One does not have to abandon the standard malloc. Instead, one can allocate
    a large chunk for, say 100, fixed blocks at once using malloc and then manage
    the internal memory of this chunk oneself.

    One can save some of the effort to manage this with container libraries. For
    example, C++ has an ::std::deque that can help to manage the block allocations
    transparently. C++ will by itself allocate larger chunks to store the entries
    of an ::std::deque. One just has to add a record of the free entries (that
    can be reused before allocating new entries) to have full memory-management,
    based on malloc but possibly more efficient than one malloc/free per entry.
     
    Stefan Ram, Apr 28, 2014
    #25
  6. James Harris

    Ian Collins Guest

    I find it's better to stick to the standard interface and use the best
    memory allocator library (at least in the Unix world, there tend to be
    several allocator libraries available on any one platform) for a
    particular application. What works best in a single threaded
    application may be hopeless in a multi-threaded one.
     
    Ian Collins, Apr 28, 2014
    #26
  7. This won't work because, to put it rather informally, # lines are
    processed before sizeof means anything. Sorry I can't offer an
    alternative.
     
    Ben Bacarisse, Apr 28, 2014
    #27
  8. James Harris

    James Kuyper Guest

    The C standard never disallows anything - it just tell you what behavior
    is mandatory, prohibited, or neither, when code containing certain
    features is processed by a conforming implementation of C. That having
    been said, 6.5.12 is a constraint section, so code which doesn't conform
    to that "shall" is a constraint violation - which is as close as C ever
    comes to disallowing something.


    Way too far. The standard imposes no requirements on how an
    implementation deals with such code, except for the mandatory diagnostic.
    It's defined in the C standard, as being part of the C standard library.
    It's an optional part, only hosted implementations of C need to support
    it - but I'd have expected you to use different wording if that had been
    the distinction you were making.
    There are other allocators defined as being part of the C standard
    library (several were added in C2011). Any of those could be used
    without making the program any less of a C program. The same would not
    be true of any allocator that is not defined by the C standard, and not
    implemented using strictly conforming C code.
     
    James Kuyper, Apr 28, 2014
    #28
  9. There are two common patterns for heap memory use.

    The first is that calls to the allocator are matched by calls to the
    deallocator, either in the same function, or in nested constructor/
    destructor calls.
    The second is that a persistent graph is created, consisting of a large
    number of relatively small fixed-size nodes, linked by pointers.

    In both these cases it's easy to use extremely efficient algorithms to
    replace malloc(). Malloc() can then be kept in reserve for the few cases
    that don't fit (e.g. an expandable buffer not in a leaf routine).
    However it's not normally worth it, because whilst the performance gain
    will be decent, it won't usally transform the program, and it involves
    putting extra complexity and dependency into the code.
     
    Malcolm McLean, Apr 28, 2014
    #29
  10. James Harris

    Ian Collins Guest

    You forgot 3: allocation and de-allocation in different threads.
    Really? If so, why to system designers spend so much effort
    implementing (usually more than one) efficient memory allocators?
     
    Ian Collins, Apr 28, 2014
    #30
  11. James Harris

    BartC Guest

    There are lots of patterns. For example, large numbers of blocks of the same
    size, but not necessarily linked to each other. Or large numbers of blocks
    of different sizes, which are never going to be freed (until the program
    terminates). Or a combination of these. Or a collection of blocks which will
    eventually be freed en masse. Then there are blocks that can grow in size,
    and those that will be fixed (and won't need any over-allocation).

    But they will still depend on some underlying allocator of large,
    arbitrary-sized blocks.
     
    BartC, Apr 28, 2014
    #31
  12. There are other patterns. There are two common patterns that are easy
    to optimise. I gave an example of another, less easy to optimise pattern.
    Because they're trying to keep the same interface as malloc(), rather
    than doing a higher-level analysis of what patterns are actually going
    to be needed first? Because they are doing a higher-level analysis, but
    not doing it very well, because they are maybe less skilled, or less
    highly qualified than I am? Because they know that very simple efficient
    algorithms are available, but there's a case for using a very complex,
    over-engineered one, and they get paid to do engineering, not as a
    share of the venture's profits? Because they're writing for small specialised
    systems which have rather different requirements to those running
    general-purpose programs?

    There are lots of possible explanations.

    A stack allocator and a fixed block allocator are trivial to write, and will
    cover maybe 80% of memory allocations if a typical program is written
    to take advantage of them. I haven't actually done any statistics, but I
    know from long experience of programming that its going to be something
    like that. But it does add an extra burden to the programmer. The stack
    allocator has to be very carefully called with matching allocates / frees,
    and the fixed block allocators need setting up with the block size and the
    expected total allocation. But you can get a significant increase in
    performance. You can't alter the big O complexity of the program, however,
    you're never going to totally transform it.
     
    Malcolm McLean, Apr 28, 2014
    #32
  13. James Harris

    Ian Collins Guest

    Normally because they want to get the best performance from a range of
    applications and hardware. The better their allocator behaves, the
    better the synthetic benchmark results they can brag about.
    So you end up reinventing a slab allocator, which any decent system will
    already have.
     
    Ian Collins, Apr 28, 2014
    #33
  14. Implementing. I did actually realise that fixed blocks could be allocated very
    easily and very quickly independently, but that was a long time ago, when I
    first started programming. Many people have used them since, and I'm sure
    I wasn't the first person to write one - I don't actually know if anyone lays
    claim to the title of first inventor. In games, where performance is at a premium,
    it's a standard technique.

    You can use one already existing, or knock on up in five minutes, or take the one
    from my book "Basic Algorithms". It hardly matters. The important point is that
    a lot of persistent structures consist of large numbers of small blocks of equal size,
    linked together with pincers to form a graph. In this common situation, you can
    use a fixed block allocator. You might shave off an extra cycle or two by writing it
    specially for the hardware, but basically you can't beat a single pointer deference
    and write to allocate and to free.

    However they are a bit fiddly. You have to set them up with the block size, and you
    have to decide how you will set the maximum allocation limit, or if you'll make it soft.
    So it's easier just to call malloc() for most purposes.
     
    Malcolm McLean, Apr 28, 2014
    #34
  15. You're absolutely right. But look at any code you happen to have written. Almost
    certainly you'll find that the calls to malloc() mostly look like this

    void foo()
    {
    char *buff1 = malloc(N);
    char *buff2 = mallo(M);

    callsubroutines();
    free(buff1);
    free(buff2);
    }

    or like this

    void foo()
    {
    OPAQUE *object = constructobject();

    callsubroutines();

    destroyobject();
    }

    So they can be replaced with a stack allocator, but you have to be careful. For example the free
    calls in example one need to be reversed.

    The other common situation is

    foo()
    {
    NODE *root = rootnode;

    while(complex_condition)
    {
    NODE *sub = findnode(root, complex_criterion);
    addordelteanode(sub);
    callsubroutines();
    }

    freealltheremainignnodes(root);
    }

    So you can use the fixed block allocator.

    Now not absolutely everything can be reworked with a little effort
    into these two patterns. As Ian said, if for some reason you need to
    allocate memory in one thread and free it in another, you're not
    going to be able to use a stack allocator, for example.

    But most can.
     
    Malcolm McLean, Apr 28, 2014
    #35
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.