Linked list allocation

Discussion in 'C Programming' started by Jeff Bown, Dec 17, 2007.

  1. Jeff Bown

    Jeff Bown Guest

    Consider implementing a (doubly) linked list.

    The simplest strategy is to provide functions
    item_t add_item(item_t predecessor);
    void delete_item(item_t item);
    where add_item allocates memory for a new list item and returns it (or
    NULL), and delete_item frees that memory.

    However, if at startup the program immediately adds a large number of
    items to a list, then all those calls to malloc() become expensive.

    An alternative I thought of was that add_item should allocate a block,
    say of size (100 * size-needed-by-an-item), then use pointers inside
    this block until the time comes to allocate a new block for 100 items.

    The problem with this is that deleting items becomes a mess - you can
    only free a block when all 100 items in it have been deleted, so in the
    worst case (where the programmer allocates 101 items, frees 99, adds
    101, ...) you end up with for all intents and purposes a memory leak.

    Is there a better solution?
     
    Jeff Bown, Dec 17, 2007
    #1
    1. Advertising

  2. Jeff Bown

    Willem Guest

    Jeff wrote:
    ) Consider implementing a (doubly) linked list.
    )
    ) The simplest strategy is to provide functions
    ) item_t add_item(item_t predecessor);
    ) void delete_item(item_t item);
    ) where add_item allocates memory for a new list item and returns it (or
    ) NULL), and delete_item frees that memory.
    )
    ) However, if at startup the program immediately adds a large number of
    ) items to a list, then all those calls to malloc() become expensive.
    )
    ) An alternative I thought of was that add_item should allocate a block,
    ) say of size (100 * size-needed-by-an-item), then use pointers inside
    ) this block until the time comes to allocate a new block for 100 items.
    )
    ) The problem with this is that deleting items becomes a mess - you can
    ) only free a block when all 100 items in it have been deleted, so in the
    ) worst case (where the programmer allocates 101 items, frees 99, adds
    ) 101, ...) you end up with for all intents and purposes a memory leak.
    )
    ) Is there a better solution?

    It's very difficult for a solution to be strictly 'better' than another
    solution. So, you'll need to examine several solutions and see which best
    fits your needs.

    For example: With your alternative, you can reuse spots within a block if
    you can mark them 'empty'. This induces an extra cost of searching for
    empty blocks, however.

    For another example: You can store groups of items, together with a count,
    into larger blocks. This is tricky to get right, though, and you need to
    move memory inside a block around when deleting or adding items.

    For yet another example: You can periodically 'rehash' the list, sticking
    it into one or more big blocks sequentially. Like defragmenting a filesystem.


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
     
    Willem, Dec 17, 2007
    #2
    1. Advertising

  3. Jeff Bown

    Ben Pfaff Guest

    Jeff Bown <> writes:

    > Consider implementing a (doubly) linked list.
    >
    > The simplest strategy is to provide functions
    > item_t add_item(item_t predecessor);
    > void delete_item(item_t item);
    > where add_item allocates memory for a new list item and returns it (or
    > NULL), and delete_item frees that memory.
    >
    > However, if at startup the program immediately adds a large number of
    > items to a list, then all those calls to malloc() become expensive.


    A better solution is sometimes to embed the list elements
    directly in the elements being added to the list. If each
    element is only on at most a fixed number of lists, this approach
    is cheaper in terms of the number of necessary allocations.
    --
    Ben Pfaff
    http://benpfaff.org
     
    Ben Pfaff, Dec 17, 2007
    #3
  4. Jeff Bown

    pete Guest

    Jeff Bown wrote:
    >
    > Consider implementing a (doubly) linked list.
    >
    > The simplest strategy is to provide functions
    > item_t add_item(item_t predecessor);
    > void delete_item(item_t item);
    > where add_item allocates memory for a new list item and returns it (or
    > NULL), and delete_item frees that memory.
    >
    > However, if at startup the program immediately adds a large number of
    > items to a list, then all those calls to malloc() become expensive.
    >
    > An alternative I thought of was that add_item should allocate a block,
    > say of size (100 * size-needed-by-an-item), then use pointers inside
    > this block until the time comes to allocate a new block for 100 items.
    >
    > The problem with this is that deleting items becomes a mess - you can
    > only free a block when all 100 items in it have been deleted,
    > so in the
    > worst case (where the programmer allocates 101 items, frees 99, adds
    > 101, ...) you end up with for all intents and purposes a memory leak.
    >
    > Is there a better solution?


    Yes.
    The solution is to realise that the problem does not exist.

    The scenario that you describe
    of a whole bunch of malloc at startup
    to the extent that it makes a noticable difference
    and also of mallocing one big object
    for parts of your list to occupy,
    suggests that
    you have no idea of why you're using a list intsead of an array.

    The best use for a list,
    is when you want to grow your data format
    as data becomes available to the program.

    Freeing a list isn't supposed to be difficult.

    typedef struct list_node {
    struct list_node *next;
    void *data;
    } list_type;

    void list_free(list_type *node, void (*free_data)(void *))
    {
    list_type *next_node;

    while (node != NULL) {
    next_node = node -> next;
    free_data(node -> data);
    free(node);
    node = next_node;
    }
    }

    --
    pete
     
    pete, Dec 17, 2007
    #4
  5. Jeff Bown

    Gene Guest

    On Dec 17, 1:51 pm, Jeff Bown <> wrote:
    > Consider implementing a (doubly) linked list.
    >
    > The simplest strategy is to provide functions
    > item_t add_item(item_t predecessor);
    > void delete_item(item_t item);
    > where add_item allocates memory for a new list item and returns it (or
    > NULL), and delete_item frees that memory.
    >
    > However, if at startup the program immediately adds a large number of
    > items to a list, then all those calls to malloc() become expensive.
    >
    > An alternative I thought of was that add_item should allocate a block,
    > say of size (100 * size-needed-by-an-item), then use pointers inside
    > this block until the time comes to allocate a new block for 100 items.
    >
    > The problem with this is that deleting items becomes a mess - you can
    > only free a block when all 100 items in it have been deleted, so in the
    > worst case (where the programmer allocates 101 items, frees 99, adds
    > 101, ...) you end up with for all intents and purposes a memory leak.
    >
    > Is there a better solution?


    The usual idea is to maintain a free list where deleted cells are held
    pending future allocations. There aren't many cases where it would pay
    to return blocks to the malloc() heap. Unused free blocks will be
    paged out in due course, where they consume only some swap space - of
    interest only for _really_ long lists.

    Pete's rather condescending suggestion that you don't need lists and
    ought to use vectors instead has a flaw: O(1) time deletion of an
    arbitrary element can't be accomplished with a vector. It's perfectly
    reasonable to build up a big list at program startup if e.g. future
    computations involve decimating the list by arbitrary deletions.
     
    Gene, Dec 18, 2007
    #5
  6. On Mon, 17 Dec 2007 19:51:32 +0100 (CET), Jeff Bown
    <> wrote:

    >Consider implementing a (doubly) linked list.
    >
    >The simplest strategy is to provide functions
    >item_t add_item(item_t predecessor);
    >void delete_item(item_t item);
    >where add_item allocates memory for a new list item and returns it (or
    >NULL), and delete_item frees that memory.
    >
    >However, if at startup the program immediately adds a large number of
    >items to a list, then all those calls to malloc() become expensive.
    >
    >An alternative I thought of was that add_item should allocate a block,
    >say of size (100 * size-needed-by-an-item), then use pointers inside
    >this block until the time comes to allocate a new block for 100 items.
    >
    >The problem with this is that deleting items becomes a mess - you can
    >only free a block when all 100 items in it have been deleted, so in the
    >worst case (where the programmer allocates 101 items, frees 99, adds
    >101, ...) you end up with for all intents and purposes a memory leak.
    >
    >Is there a better solution?


    Your description is a little confusing - to me anyway, and
    perhaps to others judging from the variety of answers you got.

    Ordinarily a linked list (single, or double) consists of nodes,
    where a node (doubly linked) looks something like this:

    struct node {
    node * prev; /* Link to previous node */
    node * next; /* Link to next node */
    void * data; /* generic pointer to data */
    };

    and the prototypes look something like this:

    node * add_item(node * predecessor, void * datum);
    void delete_item(node * item);

    Usually you would have a few more routines but set that aside for
    the moment. The point I am after is that you seem to be mixing
    up nodes and data.

    Now, supposing these are your prototypes, how do you go about
    implementing them? I will suppose that you can write the code
    for inserting and deleting nodes from linked lists. The question
    at hand is: what about storage allocation and deallocation? It
    is important to recognize that there are two different kinds of
    things being allocated and freed, nodes and data.

    A generic linked list package shouldn't be responsible for
    allocating or freeing data, though there are standard techniques
    that can be used.

    For nodes a common technique is to (a) use a free list, and (b)
    allocate blocks of nodes, link them together, and add them to the
    free list. When adding items nodes are popped off the free list;
    when deleting items nodes are pushed onto the free list.

    HTH. HAND.
     
    Richard Harter, Dec 18, 2007
    #6
  7. Jeff Bown wrote:
    > Consider implementing a (doubly) linked list.
    >
    > The simplest strategy is to provide functions
    > item_t add_item(item_t predecessor);
    > void delete_item(item_t item);
    > where add_item allocates memory for a new list item and returns it (or
    > NULL), and delete_item frees that memory.
    >
    > However, if at startup the program immediately adds a large number of
    > items to a list, then all those calls to malloc() become expensive.
    >
    > An alternative I thought of was that add_item should allocate a block,
    > say of size (100 * size-needed-by-an-item), then use pointers inside
    > this block until the time comes to allocate a new block for 100 items.
    >
    > The problem with this is that deleting items becomes a mess - you can
    > only free a block when all 100 items in it have been deleted, so in the
    > worst case (where the programmer allocates 101 items, frees 99, adds
    > 101, ...) you end up with for all intents and purposes a memory leak.
    >
    > Is there a better solution?


    You can use a pool of linked list structs, and some malloc
    implementations actually create a sort of pool for objects of the same
    size. Perhaps a different type of data structure would be a better fit
    though. Often times programmers will use a linked list when an array of
    structs will do.

    There are also certain advantages to using a linked-list pool, or a
    specialized pool for any data structure. For instance if you're dumping
    a tree or linked list to the disk, and want to restore with minimal
    relocation and cost for the ability to dump the data structure to the
    disk, and restore. For the relocation you generally want to use an
    optional part of C99 (an intptr_t or uintptr_t) though.


    George
     
    George Peter Staplin, Dec 19, 2007
    #7
  8. Jeff Bown

    henry Guest

    On Dec 17, 9:34 pm, (Richard Harter) wrote:
    > On Mon, 17 Dec 2007 19:51:32 +0100 (CET), Jeff Bown
    >
    >
    >
    > <> wrote:
    > >Consider implementing a (doubly) linked list.

    >
    > >The simplest strategy is to provide functions
    > >item_t add_item(item_t predecessor);
    > >void delete_item(item_t item);
    > >where add_item allocates memory for a new list item and returns it (or
    > >NULL), and delete_item frees that memory.

    >
    > >However, if at startup the program immediately adds a large number of
    > >items to a list, then all those calls to malloc() become expensive.

    >
    > >An alternative I thought of was that add_item should allocate a block,
    > >say of size (100 * size-needed-by-an-item), then use pointers inside
    > >this block until the time comes to allocate a new block for 100 items.

    >
    > >The problem with this is that deleting items becomes a mess - you can
    > >only free a block when all 100 items in it have been deleted, so in the
    > >worst case (where the programmer allocates 101 items, frees 99, adds
    > >101, ...) you end up with for all intents and purposes a memory leak.

    >
    > >Is there a better solution?

    >
    > Your description is a little confusing - to me anyway, and
    > perhaps to others judging from the variety of answers you got.
    >
    > Ordinarily a linked list (single, or double) consists of nodes,
    > where a node (doubly linked) looks something like this:
    >
    > struct node {
    > node * prev; /* Link to previous node */
    > node * next; /* Link to next node */
    > void * data; /* generic pointer to data */
    >
    > };
    >
    > and the prototypes look something like this:
    >
    > node * add_item(node * predecessor, void * datum);
    > void delete_item(node * item);
    >
    > Usually you would have a few more routines but set that aside for
    > the moment. The point I am after is that you seem to be mixing
    > up nodes and data.
    >
    > Now, supposing these are your prototypes, how do you go about
    > implementing them? I will suppose that you can write the code
    > for inserting and deleting nodes from linked lists. The question
    > at hand is: what about storage allocation and deallocation? It
    > is important to recognize that there are two different kinds of
    > things being allocated and freed, nodes and data.
    >
    > A generic linked list package shouldn't be responsible for
    > allocating or freeing data, though there are standard techniques
    > that can be used.
    >
    > For nodes a common technique is to (a) use a free list, and (b)
    > allocate blocks of nodes, link them together, and add them to the
    > free list. When adding items nodes are popped off the free list;
    > when deleting items nodes are pushed onto the free list.
    >
    > HTH. HAND.


    I see some list add item using ** such as
    node * add_item(node ** predecessor, void * datum);

    Any benefit compare to node * add_item(node * predecessor, void *
    datum);?
     
    henry, Dec 26, 2007
    #8
  9. On Wed, 26 Dec 2007 11:20:26 -0800 (PST), henry
    <> wrote:

    >On Dec 17, 9:34 pm, (Richard Harter) wrote:
    >> On Mon, 17 Dec 2007 19:51:32 +0100 (CET), Jeff Bown
    >>
    >>
    >>
    >> <> wrote:
    >> >Consider implementing a (doubly) linked list.

    >>
    >> >The simplest strategy is to provide functions
    >> >item_t add_item(item_t predecessor);
    >> >void delete_item(item_t item);
    >> >where add_item allocates memory for a new list item and returns it (or
    >> >NULL), and delete_item frees that memory.

    >>
    >> >However, if at startup the program immediately adds a large number of
    >> >items to a list, then all those calls to malloc() become expensive.

    >>
    >> >An alternative I thought of was that add_item should allocate a block,
    >> >say of size (100 * size-needed-by-an-item), then use pointers inside
    >> >this block until the time comes to allocate a new block for 100 items.

    >>
    >> >The problem with this is that deleting items becomes a mess - you can
    >> >only free a block when all 100 items in it have been deleted, so in the
    >> >worst case (where the programmer allocates 101 items, frees 99, adds
    >> >101, ...) you end up with for all intents and purposes a memory leak.

    >>
    >> >Is there a better solution?

    >>
    >> Your description is a little confusing - to me anyway, and
    >> perhaps to others judging from the variety of answers you got.
    >>
    >> Ordinarily a linked list (single, or double) consists of nodes,
    >> where a node (doubly linked) looks something like this:
    >>
    >> struct node {
    >> node * prev; /* Link to previous node */
    >> node * next; /* Link to next node */
    >> void * data; /* generic pointer to data */
    >>
    >> };
    >>
    >> and the prototypes look something like this:
    >>
    >> node * add_item(node * predecessor, void * datum);
    >> void delete_item(node * item);
    >>
    >> Usually you would have a few more routines but set that aside for
    >> the moment. The point I am after is that you seem to be mixing
    >> up nodes and data.
    >>
    >> Now, supposing these are your prototypes, how do you go about
    >> implementing them? I will suppose that you can write the code
    >> for inserting and deleting nodes from linked lists. The question
    >> at hand is: what about storage allocation and deallocation? It
    >> is important to recognize that there are two different kinds of
    >> things being allocated and freed, nodes and data.
    >>
    >> A generic linked list package shouldn't be responsible for
    >> allocating or freeing data, though there are standard techniques
    >> that can be used.
    >>
    >> For nodes a common technique is to (a) use a free list, and (b)
    >> allocate blocks of nodes, link them together, and add them to the
    >> free list. When adding items nodes are popped off the free list;
    >> when deleting items nodes are pushed onto the free list.
    >>
    >> HTH. HAND.

    >
    >I see some list add item using ** such as
    >node * add_item(node ** predecessor, void * datum);
    >
    >Any benefit compare to node * add_item(node * predecessor, void *
    >datum);?


    Actually there can be. If the argument points directly to the
    predecessor node the implementation cannot change the location of
    the predecesssor node; if the argument is an indirect pointer
    (**) the implementation can change the location. The down side,
    such as it is, is that the code goes through an extra level of
    indirection.
     
    Richard Harter, Dec 27, 2007
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Chris Ritchey
    Replies:
    7
    Views:
    511
    emerth
    Jul 10, 2003
  2. Henk
    Replies:
    4
    Views:
    867
  3. Nikos Mitas

    Linked list allocation problem

    Nikos Mitas, Oct 8, 2005, in forum: C Programming
    Replies:
    7
    Views:
    543
    Barry Schwarz
    Oct 8, 2005
  4. fool
    Replies:
    14
    Views:
    542
    Barry Schwarz
    Jul 3, 2006
  5. joshd
    Replies:
    12
    Views:
    701
    John Carson
    Oct 2, 2006
Loading...

Share This Page