Linked list allocation

J

Jeff Bown

Consider implementing a (doubly) linked list.

The simplest strategy is to provide functions
item_t add_item(item_t predecessor);
void delete_item(item_t item);
where add_item allocates memory for a new list item and returns it (or
NULL), and delete_item frees that memory.

However, if at startup the program immediately adds a large number of
items to a list, then all those calls to malloc() become expensive.

An alternative I thought of was that add_item should allocate a block,
say of size (100 * size-needed-by-an-item), then use pointers inside
this block until the time comes to allocate a new block for 100 items.

The problem with this is that deleting items becomes a mess - you can
only free a block when all 100 items in it have been deleted, so in the
worst case (where the programmer allocates 101 items, frees 99, adds
101, ...) you end up with for all intents and purposes a memory leak.

Is there a better solution?
 
W

Willem

Jeff wrote:
) Consider implementing a (doubly) linked list.
)
) The simplest strategy is to provide functions
) item_t add_item(item_t predecessor);
) void delete_item(item_t item);
) where add_item allocates memory for a new list item and returns it (or
) NULL), and delete_item frees that memory.
)
) However, if at startup the program immediately adds a large number of
) items to a list, then all those calls to malloc() become expensive.
)
) An alternative I thought of was that add_item should allocate a block,
) say of size (100 * size-needed-by-an-item), then use pointers inside
) this block until the time comes to allocate a new block for 100 items.
)
) The problem with this is that deleting items becomes a mess - you can
) only free a block when all 100 items in it have been deleted, so in the
) worst case (where the programmer allocates 101 items, frees 99, adds
) 101, ...) you end up with for all intents and purposes a memory leak.
)
) Is there a better solution?

It's very difficult for a solution to be strictly 'better' than another
solution. So, you'll need to examine several solutions and see which best
fits your needs.

For example: With your alternative, you can reuse spots within a block if
you can mark them 'empty'. This induces an extra cost of searching for
empty blocks, however.

For another example: You can store groups of items, together with a count,
into larger blocks. This is tricky to get right, though, and you need to
move memory inside a block around when deleting or adding items.

For yet another example: You can periodically 'rehash' the list, sticking
it into one or more big blocks sequentially. Like defragmenting a filesystem.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
B

Ben Pfaff

Jeff Bown said:
Consider implementing a (doubly) linked list.

The simplest strategy is to provide functions
item_t add_item(item_t predecessor);
void delete_item(item_t item);
where add_item allocates memory for a new list item and returns it (or
NULL), and delete_item frees that memory.

However, if at startup the program immediately adds a large number of
items to a list, then all those calls to malloc() become expensive.

A better solution is sometimes to embed the list elements
directly in the elements being added to the list. If each
element is only on at most a fixed number of lists, this approach
is cheaper in terms of the number of necessary allocations.
 
P

pete

Jeff said:
Consider implementing a (doubly) linked list.

The simplest strategy is to provide functions
item_t add_item(item_t predecessor);
void delete_item(item_t item);
where add_item allocates memory for a new list item and returns it (or
NULL), and delete_item frees that memory.

However, if at startup the program immediately adds a large number of
items to a list, then all those calls to malloc() become expensive.

An alternative I thought of was that add_item should allocate a block,
say of size (100 * size-needed-by-an-item), then use pointers inside
this block until the time comes to allocate a new block for 100 items.

The problem with this is that deleting items becomes a mess - you can
only free a block when all 100 items in it have been deleted,
so in the
worst case (where the programmer allocates 101 items, frees 99, adds
101, ...) you end up with for all intents and purposes a memory leak.

Is there a better solution?

Yes.
The solution is to realise that the problem does not exist.

The scenario that you describe
of a whole bunch of malloc at startup
to the extent that it makes a noticable difference
and also of mallocing one big object
for parts of your list to occupy,
suggests that
you have no idea of why you're using a list intsead of an array.

The best use for a list,
is when you want to grow your data format
as data becomes available to the program.

Freeing a list isn't supposed to be difficult.

typedef struct list_node {
struct list_node *next;
void *data;
} list_type;

void list_free(list_type *node, void (*free_data)(void *))
{
list_type *next_node;

while (node != NULL) {
next_node = node -> next;
free_data(node -> data);
free(node);
node = next_node;
}
}
 
G

Gene

Consider implementing a (doubly) linked list.

The simplest strategy is to provide functions
item_t add_item(item_t predecessor);
void delete_item(item_t item);
where add_item allocates memory for a new list item and returns it (or
NULL), and delete_item frees that memory.

However, if at startup the program immediately adds a large number of
items to a list, then all those calls to malloc() become expensive.

An alternative I thought of was that add_item should allocate a block,
say of size (100 * size-needed-by-an-item), then use pointers inside
this block until the time comes to allocate a new block for 100 items.

The problem with this is that deleting items becomes a mess - you can
only free a block when all 100 items in it have been deleted, so in the
worst case (where the programmer allocates 101 items, frees 99, adds
101, ...) you end up with for all intents and purposes a memory leak.

Is there a better solution?

The usual idea is to maintain a free list where deleted cells are held
pending future allocations. There aren't many cases where it would pay
to return blocks to the malloc() heap. Unused free blocks will be
paged out in due course, where they consume only some swap space - of
interest only for _really_ long lists.

Pete's rather condescending suggestion that you don't need lists and
ought to use vectors instead has a flaw: O(1) time deletion of an
arbitrary element can't be accomplished with a vector. It's perfectly
reasonable to build up a big list at program startup if e.g. future
computations involve decimating the list by arbitrary deletions.
 
R

Richard Harter

Consider implementing a (doubly) linked list.

The simplest strategy is to provide functions
item_t add_item(item_t predecessor);
void delete_item(item_t item);
where add_item allocates memory for a new list item and returns it (or
NULL), and delete_item frees that memory.

However, if at startup the program immediately adds a large number of
items to a list, then all those calls to malloc() become expensive.

An alternative I thought of was that add_item should allocate a block,
say of size (100 * size-needed-by-an-item), then use pointers inside
this block until the time comes to allocate a new block for 100 items.

The problem with this is that deleting items becomes a mess - you can
only free a block when all 100 items in it have been deleted, so in the
worst case (where the programmer allocates 101 items, frees 99, adds
101, ...) you end up with for all intents and purposes a memory leak.

Is there a better solution?

Your description is a little confusing - to me anyway, and
perhaps to others judging from the variety of answers you got.

Ordinarily a linked list (single, or double) consists of nodes,
where a node (doubly linked) looks something like this:

struct node {
node * prev; /* Link to previous node */
node * next; /* Link to next node */
void * data; /* generic pointer to data */
};

and the prototypes look something like this:

node * add_item(node * predecessor, void * datum);
void delete_item(node * item);

Usually you would have a few more routines but set that aside for
the moment. The point I am after is that you seem to be mixing
up nodes and data.

Now, supposing these are your prototypes, how do you go about
implementing them? I will suppose that you can write the code
for inserting and deleting nodes from linked lists. The question
at hand is: what about storage allocation and deallocation? It
is important to recognize that there are two different kinds of
things being allocated and freed, nodes and data.

A generic linked list package shouldn't be responsible for
allocating or freeing data, though there are standard techniques
that can be used.

For nodes a common technique is to (a) use a free list, and (b)
allocate blocks of nodes, link them together, and add them to the
free list. When adding items nodes are popped off the free list;
when deleting items nodes are pushed onto the free list.

HTH. HAND.
 
G

George Peter Staplin

Jeff said:
Consider implementing a (doubly) linked list.

The simplest strategy is to provide functions
item_t add_item(item_t predecessor);
void delete_item(item_t item);
where add_item allocates memory for a new list item and returns it (or
NULL), and delete_item frees that memory.

However, if at startup the program immediately adds a large number of
items to a list, then all those calls to malloc() become expensive.

An alternative I thought of was that add_item should allocate a block,
say of size (100 * size-needed-by-an-item), then use pointers inside
this block until the time comes to allocate a new block for 100 items.

The problem with this is that deleting items becomes a mess - you can
only free a block when all 100 items in it have been deleted, so in the
worst case (where the programmer allocates 101 items, frees 99, adds
101, ...) you end up with for all intents and purposes a memory leak.

Is there a better solution?

You can use a pool of linked list structs, and some malloc
implementations actually create a sort of pool for objects of the same
size. Perhaps a different type of data structure would be a better fit
though. Often times programmers will use a linked list when an array of
structs will do.

There are also certain advantages to using a linked-list pool, or a
specialized pool for any data structure. For instance if you're dumping
a tree or linked list to the disk, and want to restore with minimal
relocation and cost for the ability to dump the data structure to the
disk, and restore. For the relocation you generally want to use an
optional part of C99 (an intptr_t or uintptr_t) though.


George
 
H

henry

Your description is a little confusing - to me anyway, and
perhaps to others judging from the variety of answers you got.

Ordinarily a linked list (single, or double) consists of nodes,
where a node (doubly linked) looks something like this:

struct node {
node * prev; /* Link to previous node */
node * next; /* Link to next node */
void * data; /* generic pointer to data */

};

and the prototypes look something like this:

node * add_item(node * predecessor, void * datum);
void delete_item(node * item);

Usually you would have a few more routines but set that aside for
the moment. The point I am after is that you seem to be mixing
up nodes and data.

Now, supposing these are your prototypes, how do you go about
implementing them? I will suppose that you can write the code
for inserting and deleting nodes from linked lists. The question
at hand is: what about storage allocation and deallocation? It
is important to recognize that there are two different kinds of
things being allocated and freed, nodes and data.

A generic linked list package shouldn't be responsible for
allocating or freeing data, though there are standard techniques
that can be used.

For nodes a common technique is to (a) use a free list, and (b)
allocate blocks of nodes, link them together, and add them to the
free list. When adding items nodes are popped off the free list;
when deleting items nodes are pushed onto the free list.

HTH. HAND.

I see some list add item using ** such as
node * add_item(node ** predecessor, void * datum);

Any benefit compare to node * add_item(node * predecessor, void *
datum);?
 
R

Richard Harter

I see some list add item using ** such as
node * add_item(node ** predecessor, void * datum);

Any benefit compare to node * add_item(node * predecessor, void *
datum);?

Actually there can be. If the argument points directly to the
predecessor node the implementation cannot change the location of
the predecesssor node; if the argument is an indirect pointer
(**) the implementation can change the location. The down side,
such as it is, is that the code goes through an extra level of
indirection.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top