Mergesort algorithm for linked lists

Richard Harter · Jan 26, 2007

On Fri, 26 Jan 2007 13:04:00 -0500, Eric Sosman

[snip clever cleanup of MCrthy's algorithm]

Very pretty.
Can you describe the general case or is it an exercise for the reader?

Eric Sosman · Jan 26, 2007

Joerg said:
Sounds interesting, though I don't understand it fully and
I don't have access to any copy of Knuth's books. Code might
be interesting, though. What about posting it or making it
available for download?

If I get it correctly, the algorithm you are talking about
is unstable, works well only for powers of two (except with
some special modification to avoid this) and doesn't
recognize natural runs in the input? I would like to test
it against mine to see how it performs.

#include <stddef.h>
#include <limits.h>

typedef struct unknown Node;
#define NEXT(ptr) *(Node**)((char*)(ptr) + offset)
#define MAXBITS (sizeof(Node*) * CHAR_BIT)

static void *
merge(Node *p, Node *q,
size_t offset,
int (*compare)(const void *, const void *))
/*
* Internal routine to merge two non-empty sorted lists. If elements
* of the two lists compare equal, those from the `p' list will precede
* those from the `q' list in the merged output.
*/
{
Node *head, **tail;

tail = &head;
for (;

{
if (compare(p, q) <= 0) {
*tail = p;
tail = &NEXT(p);
if ( (p = NEXT(p)) == NULL ) {
*tail = q;
break;
}
}
else {
*tail = q;
tail = &NEXT(q);
if ( (q = NEXT(q)) == NULL ) {
*tail = p;
break;
}
}
}
return head;
}

void * /* returns: pointer to new list head */
listsort(
void *head, /* first item in original list */
size_t offset, /* byte offset of link field in each item */
int (*compare) /* item comparison function */
(const void *, const void *))
/*
* listsort() rearranges the links of a singly-linked NULL-terminated
* list so they traverse the items in an order defined by a caller-
* provided comparison function, and returns a pointer to the first
* item in the rearranged list. The comparison function accepts two
* item pointers and returns a negative, zero, or positive integer to
* indicate that the first item compares less than, equal to, or greater
* than the second. The sort is stable.
*/
{
Node *list[MAXBITS-1]; /* sorted sub-lists of 2,4,8,... items */
Node *p, *q;
int bits, maxbits;

list[ maxbits = 0 ] = NULL;
while ( (p = head) != NULL ) {

if ( (q = NEXT(p)) == NULL )
break;

head = NEXT(q);
if (compare(p, q) <= 0) {
NEXT(q) = NULL;
}
else {
NEXT(q) = p;
NEXT(p) = NULL;
p = q;
}

for (bits = 0; (q = list[bits]) != NULL; ) {
list[bits] = NULL;
p = merge(q, p, offset, compare);
if (++bits > maxbits) {
maxbits = bits;
break;
}
}
list[bits] = p;
}

for (bits = 0; bits <= maxbits; ++bits) {
if ( (p = list[bits]) != NULL )
head = (head == NULL) ? p : merge(p, head, offset, compare);
}
return head;
}

Notes:

1) This is packaged as a type-blind "generic" list sort, the
only assumptions being that the links are struct pointers and
that a list's final node has a NULL link. A type-aware version
(with knowledge of the nodes' structure and able to compare nodes
in-line) should run faster.

2) The sort doesn't actually need a comparison function that
returns -ve/zero/+ve: a Boolean `precedes' function would do as
well and could be faster. In this sort, I decided to stick with
a qsort-and-bsearch style, mostly because that's what people are
accustomed to.

3) This code implements the basic McCarthy method without
protection against badly unbalanced merges in the cleanup phase.

3a) Well, not the "most basic" McCarthy: It sorts two-node
lists in-line instead of merging two one-node lists.

3b) Knuth suggests that "we can sort groups of, say, 16 items
using straight insertion," but in tests I've conducted and on the
machines available to me that doesn't seem to be a good idea. (He
makes the suggestion in connection with merge using arrays instead
of lists, and perhaps things would be different in that setting.)

4) The McCarthy algorithm may be more understandable if you
note the parallel to the process of counting in binary. When N
items (or item pairs, with the 3a optimization) have been consumed,
the elements of list[] correspond to the binary representation of
N: each zero bit corresponds to a NULL in list[], and each one bit
to a non-NULL link to a sorted sub-list of N items (2*N with 3a).
Consuming the next item (or pair) "increments" N by working from
right to left: if the low-order bit is zero, the new item lands
in list[0]. Otherwise, it is merged with list[0], which becomes
NULL, and then the new two-item (two-pair) list "carries" to the
list[1] place, and so on.

5) To avoid wildly unbalanced merges during cleanup, use two
arrays alist[] and blist[] instead of the single array list[].
The "incrementing" process first tries to fill the NULL link in
alist[0], but if it's non-NULL it uses blist[0] instead. If
both are non-NULL, it merges those two lists, deposits the new
sub-list in alist[0] and NULLs blist[0], then "carries" the merge
result one place to the left:

for (bits = 0; ; p = q) {
if (alist[bits] == NULL) {
alist[bits] = p;
break;
}
if (blist[bits] == NULL) {
blist[bits] = p;
break;
}
q = merge(alist[bits], blist[bits], offset, compare);
alist[bits] = p;
blist[bits] = NULL;
if (++bits > maxbits) {
maxbits = bits;
alist[bits] = q;
blist[bits] = NULL;
break;
}
}

In my tests, this method is just a hair slower than the original
on lists of length 2^k or 2^k-1, but shows a substantial gain for
lengths like 2^k+1. It is the fastest "general-purpose" list
merge I know.

6) All the versions mentioned above are "straight" merges.
"Natural" merges that make use of the order already present in
the input list can beat the pants off them *if* the list is in
very good order already. However, in my tests the pre-existing
order must be very good indeed for natural to beat straight;
Knuth's remark in the answer to exercise 5.2.4-12

"We may conclude that natural merging is preferable to
straight merging when linked allocation is being used,
although it was inferior for sequential allocation."

is not supported by my data. (Hypothesis: Knuth was thinking
about the number of comparisons during the merges themselves,
but forgot about the N-1 comparisons required to discover the
boundaries between the initial runs. On "random" input, a
natural merge uses N-1 comparisons to create approximately N/2
initial runs, while a straight merge uses only N/2 comparisons
to make the same amount of progress.)

7) I've been working on and off -- mostly off -- on a paper
describing test results for various kinds of linked-list merges,
but I'm not ready to "publish," however informally. When last
I set it aside, I was having a hard time writing a linked-list
Quicksort that didn't exhibit pathological behavior. Maybe if
I convince myself it's just not possible, I'll finish the damn'
thing and post it.

8) An interesting (and possibly discouraging) observation:
Efficient linked-list sorting is relatively unimportant! First,
sorting a linked list brings very little "algorithmic advantage"
the way sorting an array does: In a sorted array you can use
things like binary search, but sorting a linked list offers no
similar gain. (Sorting cuts the average time for an unsuccessful
search in half, but if N is large there are better ways to search.)
Second, linked lists are "unfriendly" to the memory implementations
of contemporary systems -- bad locality, cache thrashing, etc. --
so a good programmer usually tries to keep them short; if N is
small, even an inefficient sort will be quick. Yes, there are
circumstances where an efficient linked-list sort is needed, but
IMHO they are (or should be) on the rare side.

Joerg Schoen · Jan 27, 2007

Eric said:
3b) Knuth suggests that "we can sort groups of, say, 16 items
using straight insertion," but in tests I've conducted and on the
machines available to me that doesn't seem to be a good idea. (He
makes the suggestion in connection with merge using arrays instead
of lists, and perhaps things would be different in that setting.)

I agree. As mentioned above, I made the same observation. Earlier
in this threadm, (e-mail address removed) gave a useful explanations on this
phenomenon

4) The McCarthy algorithm may be more understandable if you
note the parallel to the process of counting in binary. When N
items (or item pairs, with the 3a optimization) have been consumed,
the elements of list[] correspond to the binary representation of
N: each zero bit corresponds to a NULL in list[], and each one bit
to a non-NULL link to a sorted sub-list of N items (2*N with 3a).
Consuming the next item (or pair) "increments" N by working from
right to left: if the low-order bit is zero, the new item lands
in list[0]. Otherwise, it is merged with list[0], which becomes
NULL, and then the new two-item (two-pair) list "carries" to the
list[1] place, and so on.

I can now see that my implementation is basically a reformulation of
McCarthys algorithm. Thanks for revealing this! Your implementations is
more clever, though it takes a bit longer to comprehend. Thanks for the
explanation!

5) To avoid wildly unbalanced merges during cleanup, use two
arrays alist[] and blist[] instead of the single array list[].
The "incrementing" process first tries to fill the NULL link in
alist[0], but if it's non-NULL it uses blist[0] instead. If
both are non-NULL, it merges those two lists, deposits the new
sub-list in alist[0] and NULLs blist[0], then "carries" the merge
result one place to the left:
.. snip ..
In my tests, this method is just a hair slower than the original
on lists of length 2^k or 2^k-1, but shows a substantial gain for
lengths like 2^k+1. It is the fastest "general-purpose" list
merge I know.

You don't mention how you modify the final "cleanup" loop. I did it like
that:

for(bits = 0 ; bits <= maxbits ; ++bits) {
if((p = alist[bits]) != NULL)
head = (head == NULL) ? p : merge(p, head, offset, compare);

if((p = blist[bits]) != NULL)
head = (head == NULL) ? p : merge(p, head, offset, compare);
}

Maybe there is another / better way? At least the sorting then works.
But I cannot confirm your tests. Here are my results for three
different lengths. (1) sorts a random list, (2) random list with each
value 10 times, (3) an ordered list, and (4) and invers ordered list.

I compare my implementation against the code you supplied ("McCarthy")
and the improved version ("McCarthy II").

Length | mine | McCarthy | McCarthy II|
--------+---------+-----------+------------+
4*10^6 | | | |
(1) | 7.6 | 7.55 | 8.26 |
(2) | 7.67 | 7.63 | 8.32 |
(3) | 0.64 | 0.63 | 0.73 |
(4) | 0.59 | 0.58 | 0.88 |
--------+---------+-----------+------------+
2^22 | | | |
(1) | 8.02 | 7.97 | 8.27 |
(2) | 8.09 | 8.04 | 8.34 |
(3) | 0.67 | 0.65 | 0.77 |
(4) | 0.63 | 0.63 | 0.81 |
--------+---------+-----------+------------+
2^22+1 | | | |
(1) | 8.66 | 8.63 | 8.25 |
(2) | 9.17 | 9.11 | 8.31 |
(3) | 0.79 | 0.77 | 0.74 |
(4) | 0.62 | 0.64 | 0.79 |
--------+---------+-----------+------------+

As you can see, "McCarthy" and "mine" are basically the same, yours is a
bit faster in all cases.

But the improved one with two lists "alist" and "blist" is slower all
the time except for the 2^n+1 case.

6) All the versions mentioned above are "straight" merges.
"Natural" merges that make use of the order already present in
the input list can beat the pants off them *if* the list is in
very good order already. However, in my tests the pre-existing
order must be very good indeed for natural to beat straight;
Knuth's remark in the answer to exercise 5.2.4-12

"We may conclude that natural merging is preferable to
straight merging when linked allocation is being used,
although it was inferior for sequential allocation."

is not supported by my data. (Hypothesis: Knuth was thinking
about the number of comparisons during the merges themselves,
but forgot about the N-1 comparisons required to discover the
boundaries between the initial runs. On "random" input, a
natural merge uses N-1 comparisons to create approximately N/2
initial runs, while a straight merge uses only N/2 comparisons
to make the same amount of progress.)

I disagree. In my tests, the natural merges are always faster or only
slightly slower. Please note that the additional compares to check for
natural runs is only O(N) which looks affordable to me.

As mentioned before in this thread, the natural merge excels in case the
list is sorted and looses a few percent in the worst case, i. e. invers
ordered list.

8) An interesting (and possibly discouraging) observation:
Efficient linked-list sorting is relatively unimportant! First,
sorting a linked list brings very little "algorithmic advantage"
the way sorting an array does: In a sorted array you can use
things like binary search, but sorting a linked list offers no
similar gain. (Sorting cuts the average time for an unsuccessful
search in half, but if N is large there are better ways to search.)
Second, linked lists are "unfriendly" to the memory implementations
of contemporary systems -- bad locality, cache thrashing, etc. --
so a good programmer usually tries to keep them short; if N is
small, even an inefficient sort will be quick. Yes, there are
circumstances where an efficient linked-list sort is needed, but
IMHO they are (or should be) on the rare side.

Agreed, linked list sorting is a bit questionable, so one shouldn't put
too much effort into it. But I actually have some cases where I deal
with doubly linked lists as a usefull general purpose data structure in
a large application program and also want to provide a means to sort it
(e. g. before generating output of the list). Here I wanted to have an
implementation I shouldn't be ashamed off

Regarding locality, the McCarthy algortihm is still the best one can
achieve.

CBFalconer · Jan 27, 2007

Joerg said:
Eric Sosman wrote:
.... snip ...

Agreed, linked list sorting is a bit questionable, so one shouldn't
put too much effort into it. But I actually have some cases where I
deal with doubly linked lists as a usefull general purpose data
structure in a large application program and also want to provide a
means to sort it (e. g. before generating output of the list). Here
I wanted to have an implementation I shouldn't be ashamed off

If you want to search a table of some form, a hashtable is the most
efficient, since O(1) beats O(logN) by a wide margin. However you
may want to dump the table in sorted order, which can be awkward.
The cure is to be able to form a list of the table content, sort
it, and dump from that. That is easily implemented by simply
including a single spare pointer in the table items. See the
demonstration programs for hashlib on my site. The hash table has
to have a means of walking its content. The overall process is
O(NlogN).

As an extension, you can include two spare pointers and form a
tree. This would probably be the best method when you want to
search for a range, rather than a specific item.

This brings up another consideration - worst case timing. While
quicksort is slighly faster than mergesort, it has a horrible
O(N*N) worst case. With mergesort, the average run time and the
worst case are both O(NlogN). This again makes it an optimum
choice for real-time systems (not to mention stability). AFAIK no
in-place sort can do better (the in place eliminates radix sort).

Eric Sosman · Jan 28, 2007

Joerg Schoen wrote:

[in E-mail and on Usenet; either is fine, but both risks confusion]

[...]
You don't mention how you modify the final "cleanup" loop. I did it like
that:

for(bits = 0 ; bits <= maxbits ; ++bits) {
if((p = alist[bits]) != NULL)
head = (head == NULL) ? p : merge(p, head, offset, compare);

if((p = blist[bits]) != NULL)
head = (head == NULL) ? p : merge(p, head, offset, compare);
}

That will sort, but it ruins stability. Exchange the two
`if' statements and stability is restored. (Stability is not
the most important characteristic of a sort, but when it can be
had for free there's no reason not to preserve it.)

I compare my implementation against the code you supplied ("McCarthy")
and the improved version ("McCarthy II").

Length | mine | McCarthy | McCarthy II|
--------+---------+-----------+------------+
4*10^6 | | | |
(1) | 7.6 | 7.55 | 8.26 |
(2) | 7.67 | 7.63 | 8.32 |
(3) | 0.64 | 0.63 | 0.73 |
(4) | 0.59 | 0.58 | 0.88 |
--------+---------+-----------+------------+
2^22 | | | |
(1) | 8.02 | 7.97 | 8.27 |
(2) | 8.09 | 8.04 | 8.34 |
(3) | 0.67 | 0.65 | 0.77 |
(4) | 0.63 | 0.63 | 0.81 |
--------+---------+-----------+------------+
2^22+1 | | | |
(1) | 8.66 | 8.63 | 8.25 |
(2) | 9.17 | 9.11 | 8.31 |
(3) | 0.79 | 0.77 | 0.74 |
(4) | 0.62 | 0.64 | 0.79 |
--------+---------+-----------+------------+

As you can see, "McCarthy" and "mine" are basically the same, yours is a
bit faster in all cases.

But the improved one with two lists "alist" and "blist" is slower all
the time except for the 2^n+1 case.

My tests concentrated on much shorter lists -- as I mentioned
up-thread, I believe it is usually a bad idea to sort very long
lists (or even to build them). I used twenty-seven assorted list
lengths from 17 to 6144, with keys in unbiased order or biased
toward rising or falling order, with "fast" and "slow" comparison
functions, on five machines covering multiple generations of x86
and UltraSPARC architectures, and with three different compilers.
I also made an effort to scatter the list nodes in memory, trying
to make the links bounce around across different pages as a real
list might (a list whose nodes all come from one big array may
exhibit too much cache locality).

Twenty-seven sort implementations of six algorithms were tested
(as I mentioned earlier, I feel there are a few more that ought to
be tried for completeness' sake). The testing used a "tournament"
format: Try all the implementations on a few test combinations,
eliminate the slowest few, try the survivors on more combinations,
prune again, and so on. Six methods made it to the final round
every time; eight methods never survived the first round.

Modified McCarthy was the overall winner in five of the seven
tested configurations, taking a second and a third place in the
other two. Basic McCarthy won in one configuration, with two
seconds, three thirds, and one did-not-place. Other high finishers
were a Modified McCarthy that insertion-sorted groups of four nodes
prior to merging (one first place, one second, one third) and a
modified natural merge (three second places).

My choice of test data may have given Modified McCarthy an
unfair advantage: nine of the tested lengths were of the form
2^k+1, and another nine looked like 3*2^k (== 2^(k+1) + 2^k).
In my zeal to avoid the unbalanced merges of Basic McCarthy I may
have given too much emphasis to lengths it doesn't like and thus
shown Modified McCarthy in too favorable a light. The choice of
test data is often the weakest point -- or at least the most
debatable -- in a timing investigation.

I disagree. In my tests, the natural merges are always faster or only
slightly slower. Please note that the additional compares to check for
natural runs is only O(N) which looks affordable to me.

Both N-1 and floor(N/2) are O(N), so Big-Oh doesn't tell enough
of the story.

As mentioned before in this thread, the natural merge excels in case the
list is sorted and looses a few percent in the worst case, i. e. invers
ordered list.

On a reversed list, natural merge makes N-1 comparisons to form
N one-node sub-lists. Thereafter, it follows exactly the same pattern
of merges a straight merge would. In the worst case, then, natural
merge makes N-1 more comparisons than straight merge.

Joerg Schoen · Jan 28, 2007

Eric said:
[in E-mail and on Usenet; either is fine, but both risks confusion]

wasn't sure you read usenet regularly. I'll promise not doing it again.

That will sort, but it ruins stability. Exchange the two
`if' statements and stability is restored. (Stability is not
the most important characteristic of a sort, but when it can be
had for free there's no reason not to preserve it.)

Thanks for the hint. I didn't think much of it, just tried to get the
program to work.

My tests concentrated on much shorter lists -- as I mentioned
up-thread, I believe it is usually a bad idea to sort very long
lists (or even to build them). I used twenty-seven assorted list
lengths from 17 to 6144, with keys in unbiased order or biased
toward rising or falling order, with "fast" and "slow" comparison
functions, on five machines covering multiple generations of x86
and UltraSPARC architectures, and with three different compilers.
I also made an effort to scatter the list nodes in memory, trying
to make the links bounce around across different pages as a real
list might (a list whose nodes all come from one big array may
exhibit too much cache locality).

Agreed that long linked lists and sorting them is odd, but for the sake
of testing the algorithm it sounds OK to me. Moreover, long lists test
locality. Maybe I should also disperse my lists throughout memory to
see the effect. But my understanding of a CPU cache is that the lower
bits of the address are used as cache addresses, so e. g. putting each
node at the start of a page would trash the cache. Having them in an
array should have the same result as wildly dispersing them in memory.
I'll try it and let you know.

Still it looks odd to me that for an "arbitrary" number like 4 * 10^6
the results become worse for the "improved McCarthy".

I have also tested my implementation on a variety of hardware including
AIX, HP-UX, SunOS, etc. If time permits, I will also test yours.

Twenty-seven sort implementations of six algorithms were tested
(as I mentioned earlier, I feel there are a few more that ought to
be tried for completeness' sake). The testing used a "tournament"
format: Try all the implementations on a few test combinations,
eliminate the slowest few, try the survivors on more combinations,
prune again, and so on. Six methods made it to the final round
every time; eight methods never survived the first round.

Modified McCarthy was the overall winner in five of the seven
tested configurations, taking a second and a third place in the
other two. Basic McCarthy won in one configuration, with two
seconds, three thirds, and one did-not-place. Other high finishers
were a Modified McCarthy that insertion-sorted groups of four nodes
prior to merging (one first place, one second, one third) and a
modified natural merge (three second places).

My choice of test data may have given Modified McCarthy an
unfair advantage: nine of the tested lengths were of the form
2^k+1, and another nine looked like 3*2^k (== 2^(k+1) + 2^k).
In my zeal to avoid the unbalanced merges of Basic McCarthy I may
have given too much emphasis to lengths it doesn't like and thus
shown Modified McCarthy in too favorable a light. The choice of
test data is often the weakest point -- or at least the most
debatable -- in a timing investigation.

What about trying all possible lengths in a range from 2^k .. 2^k+1 or
so and produce a graph that compares different algorithms?

I agree that your choice maybe favoured modified McCarthy too much.

Both N-1 and floor(N/2) are O(N), so Big-Oh doesn't tell enough
of the story.

I don't quite get this. Mergesort is O(N * log(N)), adding another O(N)
only increases the leading factor but the higher N becomes the more
irrelevant gets this additional contribution.

On a reversed list, natural merge makes N-1 comparisons to form
N one-node sub-lists. Thereafter, it follows exactly the same pattern
of merges a straight merge would. In the worst case, then, natural
merge makes N-1 more comparisons than straight merge.

Yes, but that's neglicible for an O(N * log(N)) algorithm and improves
it in real cases with partial order.

Joerg Schoen · Jan 28, 2007

CBFalconer said:
If you want to search a table of some form, a hashtable is the most
efficient, since O(1) beats O(logN) by a wide margin. However you
may want to dump the table in sorted order, which can be awkward.
The cure is to be able to form a list of the table content, sort
it, and dump from that. That is easily implemented by simply
including a single spare pointer in the table items. See the
demonstration programs for hashlib on my site. The hash table has
to have a means of walking its content. The overall process is
O(NlogN).

Thanks for pointing this out! The actual situation is that the
(existing) application program I have in mind already uses linked lists
frequently for typical data structures. I now want to provide
an "improved" implementation and also want to offer an easy way of
sorting it prior to output.

I'll take your suggestion like that: I'll have another spare pointer in
the linked list which can be used to generate a hash table from the
list. This becomes handy if the list is to be searched for individual
entries frequently.

As an extension, you can include two spare pointers and form a
tree. This would probably be the best method when you want to
search for a range, rather than a specific item.

I already have two pointers (doubly linked lists), add another for
hashing the whole thing like suggested above. Maybe I offer a function
that converts the doubly-linked list into a balanced tree. Would be a
nice excercise for red-black trees I suppose.

This brings up another consideration - worst case timing. While
quicksort is slighly faster than mergesort, it has a horrible
O(N*N) worst case. With mergesort, the average run time and the
worst case are both O(NlogN). This again makes it an optimum
choice for real-time systems (not to mention stability). AFAIK no
in-place sort can do better (the in place eliminates radix sort).

You are right, but I usually live with the "worst case timing" of
quicksort, believing that this only appears in rare cases. I am not too
paranoid - it's just that everybody's after me

Eric Sosman · Jan 28, 2007

Joerg said:
I don't quite get this. Mergesort is O(N * log(N)), adding another O(N)
only increases the leading factor but the higher N becomes the more
irrelevant gets this additional contribution.

For sufficiently large N, only the leading term and its
coefficient are important. But N is not always "sufficiently
large," and the lower-order terms can be significant or even
dominant when N is small. That's why Quicksort implementations
usually resort to insertion sort for short sub-files: even though
insertion sort is O(N*N), its simplicity typically gives a much
smaller coefficient than that of Quicksort's O(N*ln(N)), so it
winds up being faster for small N.

My interest was not in sorting lists of four million nodes,
but in developing a "general-purpose" utility that works well
over a wide range of list lengths and doesn't behave too badly
if the caller-supplied comparison function is sluggish. The
exercise has given me a renewed appreciation of the difficulties
and compromises faced by the developer of "general-purpose"
routines (and has increased my already healthy skepticism about
the practicality of code re-use). It is likely that I'd have
come to different conclusions if I'd started with a different
notion of how a "general-purpose" list-sorter would be used.

In the tests I ran, one variation of natural merge performed
respectably. But three variations of straight merge (all based
on McCarthy) did just a little better. YMMV.

Yes, but that's neglicible for an O(N * log(N)) algorithm and improves
it in real cases with partial order.

I once tried to calculate the break-even point, the average
run length where natural merge's shallower tree repaid the up-front
investment of N-1 comparisons. IIRC (this was a while ago, and the
details of the calculation elude me at the moment), the average run
length needed to exceed 3.2 or so for natural merge to win over
straight merge. That may not seem like a lot, but it is extremely
rare in "random" input permutations. Only 15111 of the 362880
permutations of nine distinct keys have three or fewer runs and
hence an average run length of three or more; that's a little less
than 4.2%. For ten keys, only 1.3% of the permutations have three
or fewer runs, and the percentages keep diminishing. For even a
moderate N, very few permutations have fewer than N/3.2 runs.

On the other hand, it must be admitted that real input lists
are probably not "truly random." We're back to the difficult
question of developing general-purpose software for "typical"
cases, without any solid notion of what "typical" is. YM, as
I said before, MV.

CBFalconer · Jan 28, 2007

Joerg said:
Thanks for pointing this out! The actual situation is that the
(existing) application program I have in mind already uses linked
lists frequently for typical data structures. I now want to provide
an "improved" implementation and also want to offer an easy way of
sorting it prior to output.

I'll take your suggestion like that: I'll have another spare pointer
in the linked list which can be used to generate a hash table from
the list. This becomes handy if the list is to be searched for
individual entries frequently.

Take a look at hashlib. All the mechanisms are in there for
forming a list from the table contents. Once you have formed a
list it will remain, ignoring new additions to the table (but
beware deletions if you destroy the deleted entries, which is up to
you).

<http://cbfalconer.home.att.net/download/>

The above assumes you are able to use GPLd software. If not, I can
always be contacted for another license.

Stephen Howe · Jan 29, 2007

Um, no, you miss the essential point. The uniqueness lies in the way in

which he accesses the data, not in the fact that he is using natural
mergesort. In fact he conditionalizes whether it is natural or not.

I saw that. But that was via the pre-processor.
Whatever way he compiles it, its behaviour does not change at run-time.
I will re-check mind you.

It isn't true that you can't do a natural mergesort for arrays.

Okay, I will go with that. I overstated the case. But lets go over that.

To do natural mergesort with arrays requires an auxilary array where item
records
- the start of a run
- how many elements in the run
as there may be no other place to record that information

And there are 2 algorithms with natural mergesort:
- there is the recursive version
- there is the iterative version

I dont see how to do a recursive version with natural mergesort for arrays.
Because consider the first run - it could be the only run or it could be one
of many.
How many function calls are made recursively depends on the number of runs
and that is not known in advance or as your are going along.

On the iterative version of natural mergesort - you can make 1 pass and
record the index and run length.
After that, you merge adjacent entries and keep doing so until you have 1
run.
Technically the auxilary array could be O(N) in size in everything is in
reverse order.
The overhead of having to maintain an auxiliary array is quite substantial
but probably less than standard merge sort.

Stephen Howe

Richard Harter · Jan 30, 2007

On Wed, 24 Jan 2007 18:29:20 GMT, (e-mail address removed) (Richard Harter) wrote:

[snip]

function alpha (list L) returns list
level = 0
out = nil
while (L)
append beta(level,L) to out
level ++
end while
return out

The line
append beta(level,L) to out
should read
out = merge(beta(level,L))

Richard Harter · Jan 30, 2007

I saw that. But that was via the pre-processor.
Whatever way he compiles it, its behaviour does not change at run-time.
I will re-check mind you.

It is unclear to me what you are trying to say. The behaviour when
NATURAL is defined is different than when it is isn't.

Okay, I will go with that. I overstated the case. But lets go over that.

To do natural mergesort with arrays requires an auxilary array where item
records
- the start of a run
- how many elements in the run
as there may be no other place to record that information

This is true - sort of. For the standard recursive and iterative
versions you will need an array whose size is the number of runs (=NR).
For McCarthy versions require an array whose size is log2(NR). In a
recursive McCarthy version the "array" is contained in the recursion
stack. In an iterative there must be an exlicit stack.

And there are 2 algorithms with natural mergesort:
- there is the recursive version
- there is the iterative version

Actually there are (at least) 3. In addition:
- there is the McCarthy version

The McCarthy version is (or can be described as being) a hybrid
iterative/recursive version. Both Schoen and Sosman unfold the
recursion.

I dont see how to do a recursive version with natural mergesort for arrays.
Because consider the first run - it could be the only run or it could be one
of many.
How many function calls are made recursively depends on the number of runs
and that is not known in advance or as your are going along.

On the iterative version of natural mergesort - you can make 1 pass and
record the index and run length.
After that, you merge adjacent entries and keep doing so until you have 1
run.

In either case the natural way to do things is to operate with the
auxilliary array. The way to think of this is that the merge process
operates on a sequence of elements, where an element is a sorted run.
In vanilla mergesort the element run length is always 1.

Mergesort algorithm for linked lists	0	Jan 26, 2007
Structures and chained lists questions :	1	Feb 12, 2011
LOGO Linked List Sort using pointer	3	Aug 22, 2005
Linked-list problem - compiles but segfaults	13	Jul 4, 2011
Stack implementation of Linked List	25	May 12, 2009
Linked List Library	18	Sep 20, 2007
Multiple Double Linked Lists	1	Jan 30, 2006
single circular linked list : QUESTION.	3	Nov 11, 2007

Mergesort algorithm for linked lists

Richard Harter

Eric Sosman

Joerg Schoen

CBFalconer

Eric Sosman

Joerg Schoen

Joerg Schoen

Eric Sosman

CBFalconer

Stephen Howe

Richard Harter

Richard Harter

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads