J
jacob navia
Hi
In a recent discussion ("Open letter to Ian Collins") we found out that
the main problem in the C containers library was the time taken by the
Sort function.
The reasons for that is that it was a general sort function calling a
general comparison function that uses memcmp...
The first step to improve performance was to write data type specific
functions using a data type generic file. As a bonus, we get compile
time checking and better syntax.
Still, the generic sort function was taking a lot of time.
I have written then a data type generic quick sort function that
receives as a parameter a macro that is used to compare two elements.
This vastly improves performance.
The times are:
Generic sort function without any specialized heap
(using malloc)
real 0m17.029s
user 0m16.654s
sys 0m0.368s
Generic sort function using a specialized heap
(reduces the malloc overhead)
real 0m11.759s
user 0m11.500s
sys 0m0.252s
"Templated" sort function using a macro parameter
(reduces comparisons to a simple expression)
real 0m6.453s
user 0m6.165s
sys 0m0.281s
The expression used to compare two list elements is:
#define COMPARE_EXPRESSION(A, B) \
((*B)->Data > (*A)->Data ? -1 : (*B)->Data != (*A)->Data)
I thank Pete for that proposal, I wouldn't have come to it
alone
I would like to have the corresponding C++ times but I fear that
if I write it it would be unfair since I am not a C++ wizard...
Here is the C code for the example:
#include <stdlib.h>
#include "containers.h"
#include "intlist.h"
#define N 10000000
int main(void)
{
intList *L1;
size_t i;
int d;
long long sum=0,newSum=0;
Iterator *it;
int *pint;
L1 = iintList.Create(sizeof(int));
iintList.UseHeap(L1,NULL); // Use specialized Heap
// Add N random numbers to the integer list
for (i=0; i<N;i++) {
d = rand();
sum += d;
iintList.Add(L1,d);
}
// Sort it
iintList.Sort(L1);
// Go through the sorted list using an iterator
it = iintList.NewIterator(L1);
i=0;
for (pint = it->GetFirst(it); pint; pint = it->GetNext(it)) {
newSum += *pint;
}
// Verify that both sums are identical
if (sum != newSum)
printf("Failed\n");
else printf("Sum = %lld\n",sum);
// Destroy the list
iintList.Finalize(L1);
}
I have uploaded the latest code to:
http://code.google.com/p/ccl/
There you will find (in listgen.c) the "templated" version of quick sort
in C.
In a recent discussion ("Open letter to Ian Collins") we found out that
the main problem in the C containers library was the time taken by the
Sort function.
The reasons for that is that it was a general sort function calling a
general comparison function that uses memcmp...
The first step to improve performance was to write data type specific
functions using a data type generic file. As a bonus, we get compile
time checking and better syntax.
Still, the generic sort function was taking a lot of time.
I have written then a data type generic quick sort function that
receives as a parameter a macro that is used to compare two elements.
This vastly improves performance.
The times are:
Generic sort function without any specialized heap
(using malloc)
real 0m17.029s
user 0m16.654s
sys 0m0.368s
Generic sort function using a specialized heap
(reduces the malloc overhead)
real 0m11.759s
user 0m11.500s
sys 0m0.252s
"Templated" sort function using a macro parameter
(reduces comparisons to a simple expression)
real 0m6.453s
user 0m6.165s
sys 0m0.281s
The expression used to compare two list elements is:
#define COMPARE_EXPRESSION(A, B) \
((*B)->Data > (*A)->Data ? -1 : (*B)->Data != (*A)->Data)
I thank Pete for that proposal, I wouldn't have come to it
alone
I would like to have the corresponding C++ times but I fear that
if I write it it would be unfair since I am not a C++ wizard...
Here is the C code for the example:
#include <stdlib.h>
#include "containers.h"
#include "intlist.h"
#define N 10000000
int main(void)
{
intList *L1;
size_t i;
int d;
long long sum=0,newSum=0;
Iterator *it;
int *pint;
L1 = iintList.Create(sizeof(int));
iintList.UseHeap(L1,NULL); // Use specialized Heap
// Add N random numbers to the integer list
for (i=0; i<N;i++) {
d = rand();
sum += d;
iintList.Add(L1,d);
}
// Sort it
iintList.Sort(L1);
// Go through the sorted list using an iterator
it = iintList.NewIterator(L1);
i=0;
for (pint = it->GetFirst(it); pint; pint = it->GetNext(it)) {
newSum += *pint;
}
// Verify that both sums are identical
if (sum != newSum)
printf("Failed\n");
else printf("Sum = %lld\n",sum);
// Destroy the list
iintList.Finalize(L1);
}
I have uploaded the latest code to:
http://code.google.com/p/ccl/
There you will find (in listgen.c) the "templated" version of quick sort
in C.