qsort returning index

Discussion in 'C Programming' started by Malcolm McLean, Oct 25, 2011.

1. Malcolm McLeanGuest

qsort is fine if all data is held in structures. E.g. if you want to
sort employees by salary, and you have a structure with salary and
name and payroll number, you can simply call qsort on the array.

However if you have salaries in a parallel array, it is difficult.
It's necessary to set up a structure with salary and index numbers,
just to do the sort.

That's why I've written this function. It's a sort that returns a list
of indices. I think it's best to keep it non-recursive and as a single
function, so I've chosen shellsort as the algorithm.

/*
qsort function which returns the index of items sorted.
Uses Shell sort.
Params: buf - the array to be sorted
N - number of elements
size - element width
compfunc - qsort-style comparison function
Returns: malloced array with index numbers of sorted values in
original array.
*/
int *qsortindex(void *buf, int N, int size, int (*compfunc)(const void
*e1, const void *e2))
{
int ciura_intervals[] = {701, 301, 132, 57, 23, 10, 4, 1};
int i, ii, iii;
unsigned char *buff = buf;
unsigned char *tbuff;
int interval, res;
int t;
int passes;

/* set up return array. Initally all entries are in index order */
return 0;
for(i=0;i<N;i++)

/* temporary buffer to make data movement easier */
tbuff = malloc(size);
if(!tbuff)
{
return 0;
}

/* how many times to pass over the array ? */
passes = (int) (log(N)/log(2.3));
if(passes < 8)
passes = 8;

/* i goes from negative to + 7. 0 to 7 we use the ciura intervals
*/
for(i= -passes+8;i<8;i++)
{
if(i >= 0)
interval = ciura_intervals;
else
{
interval = 701;
/* if we've over 701 items to sort, use an intervals at about
2.3 spacing */
for(ii=0;ii<-i;ii++)
interval = (int) (interval * 2.3);
}
if(interval >= N)
continue;

/* now we've got our interval, pass over with the sort */
for(ii=0;ii<N;ii++)
{
memcpy(tbuff, buff + ii * size, size);
for(iii = ii - interval; iii >= 0; iii -= interval)
{
res = (*compfunc)(buff + iii*size, tbuff);
if(res < 0)
break;
memcpy(buff + (iii + interval) * size, buff + iii * size, size);
}
memcpy(buff + (iii + interval) * size, tbuff, size);
}
}

/* temporary buffer no longer needed */
free(tbuff);

}

--
Basic Algorithms, a massive compendium of C resources
http://www.malcolmmclean.site11.com/www

Malcolm McLean, Oct 25, 2011

2. GeneGuest

On Oct 25, 4:24 pm, Malcolm McLean <>
wrote:
> qsort is fine if all data is held in structures. E.g. if you want to
> sort employees by salary, and you have a structure with salary and
> name and payroll number, you can simply call qsort on the array.
>
> However if you have salaries in a parallel array, it is difficult.
> It's necessary to set up a structure with salary and index numbers,
> just to do the sort.
>
> That's why I've written this function. It's a sort that returns a list
> of indices. I think it's best to keep it non-recursive and as a single
> function, so I've chosen shellsort as the algorithm.
>
> /*
>   qsort function which returns the index of items sorted.
>   Uses Shell sort.
>   Params: buf - the array to be sorted
>           N - number of elements
>           size - element width
>           compfunc - qsort-style comparison function
>   Returns: malloced array with index numbers of sorted values in
> original array.
> */
> int *qsortindex(void *buf, int N, int size, int (*compfunc)(const void
> *e1, const void *e2))
> {
>   int ciura_intervals[] = {701, 301, 132, 57, 23, 10, 4, 1};
>   int i, ii, iii;
>   unsigned char *buff = buf;
>   unsigned char *tbuff;
>   int interval, res;
>   int t;
>   int passes;
>
>   /* set up return array. Initally all entries are in index order */
>   answer = malloc(N * sizeof(int));
>     return 0;
>   for(i=0;i<N;i++)
>
>   /* temporary buffer to make data movement easier */
>   tbuff = malloc(size);
>   if(!tbuff)
>   {
>     return 0;
>   }
>
>   /* how many times to pass over the array ? */
>   passes = (int) (log(N)/log(2.3));
>   if(passes < 8)
>     passes = 8;
>
>   /* i goes from negative to + 7. 0 to 7 we use the ciura intervals
> */
>   for(i= -passes+8;i<8;i++)
>   {
>         if(i >= 0)
>         interval = ciura_intervals;
>     else
>     {
>        interval = 701;
>        /* if we've over 701 items to sort, use an intervals at about
> 2.3 spacing */
>        for(ii=0;ii<-i;ii++)
>          interval = (int) (interval * 2.3);
>     }
>     if(interval >= N)
>       continue;
>
>       /* now we've got our interval, pass over with the sort */
>     for(ii=0;ii<N;ii++)
>     {
>           memcpy(tbuff, buff + ii * size, size);
>           for(iii = ii - interval; iii >= 0; iii -= interval)
>           {
>             res = (*compfunc)(buff + iii*size, tbuff);
>             if(res < 0)
>               break;
>             memcpy(buff + (iii + interval) * size, buff + iii* size, size);
>       }
>       memcpy(buff + (iii + interval) * size, tbuff, size);
>       answer[iii + interval] = t;
>     }
>   }
>
>   /* temporary buffer no longer needed */
>   free(tbuff);
>
>
> }

This is a good alternative to touching the data through global or file
level static variables. But it's also a good argument for including a
void* environment pointer in the design of all callbacks.

A sort done this way would be fine:

typedef int (*SORT_COMPARISON)(void *a, void *b, void *env);

// env is passed to all calls of cmp
void sort(void *items, unsigned n, size_t size, SORT_COMPARISON cmp,
void *env);

With such you could do a singly indirect sort with:

int indirect_cmp(void *iavp, void *ibvp, void* env)
{
const ITEM *items = env;
const unsigned ia = *(unsigned*)iavp;
const unsigned ib = *(unsigned*)ibvp;
return items[ia].key < items[ib].key ? -1 :
items[ia].key > items[ib].key ? +1 : 0;
}

Now somewhere else to sort:
{
ITEM array[]; // array to sort
unsigned indices[]; // array of indices into array

...

sort(indices, n, sizeof(unsigned), indirect_cmp, array);
}

This is strictly more general than your solution since you can do
fancier indirection than through an array of indices, e.g. through an
array of structs containing indices.

Gene, Oct 25, 2011

3. Malcolm McLeanGuest

On Oct 25, 9:17 pm, Gene <> wrote:
> On Oct 25, 4:24 pm, Malcolm McLean <>
>
> This is a good alternative to touching the data through global or file
> level static variables.  But it's also a good argument for including a
> void* environment pointer in the design of all callbacks.
>
> A sort done this way would be fine:
>
> typedef int (*SORT_COMPARISON)(void *a, void *b, void *env);
>
> // env is passed to all calls of cmp
> void sort(void *items, unsigned n, size_t size, SORT_COMPARISON cmp,
> void *env);
>
> With such you could do a singly indirect sort with:
>
> int indirect_cmp(void *iavp, void *ibvp, void* env)
> {
>   const ITEM *items = env;
>   const unsigned ia = *(unsigned*)iavp;
>   const unsigned ib = *(unsigned*)ibvp;
>   return items[ia].key < items[ib].key ? -1 :
>          items[ia].key > items[ib].key ? +1 : 0;
>
> }
>
> Now somewhere else to sort:
> {
>   ITEM array[];        // array to sort
>   unsigned indices[];  // array of indices into array
>
>   ...
>
>   sort(indices, n, sizeof(unsigned), indirect_cmp, array);
>
> }
>
> This is strictly more general than your solution since you can do
> fancier indirection than through an array of indices, e.g. through an
> array of structs containing indices.
>

That's a better answer. Thank you.
--
Basic Algorithms: now available as an ebook
http://www.malcolmmclean.site11.com/www

Malcolm McLean, Oct 26, 2011