qsort returning index

M

Malcolm McLean

qsort is fine if all data is held in structures. E.g. if you want to
sort employees by salary, and you have a structure with salary and
name and payroll number, you can simply call qsort on the array.

However if you have salaries in a parallel array, it is difficult.
It's necessary to set up a structure with salary and index numbers,
just to do the sort.

That's why I've written this function. It's a sort that returns a list
of indices. I think it's best to keep it non-recursive and as a single
function, so I've chosen shellsort as the algorithm.


/*
qsort function which returns the index of items sorted.
Uses Shell sort.
Params: buf - the array to be sorted
N - number of elements
size - element width
compfunc - qsort-style comparison function
Returns: malloced array with index numbers of sorted values in
original array.
*/
int *qsortindex(void *buf, int N, int size, int (*compfunc)(const void
*e1, const void *e2))
{
int *answer;
int ciura_intervals[] = {701, 301, 132, 57, 23, 10, 4, 1};
int i, ii, iii;
unsigned char *buff = buf;
unsigned char *tbuff;
int interval, res;
int t;
int passes;

/* set up return array. Initally all entries are in index order */
answer = malloc(N * sizeof(int));
if(!answer)
return 0;
for(i=0;i<N;i++)
answer = i;

/* temporary buffer to make data movement easier */
tbuff = malloc(size);
if(!tbuff)
{
free(answer);
return 0;
}

/* how many times to pass over the array ? */
passes = (int) (log(N)/log(2.3));
if(passes < 8)
passes = 8;

/* i goes from negative to + 7. 0 to 7 we use the ciura intervals
*/
for(i= -passes+8;i<8;i++)
{
if(i >= 0)
interval = ciura_intervals;
else
{
interval = 701;
/* if we've over 701 items to sort, use an intervals at about
2.3 spacing */
for(ii=0;ii<-i;ii++)
interval = (int) (interval * 2.3);
}
if(interval >= N)
continue;

/* now we've got our interval, pass over with the sort */
for(ii=0;ii<N;ii++)
{
t = answer[ii];
memcpy(tbuff, buff + ii * size, size);
for(iii = ii - interval; iii >= 0; iii -= interval)
{
res = (*compfunc)(buff + iii*size, tbuff);
if(res < 0)
break;
memcpy(buff + (iii + interval) * size, buff + iii * size, size);
answer[iii+interval] = answer[iii];
}
memcpy(buff + (iii + interval) * size, tbuff, size);
answer[iii + interval] = t;
}
}

/* temporary buffer no longer needed */
free(tbuff);

return answer;
}
 
G

Gene

qsort is fine if all data is held in structures. E.g. if you want to
sort employees by salary, and you have a structure with salary and
name and payroll number, you can simply call qsort on the array.

However if you have salaries in a parallel array, it is difficult.
It's necessary to set up a structure with salary and index numbers,
just to do the sort.

That's why I've written this function. It's a sort that returns a list
of indices. I think it's best to keep it non-recursive and as a single
function, so I've chosen shellsort as the algorithm.

/*
  qsort function which returns the index of items sorted.
  Uses Shell sort.
  Params: buf - the array to be sorted
          N - number of elements
          size - element width
          compfunc - qsort-style comparison function
  Returns: malloced array with index numbers of sorted values in
original array.
*/
int *qsortindex(void *buf, int N, int size, int (*compfunc)(const void
*e1, const void *e2))
{
  int *answer;
  int ciura_intervals[] = {701, 301, 132, 57, 23, 10, 4, 1};
  int i, ii, iii;
  unsigned char *buff = buf;
  unsigned char *tbuff;
  int interval, res;
  int t;
  int passes;

  /* set up return array. Initally all entries are in index order */
  answer = malloc(N * sizeof(int));
  if(!answer)
    return 0;
  for(i=0;i<N;i++)
    answer = i;

  /* temporary buffer to make data movement easier */
  tbuff = malloc(size);
  if(!tbuff)
  {
        free(answer);
    return 0;
  }

  /* how many times to pass over the array ? */
  passes = (int) (log(N)/log(2.3));
  if(passes < 8)
    passes = 8;

  /* i goes from negative to + 7. 0 to 7 we use the ciura intervals
*/
  for(i= -passes+8;i<8;i++)
  {
        if(i >= 0)
        interval = ciura_intervals;
    else
    {
       interval = 701;
       /* if we've over 701 items to sort, use an intervals at about
2.3 spacing */
       for(ii=0;ii<-i;ii++)
         interval = (int) (interval * 2.3);
    }
    if(interval >= N)
      continue;

      /* now we've got our interval, pass over with the sort */
    for(ii=0;ii<N;ii++)
    {
          t = answer[ii];
          memcpy(tbuff, buff + ii * size, size);
          for(iii = ii - interval; iii >= 0; iii -= interval)
          {
            res = (*compfunc)(buff + iii*size, tbuff);
            if(res < 0)
              break;
            memcpy(buff + (iii + interval) * size, buff + iii* size, size);
            answer[iii+interval] = answer[iii];
      }
      memcpy(buff + (iii + interval) * size, tbuff, size);
      answer[iii + interval] = t;
    }
  }

  /* temporary buffer no longer needed */
  free(tbuff);

  return answer;

}


This is a good alternative to touching the data through global or file
level static variables. But it's also a good argument for including a
void* environment pointer in the design of all callbacks.

A sort done this way would be fine:

typedef int (*SORT_COMPARISON)(void *a, void *b, void *env);

// env is passed to all calls of cmp
void sort(void *items, unsigned n, size_t size, SORT_COMPARISON cmp,
void *env);

With such you could do a singly indirect sort with:

int indirect_cmp(void *iavp, void *ibvp, void* env)
{
const ITEM *items = env;
const unsigned ia = *(unsigned*)iavp;
const unsigned ib = *(unsigned*)ibvp;
return items[ia].key < items[ib].key ? -1 :
items[ia].key > items[ib].key ? +1 : 0;
}

Now somewhere else to sort:
{
ITEM array[]; // array to sort
unsigned indices[]; // array of indices into array

...

sort(indices, n, sizeof(unsigned), indirect_cmp, array);
}

This is strictly more general than your solution since you can do
fancier indirection than through an array of indices, e.g. through an
array of structs containing indices.
 
M

Malcolm McLean

On Oct 25, 4:24 pm, Malcolm McLean <[email protected]>

This is a good alternative to touching the data through global or file
level static variables.  But it's also a good argument for including a
void* environment pointer in the design of all callbacks.

A sort done this way would be fine:

typedef int (*SORT_COMPARISON)(void *a, void *b, void *env);

// env is passed to all calls of cmp
void sort(void *items, unsigned n, size_t size, SORT_COMPARISON cmp,
void *env);

With such you could do a singly indirect sort with:

int indirect_cmp(void *iavp, void *ibvp, void* env)
{
  const ITEM *items = env;
  const unsigned ia = *(unsigned*)iavp;
  const unsigned ib = *(unsigned*)ibvp;
  return items[ia].key < items[ib].key ? -1 :
         items[ia].key > items[ib].key ? +1 : 0;

}

Now somewhere else to sort:
{
  ITEM array[];        // array to sort
  unsigned indices[];  // array of indices into array

  ...

  sort(indices, n, sizeof(unsigned), indirect_cmp, array);

}

This is strictly more general than your solution since you can do
fancier indirection than through an array of indices, e.g. through an
array of structs containing indices.
That's a better answer. Thank you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top