# Generic Sort

Discussion in 'C Programming' started by AB, May 24, 2006.

1. ### ABGuest

Hello All,

I'm trying to replicate a general purpose sort function (think qsort)

void sort(void *arr, const int num, size_t size,
int (*cmp)(void *a, void *b))
{
int i = 0 ;
int j = 0 ;

for (i = (num - 1) ; i >= 0 ; i--)
{
for (j = 1 ; j <= i ; j++)
{
if(cmp((int *) &arr[j-1], (int *) &arr[j])) //Error
{
//swapping logic
}
}
}
}

MSVC 8 (2005) reports 2 errors when trying to call cmp:-

error C2036: 'void *' : unknown size
error C2036: 'void *' : unknown size

I've tried a few variations, but I can't seem to get it right. Can
anybody help?

AB, May 24, 2006

2. ### Richard HeathfieldGuest

AB said:

> Hello All,
>
> I'm trying to replicate a general purpose sort function (think qsort)
>
> void sort(void *arr, const int num, size_t size,
> int (*cmp)(void *a, void *b))

Better:

void sort(const void *arr, size_t num, size_t size,
int (*cmp)(const void *a, const void *b))

> {
> int i = 0 ;
> int j = 0 ;
>
> for (i = (num - 1) ; i >= 0 ; i--)
> {
> for (j = 1 ; j <= i ; j++)
> {
> if(cmp((int *) &arr[j-1], (int *) &arr[j])) //Error

for(j = 1; j <= i; j++)
{
const unsigned char *left = arr;
const unsigned char *right;
left += (j - 1) * size;
right = left + size;
if((*cmp)(left, right) > 0)

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Richard Heathfield, May 24, 2006

3. ### sandyGuest

AB wrote:

> Hello All,
>
> I'm trying to replicate a general purpose sort function (think qsort)
>
> void sort(void *arr, const int num, size_t size,
> int (*cmp)(void *a, void *b))
> {
> int i = 0 ;
> int j = 0 ;
>
> for (i = (num - 1) ; i >= 0 ; i--)
> {
> for (j = 1 ; j <= i ; j++)
> {
> if(cmp((int *) &arr[j-1], (int *) &arr[j])) //Error
>

try if(cmp(int *)arr[j-1] , (int *) arr[j])))

> if(cmp((int *) &arr[j-1], (int *) &arr[j])) //Error

Why & for a variable which is already a pointer(according to your
logic)....In this does it makes sense. Check that part out again.
{
> //swapping logic
> }
> }
> }
> }
>
> MSVC 8 (2005) reports 2 errors when trying to call cmp:-
>
> error C2036: 'void *' : unknown size
> error C2036: 'void *' : unknown size
>
> I've tried a few variations, but I can't seem to get it right. Can
> anybody help?
>

sandy, May 24, 2006
4. ### peteGuest

AB wrote:
>
> Hello All,
>
> I'm trying to replicate a general purpose sort function (think qsort)
>
> void sort(void *arr, const int num, size_t size,
> int (*cmp)(void *a, void *b))
> {
> int i = 0 ;
> int j = 0 ;
>
> for (i = (num - 1) ; i >= 0 ; i--)
> {
> for (j = 1 ; j <= i ; j++)
> {
> if(cmp((int *) &arr[j-1], (int *) &arr[j])) //Error
> {
> //swapping logic
> }
> }
> }
> }
>
> MSVC 8 (2005) reports 2 errors when trying to call cmp:-
>
> error C2036: 'void *' : unknown size
> error C2036: 'void *' : unknown size
>
> I've tried a few variations, but I can't seem to get it right. Can
> anybody help?

void sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *a, const void *b))
{
unsigned char *arr = base;
size_t i = nmemb ;
size_t j ;

while (i-- > 0) {
for (j = 0 ; i > j ; ++j) {
if (compar(arr + j * size, arr + (j + 1) * size) > 0) {
/* swapping logic */
}
}
}
}

--
pete

pete, May 24, 2006
5. ### peteGuest

sandy wrote:

> try if(cmp(int *)arr[j-1] , (int *) arr[j])))
>
> This might solve your problem.

Do you really think that swapping every pair
of elements that are unequal to each other,
is going to sort something?

What's the (int *) cast all about?
It looks like it's for a generic sort
that only sorts arrays of int.

--
pete

pete, May 24, 2006
6. ### TomásGuest

AB posted:

> Hello All,
>
> I'm trying to replicate a general purpose sort function (think qsort)

I've tried this myself too. I'll just have a look through my code
directory... ah here we are:

(It was originally written in C++, so I translated it on-the-fly. I
didn't spend time optimizing it)

#include <stdlib.h>

typedef int T;

void SortArray( T* p_start, unsigned long const len )
{
const T * const p_last = p_start + (len - 1);

do
{
T* p_lowest_value = p_start;

for( T* p = p_start; p <= p_last; ++p)
{
if ( *p < *p_lowest_value ) p_lowest_value = p;
}

if ( p_start == p_lowest_value ) continue;

char temp[ sizeof(T) ];

memcpy( temp, p_start, sizeof(temp) );

memcpy( p_start, p_lowest_value, sizeof(temp) );

memcpy( p_lowest_value, temp, sizeof(temp) ) ;
}
while (++p_start != p_last);
}

-Tomás

Tomás, May 24, 2006
7. ### peteGuest

Tomás wrote:
>
> AB posted:
>
> > Hello All,
> >
> > I'm trying to replicate a general purpose sort function (think qsort)

>
> I've tried this myself too. I'll just have a look through my code
> directory... ah here we are:
>
> (It was originally written in C++, so I translated it on-the-fly. I
> didn't spend time optimizing it)
>
> #include <stdlib.h>
>
> typedef int T;
>
> void SortArray( T* p_start, unsigned long const len )
> {
> const T * const p_last = p_start + (len - 1);
>
> do
> {
> T* p_lowest_value = p_start;
>
> for( T* p = p_start; p <= p_last; ++p)
> {
> if ( *p < *p_lowest_value ) p_lowest_value = p;
> }
>
> if ( p_start == p_lowest_value ) continue;
>
> char temp[ sizeof(T) ];
>
> memcpy( temp, p_start, sizeof(temp) );
>
> memcpy( p_start, p_lowest_value, sizeof(temp) );
>
> memcpy( p_lowest_value, temp, sizeof(temp) ) ;
> }
> while (++p_start != p_last);
> }

That's not a general purpose sort function (think qsort)
It doesn't have a general pupose comparison.

It couldn't sort this array on any key:
char *array[] = {"one", "three", "two"};

--
pete

pete, May 24, 2006
8. ### peteGuest

pete wrote:

> void sort(void *base, size_t nmemb, size_t size,
> int (*compar)(const void *a, const void *b))

The a and b parameters shouldn't be in there.
They don't cause any real problems,
but they're meaningless.

void sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *))

--
pete

pete, May 24, 2006
9. ### peteGuest

pete wrote:
>
> pete wrote:
>
> > void sort(void *base, size_t nmemb, size_t size,
> > int (*compar)(const void *a, const void *b))

>
> The a and b parameters
> shouldn't be in there.
> They don't cause any real problems,
> but they're meaningless.
>
> void sort(void *base, size_t nmemb, size_t size,
> int (*compar)(const void *, const void *))

The a and b parameter "names" shouldn't be in there.
The parameters themselves, do need to be there.

--
pete

pete, May 24, 2006
10. ### ABGuest

Thanks for your responses everyone. However I should make it clear that
I was not looking for advice on function declaration or optimization of
the sort procedure. The sort function itself is immaterial, what I
needed help on was passing the parameters to the comparison function.

For those unfamiliar with the CRT qsort should look it up. You can in
fact sort a sentence (for what it's worth) using a comparison function
that accepts two void pointers and returns an integer value. Look it up
on MSDN, that's the very example used to explain how qsort works, and
how to define a custom comparison function (operator) for it.

According to the documentation, the comparison function acceptable to
qsort (and what I'm trying to emulate) works something like this...

int compare(const void* a, const void *b)
{
if( ( *(type *) a < *(type *) b )
return -1 ;
else if( ( *(type *) a == *(type *) b )
return 0 ;
else if( ( *(type *) a > *(type *) b )
return 1 ;
return 0 ; /* default case, written here only to silence critics who
will say that "not all paths of the function return a value" */
}

AB, May 25, 2006
11. ### peteGuest

AB wrote:

> what I
> needed help on was passing the parameters to the comparison function.

That's what I posted.

if(cmp((int *) &arr[j-1], (int *) &arr[j]))

Compare that to this line:

if (compar(arr + j * size, arr + (j + 1) * size) > 0) {

For one thing,
you have no relational operator in your comparison line.
It swaps elements whenever they are unequal.

The other problem, is that in your posted code
you have
void *arr
which means that you can't have
arr[j]
I order to be able to write that, and make it have meaning
the declaration of arr would have to be
int *arr
which means that you function would only be able to arrays
of type int.
Notice that your function doesn't use the size parameter.

This would work,
if you only ever used it with arrays of type int:

void sort(void *base, size_t num, size_t size,
int (*cmp)(const void *a, const void *b))
{
int i = 0 ;
int j = 0 ;
int *arr = base;

for (i = (num - 1) ; i >= 0 ; i--)
{
for (j = 1 ; j <= i ; j++)
{
if(cmp((int *) &arr[j-1], (int *) &arr[j]) > 0)
{
/* swapping logic */
}
}
}
}

However, since a sort that only sorts arrays of type int
isn't your assignment, it should be more like this:

void sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *))
{
unsigned char *arr = base;
size_t i = nmemb ;
size_t j ;

while (i-- > 0) {
for (j = 0 ; i > j ; ++j) {
if (compar(arr + j * size, arr + (j + 1) * size) > 0) {
/* swapping logic */
}
}
}
}

Try not to be distracted by the difference in the loops.
That's not your code anymore, that's my code;
that's why it looks like that.
I tested it before I posted it the first time.

The original order of the array is:
4.000000 3.500000 1.000000 2.000000

The sorted order of the array is:
1.000000 2.000000 3.500000 4.000000

/* BEGIN new.c */

#include <stdio.h>

#define E_TYPE double

#define BYTE_SWAP(A, B) \
{ \
p1 = (A); \
p2 = (B); \
end = p2 + size; \
do { \
swap = *p1; \
*p1++ = *p2; \
*p2++ = swap; \
} while (p2 != end); \
}

typedef E_TYPE e_type;

int comparison(const void *arg1, const void *arg2);
void sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));

int main(void)
{
e_type array[] = {(4), (3.5), (1), (2)};
size_t x;

puts("The original order of the array is:");
for (x = 0; x != sizeof array / sizeof *array; ++x) {
printf("%f ", array[x]);
}
puts("\n");
sort(array, sizeof array / sizeof *array,
sizeof *array, comparison);
puts("The sorted order of the array is:");
for (x = 0; x != sizeof array / sizeof *array; ++x) {
printf("%f ", array[x]);
}
putchar('\n');
return 0;
}

int comparison(const void *arg1, const void *arg2)
{
return *(e_type*)arg2 > *(e_type*)arg1 ? -1
: *(e_type*)arg2 != *(e_type*)arg1;
}

void sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *a, const void *b))
{
unsigned char *arr = base;
size_t i = nmemb ;
size_t j ;
unsigned char *p1, *p2, *end, swap;

while (i-- > 0) {
for (j = 0 ; i > j ; ++j) {
if (compar(arr + j * size, arr + (j + 1) * size) > 0) {
BYTE_SWAP(arr + j * size, arr + (j + 1) * size);
}
}
}
}

/* END new.c */

--
pete

pete, May 25, 2006
12. ### Ben PfaffGuest

"AB" <> writes:

> For those unfamiliar with the CRT qsort should look it up.

"CRT" isn't a common abbreviation here. I assume you mean "C
runtime".

> You can in fact sort a sentence (for what it's worth) using a
> comparison function that accepts two void pointers and returns
> an integer value. Look it up on MSDN, that's the very example
> used to explain how qsort works, and how to define a custom
> comparison function (operator) for it.

It would be better to use a C reference manual instead of MSDN,
which describes Microsoft's implementation, not the standard.

> According to the documentation, the comparison function acceptable to
> qsort (and what I'm trying to emulate) works something like this...
>
> int compare(const void* a, const void *b)
> {
> if( ( *(type *) a < *(type *) b )
> return -1 ;
> else if( ( *(type *) a == *(type *) b )
> return 0 ;
> else if( ( *(type *) a > *(type *) b )
> return 1 ;
> return 0 ; /* default case, written here only to silence critics who
> will say that "not all paths of the function return a value" */
> }
>

A better way:

int compare (const void *a_, const void *b_)
{
const int *a = a_;
const int *b = b_;
return *a < *b ? -1 : *a > *b;
}
--
"...deficient support can be a virtue.
It keeps the amateurs off."
--Bjarne Stroustrup

Ben Pfaff, May 25, 2006
13. ### CBFalconerGuest

AB wrote:
>

.... snip ...
>
> According to the documentation, the comparison function acceptable to
> qsort (and what I'm trying to emulate) works something like this...
>
> int compare(const void* a, const void *b)
> {
> if( ( *(type *) a < *(type *) b )
> return -1 ;
> else if( ( *(type *) a == *(type *) b )
> return 0 ;
> else if( ( *(type *) a > *(type *) b )
> return 1 ;
> return 0 ; /* default case, written here only to silence critics
> who will say that "not all paths of the function return a value" */
> }

So far, fine. The point is that YOU know the actual types of the
data to be compared, and how to do the comparison. The void*
pointers are to allow qsort to manipulate them without knowing
anything about their types. So the easiest thing to do is first
control the types, and you don't need any casts to do so If the
types to be compared are T, then:

int compare(const void* a, const void *b)
{
const T *ap = a;
const T *bp = b;

and now you can write the rest of the routine using ap and bp, with
all the type checking of which C is capable. For integers the rest
of the code might then be:

if (*ap < *bp) return 1;
else if (*ap > *bp) return -1;
return 0;
}

You can let the optimizer decide if it really needs to create
locals for ap and bp. Meanwhile the actual code becomes crystal
clear. BTW, the default case is absolutely necessary.

For ints, another way to write the function body is:

return (*ap < *bp) - (*ap > *bp);

which has advantages on some architectures. Remember that logical
expressions evaluate to either 0 or 1.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the

CBFalconer, May 25, 2006
14. ### peteGuest

Ben Pfaff wrote:
>
> "AB" <> writes:
>
> > For those unfamiliar with the CRT qsort should look it up.

>
> "CRT" isn't a common abbreviation here. I assume you mean "C
> runtime".

I had no idea what it meant.

> int compare (const void *a_, const void *b_)
> {
> const int *a = a_;
> const int *b = b_;
> return *a < *b ? -1 : *a > *b;
> }

I've read of coding standards which reserve verb style names
for functions that have side effects,
suggesting that "comparison",
or perhaps even a nonword like "comp"
might be a better choice for the name of the function.

Any thoughts on that?

--
pete

pete, May 25, 2006
15. ### Ben PfaffGuest

pete <> writes:

> I've read of coding standards which reserve verb style names
> for functions that have side effects,
> suggesting that "comparison",
> or perhaps even a nonword like "comp"
> might be a better choice for the name of the function.
>
> Any thoughts on that?

I haven't heard of a coding standard that specifies that before.
It sounds to me like an interesting idea. If anyone here has
experience with such a coding standard, then I'd like to hear
about how it works out in practice.
--
"IMO, Perl is an excellent language to break your teeth on"
--Micah Cowan

Ben Pfaff, May 25, 2006
16. ### Michael MairGuest

pete schrieb:
> Ben Pfaff wrote:
>>int compare (const void *a_, const void *b_)
>>{
>> const int *a = a_;
>> const int *b = b_;
>> return *a < *b ? -1 : *a > *b;
>>}

>
> I've read of coding standards which reserve verb style names
> for functions that have side effects,
> suggesting that "comparison",
> or perhaps even a nonword like "comp"
> might be a better choice for the name of the function.
>
> Any thoughts on that?

The only similar thing I know in coding standards are rules that
predicates should always be side effect free, i.e. if the function
starts with is, has, contains etc., then it returns a Boolean value
(of some integer type) and is side effect free.
Possible extensions:
- Comparisons; here, you have some fixed part of the function
identifier, e.g. "starts with compare" or cmp, which indicates "no
side effects"
- constant get...() functions are side effect free
However, all other function identifiers are verb style names, too.

These are IMO more useful rules as the other rule may make it much
harder to create "good" identifiers for arbitrary projects.
In addition, if you decide to change what happens under the hood,
you would have to change function names -- due to the more
restrictive rule this would happen much more often than you need
to change from "is...()" to "obtain...State()".

Another aspect which should be clear beforehand: What do you see
as "side effect free" or "pure"? It may be perfectly reasonable to
separate changing debug and tracing information from changing
other information when making this distinction.

Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.

Michael Mair, May 25, 2006
17. ### ABGuest

Thanks to all those who replied to my post. I'd like to clear that this
wasn't an assignment..it was just something that was on my todo list
for a while. I've always been partial to C++ and so a generic sort was
easy to implement with templates. However, when I thought about a port
to C, this problem arose.

Anyways, I've solved the problem and for those who're interested...

typedef unsigned char byte ;

//for sorting int in ascending order
int cmp_int_asc(const void* a, const void* b)
{
if(*(int *)a < *(int *)b)
return -1 ;
else if(*(int *)a == *(int *)b)
return 0 ;
else if(*(int *)a > *(int *)b)
return 1 ;
}

void sort(void *base, int num, size_t size, int(*cmp)(const void* a,
const void* b))
{
byte temp,
*ptr2a,
*ptr2b,
*ptr2base;

ptr2base = (byte *)base ;

for(int i = num - 1; i >= 0; i--)
{
for(int j = 1; j <= i; j++)
{
ptr2a = (byte*) (ptr2base + size * (j - 1)) ;
ptr2b = (byte*) (ptr2base + size * j) ;

if(cmp(ptr2a, ptr2b) > 0)
swap_bytes(ptr2a, ptr2b, size) ;
}
}
}

void swap_bytes(byte *a, byte *b, size_t size)
{
byte temp ;

for(int i = 0; i < size; i++)
{
temp = *(a + i) ;
*(a + i) = *(b + i) ;
*(b + i) = temp ;
}
}

AB, May 26, 2006
18. ### Robert LatestGuest

On Thu, 25 May 2006 15:02:29 GMT,
pete <> wrote
in Msg. <>

> I've read of coding standards which reserve verb style names
> for functions that have side effects,
> suggesting that "comparison",
> or perhaps even a nonword like "comp"
> might be a better choice for the name of the function.

It depends on how you want to use the function. I once wrote a database
application that would sort arrays of structs by different fields, so
all my comparison functions started in "by_", resulting in beautiful
functions calls like:

qsort(record, ..., by_name);

This also reflects my tendency to use the singular for array names
because I prefer

record.name
over
records.name

robert

Robert Latest, May 26, 2006
19. ### Barry SchwarzGuest

On 26 May 2006 00:15:30 -0700, "AB" <> wrote:

>Thanks to all those who replied to my post. I'd like to clear that this
>wasn't an assignment..it was just something that was on my todo list
>for a while. I've always been partial to C++ and so a generic sort was
>easy to implement with templates. However, when I thought about a port
>to C, this problem arose.
>
>Anyways, I've solved the problem and for those who're interested...
>
>typedef unsigned char byte ;
>
>//for sorting int in ascending order
>int cmp_int_asc(const void* a, const void* b)
>{
> if(*(int *)a < *(int *)b)

Everywhere else you were careful to keep your code independent of the
sizeof an array element. Why here do you forsake that and insist on
int?

> return -1 ;
> else if(*(int *)a == *(int *)b)
> return 0 ;
> else if(*(int *)a > *(int *)b)
> return 1 ;
>}
>
>void sort(void *base, int num, size_t size, int(*cmp)(const void* a,
>const void* b))
>{
> byte temp,
> *ptr2a,
> *ptr2b,
> *ptr2base;
>
> ptr2base = (byte *)base ;

You don't need to cast a void* to assign it to another pointer.

>
> for(int i = num - 1; i >= 0; i--)
> {
> for(int j = 1; j <= i; j++)
> {
> ptr2a = (byte*) (ptr2base + size * (j - 1)) ;
> ptr2b = (byte*) (ptr2base + size * j) ;
>
> if(cmp(ptr2a, ptr2b) > 0)
> swap_bytes(ptr2a, ptr2b, size) ;
> }
> }
>}
>
>void swap_bytes(byte *a, byte *b, size_t size)
>{
> byte temp ;
>
> for(int i = 0; i < size; i++)
> {
> temp = *(a + i) ;
> *(a + i) = *(b + i) ;
> *(b + i) = temp ;
> }
>}

Remove del for email

Barry Schwarz, May 27, 2006
20. ### peteGuest

Barry Schwarz wrote:
>
> On 26 May 2006 00:15:30 -0700, "AB" <> wrote:

> >//for sorting int in ascending order
> >int cmp_int_asc(const void* a, const void* b)
> >{
> > if(*(int *)a < *(int *)b)

>
> Everywhere else you were careful to keep your code independent of the
> sizeof an array element. Why here do you forsake that and insist on
> int?

That's just a sample comparison function.
Those are written for specific data types.

--
pete

pete, May 27, 2006