qsort semantics

gauss010 · Sep 20, 2007

Suppose I have an object A of type char[M][N]. Each A is a buffer
containing a string, and I want to sort the M strings of A using the
strcmp function. The description of the qsort function says that I
must pass a pointer to the first object of the array to be sorted.
This leads me to write the following:

char A[M][N];
int cmp(const void *a, const void *b) {
int v = strcmp(*(char (*)[N])a, *(char (*)[N])b);
if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;
}
void sortA(void) {
qsort(&A[0], sizeof A / sizeof *A, sizeof *A, cmp);
}

But I also want to be able to use my comparison function to sort other
arrays of strings whose second array dimension is of a different size.
For example, I may want to use it to do the same sort on an object of
type char[2*M][2*N]. Then I think of rewriting it like this:

char A[M][N], B[2*M][2*N];
int cmp(const void *a, const void *b) {
int v = strcmp(a,b);
if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;
}
void sortA(void) {
qsort(&A[0][0], sizeof A / sizeof *A, sizeof *A, cmp);
}
void sortB(void) {
qsort(&B[0][0], sizeof B / sizeof *B, sizeof *B, cmp);
}

But now I'm not strictly following the description of the qsort
function. I'm now not *really* passing a pointer to the first object
of the array to be sorted, rather I'm passing a pointer to the first
byte of the first object of the array to be sorted.

Imagining how I would implement the qsort function myself, I know that
this will work fine, but abusing the description of the qsort function
like this makes me uncomfortable.

Do you think it's OK to write such code?

Army1987 · Sep 20, 2007

Suppose I have an object A of type char[M][N]. Each A is a buffer
containing a string, and I want to sort the M strings of A using the
strcmp function. The description of the qsort function says that I
must pass a pointer to the first object of the array to be sorted.
This leads me to write the following:

char A[M][N];
int cmp(const void *a, const void *b) {
int v = strcmp(*(char (*)[N])a, *(char (*)[N])b);

strcmp takes pointers to const chars, not to arrays.
If strcmp has a visible prototype here, you could just
int cmp(const void *a, const void *b) { return strcmp(a, b); }

K&R2 uses (int (* )(const void *, const void *))strcmp as an
argument to qsort, but I think the standard committee has since
deemed it UB (whereas const void *s are required to have the same
representations as const char*s, a particularly perverse
implementation could pass them to functions in different ways...)

if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;

Click to expand...

Why couldn't you just return v?

}
void sortA(void) {
qsort(&A[0], sizeof A / sizeof *A, sizeof *A, cmp);
}

But I also want to be able to use my comparison function to sort other
arrays of strings whose second array dimension is of a different size.
For example, I may want to use it to do the same sort on an object of
type char[2*M][2*N]. Then I think of rewriting it like this:

char A[M][N], B[2*M][2*N];
int cmp(const void *a, const void *b) {
int v = strcmp(a,b);
if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;
}
void sortA(void) {
qsort(&A[0][0], sizeof A / sizeof *A, sizeof *A, cmp);
}
void sortB(void) {
qsort(&B[0][0], sizeof B / sizeof *B, sizeof *B, cmp);
}

But now I'm not strictly following the description of the qsort
function. I'm now not *really* passing a pointer to the first object
of the array to be sorted, rather I'm passing a pointer to the first
byte of the first object of the array to be sorted.

Click to expand...

You're passing void *, which are essentially pointers to char with
syntactic salt on them. Of course they point to the first byte.
The only way for qsort() to know how large the objects they are
intended to point to are, is through the second and third
argument.

user923005 · Sep 20, 2007

Maybe use:
int compare (const void *aa, const void *bb)
{
const char * const *a = aa;
const char * const *b = bb;
return strcmp (a, b);
}

to compare C strings using qsort.

Army1987 · Sep 21, 2007

Maybe use:
int compare (const void *aa, const void *bb)
{
const char * const *a = aa;
const char * const *b = bb;
return strcmp (a, b);
}

to compare C strings using qsort.

Why?
strcmp takes two const char *s...

Barry Schwarz · Sep 21, 2007

Maybe use:
int compare (const void *aa, const void *bb)
{
const char * const *a = aa;
const char * const *b = bb;
return strcmp (a, b);
}

to compare C strings using qsort.

a and b are both qualified types of char**. strcmp takes slightly
different qualified types of char*.

Remove del for email

pete · Sep 21, 2007

Suppose I have an object A of type char[M][N]. Each A is a buffer
containing a string, and I want to sort the M strings of A using the
strcmp function. The description of the qsort function says that I
must pass a pointer to the first object of the array to be sorted.
This leads me to write the following:

char A[M][N];
int cmp(const void *a, const void *b) {
int v = strcmp(*(char (*)[N])a, *(char (*)[N])b);
if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;
}
void sortA(void) {
qsort(&A[0], sizeof A / sizeof *A, sizeof *A, cmp);
}

But I also want to be able to use my comparison function to sort other
arrays of strings whose second array dimension is of a different size.
For example, I may want to use it to do the same sort on an object of
type char[2*M][2*N]. Then I think of rewriting it like this:

char A[M][N], B[2*M][2*N];
int cmp(const void *a, const void *b) {
int v = strcmp(a,b);
if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;
}
void sortA(void) {
qsort(&A[0][0], sizeof A / sizeof *A, sizeof *A, cmp);
}
void sortB(void) {
qsort(&B[0][0], sizeof B / sizeof *B, sizeof *B, cmp);
}

But now I'm not strictly following the description of the qsort
function. I'm now not *really* passing a pointer to the first object
of the array to be sorted, rather I'm passing a pointer to the first
byte of the first object of the array to be sorted.

Imagining how I would implement the qsort function myself, I know that
this will work fine, but abusing the description of the qsort function
like this makes me uncomfortable.

Do you think it's OK to write such code?

/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NMEMB(A) (sizeof (A) / sizeof *(A))

int compar(const void *a, const void *b);

int main(void)
{
char array_1[][sizeof "three"] = {
"zero","one","two","three","four",
"five","six","seven","eight","nine"
};
char array_2[][2] = {
"9","8","7","6","5","4","3","2","1","0"
};
size_t index;

for (index = 0; index != NMEMB(array_1); ++index) {
puts(array_1[index]);
}
putchar('\n');
for (index = 0; index != NMEMB(array_2); ++index) {
puts(array_2[index]);
}
puts("After sorting with strcmp:");
qsort(array_1, NMEMB(array_1), sizeof *array_1, compar);
qsort(array_2, NMEMB(array_2), sizeof *array_2, compar);
for (index = 0; index != NMEMB(array_1); ++index) {
puts(array_1[index]);
}
putchar('\n');
for (index = 0; index != NMEMB(array_2); ++index) {
puts(array_2[index]);
}
return 0;
}

int compar(const void *a, const void *b)
{
return strcmp(a, b);
}

/* END new.c */

user923005 · Sep 21, 2007

a and b are both qualified types of char**. strcmp takes slightly
different qualified types of char*.

I can't believe I posted that. Let's call it temporary insanity.

Barry Schwarz · Sep 21, 2007

Suppose I have an object A of type char[M][N]. Each A is a buffer
containing a string, and I want to sort the M strings of A using the
strcmp function. The description of the qsort function says that I
must pass a pointer to the first object of the array to be sorted.
This leads me to write the following:

char A[M][N];
int cmp(const void *a, const void *b) {
int v = strcmp(*(char (*)[N])a, *(char (*)[N])b);

You cast a to "pointer to an array of N char". You then dereference
this expression which yields an array of N char. This expression does
not meet one of the three exceptions and is therefore converted to the
address of the first char in the array with type pointer to char. This
is what you want to pass to strcmp but you don't really care about the
size of the array. You could achieve the same with less typing with a
simple (char*)a.

But as you note below, even that is not necessary.

if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;

Click to expand...

This is not necessary. You can just return v since the requirement is
for negative, 0, or positive but not for exactly -1 or +1.

}
void sortA(void) {
qsort(&A[0], sizeof A / sizeof *A, sizeof *A, cmp);
}

But I also want to be able to use my comparison function to sort other
arrays of strings whose second array dimension is of a different size.
For example, I may want to use it to do the same sort on an object of
type char[2*M][2*N]. Then I think of rewriting it like this:

char A[M][N], B[2*M][2*N];
int cmp(const void *a, const void *b) {
int v = strcmp(a,b);
if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;
}
void sortA(void) {
qsort(&A[0][0], sizeof A / sizeof *A, sizeof *A, cmp);
}
void sortB(void) {
qsort(&B[0][0], sizeof B / sizeof *B, sizeof *B, cmp);
}

But now I'm not strictly following the description of the qsort
function. I'm now not *really* passing a pointer to the first object
of the array to be sorted, rather I'm passing a pointer to the first
byte of the first object of the array to be sorted.

Click to expand...

Sure you are. Since the qsort prototype will force the first argument
to be converted to void*, the type of the pointer in the calling
statement is irrelevant. While the expressions A, &A, A[0], &A[0],
and &A[0][0] all have different types~, the address they evaluate to
points to the same byte, namely the first byte of A. There is no way
for qsort to know which one you actually used in your calling
statement.

~Yes, I know A and &A[0] have the same type as do A[0] and &A[0][0].

You could even go one step further and remove the need for sortA and
sortB by using the calls to qsort in place of the calls to these
functions (eliminates the need for global variables) or a single call
to qsort with

void* ptr;
size_t count;
size_t size;
ptr = $; /* replace $ with A or B as appropriate */
size = sizeof *$;
count = sizeof $ / size;
qsort(ptr, count, size, cmp); /* will work for both A and B */

Imagining how I would implement the qsort function myself, I know that
this will work fine, but abusing the description of the qsort function
like this makes me uncomfortable.

Click to expand...

A void* is a void*, no matter what type the value was converted from.

Do you think it's OK to write such code?

Click to expand...

Yes.

Remove del for email

gauss010 · Sep 21, 2007

Suppose I have an object A of type char[M][N]. Each A is a buffer
containing a string, and I want to sort the M strings of A using the
strcmp function. The description of the qsort function says that I
must pass a pointer to the first object of the array to be sorted.
This leads me to write the following:

Click to expand...

char A[M][N];
int cmp(const void *a, const void *b) {
int v = strcmp(*(char (*)[N])a, *(char (*)[N])b);

Click to expand...

You cast a to "pointer to an array of N char". You then dereference
this expression which yields an array of N char. This expression does
not meet one of the three exceptions and is therefore converted to the
address of the first char in the array with type pointer to char. This
is what you want to pass to strcmp but you don't really care about the
size of the array. You could achieve the same with less typing with a
simple (char*)a.

But as you note below, even that is not necessary.

if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;

Click to expand...

This is not necessary. You can just return v since the requirement is
for negative, 0, or positive but not for exactly -1 or +1.

My mistake. I incorrectly recalled that the comparison function needs
to return either -1, 0, or 1.

}
void sortA(void) {
qsort(&A[0], sizeof A / sizeof *A, sizeof *A, cmp);
}

Click to expand...

But I also want to be able to use my comparison function to sort other
arrays of strings whose second array dimension is of a different size.
For example, I may want to use it to do the same sort on an object of
type char[2*M][2*N]. Then I think of rewriting it like this:

Click to expand...

char A[M][N], B[2*M][2*N];
int cmp(const void *a, const void *b) {
int v = strcmp(a,b);
if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;
}
void sortA(void) {
qsort(&A[0][0], sizeof A / sizeof *A, sizeof *A, cmp);
}
void sortB(void) {
qsort(&B[0][0], sizeof B / sizeof *B, sizeof *B, cmp);
}

Click to expand...

But now I'm not strictly following the description of the qsort
function. I'm now not *really* passing a pointer to the first object
of the array to be sorted, rather I'm passing a pointer to the first
byte of the first object of the array to be sorted.

Click to expand...

Sure you are. Since the qsort prototype will force the first argument
to be converted to void*, the type of the pointer in the calling
statement is irrelevant. While the expressions A, &A, A[0], &A[0],
and &A[0][0] all have different types~, the address they evaluate to
points to the same byte, namely the first byte of A. There is no way
for qsort to know which one you actually used in your calling
statement.

Click to expand...

Ah, this gets at the heart of my problem. I'm unsure when switching
between pointer types like this, and that's the reason why I used the
weird cast to char (*)[N] --- treating qsort as a magic black box that
I should receive pointers from as the same type I passed in seemed
more correct.

Suppose I have an array defined as T x[2][2], where T is any type. The
expression &x then has type pointer-to-array-of-2-array-of-2-T, and
the expression &x would be said to point to the object named x,
correct? But is this is only because of the type of &x (&x could be
considered to point to both x[0] and x[0][0] as well)? Is it always
OK then, to write expressions such as:

- (T *)&x or (T *)&x[0] to produce a pointer to x[0][0]?
- (T *)((char *)&x + sizeof x[0]) to produce a pointer to x[1][0]?
- any similarly contrived expression to point to a subobject of x?

These are intuitively correct to me, but I have trouble tracking the
reasoning through the standard as to why they should behave properly.

~Yes, I know A and &A[0] have the same type as do A[0] and &A[0][0].

Click to expand...

Nitpick: they don't have the same type, but are equivalent when not
used as the operand of sizeof nor &.

[snip]

Barry Schwarz · Sep 21, 2007

On Thu, 20 Sep 2007 21:39:09 -0700, (e-mail address removed) wrote:

snip

But I also want to be able to use my comparison function to sort other
arrays of strings whose second array dimension is of a different size.
For example, I may want to use it to do the same sort on an object of
type char[2*M][2*N]. Then I think of rewriting it like this:

Click to expand...

char A[M][N], B[2*M][2*N];
int cmp(const void *a, const void *b) {
int v = strcmp(a,b);
if (v < 0) return -1;
else if (v == 0) return 0;
else return 1;
}
void sortA(void) {
qsort(&A[0][0], sizeof A / sizeof *A, sizeof *A, cmp);
}
void sortB(void) {
qsort(&B[0][0], sizeof B / sizeof *B, sizeof *B, cmp);
}

Click to expand...

But now I'm not strictly following the description of the qsort
function. I'm now not *really* passing a pointer to the first object
of the array to be sorted, rather I'm passing a pointer to the first
byte of the first object of the array to be sorted.

Click to expand...

Sure you are. Since the qsort prototype will force the first argument
to be converted to void*, the type of the pointer in the calling
statement is irrelevant. While the expressions A, &A, A[0], &A[0],
and &A[0][0] all have different types~, the address they evaluate to
points to the same byte, namely the first byte of A. There is no way
for qsort to know which one you actually used in your calling
statement.

Click to expand...

Ah, this gets at the heart of my problem. I'm unsure when switching
between pointer types like this, and that's the reason why I used the
weird cast to char (*)[N] --- treating qsort as a magic black box that
I should receive pointers from as the same type I passed in seemed
more correct.

Since qsort has no idea what type or array you are dealing with, it
cannot pass that type info on to your compare function. (More
technically, you passed qsort a void* which removed any type
information from the argument.) The only thing qsort knows is the
starting address of the array, the size of each element, and the
number of elements.

Your compare function needs to know that each element is a string but
in this case it does not care what the max size of each string is. It
is sufficient to know the starting location of the string because that
is what strcmp needs to know.

Suppose I have an array defined as T x[2][2], where T is any type. The
expression &x then has type pointer-to-array-of-2-array-of-2-T, and
the expression &x would be said to point to the object named x,
correct?
Yes

But is this is only because of the type of &x (&x could be
considered to point to both x[0] and x[0][0] as well)?

While the address &x evaluates to is starting location of x which is
also the starting location of x[0] and also the starting location of
x[0][0], &x has the wrong type to point to either x[0] or x[0][0].
(This "nit" is similar to the problem first year physics students have
in remembering that most values have units as well as magnitude.)

Is it always
OK then, to write expressions such as:

- (T *)&x or (T *)&x[0] to produce a pointer to x[0][0]?

Probably. But the use of an unnecessarily complex expression with a
cast is the sign of a coder who couldn't be bothered to get things
correct in the first place. If you want a pointer to x[0][0], use
&x[0][0]. On the other hand, if you need the address of the sub-array
in the context of pointer to element (as with the strings we have been
discussing) use x.

Furthermore, consider the situation where x started out as an array of
floats. Somewhere later in your code you used (float*)&x to point to
x[0][0]. Then after reading this group for a while you decided it
would be better to change x to double. You would have to search
through your code looking for casts to change. If you had coded
&x[0][0] in the first place, that expression's type would change
automatically (what I call self adjusting code). If it were used in a
place where a double* was not appropriate, your compiler would
generate a diagnostic. And worst of all, if you had missed changing
the cast, you would be using the address of a double in a place where
you thought you were using the address of a float which would probably
lead to undesirable results(tm).

By the way, the & is superfluous in either of your two expressions.

- (T *)((char *)&x + sizeof x[0]) to produce a pointer to x[1][0]?
- any similarly contrived expression to point to a subobject of x?

These are intuitively correct to me, but I have trouble tracking the
reasoning through the standard as to why they should behave properly.

Click to expand...

In any array, sizeof x is a constant and does not depend on i.
Furthermore, the address of x[i+1] is always sizeof x[n] bytes beyond
the address of x. You need the cast to char* simply because
pointer arithmetic includes the implied scaling factor of
sizeof(object pointed to). For an array x, x+i is guaranteed to point
to x. For int with size 4, x[1] is obviously 4 bytes beyond x[0]
so the "pointer expression" x+1 must be equivalent to the "arithmetic
expression" x + 1*4. Therefore you could replace your expression
above with (T*)(x+1). But you really shouldn't use either. At this
point, you should also be able to explain why (T*)(&x+1) is wrong.

~Yes, I know A and &A[0] have the same type as do A[0] and &A[0][0].

Click to expand...

Nitpick: they don't have the same type, but are equivalent when not
used as the operand of sizeof nor &.

Click to expand...

It is stronger than equivalent. The standard says that under these
conditions (there is a third exception also), an expression of array
of T is **converted** to an expression of type pointer to T that
points to element 0 of the array. So A and &A[0] have exactly the
same type.

Remove del for email

Tor Rustad · Sep 21, 2007

user923005 said:
I can't believe I posted that. Let's call it temporary insanity.

I do remember an old regular named Dann Corbit.

Yup, he knew how how to call qsort()!

Qsort() messing with my entire Code	0	Apr 25, 2022
Qsort() is messing with my entire Code!!!	0	Apr 25, 2022
qsort returning index	2	Oct 25, 2011
qsort man page	23	Jul 21, 2012
calling qsort	59	Sep 9, 2008
If(strcmp(str, "") == 0) - What does this line of code mean?	0	Aug 8, 2022
Lexical Analysis on C++	1	Oct 31, 2023
How to apply qsort() with a parameter?	17	Oct 22, 2012

qsort semantics

gauss010

Army1987

user923005

Army1987

Barry Schwarz

pete

user923005

Barry Schwarz

gauss010

Barry Schwarz

Tor Rustad

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads