comparison function for qsort() question

Discussion in 'C Programming' started by David Mathog, Jan 30, 2008.

  1. David Mathog

    David Mathog Guest

    A piece of code I am working on uses qsort() to sort buffers containing
    packed binary fixed length words, however the number of bytes in a word
    will vary from one call of qsort() to another. We've discussed here
    before how qsort() doesn't provide any nice way to pass in extra data,
    so here the number of bytes is smuggled into the comparison function
    using the global integer variable gbl_bws (gbl_bws == GloBaL Binary Word
    Size). The word size varies but isn't likely to be huge in any case,
    with a maximum of probably around 8 bytes. The current code is this:

    static int compare_bw1(const void *p1, const void *p2){
    unsigned char *uc1 = (unsigned char *)p1;
    unsigned char *uc2 = (unsigned char *)p2;
    int i;
    for(i=0; i < gbl_bws; i++){
    if( uc1 > uc2 ){ return 1; }
    else if( uc1 < uc2 ){ return -1;}
    }
    return 0;
    }

    Another way of doing this is:

    static int compare_bw2(const void *p1, const void *p2){
    return(memcmp(p1,p2,gbl_bws);
    }

    Any thoughts on whether the second function form would usually be
    faster, slower, or the same?

    It seems to me to be pretty compiler dependent in that if the compiler
    actually makes a function call to memcmp() that call overhead is going
    to add up in a hurry when sorting many billions of words. On the other
    hand, if the compiler expands memcmp() in place that code is likely to
    be heavily optimized. I know that generally using memcpy() ends up
    being faster than using my own code, is that going to be generally true
    for memcmp() as well?

    On a related note, what does the standard say memcmp() will return for
    the following 4 cases? My guesses are shown in the last column.

    p1 p2 Guess (why)
    1 null null 0 (because they are identical, albeit undefined)
    2 null !null -1 (nothing < something )
    3 !null null 1 (something > nothing )
    -----------------------------------------------
    4 n == 0 0 {0 bytes of anything is always nothing)

    Since memcmp() only returns a comparison value, and not an error value
    which is really what's called for here, it presumably has these
    "comparisons" defined in the standard. The man pages I have seen do
    not address this point.

    Thanks,

    David Mathog
    David Mathog, Jan 30, 2008
    #1
    1. Advertising

  2. In article <fnqnln$8bg$>,
    David Mathog <> wrote:

    >On a related note, what does the standard say memcmp() will return for
    >the following 4 cases? My guesses are shown in the last column.
    >
    > p1 p2 Guess (why)
    >1 null null 0 (because they are identical, albeit undefined)
    >2 null !null -1 (nothing < something )
    >3 !null null 1 (something > nothing )
    >-----------------------------------------------
    >4 n == 0 0 {0 bytes of anything is always nothing)


    >Since memcmp() only returns a comparison value, and not an error value
    >which is really what's called for here, it presumably has these
    >"comparisons" defined in the standard. The man pages I have seen do
    >not address this point.


    The standard does not specifically say what happens for null pointers
    or for n == 0.

    I would think it likely that using a null pointer
    would be considered to be outside the range of valid parameters
    and so that the behaviour would be undefined.

    Returning 0 for n == 0 sounds like proper behaviour to me, but on
    different grounds: that all empty buffers are identical. Returning
    non-zero would imply that there was a difference found between
    the empty buffers, which could not be the case.
    --
    "No one has the right to destroy another person's belief by
    demanding empirical evidence." -- Ann Landers
    Walter Roberson, Jan 30, 2008
    #2
    1. Advertising

  3. David Mathog

    Eric Sosman Guest

    Walter Roberson wrote:
    > In article <fnqnln$8bg$>,
    > David Mathog <> wrote:
    >
    >> On a related note, what does the standard say memcmp() will return for
    >> the following 4 cases? My guesses are shown in the last column.
    >>
    >> p1 p2 Guess (why)
    >> 1 null null 0 (because they are identical, albeit undefined)
    >> 2 null !null -1 (nothing < something )
    >> 3 !null null 1 (something > nothing )
    >> -----------------------------------------------
    >> 4 n == 0 0 {0 bytes of anything is always nothing)

    >
    >> Since memcmp() only returns a comparison value, and not an error value
    >> which is really what's called for here, it presumably has these
    >> "comparisons" defined in the standard. The man pages I have seen do
    >> not address this point.

    >
    > The standard does not specifically say what happens for null pointers
    > or for n == 0.


    7.21.1p1 seems pretty specific:

    Where an argument declared as size_t n specifies the
    length of the array for a function, n can have the value
    zero on a call to that function. Unless explicitly stated
    otherwise in the description of a particular function in
    this subclause, pointer arguments on such a call shall
    still have valid values, as described in 7.1.4.

    Over in 7.1.4p1, we find:

    [...] unless explicitly stated otherwise [...] If an
    argument to a function has an invalid value (such as a
    value outside the domain of the function, or a pointer
    outside the address space of the program, or a null
    pointer, [...]) [...] the behavior is undefined.

    So, since the description of memcmp() mentions no exceptions,
    it's legal for the third argument to be zero but not legal for
    either of the first two to be NULL.

    --
    Eric Sosman, Jan 30, 2008
    #3
  4. -cnrc.gc.ca (Walter Roberson) wrote:
    > David Mathog  <> wrote:
    > > On a related note, what does the standard say memcmp()
    > > will return for the following 4 cases?  My guesses are
    > > shown in the last column.
    > >
    > >        p1     p2      Guess (why)
    > > 1      null   null    0  (because they are identical,
    > > albeit undefined)
    > > 2      null  !null   -1  (nothing < something )
    > > 3     !null   null    1  (something > nothing )
    > > -----------------------------------------------
    > > 4      n == 0         0  {0 bytes of anything is always
    > > nothing)
    > > Since memcmp() only returns a comparison value, and not
    > > an error alue which is really what's called for here,  it
    > > presumably has these "comparisons" defined in the
    > > standard.  The man pages I have seen do not address this
    > > point.

    >
    > The standard does not specifically say what happens for
    > null pointers or for n == 0.


    7.21.1p2:

    Where an argument declared as size_t n specifies the
    length of the array for a function, n can have the value
    zero on a call to that function. Unless explicitly
    stated otherwise in the description of a particular
    function in this subclause, pointer arguments on such a
    call shall still have valid values, as described in
    7.1.4. On such a call, a function that locates a
    character finds no occurrence, a function that compares
    two character sequences returns zero, and a function that
    copies characters copies zero characters.

    7.21.4.1p2

    The memcmp function compares the first n characters of
    the object pointed to by s1 to the first n characters of
    the object pointed to by s2.

    So, if either pointer is null, the behaviour is undefined.
    If 0 is passed (and both pointers are valid,) 0 will be
    returned.

    --
    Peter
    Peter Nilsson, Jan 30, 2008
    #4
  5. David Mathog

    David Mathog Guest

    David Mathog wrote:
    >
    > Any thoughts on whether the second function form would usually be
    > faster, slower, or the same?


    I forgot to mention, on gcc 4.1.2 on an opteron system the memcmp() form
    runs about 25% faster on a test with a sort of 125M words of 5 bytes
    each. So I knew that it worked better with _some_ compiler, I was just
    curious how common that was likely to be.

    Regards,

    David Mathog
    David Mathog, Jan 30, 2008
    #5
  6. David Mathog

    David Mathog Guest

    David Mathog wrote:
    > David Mathog wrote:
    >>
    >> Any thoughts on whether the second function form would usually be
    >> faster, slower, or the same?

    >
    > I forgot to mention, on gcc 4.1.2 on an opteron system the memcmp() form
    > runs about 25% faster on a test with a sort of 125M words of 5 bytes
    > each. So I knew that it worked better with _some_ compiler, I was just
    > curious how common that was likely to be.


    Turns out the fastest variant so far is neither of the above (all tests
    at gcc -03).

    Using memcmp() helped, dropping the run time to 164 seconds. Still, the
    one below ran in just 144 seconds. The program does a lot of IO,
    unclear how much of the time was in qsort() and how much in IO, but the
    IO sections were unchanged and should have run in exactly the same time.
    In the tests gbl_bws was 5 and there were around 120M words in the
    sort buffer.

    /* based on the memcpm() implementation here:
    http://mia.ece.uic.edu/cgi-bin/lxr/http/source/memcmp.c?v=openvpn-1.4.2
    */
    int compare_bw(const void *s1, const void *s2){
    register unsigned const char *p1 = s1, *p2 = s2;
    register int d;
    register int n;

    if (gbl_bws > 0){
    n = gbl_bws;
    while (n-- > 0){
    d = *p1++ - *p2++;
    if (d != 0)return(d);
    }
    }
    return 0;
    }

    Probably this program shouldn't be using the qsort() from the
    dynamically linked library anyway, since that can't inline the
    comparison function into the sort function. No big deal to pick
    up a qsort source and put it into this code, still, why should
    the programmer have to do this? Do any C compilers offer an
    option to compile and statically link their qsort() in order
    to achieve better optimization? In other words, rather than
    the end user having to physically paste the qsort code
    in somewhere, can some compilers do it automagically? This
    pretty much only comes up with qsort() since it makes a lot of
    calls to the comparison function, and so is ripe for some
    optimization at compile time.

    Regards,

    David Mathog
    David Mathog, Jan 30, 2008
    #6
  7. David Mathog

    Randy Howard Guest

    On Wed, 30 Jan 2008 16:43:48 -0600, David Mathog wrote
    (in article <fnquj3$b30$>):

    > David Mathog wrote:
    >>
    >> Any thoughts on whether the second function form would usually be
    >> faster, slower, or the same?

    >
    > I forgot to mention, on gcc 4.1.2 on an opteron system the memcmp() form
    > runs about 25% faster on a test with a sort of 125M words of 5 bytes
    > each. So I knew that it worked better with _some_ compiler, I was just
    > curious how common that was likely to be.


    memcmp() was hideously slow in some versions of Microsoft's library, it
    may have improved recently, I don't use it anymore.


    --
    Randy Howard (2reply remove FOOBAR)
    "The power of accurate observation is called cynicism by those
    who have not got it." - George Bernard Shaw
    Randy Howard, Jan 31, 2008
    #7
  8. David Mathog

    pete Guest

    David Mathog wrote:
    >
    > David Mathog wrote:
    > > David Mathog wrote:
    > >>
    > >> Any thoughts on whether the second function form would usually be
    > >> faster, slower, or the same?

    > >
    > > I forgot to mention,
    > > on gcc 4.1.2 on an opteron system the memcmp() form
    > > runs about 25% faster on a test with a sort of 125M words of 5 bytes
    > > each. So I knew that it worked better with _some_ compiler,
    > > I was just curious how common that was likely to be.

    >
    > Turns out the fastest variant
    > so far is neither of the above (all tests at gcc -03).
    >
    > Using memcmp() helped, dropping the run time to 164 seconds.
    > Still, the
    > one below ran in just 144 seconds. The program does a lot of IO,
    > unclear how much of the time was in qsort() and how much in IO,
    > but the
    > IO sections were unchanged and
    > should have run in exactly the same time.
    > In the tests gbl_bws was 5 and there were around 120M words in the
    > sort buffer.
    >
    > /* based on the memcpm() implementation here:
    > http://mia.ece.uic.edu/cgi-bin/lxr/http/source/memcmp.c?v=openvpn-1.4.2
    > */
    > int compare_bw(const void *s1, const void *s2){
    > register unsigned const char *p1 = s1, *p2 = s2;
    > register int d;
    > register int n;
    >
    > if (gbl_bws > 0){
    > n = gbl_bws;
    > while (n-- > 0){
    > d = *p1++ - *p2++;
    > if (d != 0)return(d);
    > }
    > }
    > return 0;
    > }
    >
    > Probably this program shouldn't be using the qsort() from the
    > dynamically linked library anyway, since that can't inline the
    > comparison function into the sort function. No big deal to pick
    > up a qsort source and put it into this code, still, why should
    > the programmer have to do this?


    I don't know.
    If you're sorting arrays of pointers, then you could try ucptrsort.
    If you're sorting arrays of arrays,
    then I'd have to write a MOV(A,B) macro.

    /* BEGIN ucptrsort.h */

    #ifndef H_UCPTRSORT_H
    #define H_UCPTRSORT_H

    #include <stddef.h>

    typedef unsigned char * e_type;

    extern size_t gbl_bws;

    void ucptrsort(e_type *array, size_t nmemb);
    void hsortm(e_type *array, size_t nmemb);
    void i_sort(e_type *array, size_t nmemb); /* unstable insertionsort */

    #endif

    /* END ucptrsort.h */


    /* BEGIN ucptrsort.c */

    #include <assert.h>
    #include <string.h>

    #include "ucptrsort.h"

    #define GT(A, B) (memcmp(A, B, gbl_bws) > 0)
    #define SMALL_PARTITION 50
    /*
    ** Reduce SMALL_PARTITION for slower moving data types.
    ** for example: 50 for unsigned, 20 for long double.
    ** SMALL_PARTITION must be made to be greater than or equal to 4.
    ** assert(SMALL_PARTITION >= 4);
    ** Use i_sort instead of ucptrsort, for sorting small arrays.
    */
    #define LU_RAND_SEED 123456789LU
    #define LU_RAND(S) ((S) * 69069 + 362437)
    /*
    ** Use:
    ** #define LU_RAND(S) ((S) * 69069 + 362437)
    ** for better sorting
    ** Use:
    ** #define LU_RAND(S) ((S) * 69069 + 362437 & 0XFFFFFFFF)
    ** to avoid implementation defined behavior
    */
    #define SWAP(A, B, T) ((void)((T) = *(A), *(A) = *(B), *(B) = (T)))
    #define SORT_3(A, B, C, T) \
    if (GT((A), (B))) { \
    (T) = *(A); \
    if (GT((C), (B))) { \
    *(A) = *(B); \
    if (GT(&(T), (C))) { \
    *(B) = *(C); \
    *(C) = (T); \
    } else { \
    *(B) = (T); \
    } \
    } else { \
    *(A) = *(C); \
    *(C) = (T); \
    } \
    } else { \
    if (GT((B), (C))) { \
    if (GT((A), (C))) { \
    (T) = *(A); \
    *(A) = *(C); \
    *(C) = *(B); \
    *(B) = (T); \
    } else { \
    SWAP((B), (C), (T)); \
    } \
    } \
    }
    #define SIFTDOWNM(A, I, N, P, C, T) \
    { \
    (P) = (I); \
    (T) = (A)[(P)]; \
    (C) = (P) * 2; \
    while ((N) > (C)) { \
    if (GT((A) + (C) + 1, (A) + (C))) { \
    ++(C); \
    } \
    if (GT((A) + (C), &(T))) { \
    (A)[(P)] = (A)[(C)]; \
    (P) = (C); \
    (C) *= 2; \
    } else { \
    break; \
    } \
    } \
    if ((N) == (C) && GT((A) + (C), &(T))) { \
    (A)[(P)] = (A)[(C)]; \
    (P) = (C); \
    } \
    (A)[(P)] = (T); \
    }

    static int sorted(e_type *array, size_t nmemb);
    static int rev_sorted(e_type *array, size_t nmemb);
    static void rev_array(e_type *array, size_t nmemb);
    static long unsigned
    ucptrloop(e_type *array, size_t nmemb, size_t d, long unsigned seed);

    void ucptrsort(e_type *array, size_t nmemb)
    {
    size_t d, n;

    if (nmemb > 1) {
    if (!rev_sorted(array, nmemb)) {
    d = 2;
    for (n = nmemb / 4; n; n /= 2) {
    ++d;
    }
    ucptrloop(array, nmemb, d * 2, LU_RAND_SEED);
    } else {
    rev_array(array, nmemb);
    }
    }
    }

    void hsortm(e_type *array, size_t nmemb)
    {
    size_t i, child, parent;
    e_type temp;

    if (nmemb > 1) {
    i = --nmemb / 2;
    do {
    SIFTDOWNM(array, i, nmemb, parent, child, temp);
    } while (i-- != 0);
    SWAP(array, array + nmemb, temp);
    while (--nmemb != 0) {
    SIFTDOWNM(array, 0, nmemb, parent, child, temp);
    SWAP(array, array + nmemb, temp);
    }
    }
    }

    void i_sort(e_type *array, size_t nmemb)
    {
    e_type temp, *first, *middle;
    size_t counter;

    if (nmemb > 1) {
    first = middle = 1 + array;
    counter = --nmemb;
    while (--counter != 0) {
    ++first;
    if (GT(middle, first)) {
    middle = first;
    }
    }
    if (GT(array, middle)) {
    SWAP(array, middle, temp);
    }
    ++array;
    while (--nmemb != 0) {
    first = array++;
    if (GT(first, array)) {
    middle = array;
    temp = *middle;
    do {
    *middle-- = *first--;
    } while (GT(first, &temp));
    *middle = temp;
    }
    }
    }
    }

    static int sorted(e_type *array, size_t nmemb)
    {
    while (--nmemb != 0 && !GT(array, array + 1)) {
    ++array;
    }
    return !nmemb;
    }

    static int rev_sorted(e_type *array, size_t nmemb)
    {
    while (--nmemb != 0 && !GT(array + 1, array)) {
    ++array;
    }
    return !nmemb;
    }

    static void rev_array(e_type *array, size_t nmemb)
    {
    e_type temp, *end;

    for (end = array + nmemb; --end > array; ++array) {
    SWAP(array, end, temp);
    }
    }

    static long unsigned
    ucptrloop(e_type *array, size_t nmemb, size_t d,
    long unsigned seed)
    {
    e_type temp, *first, *last;

    assert(SMALL_PARTITION >= 4);
    while (nmemb > SMALL_PARTITION) {
    if (sorted(array, nmemb)) {
    return seed;
    }
    if (!d--) {
    hsortm(array, nmemb);
    return seed;
    }
    seed = LU_RAND(seed);
    first = seed % --nmemb + array;
    SWAP(array, first, temp);
    first = 1 + array;
    last = nmemb + array;
    SORT_3(first, array, last, temp);
    do {
    ++first;
    } while (GT(array, first));
    do {
    --last;
    } while (GT(last, array));
    while (last > first) {
    SWAP(last, first, temp);
    do {
    ++first;
    } while (GT(array, first));
    do {
    --last;
    } while (GT(last, array));
    }
    SWAP(array, last, temp);
    seed = ucptrloop(last + 1, nmemb + array - last, d, seed);
    nmemb = last - array;
    }
    i_sort(array, nmemb);
    return seed;
    }

    /* END ucptrsort.c */

    --
    pete
    pete, Feb 1, 2008
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Debashish Chakravarty

    [possibly OT] the comparison function in qsort

    Debashish Chakravarty, Nov 23, 2003, in forum: C Programming
    Replies:
    0
    Views:
    312
    Debashish Chakravarty
    Nov 23, 2003
  2. Charlie Zender

    Need warning-free qsort() comparison functions

    Charlie Zender, Jan 1, 2004, in forum: C Programming
    Replies:
    21
    Views:
    1,256
  3. Spoon

    Comparison function in qsort()

    Spoon, Apr 12, 2006, in forum: C Programming
    Replies:
    14
    Views:
    620
    Robert Latest
    Apr 18, 2006
  4. , India

    question on qsort library function

    , India, Mar 3, 2007, in forum: C Programming
    Replies:
    14
    Views:
    613
  5. pereges

    How to use qsort function of C library ?

    pereges, Apr 8, 2008, in forum: C Programming
    Replies:
    17
    Views:
    569
    Barry Schwarz
    Apr 9, 2008
Loading...

Share This Page