Re: bsearch in C

Discussion in 'C Programming' started by Eric Sosman, Aug 7, 2003.

  1. Eric Sosman

    Eric Sosman Guest

    Edward A Thompson wrote:
    >
    > Is there any common knowledge out there about what the threshold for
    > bsearch (stdlib.h) performing better than a sequential search of an
    > array? If I have an array of 600 structure vs 6000?


    Too many unknowns: The nature and quality of the bsearch()
    implementation, the complexity of the comparison function, the
    comparative overhead of actually calling a function for each
    comparison as opposed to performing it in-line inside the search
    loop, the behavior of your virtual memory system, the phase of
    the moon, ...

    If the bsearch() implementation uses a binary search (not
    required by the Standard, but probably pretty common), an
    average successful search will take about lg(N)-1 probes to
    find a key in an N-element array, and an average unsuccessful
    search will take about lg(N) probes to establish that the key
    isn't there. For 600 elements that's nine-and-change probes;
    for 6000 it's twelve-and-change.

    As suggested above, you can't really predict the crossover
    point given such meager data. There's more than one way to
    search sequentially -- Knuth devotes more than twelve pages
    of TAOCP to the topic -- so it's hard to predict what sort of
    performance "a sequential search" will give. Still, a few
    *very* *general* *and* *approximate* ideas:

    - A sequential search in a haphazardly arranged array
    will probe about N/2 elements for a successful search,
    or all N elements in an unsuccessful search. This
    will be greater than lg(N) for all except the smallest
    arrays -- but the probes themselves may be a good deal
    simpler, and you needn't incur the cost of sorting the
    array beforehand.

    - If you sort the array, a successful sequential search
    still averages N/2 probes but an unsuccessful search
    drops from N-for-sure to N/2-on-the-average (with some
    assumptions about the distribution of the keys).

    - Self-organizing ("move to front" or "move toward front")
    arrays can exploit non-uniform search patterns in ways
    binary search cannot match.

    All very nice in theory, but not too useful in practice. The
    first -- the VERY FIRST -- thing you should do is measure the
    performance of your program and decide whether the searching
    contributes an unacceptable amount of running time. If it does,
    consider ways to reduce that time -- but if it doesn't, all the
    time you spend optimizing it is time wasted. Utterly wasted.

    Finally, a personal opinion (with which others may disagree,
    perhaps vehemently): don't use bsearch(). The binary search
    is such a simple algorithm to implement that you might as well
    just do it in-line where it's needed, and not pay performance
    penalties for generality you don't need. (By the way, you may
    be interested in an invention of mine: I'm calling it the "wheel.")

    --
    Eric Sosman, Aug 7, 2003
    #1
    1. Advertising

  2. Eric Sosman

    Eric Sosman Guest

    Ben Pfaff wrote:
    >
    > Eric Sosman <> writes:
    >
    > > Finally, a personal opinion (with which others may disagree,
    > > perhaps vehemently): don't use bsearch(). The binary search
    > > is such a simple algorithm to implement that you might as well
    > > just do it in-line where it's needed, and not pay performance
    > > penalties for generality you don't need. (By the way, you may
    > > be interested in an invention of mine: I'm calling it the "wheel.")

    >
    > bsearch() is most useful when you've already called qsort() on
    > the array, because you already have the comparison function
    > handy.


    ... except that the comparison functions for qsort() and
    bsearch() receive different arguments. The qsort() comparator
    gets pointers to two array elements, while the bsearch()
    comparator gets one pointer to an array element and one
    pointer to a "key." Unless the key happens to be the first
    (or only) thing in the array element, you can't use the
    same comparator for both qsort() and bsearch().

    --
    Eric Sosman, Aug 8, 2003
    #2
    1. Advertising

  3. On Fri, 8 Aug 2003, Eric Sosman wrote:
    >
    > Ben Pfaff wrote:
    > > Eric Sosman <> writes:
    > >
    > > > Finally, a personal opinion (with which others may disagree,
    > > > perhaps vehemently): don't use bsearch(). The binary search
    > > > is such a simple algorithm to implement that you might as well
    > > > just do it in-line where it's needed, and not pay performance
    > > > penalties for generality you don't need. (By the way, you may
    > > > be interested in an invention of mine: I'm calling it the "wheel.")

    > >
    > > bsearch() is most useful when you've already called qsort() on
    > > the array, because you already have the comparison function
    > > handy.

    >
    > ... except that the comparison functions for qsort() and
    > bsearch() receive different arguments. The qsort() comparator
    > gets pointers to two array elements, while the bsearch()
    > comparator gets one pointer to an array element and one
    > pointer to a "key"


    ... which is allowed to be of the same type as the array elements.

    > Unless the key happens to be the first
    > (or only) thing in the array element, you can't use the
    > same comparator for both qsort() and bsearch().


    Of course you can!
    For example,

    int intcmp(const void *p, const void *q)
    {
    int i = *(const int*)p, j = *(const int*)q;
    return (i < j)? -1: (i != j);
    }

    Another brain fart?

    -Arthur
    Arthur J. O'Dwyer, Aug 8, 2003
    #3
  4. Eric Sosman

    Eric Sosman Guest

    "Arthur J. O'Dwyer" wrote:
    >
    > On Fri, 8 Aug 2003, Eric Sosman wrote:
    > >
    > > ... except that the comparison functions for qsort() and
    > > bsearch() receive different arguments. The qsort() comparator
    > > gets pointers to two array elements, while the bsearch()
    > > comparator gets one pointer to an array element and one
    > > pointer to a "key"

    >
    > ... which is allowed to be of the same type as the array elements.


    .... which is exactly what I wrote in the very first phrase of
    the very next sentence.

    > > Unless the key happens to be the first
    > > (or only) thing in the array element, you can't use the
    > > same comparator for both qsort() and bsearch().

    >
    > Of course you can!
    > For example,
    >
    > int intcmp(const void *p, const void *q)
    > {
    > int i = *(const int*)p, j = *(const int*)q;
    > return (i < j)? -1: (i != j);
    > }
    >
    > Another brain fart?


    My brain has been known to emit noxious fumes from time
    to time, but I think this notion is fully digested. Example:

    struct item {
    double trouble;
    char *coal;
    ...
    int key;
    ...
    float ation;
    void *main;
    };

    The challenge: use qsort() to sort an array of these
    things on their `key' elements, and then use bsearch() with
    the same comparison functions to search the sorted array.
    This won't work:

    int key = 42;
    bsearch (&key, array, count, sizeof array[0],
    same_comparator_as_qsort);

    It *is* possible to re-use the comparator, at the cost
    of manufacturing a fake node simply to carry the key data:

    struct item fake;
    fake.key = 42;
    bsearch (&fake, ...);

    With a small struct like the one shown, this isn't bad --
    but with larger structs it can become inconvenient.

    --
    Eric Sosman, Aug 8, 2003
    #4
  5. Eric Sosman

    Al Bowers Guest

    Eric Sosman wrote:
    > Ben Pfaff wrote:
    >
    >>Eric Sosman <> writes:
    >>
    >>
    >>> Finally, a personal opinion (with which others may disagree,
    >>>perhaps vehemently): don't use bsearch(). The binary search
    >>>is such a simple algorithm to implement that you might as well
    >>>just do it in-line where it's needed, and not pay performance
    >>>penalties for generality you don't need. (By the way, you may
    >>>be interested in an invention of mine: I'm calling it the "wheel.")

    >>
    >>bsearch() is most useful when you've already called qsort() on
    >>the array, because you already have the comparison function
    >>handy.

    >
    >
    > ... except that the comparison functions for qsort() and
    > bsearch() receive different arguments. The qsort() comparator
    > gets pointers to two array elements, while the bsearch()
    > comparator gets one pointer to an array element and one
    > pointer to a "key." Unless the key happens to be the first
    > (or only) thing in the array element, you can't use the
    > same comparator for both qsort() and bsearch().
    >

    Nope; function qsort and bsearch can use the same comparator function.

    #include <stdio.h>
    #include <stdlib.h>

    typedef INTARRAY[6];

    void PrintINTARRAY(INTARRAY a);
    int cmp(const void *p, const void *q);

    int main(void)
    {
    INTARRAY iarr = { 6,4,9,3,1,10};
    int key = 9;

    puts("Unsorted");
    PrintINTARRAY(iarr);
    qsort(iarr,6,sizeof(int),cmp);
    puts("\nSorted");
    PrintINTARRAY(iarr);
    puts("\nLooking for integer 9 in array.....");
    if(bsearch(&key,iarr,6,sizeof(int),cmp))
    puts("\nThe integer 9 was found in the array");
    else puts("\nThe integer 9 was not found in the array");
    return 0;
    }

    void PrintINTARRAY(INTARRAY a)
    {
    int i, nelements = sizeof(INTARRAY)/sizeof(a);

    for(i = 0;i < nelements;i++)
    printf("%d\t",a);
    putchar('\n');
    return;
    }

    int cmp(const void *v1, const void *v2)
    {
    const int *i1 = v1, *i2 = v2;
    return (*i1 < *i2)? -1: (*i1 != *i2);
    }

    --
    Al Bowers
    Tampa, Fl USA
    mailto: (remove the x)
    http://www.geocities.com/abowers822/
    Al Bowers, Aug 8, 2003
    #5
  6. Eric Sosman

    Guest

    Eric Sosman <> wrote:
    >
    > The qsort() comparator
    > gets pointers to two array elements, while the bsearch()
    > comparator gets one pointer to an array element and one
    > pointer to a "key." Unless the key happens to be the first
    > (or only) thing in the array element, you can't use the
    > same comparator for both qsort() and bsearch().


    Not so, you just need to ensure that your key has the same type as your
    array elements. In fact, most pre-ANSI specs for bsearch() neglected to
    specify which comparison function parameter was the key, which pretty
    much forced you to do the same thing.

    -Larry Jones

    Why is it you always rip your pants on the day everyone has to
    demonstrate a math problem at the chalkboard? -- Calvin
    , Aug 8, 2003
    #6
  7. Eric Sosman

    Eric Sosman Guest

    wrote:
    >
    > Eric Sosman <> wrote:
    > >
    > > The qsort() comparator
    > > gets pointers to two array elements, while the bsearch()
    > > comparator gets one pointer to an array element and one
    > > pointer to a "key." Unless the key happens to be the first
    > > (or only) thing in the array element, you can't use the
    > > same comparator for both qsort() and bsearch().

    >
    > Not so, you just need to ensure that your key has the same type as your
    > array elements. In fact, most pre-ANSI specs for bsearch() neglected to
    > specify which comparison function parameter was the key, which pretty
    > much forced you to do the same thing.


    This is becoming spooky. Three count them three
    respondents (so far) have offered counterexamples to my
    contention that you can't, in general, use the same
    comparator for qsort() and bsearch():

    - Arthur J. O'Dwyer shows a comparison function
    that works on a pair of `int' objects,

    - Al Bowers demonstrates a complete program to
    qsort() and bsearch() an array of `int',

    - Larry Jones points out that everything works if
    the key and the array elements have the same
    type.

    The spooky part is that all three of these folks are
    right, as far as they go -- but that even though all three
    quoted the sentence beginning with "Unless," not one of
    them seems to have read it.

    YES, people! There ARE cases where you CAN use the
    same comparator for both qsort() and bsearch()! I admit
    it freely! You've got me! It's a fair cop.

    ... but it's a SPECIAL CASE, not a general property
    of the qsort() and bsearch() functions. It is SOMETIMES
    true that you can use the same comparison function with
    both qsort() and bsearch(), but NOT ALWAYS true.

    (By the way, I've made an improvement to my new
    "wheel" invention: the latest model is triangular instead
    of square, eliminating twenty-five percent of the jolts.)

    --
    Eric Sosman, Aug 8, 2003
    #7
  8. Eric Sosman

    Guest

    Rich Grise <> wrote:
    >
    > The challenge: use qsort() to sort an array of these
    > things on their `key' elements, and then use bsearch() with
    > the same comparison functions to search the sorted array.
    > This won't work:
    >
    > int key = 42;
    > bsearch (&key, array, count, sizeof array[0],
    > same_comparator_as_qsort);


    No, but this will:

    struct item keyitem;
    keyitem.key = 42;
    bsearch(&keyitem, array, count, sizeof array[0],
    same_comparator_as_qsort);

    -Larry Jones

    I never get to do anything fun. -- Calvin
    , Aug 9, 2003
    #8
  9. Eric Sosman

    Ben Pfaff Guest

    Eric Sosman <> writes:

    > The challenge: use qsort() to sort an array of these
    > things on their `key' elements, and then use bsearch() with
    > the same comparison functions to search the sorted array.
    > This won't work:
    >
    > int key = 42;
    > bsearch (&key, array, count, sizeof array[0],
    > same_comparator_as_qsort);
    >
    > It *is* possible to re-use the comparator, at the cost
    > of manufacturing a fake node simply to carry the key data:
    >
    > struct item fake;
    > fake.key = 42;
    > bsearch (&fake, ...);


    That's the way I've always used it, and that's the way I've
    always assumed it was meant to be used.
    Ben Pfaff, Aug 11, 2003
    #9
  10. Ben Pfaff wrote:

    > Eric Sosman <> writes:
    >
    >> It *is* possible to re-use the comparator, at the cost
    >> of manufacturing a fake node simply to carry the key data:
    >>
    >> struct item fake;
    >> fake.key = 42;
    >> bsearch (&fake, ...);

    >
    > That's the way I've always used it, and that's the way I've
    > always assumed it was meant to be used.


    Curiously, I seem to recall that it isn't the way I started off using it. At
    some point, I discovered that this was (what I now think of as) The Right
    Way To Use bsearch, but I seem to recall that as a newbie I used to just
    pass the key field itself into bsearch and hope like hell that it would
    work. (Often, amazingly, it did.)

    --
    Richard Heathfield :
    "Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
    C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
    K&R answers, C books, etc: http://users.powernet.co.uk/eton
    Richard Heathfield, Aug 12, 2003
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Artie Gold

    Re: bsearch in C

    Artie Gold, Aug 7, 2003, in forum: C Programming
    Replies:
    0
    Views:
    392
    Artie Gold
    Aug 7, 2003
  2. Richard Heathfield

    Re: bsearch in C

    Richard Heathfield, Aug 7, 2003, in forum: C Programming
    Replies:
    0
    Views:
    393
    Richard Heathfield
    Aug 7, 2003
  3. Ramprasad A Padmanabhan

    a small bsearch example

    Ramprasad A Padmanabhan, Oct 28, 2003, in forum: C Programming
    Replies:
    1
    Views:
    3,768
    Mark Gordon
    Oct 28, 2003
  4. Ramprasad A Padmanabhan

    bsearch script segfault

    Ramprasad A Padmanabhan, Dec 4, 2003, in forum: C Programming
    Replies:
    11
    Views:
    486
    Ramprasad A Padmanabhan
    Dec 5, 2003
  5. Angus Comber

    Struggling with bsearch - on a struct!

    Angus Comber, Feb 5, 2004, in forum: C Programming
    Replies:
    4
    Views:
    353
    Al Bowers
    Feb 6, 2004
Loading...

Share This Page