Algorithms in C - Robert Sedgewick

Discussion in 'C Programming' started by arnuld, Sep 8, 2009.

  1. arnuld

    arnuld Guest

    This code is from chapter 2 of Algorithms in C:

    Sequential Search:

    int search( int a[], int v, int l, int r)
    {
    int i;
    for(i = 1; i <= r; i++)
    if(v == a) return i;

    return -1;
    }




    Binary Search:

    int search( int a[], int v, int l, int r )
    {
    while( r >= 1 )
    {
    int m = (l + r)/2;
    if(v == a[m]) return m;
    if(v < a[m]) r = m - 1;
    else l = m + 1;
    }

    return -1;
    }




    Now author explains the average cost of S-Search as:

    (1 + 2 + 3 + .... + N)/N = (N + 1)/2

    and the worst-case cost of B-Search as:

    T(Base-N) <= T(Base-N/2) + 1 for N >=2 and T(Base-1) = 1




    I am going to ask a very simple question. BY just looking at the code of
    both searches, any man with brains can simply conclude that B-Search, on
    a sorted input, is always cheaper as compared to S-Search. If it is so
    clear form the code, why do I h ave to go through the incomprehensible
    junk of N, T, C N(BaseN) etc etc. The book is about Algorithms in C but
    when I started to read, the author is teaching me whole recurrence series
    (which of course is an obtuse thing to understand):

    C(Base-N) = C(Base-(N-1)) + N for N >= 2 with C(Base-1) = 1
    C(Base-N) = C(Base-(N-2)) + (N - 1) + N
    = C(Base-(N-3)) + (N - 2) + (N - 1) + N
    = ..............
    = ..............
    = C(Base-1) + 2 + ... (N - 2) + (N - 1) + N
    = 1 + 2 + 3 + .. + (N - 2) + (N - 1) + N
    = N(N+1)/2


    Is it necessary to understand whole exponential series and recurrences
    and permutations and combinations and logarithmic series before you even
    start to comprehend a search or sorting or any algorithm at all ? If that
    is so why will I use Sedgwick, I can read Hall & Knight's "Higher
    Algebra".

    My experience says its enough to know that N is less than N^2 which is
    less than 2^N and logN is least of them all and if B-Search takes logN on
    a sorted input and S-Search takes N, then we can go with B-Search based
    algorithm, provided you have estimated the cost of sorting first.

    So should I go with reading all the maths or just go to some random
    chapter and start working on an algorithm in C ?




    --

    www.lispmachine.wordpress.com
    my email is @ the above blog.
     
    arnuld, Sep 8, 2009
    #1
    1. Advertising

  2. arnuld

    Eric Sosman Guest

    arnuld wrote:
    > This code is from chapter 2 of Algorithms in C:
    >
    > Sequential Search:
    >
    > int search( int a[], int v, int l, int r)
    > {
    > int i;
    > for(i = 1; i <= r; i++)
    > if(v == a) return i;
    >
    > return -1;
    > }


    I don't have the book, but this looks very un-C-like.
    Ordinarily, one would expect element [0] of an array to
    hold something useful, and might not expect element [r] to
    exist at all. Is this offered as real C code, or just as
    the first stage in a transliteration from a language with
    [1]-based arrays, possibly Fortran?

    > Binary Search:
    >
    > int search( int a[], int v, int l, int r )
    > {
    > while( r >= 1 )
    > {
    > int m = (l + r)/2;
    > if(v == a[m]) return m;
    > if(v < a[m]) r = m - 1;
    > else l = m + 1;
    > }
    >
    > return -1;
    > }


    The termination criterion looks completely wrong. Are
    you sure you've transcribed the code correctly?

    > Now author explains the average cost of S-Search as:
    >
    > (1 + 2 + 3 + .... + N)/N = (N + 1)/2
    >
    > and the worst-case cost of B-Search as:
    >
    > T(Base-N) <= T(Base-N/2) + 1 for N >=2 and T(Base-1) = 1


    > I am going to ask a very simple question. BY just looking at the code of
    > both searches, any man with brains can simply conclude that B-Search, on
    > a sorted input, is always cheaper as compared to S-Search.


    Yes, any man with brains could conclude that, if his brains
    didn't function very well. The sequential search will take more
    steps than the binary search, it's true -- but the "step" of one
    is simpler (cheaper) than the "step" of the other. Look at the
    sequential "step:" two comparisons and an addition. Look at
    the binary "step:" three comparisons, two additions, and a
    division.

    > Is it necessary to understand whole exponential series and recurrences
    > and permutations and combinations and logarithmic series before you even
    > start to comprehend a search or sorting or any algorithm at all ?


    No: You can choose to learn things by rote and not bother
    with understanding them. Carry your grimoire everywhere, and
    hope you never encounter a situation that's just a little bit
    different, something that isn't written down for you in exactly
    the form you encounter it.

    Or you can arm yourself with the ability to do a little
    analysis unaided, so you won't be thrown for a loop if things
    are just a little different from those you saw in the book. For
    example, consider applying the binary search to a sorted array
    containing some equal elements, the task being to find the low
    and high indices of the span of elements equal to `v', or to
    determine that `v' is not present. There are at least three
    obvious ways to approach the task; how will you choose one?

    > My experience says its enough to know that N is less than N^2 which is
    > less than 2^N and logN is least of them all and if B-Search takes logN on
    > a sorted input and S-Search takes N, then we can go with B-Search based
    > algorithm, provided you have estimated the cost of sorting first.


    Have you ever noticed that practical Quicksort implementations
    often switch to insertion sort or something of the kind when the
    sub-files get short enough? Why would they abandon the O(N*logN)
    partition-exchange method in favor of an O(N*N) method? Are they
    just being perverse -- or is there something you've failed to
    notice?

    > So should I go with reading all the maths or just go to some random
    > chapter and start working on an algorithm in C ?


    It's your choice: memorize, or comprehend. Each approach has
    its advantages and disadvantages.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Sep 8, 2009
    #2
    1. Advertising

  3. arnuld wrote:
    >
    > [snip...] [snip...] [snip...]
    >
    > I am going to ask a very simple question. BY just looking at the code of
    > both searches, any man with brains can simply conclude that B-Search, on
    > a sorted input, is always cheaper as compared to S-Search.


    But a binary search is *not* always cheaper than a sequential
    search. For just a few items, sequential search can be cheaper.
    There is a "breakpoint" where the binary search becomes cheaper.
    This may be somewhere between five and nine items.


    --
    +----------------------------------------+
    | Charles and Francis Richmond |
    | |
    | plano dot net at aquaporin4 dot com |
    +----------------------------------------+
     
    Charles Richmond, Sep 8, 2009
    #3
  4. arnuld

    Default User Guest

    Eric Sosman wrote:

    > arnuld wrote:
    > > This code is from chapter 2 of Algorithms in C:
    > >
    > > Sequential Search:
    > >
    > > int search( int a[], int v, int l, int r)
    > > {
    > > int i;
    > > for(i = 1; i <= r; i++)
    > > if(v == a) return i;
    > >
    > > return -1;
    > > }

    >
    > I don't have the book, but this looks very un-C-like.
    > Ordinarily, one would expect element [0] of an array to
    > hold something useful, and might not expect element [r] to
    > exist at all. Is this offered as real C code, or just as
    > the first stage in a transliteration from a language with
    > [1]-based arrays, possibly Fortran?


    As I recall, the original book was written in Pascal, and Sedgewick
    used 1-based arrays for the C version as well.

    The code is on his web site:

    <http://www.cs.princeton.edu/~rs/Algs3.c1-4/code.txt>



    Brian
     
    Default User, Sep 8, 2009
    #4
  5. On Sep 8, 8:43 am, Eric Sosman <> wrote:
    > arnuld wrote:
    >
    > > Binary Search:

    >
    > > int search( int a[], int v, int l, int r )
    > > {
    > >    while( r >= 1 )
    > >      {
    > >         int m = (l + r)/2;
    > >         if(v == a[m]) return m;
    > >         if(v < a[m]) r = m - 1;
    > >         else l = m + 1;
    > >       }

    >
    > >    return -1;
    > > }

    >
    >      The termination criterion looks completely wrong.  Are
    > you sure you've transcribed the code correctly?


    While I don't have that book on hand at the moment to verify the
    transcription, I can safely say that the code in it is generally quite
    poor, in terms of readability, style, and overall correctness. I spent
    more time fixing the code than learning from it, though otherwise the
    book is excellent and I still recommend it with a warning not to rely
    on the code.
     
    Julienne Walker, Sep 8, 2009
    #5
  6. arnuld

    James Kuyper Guest

    arnuld wrote:
    > This code is from chapter 2 of Algorithms in C:
    >
    > Sequential Search:
    >
    > int search( int a[], int v, int l, int r)
    > {
    > int i;
    > for(i = 1; i <= r; i++)
    > if(v == a) return i;
    >
    > return -1;
    > }


    That looks like Fortran-inspired C code. I'm not personally familiar
    with this book; it might have some very useful information about
    algorithms. However, I recommend not paying too much attention to the C
    code; you're better off learning a more idiomatic C programming style.

    ....
    > Now author explains the average cost of S-Search as:
    >
    > (1 + 2 + 3 + .... + N)/N = (N + 1)/2
    >
    > and the worst-case cost of B-Search as:
    >
    > T(Base-N) <= T(Base-N/2) + 1 for N >=2 and T(Base-1) = 1
    >
    >
    >
    >
    > I am going to ask a very simple question. BY just looking at the code of
    > both searches, any man with brains can simply conclude that B-Search, on
    > a sorted input, is always cheaper as compared to S-Search.


    It is not always cheaper, and in some cases it can be more expensive.

    The first issue is the frequency with which items on the list are
    searched for. If some items in a list are searched for a lot more
    frequently than others, and the list is sorted in decreasing order by
    frequency of use, it's quite possible for the linear search to be as
    fast or even faster than a binary search. In order for this to be true,
    the chance that a given item will be searched for has to be very high
    for the first few items in the list, and that chance has to drop very
    rapidly with each of the first several items on the list. Lists with
    those characteristics are not commonplace, but neither are they as rare
    as you might think.

    The second issue is the length of the list - binary search is a little
    more complicated to set up than a linear search, and each binary search
    step is a bit more complicated than the corresponding linear search
    step. It isn't much of a difference, but For sufficiently short lists,
    it's a difference that can make a linear search as fast as, if not
    faster than, a binary search. How short is "sufficiently short"? That's
    highly implementation-dependent. However, I'd never even dream of using
    a binary search on a list with as few as 5 elements, the break-even
    point is usually quite a bit higher than that.

    These first case is really just an extension of the second. A list
    where, most of the time, only the first few items have to be searched
    through, is effectively very similar to a list that only has a few items.
     
    James Kuyper, Sep 8, 2009
    #6
  7. arnuld

    Default User Guest

    Eric Sosman wrote:


    > Well, possibly -- I wouldn't know. But in the case at hand, it
    > seems more than likely that the O.P. has mis-transcribed "l" as "1",
    > don't you think? And I sort of wonder what other transcription
    > errors may have been introduced along with that one ...


    That's not the case.



    Brian
     
    Default User, Sep 9, 2009
    #7
  8. arnuld

    arnuld Guest

    > On Tue, 08 Sep 2009 08:43:12 -0400, Eric Sosman wrote:
    >> arnuld wrote:


    > I don't have the book, but this looks very un-C-like.
    > Ordinarily, one would expect element [0] of an array to hold something
    > useful, and might not expect element [r] to exist at all. Is this
    > offered as real C code, or just as the first stage in a transliteration
    > from a language with [1]-based arrays, possibly Fortran?


    I guess Pascal. I got that from accu.org .



    >> Binary Search:
    >>
    >> int search( int a[], int v, int l, int r ) {
    >> while( r >= 1 )
    >> {
    >> int m = (l + r)/2;
    >> if(v == a[m]) return m;
    >> if(v < a[m]) r = m - 1;
    >> else l = m + 1;
    >> }
    >>
    >> return -1;
    >> }



    > The termination criterion looks completely wrong. Are
    > you sure you've transcribed the code correctly?


    Well, the condition inside while loop was typed incorrectly its (r >= l),
    its l (Ell) not 1 (one). They are so confusing to me all the time. If you
    look at the code I have posted in CLC and CLC++ in last 2 years, you will
    never see me using l as a variable name.




    > No: You can choose to learn things by rote and not bother
    > with understanding them. Carry your grimoire everywhere, and hope you
    > never encounter a situation that's just a little bit different,
    > something that isn't written down for you in exactly the form you
    > encounter it.


    > ...SNIP....


    > Have you ever noticed that practical Quicksort implementations
    > often switch to insertion sort or something of the kind when the
    > sub-files get short enough? Why would they abandon the O(N*logN)
    > partition-exchange method in favor of an O(N*N) method? Are they just
    > being perverse -- or is there something you've failed to notice?


    Yes, I failed to notice that. Its a new insight you told me, I can never
    deduce this.



    >> So should I go with reading all the maths or just go to some random
    >> chapter and start working on an algorithm in C ?


    > It's your choice: memorize, or comprehend. Each approach has
    > its advantages and disadvantages.


    Well, I am willing to buy a very basic book ("Higher Algebra" by Hall &
    Knight) and learn Maths. Trouble is I still don't get how math is
    connected with finding what kind of loop in C will take more CPU time or
    if algebra fits in finding which piece of C code will make efficient use
    of CPU, then from where I need to start to understand ? C^2, C(Base 2^n +
    1) or log N.

    Math always seemed incomprehensible to me. It looks like Newton's Law of
    Gravitation to me. Before discovered the law of gravitation, apple still
    fell on the ground, gravitation existed before Newton saw the apple
    falling. Same I feel for math, I mean the fundamentals that math explains
    exist without maths. I am not arguing, I am just trying to understand.






    --
    www.lispmachine.wordpress.com
    my email is @ the above blog.
     
    arnuld, Sep 9, 2009
    #8
  9. arnuld

    arnuld Guest

    > On Wed, 09 Sep 2009 04:49:25 +0000, arnuld wrote:
    > Math always seemed incomprehensible to me. It looks like Newton's Law of
    > Gravitation to me. Before discovered the law of gravitation, apple still
    > fell on the ground, gravitation existed before Newton saw the apple
    > falling. Same I feel for math, I mean the fundamentals that math
    > explains exist without maths. I am not arguing, I am just trying to
    > understand.



    What I exactly meant why I shouldn't understand those fundamentals which
    exist before Math. We are writing code, C or Lisp or whatever, we are
    doing programming which is a different skill from Maths.




    --
    www.lispmachine.wordpress.com
    my email is @ the above blog.
     
    arnuld, Sep 9, 2009
    #9
  10. Richard Heathfield <> writes:
    [...]
    > 1, 2, 3, 4, 5, 6, 7, 8, 9, ..... 1048573, 1048574, 1048575.
    >
    > To find the key 1 via binary search will take 20 comparisons, whereas
    > it will only take 1 comparison via sequential search.
    >
    > Nevertheless, *on average* binary search is much, much quicker. The
    > average sequential search will take half a million comparisons for
    > the above list, whereas a binary search on the above list (assuming
    > it is a straightforward sorted linear list, that is) will never take
    > more than 20 comparisons.


    The average *successful* search will take half a million comparisons.
    An unsuccessful search will take a million. The number of comparisons
    required for an average search depends on how many of them are
    successful (and on the distribution of the values successfully
    searched for).

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 9, 2009
    #10
  11. arnuld

    Phil Carmody Guest

    arnuld <> writes:
    > This code is from chapter 2 of Algorithms in C:
    >
    > Sequential Search:
    >
    > int search( int a[], int v, int l, int r)
    > {
    > int i;
    > for(i = 1; i <= r; i++)


    Don't learn Algorithms in C.

    Learn algoritms. Implement them in C later.

    Don't learn Numerical Recipes in C either.

    Phil
    --
    Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1
     
    Phil Carmody, Sep 9, 2009
    #11
  12. arnuld

    James Kuyper Guest

    arnuld wrote:
    ....
    > Well, I am willing to buy a very basic book ("Higher Algebra" by Hall &
    > Knight) and learn Maths. Trouble is I still don't get how math is
    > connected with finding what kind of loop in C will take more CPU time or
    > if algebra fits in finding which piece of C code will make efficient use
    > of CPU, then from where I need to start to understand ? C^2, C(Base 2^n +
    > 1) or log N.


    It won't help you find that loop; it will help you identify it as such,
    once you've found it. Looking for an apple, without knowing how to
    identify it as an apple when you find it, is pretty difficult; the same
    is true of efficient algorithms.

    > Math always seemed incomprehensible to me. It looks like Newton's Law of
    > Gravitation to me. Before discovered the law of gravitation, apple still
    > fell on the ground, gravitation existed before Newton saw the apple
    > falling. Same I feel for math, I mean the fundamentals that math explains
    > exist without maths. I am not arguing, I am just trying to understand.


    Math provides a powerful language for describing how the apple falls; it
    is that language that allows an understanding of gravity that is
    sufficiently detailed and precise to allow the design of complicated
    systems that either depend upon or have to cope with the effects of
    gravity.

    Math provides an equally powerful language for describing the complexity
    of algorithms; you'll never get really good at managing the complexity
    of the algorithms you design without having mastered enough of that
    language to describe them.

    If you can't master that language (and I've known many people who had a
    lot of trouble with it), you will thereby restrict your career path to
    those roles where mastery of that language is not required. Those are
    generally lower paid and less interesting than the ones where it is
    required.
     
    James Kuyper, Sep 9, 2009
    #12
  13. arnuld

    arnuld Guest

    > On Wed, 09 Sep 2009 10:39:31 +0000, Richard Heathfield wrote:

    >> In <>, Keith Thompson wrote:
    >> The average *successful* search


    > Er, quite so. I should have pointed that out. Thank you.


    By that you mean the worst case successful search (20th comparison) or a
    successful search which is not worst case( e.g. 10th, 15th or even 3rd
    comparison) ??




    --
    www.lispmachine.wordpress.com
    my email is @ the above blog.
     
    arnuld, Sep 9, 2009
    #13
  14. arnuld

    arnuld Guest

    > On Wed, 09 Sep 2009 06:22:32 -0400, James Kuyper wrote:

    > ... SNIP...


    > If you can't master that language (and I've known many people who had a
    > lot of trouble with it), you will thereby restrict your career path to
    > those roles where mastery of that language is not required.


    > Those are generally lower paid


    That is not a problem.


    > and less interesting than the ones where it is required.


    This is a very big problem. I was contemplating on looking at Knuth's
    volume 1 vs Hall & Knight. I will go with Hall and Knight.








    --
    www.lispmachine.wordpress.com
    my email is @ the above blog.
     
    arnuld, Sep 9, 2009
    #14
  15. arnuld

    arnuld Guest

    > On Wed, 09 Sep 2009 06:22:35 +0000, Richard Heathfield wrote:

    > ..SNIP...


    > You rightly say that the fundamentals that mathematics explains exist
    > without mathematics. Of course they do. But mathematics has proved to be
    > a tremendously useful explanatory tool. And, by facilitating a clear
    > visualisation of those fundamentals, it can offer fresh insights.


    Actually I have "fear of Maths" , fear because I think I am incapable of
    having a prodigal brain like Knuth or Sawyer or Tesla. I am just an
    average guy with an average brain. That algebra, integration takes life
    out of me.


    NOTE: But you Heathfield said once: Anyone who can wrote quality code
    has definitely more than average intelligence, while replying to one of
    my posts.



    > Now, you say in your original article that anyone with brains can
    > conclude that binary search on a sorted input (by which I take it you
    > mean a sorted list) is always cheaper than a sequential search. Well,
    > nice try but it isn't true. Here's the sorted list:
    >
    > 1, 2, 3, 4, 5, 6, 7, 8, 9, ..... 1048573, 1048574, 1048575.
    >
    > To find the key 1 via binary search will take 20 comparisons, whereas it
    > will only take 1 comparison via sequential search.


    Gosh!.. Why such a simple operation did not come to *my* mind ?





    --
    www.lispmachine.wordpress.com
    my email is @ the above blog.
     
    arnuld, Sep 9, 2009
    #15
  16. On 9 Sep, 09:52, Phil Carmody <> wrote:
    > arnuld <> writes:
    > > This code is from chapter 2 of Algorithms in C:

    >
    > > Sequential Search:

    >
    > > int search( int a[], int v, int l, int r)
    > > {
    > >   int i;
    > >   for(i = 1; i <= r; i++)

    >
    > Don't learn Algorithms in C.
    >
    > Learn algoritms. Implement them in C later.


    yes but some code is nice. I own the corresponding
    "Algorithms In Pascal" but I never bothered getting the C version.

    > Don't learn Numerical Recipes in C either.


    another CORTRAN (C being Fortran) book
     
    Nick Keighley, Sep 9, 2009
    #16
  17. arnuld

    Eric Sosman Guest

    arnuld wrote:
    >> On Tue, 08 Sep 2009 08:43:12 -0400, Eric Sosman wrote:
    >>> arnuld wrote:
    >>> [...]
    >>> So should I go with reading all the maths or just go to some random
    >>> chapter and start working on an algorithm in C ?

    >
    >> It's your choice: memorize, or comprehend. Each approach has
    >> its advantages and disadvantages.

    >
    > Well, I am willing to buy a very basic book ("Higher Algebra" by Hall &
    > Knight) and learn Maths. Trouble is I still don't get how math is
    > connected with finding what kind of loop in C will take more CPU time or
    > if algebra fits in finding which piece of C code will make efficient use
    > of CPU, then from where I need to start to understand ? C^2, C(Base 2^n +
    > 1) or log N.


    Most of the analysis you'll need to get along as a programmer
    won't be "higher," although some extremely deep questions occur
    now and then. For example, the simple-seeming problem of how to
    compute X to the N'th power for arbitrary large N is not (as far
    as I know) fully solved. For "practical" N, though, there's no
    difficulty -- even with only "lower" math you could always write
    yourself a program to search out the shortest path to a given
    non-huge N. (We'll suppose X is a big matrix or something, where
    multiplications cost enough to make the search worth while.)

    Recursions of the kind that you illustrated earlier arise
    just about every time you turn around, because divide-and-conquer
    is such a common solution technique. You wind up with a program
    that solves a big problem by dividing it into smaller sub-problems,
    solving them (perhaps by dividing them even further), and somehow
    stitching the individual solutions together into a Grand Answer.
    Merge sorting, many kinds of search procedures, some kinds of
    image manipulation ... You'll run into the divide-and-conquer
    approach very very often, and the answer is *not* always just
    "O(N log N)," and you'd better have some understanding of why.
    (True story: I once found an O(N*N*logN) sort in actual use in
    production code -- yes, folks, *worse* than Bubblesort! It turned
    out that the sort routine itself was O(N logN) as one would expect,
    but silly programmer had given it a comparison routine that took
    O(N) time *per* *call* ...) As the programmer writing the call
    to that searching routine or sorting routine, you'd better have
    at least some notion that an O(N) comparison would be a Bad Thing.

    > Math always seemed incomprehensible to me. It looks like Newton's Law of
    > Gravitation to me. Before discovered the law of gravitation, apple still
    > fell on the ground, gravitation existed before Newton saw the apple
    > falling. Same I feel for math, I mean the fundamentals that math explains
    > exist without maths. I am not arguing, I am just trying to understand.


    One use of math is to improve your models of phenomena. You
    start with "Oh, that apple fell" and later "Oh, another apple
    fell" and after while you generalize to "Apples fall," which is
    your model of the world. You've made progress, because you can
    now predict the behavior of apples everywhere without needing to
    go watch each one as it splats. (Whether your predictions are
    accurate is another matter having to do with how effective your
    model is; at the moment, we're just concerned with the modeling
    process.)

    Okay, so you've now formed a description that applies (you
    hope) to all apples: They fall. But are there other questions
    you might want answered? How about "How hard does the apple
    land?" which might be of interest if you were thinking about
    having a little sit-down beneath the tree, and wondering whether
    your little sit-down might become your eternal lie-down. You
    could experiment, placing friends under apple trees of various
    heights and observing who survived and who didn't, and from the
    data you might come up with ranges of tree heights labeled
    "Safe," "Iffy," and "Send Flowers."

    But if you haven't got enough friends willing to risk their
    crania for the sake of your experiments, you're going to need
    another way to figure out how hard the apple hits. And that's
    where the mathematical model enters the picture. Newton is not
    immortal because he saw that apples fall (if indeed he did; the
    story may be apocryphal), but because he invented a mathematical
    model that described their falling. Thanks to Newton, you now
    need only two numbers to predict how hard an apple hits: Its
    mass, and the height it drops from. Now, you can assess the
    safety of any particular apple tree without sacrificing a whole
    bunch of ex-friends first. (Again, we focus on the making of
    the model and not on its accuracy: Newton's predictions would
    be pretty badly in error for an apple tree ten miles tall,
    since air resistance would come into play. But models are like
    that: Their predictions are "good enough" over "practical ranges"
    and the models keep on working until greater accuracy and/or
    wider ranges are needed.)

    Okay; that's my idea of one reason why mathematics can be a
    useful description of a natural phenomenon, even though the
    phenomenon occurs with no mathematicians in the vicinity. Take
    it or leave it. But I do not think you'll make it as a programmer
    unless you have at least *some* grasp of mathematics. You will
    *need* to be able to count sets and combinations of various kinds.
    You will *need* to make choices based on probabilistic arguments.
    You may or may not ever need to calculate a contour integral,
    but there's a nugget of mathematics you just can't do without.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Sep 9, 2009
    #17
  18. arnuld

    Eric Sosman Guest

    Keith Thompson wrote:
    > Richard Heathfield <> writes:
    > [...]
    >> 1, 2, 3, 4, 5, 6, 7, 8, 9, ..... 1048573, 1048574, 1048575.
    >>
    >> To find the key 1 via binary search will take 20 comparisons, whereas
    >> it will only take 1 comparison via sequential search.
    >>
    >> Nevertheless, *on average* binary search is much, much quicker. The
    >> average sequential search will take half a million comparisons for
    >> the above list, whereas a binary search on the above list (assuming
    >> it is a straightforward sorted linear list, that is) will never take
    >> more than 20 comparisons.

    >
    > The average *successful* search will take half a million comparisons.
    > An unsuccessful search will take a million. The number of comparisons
    > required for an average search depends on how many of them are
    > successful (and on the distribution of the values successfully
    > searched for).


    Since the array being searched is sorted, the average for
    an unsuccessful search is also about half a million.

    Actually, given the array as shown both the successful
    and unsuccessful searches are O(1):

    if (x < 1 || x > 1048575) return NOT_FOUND;
    return array[x-1];

    :) (Nor is this an entirely whimsical quibble: Consider the
    code typically generated for a `switch' with a "dense" set of
    `case' labels ...)

    --
    Eric Sosman
    lid
     
    Eric Sosman, Sep 9, 2009
    #18
  19. arnuld

    Eric Sosman Guest

    arnuld wrote:
    >> On Wed, 09 Sep 2009 10:39:31 +0000, Richard Heathfield wrote:

    >
    >>> In <>, Keith Thompson wrote:
    >>> The average *successful* search

    >
    >> Er, quite so. I should have pointed that out. Thank you.

    >
    > By that you mean the worst case successful search (20th comparison) or a
    > successful search which is not worst case( e.g. 10th, 15th or even 3rd
    > comparison) ??


    Aha! Here's a perfect example of the kind of "Why
    mathematics matters" issue you asked about earlier. No
    search, successful or unsuccessful, will take more than
    twenty comparisons. Some "lucky" successful searches
    will take fewer, by happening to hit on the desired key
    while the "active" interval is still relatively wide.
    One particularly lucky successful search will find its
    key on the very first comparison! How does the "luck
    factor" skew the average? Is the effect significant, or
    is it negligible? *That's* the kind of question that faces
    a programmer often and often again. Care to have a go?


    S

    P

    O

    I

    L

    E

    R



    S

    P

    A

    C

    E


    One way to tackle the problem is to set up one of those
    recursions you seem to dread. Let's work "forward," following
    the comparisons as they're made. The very first comparison in
    an array of N elements lands on the desired key with probability
    1/N. Otherwise (with probability 1-1/N) it misses, but leaves
    you with only (N-1)/2 array locations still to be searched.
    Letting C(N) be the average number of comparisons to find a
    key in an array of size N (we're assuming a successful search),
    we've got

    C(N) = 1 * (1/N) + C((N-1)/2) * (1 - 1/N)

    .... and if you can solve this recursion, you're all set. (The
    decision of whether to start with C(1)=0 or C(1)=1 depends on
    how sure you are that the key is actually present.)

    By the way, the recursion I've shown is not quite right,
    as it glosses over one crucial detail. Can you spot it? (Hint:
    is N even or odd?)

    Another way to attack the problem is to visualize the search
    as a binary tree. The root node is the first array slot you
    compare against. The left sub-tree is rooted at the slot you
    check second if the original comparison comes out <, and the right
    sub-tree starts at the second-compared slot if the first comparison
    comes out >, and so on through the third, ..., twentieth comparison.
    There are N nodes in the tree altogether. The number of comparisons
    you make to arrive at any particular node K is one plus K's distance
    from the root of the tree. So the average number of comparisons
    is the tree's "internal path length," divided by N, plus one. If
    you know something about the mathematics of trees, you are once
    again all set.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Sep 9, 2009
    #19
  20. Eric Sosman <> writes:
    > Keith Thompson wrote:
    >> Richard Heathfield <> writes:
    >> [...]
    >>> 1, 2, 3, 4, 5, 6, 7, 8, 9, ..... 1048573, 1048574, 1048575.
    >>>
    >>> To find the key 1 via binary search will take 20 comparisons,
    >>> whereas it will only take 1 comparison via sequential search.
    >>>
    >>> Nevertheless, *on average* binary search is much, much quicker. The
    >>> average sequential search will take half a million comparisons for
    >>> the above list, whereas a binary search on the above list (assuming
    >>> it is a straightforward sorted linear list, that is) will never
    >>> take more than 20 comparisons.

    >>
    >> The average *successful* search will take half a million comparisons.
    >> An unsuccessful search will take a million. The number of comparisons
    >> required for an average search depends on how many of them are
    >> successful (and on the distribution of the values successfully
    >> searched for).

    >
    > Since the array being searched is sorted, the average for
    > an unsuccessful search is also about half a million.
    >
    > Actually, given the array as shown both the successful
    > and unsuccessful searches are O(1):

    [snip]

    Conclusion: The cost of a search depends critically on the
    assumptions you can make about both the key and the contents of
    the array.

    Another case we both missed: If we know the array is sorted but
    we can't assume the values are contiguous (i.e., we really have to
    search), then a search for an element outside the range of stored
    values can be O(1); if the stored values range from 1 to 1 million,
    it'd doesn't take long to determine that 1 billion doesn't appear.
    On the other hand, pre-checking the upper bound imposes a non-zero
    cost on *all* searches, which might not be worth it; we might save
    time overall by searching all 1 million elements for the value 1
    billion if it makes most searches very slighly faster.

    Even linear searches can be non-trivial.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 9, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. SK
    Replies:
    1
    Views:
    330
    llewelly
    Sep 16, 2003
  2. Alex Vinokur
    Replies:
    0
    Views:
    996
    Alex Vinokur
    Aug 30, 2004
  3. entropy123

    Help: Algorithms in C (Sedgewick) "Graph.h"

    entropy123, Jul 29, 2003, in forum: C Programming
    Replies:
    1
    Views:
    615
    Martijn
    Jul 29, 2003
  4. ben
    Replies:
    0
    Views:
    467
  5. ben
    Replies:
    4
    Views:
    470
Loading...

Share This Page