Re: Binary Search

Discussion in 'Java' started by Gene, Apr 8, 2011.

1. GeneGuest

This is wrong. It depends on how long comparisons take. If comparing two elements requires 1 minute, and the list has 20 elements, then the binary search runs in no more than 5 minutes plus a few microseconds, and the linear search runs in no more than 20 minutes plus a few microseconds. So you clearly want the binary search.

Gene, Apr 8, 2011

2. LewGuest

On 04/07/2011 11:29 PM, Gene wrote:
> This is wrong. It depends on how long comparisons take. If comparing two elements requires 1 minute, and the list has 20 elements, then the binary search runs in no more than 5 minutes plus a few microseconds, and the linear search runs in no more than 20 minutes plus a few microseconds. So you clearly want the binary search.

What is wrong? Your message shows up with no context, disconnected from any

Please do not email Usenet posts to
"". It's nonsensical because Usenet
forums are not part of Google Groups, and posting that way doesn't work very
well. Do not set a separate "reply-to". Just post to the newsgroup.

--
Lew
Honi soit qui mal y pense.

Lew, Apr 8, 2011

3. Eric SosmanGuest

On 4/7/2011 11:29 PM, Gene wrote:
> This is wrong. It depends on how long comparisons take. If comparing two elements requires 1 minute, and the list has 20 elements, then the binary search runs in no more than 5 minutes plus a few microseconds, and the linear search runs in no more than 20 minutes plus a few microseconds. So you clearly want the binary search.

There's not much to go on, but I guess you're comparing a binary
search against a linear search in a twenty-element array. Okay, fine:
The binary search takes <=5 minutes and the linear search takes <=20.
But you can't even *start* to use binary search until you've sorted
the list, which takes about 20*lg(20/2) ~= 66+ comparisons, slightly
over an hour. By the time you've completed *one* binary search you've
spent 71 minutes; the linear search has spent 20. After two searches
binary has used 76 minutes and linear has used 40. After three, binary
is up to 81 and linear is at 60. Four searches, and linear is still
faster: 86 vs. 80. It's not until the fifth search that binary finally
overcomes its startup cost and starts to pull ahead.

"Well, *of course* there will be billyuns and billyuns of searches,"
you say. Perhaps, but you left that "given" out, didn't you? You also
left out the "given" that the search keys are not equiprobable; in
fact, one single key accounts for 90% of all searches, another for 9%,
and the remaining eighteen for just 1% -- and the linear list uses
move-to-front. (It's right there in the third paragraph of the problem
statement, right after the part about the billyuns.) So in fact the
linear list takes one-and-a-fraction comparisons per search; let's be
generous and say it averages 90 seconds -- *far* faster than a five-
minute binary search.

In other words: Cost comparisons without context are silly.

--
Eric Sosman
d

Eric Sosman, Apr 8, 2011
4. Lars EnderinGuest

2011-04-08 06:01, Lew skrev:
> On 04/07/2011 11:29 PM, Gene wrote:
>> This is wrong. It depends on how long comparisons take. If comparing
>> two elements requires 1 minute, and the list has 20 elements, then the
>> binary search runs in no more than 5 minutes plus a few microseconds,
>> and the linear search runs in no more than 20 minutes plus a few
>> microseconds. So you clearly want the binary search.

>
> What is wrong? Your message shows up with no context, disconnected from
> any thread, so it's impossible to know about what you speak.
>
> Please do not email Usenet posts to
> "". It's nonsensical because
> Usenet forums are not part of Google Groups, and posting that way
> doesn't work very well. Do not set a separate "reply-to". Just post to
> the newsgroup.

Gene used a new interface to Google Groups, which allegedly looks better
in GG, but clobbers "References". He should revert to the older
interface when interacting with Usenet.

Lars Enderin, Apr 8, 2011
5. GeneGuest

On Apr 8, 12:46 am, Eric Sosman <> wrote:
> On 4/7/2011 11:29 PM, Gene wrote:
>
> > This is wrong.  It depends on how long comparisons take.  If comparing two elements requires 1 minute, and the list has 20 elements, then thebinarysearchruns in no more than 5 minutes plus a few microseconds, and the linearsearchruns in no more than 20 minutes plus a few microseconds.  So you clearly want thebinarysearch.

>
>      There's not much to go on, but I guess you're comparing abinarysearchagainst a linearsearchin a twenty-element array.  Okay, fine:
> Thebinarysearchtakes <=5 minutes and the linearsearchtakes <=20.
> But you can't even *start* to usebinarysearchuntil you've sorted
> the list, which takes about 20*lg(20/2) ~= 66+ comparisons, slightly
> over an hour.  By the time you've completed *one*binarysearchyou've
> spent 71 minutes; the linearsearchhas spent 20.  After two searchesbinaryhas used 76 minutes and linear has used 40.  After three,binary
> is up to 81 and linear is at 60.  Four searches, and linear is still
> faster: 86 vs. 80.  It's not until the fifthsearchthatbinaryfinally
> overcomes its startup cost and starts to pull ahead.
>
>      "Well, *of course* there will be billyuns and billyuns of searches,"
> you say.  Perhaps, but you left that "given" out, didn't you?  You also
> left out the "given" that thesearchkeys are not equiprobable; in
> fact, one single key accounts for 90% of all searches, another for 9%,
> and the remaining eighteen for just 1% -- and the linear list uses
> move-to-front.  (It's right there in the third paragraph of the problem
> statement, right after the part about the billyuns.)  So in fact the
> linear list takes one-and-a-fraction comparisons persearch; let's be
> generous and say it averages 90 seconds -- *far* faster than a five-
> minutebinarysearch.

Friend, clearly you are an angry person. I'm sorry the interface
I mistakenly used did not quote Lew's remark. I was just trying
to make the point that asymptotic complexity _can_ matter for small N.
I run into needs to search a pre-sorted table all the time: Keyword
lookup in language processors, checking magic numbers, the inverse
CDF
method of generating random numbers from empirical distributions,
finding a
prime approximate multiple of a hash table size in order to expand it,
etc, etc. In these cases, binary search has always turned out to be
both
faster than linear search and so easy to code that using a linear
search
for momentary convenience would have been lazy and un-craftsmanlike.

Gene, Apr 9, 2011