Whither binary search?

N

Neil Cerutti

I can't find the mystic search glob that will find a binary
search function in the Python documentation.

Am I searching in vain, or do I need better search-fu?
 
N

Neil Cerutti

bisect...

That doesn't tell me if an item doesn't exist in the sequence
though, does it? Maybe I'm being dense.

My guess nobody keeps their sequences sorted in Python. ;)
 
F

Fredrik Lundh

Neil said:
That doesn't tell me if an item doesn't exist in the sequence
though, does it? Maybe I'm being dense.

I guess you could use something like

import bisect

def check(list, item):
try:
return list[bisect.bisect_left(list, item)] == item
except IndexError:
return False

but unless your lists are really large, or your objects are expensive to
compare, you could probably just use "in" instead.
> My guess nobody keeps their sequences sorted in Python.

well, people tend to use dictionaries when they need to look things up
quickly...

</F>
 
S

skip

Neil> That doesn't tell me if an item doesn't exist in the sequence
Neil> though, does it? Maybe I'm being dense.

Neil> My guess nobody keeps their sequences sorted in Python. ;)

Sorry, I was on the phone as I replied, and was only typing with my left
hand, so my response was intentionally very brief. ;-)

If you use bisect to do your insertions, your lists remain sorted. And as
for searching, all you have to do is look to the left or the right of the
insertion point (depending on which search method you call) to determine if
the item you're searching for is in the list.

Skip
 
J

John Machin

Fredrik said:
Neil said:
That doesn't tell me if an item doesn't exist in the sequence
though, does it? Maybe I'm being dense.

I guess you could use something like

import bisect

def check(list, item):
try:
return list[bisect.bisect_left(list, item)] == item
except IndexError:
return False

but unless your lists are really large, or your objects are expensive to
compare, you could probably just use "in" instead.
My guess nobody keeps their sequences sorted in Python.

well, people tend to use dictionaries when they need to look things up
quickly...

.... like those paper dictionaries with the words in alphabetical order
:)

A common use case for looking things up in a sorted list is an
in-memory table of dates and rates (interest rates, insurance premium
rates, fee rates). In a big batch job, this sure beats doing "select
rate from table where ? between low_date and high_date" each time you
need a rate.

Cheers,
John
 
S

Sion Arrowsmith

John Machin said:
... like those paper dictionaries with the words in alphabetical order
:)

.... where you'll notice that the really big ones are divided up into
buckets (which just happen to be keyed on initial letter).

Truth is that humans are lot better than computers at general
insertion sort, and its lookup equivalent, whereas computers are much
better at calculating hashes. So we each use a dictionary
implementation that plays to our strengths.
 
N

Neil Cerutti

... where you'll notice that the really big ones are divided up
into buckets (which just happen to be keyed on initial letter).

Truth is that humans are lot better than computers at general
insertion sort, and its lookup equivalent, whereas computers
are much better at calculating hashes. So we each use a
dictionary implementation that plays to our strengths.

Thanks all, for the help.

In the end, perhaps prophetically, it turned out my data wasn't
really fully sorted after all. I put together a short function to
translate my raw data into a dictionary, and that was the end of
my problem.
 
J

John Machin

Sion said:
... where you'll notice that the really big ones are divided up into
buckets (which just happen to be keyed on initial letter).

And languages whose scripts don't have letters use other bucket keys
like pronunciation or (radical, stroke_count). In any case, the
buckets are printed in key order. The bucket index, if printed, is
itself sorted.
Truth is that humans are lot better than computers at general
insertion sort, and its lookup equivalent, whereas computers are much
better at calculating hashes. So we each use a dictionary
implementation that plays to our strengths.

Hashes are fine for simplistic applications.
| >>> hash('initialise') == hash('initialize')
| False
but those two are not very far apart in a sorted list.

Cheers,
John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top