binary search

T

Travis Vitek

Hello,

I'm trying to do simple implementation of 'binary search' algorithm. After
having readhttp://en.wikipedia.org/wiki/Binary_searchI wrote this and got
stuck:


static int b_search(const double data[], double value, double low, double
high)
{
int mean;

if (high < low)
return -1;

mean = (low + high) / 2;

if (data[mean] > value)
return b_search(data, value, low, mean - 1);
else if (data[mean] < value)
return b_search(data, value, mean + 1, high);
else
return mean;

}

int main(void)
{
b_search(power, -97.0, -100.0, -85.0);

return 0;

}

It's compiled correctly, but I doubt I'm passing array as parameter in a
right way (what is more correct?). And second, considering that values in
array are negative, 'mean' is calculated wrong ((-100 + (-85)) / 2 = -92.5),
or I'm not understanding the algorithm?

Well, you've got the right idea. The problem is that `low' and `high'
are not the low and high values in the array, but the indexes that
bound the part of the array array you are searching. This should get
you started...

static int b_search(const double data[], double value, int low, int
high);

int main(void)
{
int where;

where = b_search(power, -97.0, 0, MAX);
return 0;
}

There are other errors in your code, but I'll let you find them.

Travis
 
C

CBFalconer

Roman said:
I'm trying to do simple implementation of 'binary search' algorithm.
After having read http://en.wikipedia.org/wiki/Binary_search I
wrote this and got stuck:

#include <stdio.h>
#include <stdlib.h>

static const double power[MAX] = {
-85.0, -86.0, -87.0, -88.0, -89.0, -90.0, -91.0, -92.0,
-93.0, -94.0, -95.0, -96.0, -97.0, -98.0, -99.0, -100.0
};

To start with, you have never defined MAX. Didn't your compiler
complain?
static int b_search(const double data[], double value,
double low,
double high)
{

You haven't defined the meaning of your parameters very well, and
the result has bitten you. low and high should be indices into the
power array, specifying what to search.
int mean;

if (high < low)
return -1;

mean = (low + high) / 2;

if (data[mean] > value)
return b_search(data, value, low, mean - 1);
else if (data[mean] < value)
return b_search(data, value, mean + 1, high);
else
return mean;

This should be "return data[mean];".
} /* b_search */

I suggest you always put a clear delimiter between functions. I
use a blank line, a comment line, and another blank line. The
comment is always "/*--------------------*/". I also comment the
final '}' of the function with its name, as above. Makes searching
the source for something much easier.
int main(void)
{
b_search(power, -97.0, -100.0, -85.0);
^ ...... ^
I think these want to be 0 and 15, and the parameter types should
be size_t.
return 0;
}

It's compiled correctly, but I doubt I'm passing array as parameter
in a right way (what is more correct?). And second, considering
that values in array are negative, 'mean' is calculated wrong
((-100 + (-85)) / 2 = -92.5), or I'm not understanding the algorithm?

The latter, causing the first.
 
K

Kaz Kylheku

This is my last version, and I tried a non-recursive algorithm:

#include <stdio.h>
#include <stdint.h>

#define MAX 16

/* array of power values */
static const double power[MAX] = {
-100.0, -99.0, -98.0, -97.0, -96.0, -95.0, -94.0, -93.0,
-92.0, -91.0, -90.0, -89.0, -88.0, -87.0, -86.0, -85.0

};

static double b_search(const double data[], double value)
{
uint16_t low = 0;
uint16_t high = MAX - 1;
uint16_t mean;

while (low <= high) {
mean = (low + high) / 2;
if (data[mean] > value)
high = mean - 1;
else if (data[mean] < value)
low = mean + 1;
else
return data[mean];
}
return -1.0;

}

int main(void)
{
double res = b_search(power, -86.0);
return 0;

}

Is it fine now?

Not really.

Your function makes unwarranted assumptions about the size of the
array which is passed to it. It so happens that the call in main
passes in the power array, which has MAX elements. But the function
allows any arbitrary array to be passed, which might have fewer or
more elements, creating an opportunity for error. If the array is
passed as an argument, then the size should also be passed as an
additional argument.

Another problem is that you have an exact equality test on floating
point numbers. What if you are searching for 3.0, but the data
contains 2.999999999999997? Maybe that's a match, maybe not. Floating
point numbers are inexact, and numerical algorithms based on floating
point numbers produce (when well-behaved!) only approximations to the
ideal mathematical results.

Third, it's better to return the position of the item in the array,
rather than the value. If the caller knows the position, it can
retrieve the value at that position easily. In fact, the caller
already knows the value; it's what is being searched for. Don't return
useless information to the caller. By returning a value, you have
created an ambiguity, also: what if the caller wants to search for the
value -1.0? If the value is found, the result is -1.0. If it is not
found, the result is also -1.0. If you return the index, this isn't a
problem, because array indices are non-negative, and so the -1 return
value cannot be misunderstood as a successful search.

Fourth, don't use the uint16_t type for array indexing; a plain int
will do the job. A uint16_t is used in situations when you need a type
that is exactly 16 bits wide; this is not such a situation. And this
type might not be available.

Also, you might want to write the binary search in a more generic way,
so that it can be applied to arrays regardless of their element type.
A function which can only search arrays whose element type is double
is not as useful as one which can handle arrays of any type. Take a
look at the standard C functions bsearch and qsort for an idea of one
possible way of doing it. Another way is to write a macro that
expands to a function. One of the parameters to the macro is the type.
Something along these lines:

#define MAKE_BINSEARCH_FUNCTION(NAME, TYPE, LT_TEST, EQ_TEST) \
int NAME(TYPE array[], int size, TYPE value) \
{ \
int low = 0, high = size - 1; \
while (low <= high) { \
int mid = (low + high) / 2; \
TYPE element = array[mid]; \
if (LT_TEST) \
low = mid + 1; \
else if (EQ_TEST) \
return mid; \
else \
high = mid - 1; \
} \
return -1; \
}

Now the above macro can write functions for you:

MAKE_BINSEARCH_FUNCTION(binsearch_int, int,
element < value, element == value)

This second example reveals a weakness: strcmp is invoked twice. Maybe
a good compiler will optimize that for us.

MAKE_BINSEARCH_FUNCTION(binsearch_str, char *,
strcmp(element, value) < 0,
strcmp(element, value) == 0)

How about one over an array of doubles, where equality means that the
two values are close within an epsilon of 0.000001:

#include <math.h>

MAKE_BINSEARCH_FUNCTION(binsearch_dbl_epsilon, double,
element < value - 0.000001,
element >= value - 0.000001 &&
element <= value + 0.000001)

This also reveals a weakness: our macro is not flexible enough to
allow the epsilon to be a parameter to the generated function. If you
want a different epsilon, you have to generate a different function.

Test (for string variant):

int main(void)
{
char *str_array[] = {
"apple", "bee", "xylophone"
};

int s1 = binsearch_str(str_array, 3, "apple");
int s2 = binsearch_str(str_array, 3, "bee");
int s3 = binsearch_str(str_array, 3, "xylophone");
int s4 = binsearch_str(str_array, 3, "aardvark");
int s5 = binsearch_str(str_array, 3, "cruft");
int s6 = binsearch_str(str_array, 3, "zebra");

printf("%d %d %d %d %d %d\n", s1, s2, s3, s4, s5, s6);

return 0;
}

Output:

0 1 2 -1 -1 -1


One final comment: a useful variation on binary search returns, if the
element is not found, the array index where the element should be
inserted, such that the sorted order is maintained. Adjusting the
algorithm for this requirement is a little bit tricky!

You have to think about the interface a little bit, too. Of course,
non-negative return values already indicate a successful search. You
can return the additional value by means of an ``int *'' parameter.
Or, you can encode it as a negative return value. For instance the
return value -1 can mean that the search value can be inserted before
position [0] of the array; -2 can mean that the value should be
inserted before position 1 and so on. The caller simply has to check
the return value for negativity, and then to obtain the insertion
position, flip the sign and subtract one.
 
R

Roman Mashak

Hello,

I'm trying to do simple implementation of 'binary search' algorithm. After
having read http://en.wikipedia.org/wiki/Binary_search I wrote this and got
stuck:

#include <stdio.h>
#include <stdlib.h>

static const double power[MAX] = {
-85.0, -86.0, -87.0, -88.0, -89.0, -90.0, -91.0, -92.0,
-93.0, -94.0, -95.0, -96.0, -97.0, -98.0, -99.0, -100.0
};

static int b_search(const double data[], double value, double low, double
high)
{
int mean;

if (high < low)
return -1;

mean = (low + high) / 2;

if (data[mean] > value)
return b_search(data, value, low, mean - 1);
else if (data[mean] < value)
return b_search(data, value, mean + 1, high);
else
return mean;
}

int main(void)
{
b_search(power, -97.0, -100.0, -85.0);

return 0;
}

It's compiled correctly, but I doubt I'm passing array as parameter in a
right way (what is more correct?). And second, considering that values in
array are negative, 'mean' is calculated wrong ((-100 + (-85)) / 2 = -92.5),
or I'm not understanding the algorithm?

Please clarify these points. Thanks.

With best regards, Roman Mashak. E-mail: (e-mail address removed)
 
R

Roman Mashak

Hello, CBFalconer!
You wrote on Thu, 29 Nov 2007 00:13:00 -0500:

[skip]
Thank you for your suggestions.

??>> int main(void)
??>> {
??>> b_search(power, -97.0, -100.0, -85.0);
C> ^ ...... ^
C> I think these want to be 0 and 15, and the parameter types should
C> be size_t.

Do you mean 'low' and 'high' parameters should have 'size_t' type?

??>> return 0;
??>> }
??>>
??>> It's compiled correctly, but I doubt I'm passing array as parameter
??>> in a right way (what is more correct?). And second, considering
??>> that values in array are negative, 'mean' is calculated wrong
??>> ((-100 + (-85)) / 2 = -92.5), or I'm not understanding the algorithm?

C> The latter, causing the first.

This is my last version, and I tried a non-recursive algorithm:

#include <stdio.h>
#include <stdint.h>

#define MAX 16

/* array of power values */
static const double power[MAX] = {
-100.0, -99.0, -98.0, -97.0, -96.0, -95.0, -94.0, -93.0,
-92.0, -91.0, -90.0, -89.0, -88.0, -87.0, -86.0, -85.0
};

static double b_search(const double data[], double value)
{
uint16_t low = 0;
uint16_t high = MAX - 1;
uint16_t mean;

while (low <= high) {
mean = (low + high) / 2;
if (data[mean] > value)
high = mean - 1;
else if (data[mean] < value)
low = mean + 1;
else
return data[mean];
}
return -1.0;
}

int main(void)
{
double res = b_search(power, -86.0);
return 0;
}

Is it fine now?

With best regards, Roman Mashak. E-mail: (e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top