faster solution

kizzienova · Apr 6, 2009

Dear all,

The following piece of code (a search algorithm) has to be evaluated
for more than 5e9 times in my program. Therefore, I'd like to obtain
te fastest implementationpossible. Can anyone help me improving this
piece of code?
Thanks,
Kizzie

int search (int ascnd, int ju, int jl, double xi, double *xpp, int P)
{
int jm, j;
while (ju-jl > 1)
{ /* if we are not yet done */
jm = (ju+jl) >> 1; /* compute a midpoint */
if (xi >= xpp[jm] == ascnd) {jl = jm; /* and replace
either the lower limit */ }
else {ju = jm; /* or the upper limit, as appropriate
*/ }
} /* repeat until the test condition is satisfied. */
if (xi == xpp[1]) j = 1;
else if (xi == xpp[P]) j = P-1;
else j = jl;
return j;
}

kizzienova · Apr 6, 2009

Dear Eric,

Thanks for your help!

Indeed, as you said, it is a binary search algorithm (based on the one
in Numerical recipes in C).
The aim of the code is to search the location of a set of points xi in
an array xpp with length P. The values of xi are scattered over the
array xpp for every call of the function. The array xpp stays the same
for every point xi (so for every call of the search function).
Over the time, the xi values change. I tried to speedup the code by
utilizing former information about the position of the points xi in
previous time steps, but it seems the overhead by extra calculations
cancelled out the faster convergence of the search algorithm.

cheers,
Kizzie

user923005 · Apr 6, 2009

Dear Eric,

Thanks for your help!

Indeed, as you said, it is a binary search algorithm (based on the one
in Numerical recipes in C).
The aim of the code is to search the location of a set of points xi in
an array xpp with length P. The values of xi are scattered over the
array xpp for every call of the function. The array xpp stays the same
for every point xi (so for every call of the search function).
Over the time, the xi values change. I tried to speedup the code by
utilizing former information about the position of the points xi in
previous time steps, but it seems the overhead by extra calculations
cancelled out the faster convergence of the search algorithm.

If it is an equality search, you could switch to a hash table which
would be faster: O(1) for a single search.
If it is a range search, then any solution will be at best: O(log(n))
for any search.

lndresnick · Apr 6, 2009

Dear all,

The following piece of code (a search algorithm) has to be evaluated
for more than 5e9 times in my program. Therefore, I'd like to obtain
te fastest implementationpossible. Can anyone help me improving this
piece of code?
Thanks,
Kizzie

int search (int ascnd, int ju, int jl, double xi, double *xpp, int P)
{
int jm, j;
while (ju-jl > 1)
{ /* if we are not yet done */
jm = (ju+jl) >> 1; /* compute a midpoint */
if (xi >= xpp[jm] == ascnd) {jl = jm; /* and replace
either the lower limit */ }
else {ju = jm; /* or the upper limit, as appropriate
*/ }
} /* repeat until the test condition is satisfied. */
if (xi == xpp[1]) j = 1;
else if (xi == xpp[P]) j = P-1;
else j = jl;
return j;

}

The discussions with Eric on improving the algorithm (if possible) are
no doubt the best way to go. But have you tried the "cheap" way to
get better performance -- turning your compiler's optimization
settings to maximum?

BTW, I'd parenthesize this for clarity:
xi >= xpp[jm] == ascnd

-David

Bartc · Apr 6, 2009

Dear all,

The following piece of code (a search algorithm) has to be evaluated
for more than 5e9 times in my program. Therefore, I'd like to obtain
te fastest implementationpossible. Can anyone help me improving this
piece of code?
Thanks,
Kizzie

int search (int ascnd, int ju, int jl, double xi, double *xpp, int P)
{
int jm, j;
while (ju-jl > 1)
{ /* if we are not yet done */
jm = (ju+jl) >> 1; /* compute a midpoint */
if (xi >= xpp[jm] == ascnd) {jl = jm; /* and replace
either the lower limit */ }
else {ju = jm; /* or the upper limit, as appropriate
*/ }
} /* repeat until the test condition is satisfied. */
if (xi == xpp[1]) j = 1;
else if (xi == xpp[P]) j = P-1;
else j = jl;
return j;
}

You might need to look at the bigger picture.

Is it really necessary to call this 5 billion times?

You say it's a search, what sort of size data is being searched? (As it
might be worth changing the search method)

And, what sort of factor of speedup do you need? Is the bottleneck in this
routine? (Try calling it twice each time if practical, if it's much slower
then it probably is here)

user923005 · Apr 6, 2009

Eric Sosman said:

<snip>

Note that whether it is possible to obtain the wrong answer that
quickly depends heavily on your background. Physicists are unlikely
to be able to be wrong that fast (one microjiffy is 3e-35 seconds),
but computer scientists do have a fighting chance of being wrong in
a few times 4 nanoseconds (although the precise value depends on
the computer's clock).

Quantum physicists are even less likely than ordinary physicists to
be wrong quickly enough, but at least they can be sure of being
wrong about *one* thing (or, alternatively, another thing).

The quantum physicists can be simultaneously right and wrong.

A Bayesian theorist will become more and more right, as you give him
new information, but he will never be totally right (or wrong) until
after the event has passed.

A mathematician can fill up Hilbert's hotel with an infinity of wrong
answers, and then double the amount of wrong answers any time he
pleases.

CBFalconer · Apr 6, 2009

Eric said:
.... snip ...

Okay, you've answered part of what I asked. The other missing
piece is a description of *exactly* what the search function is
supposed to do. Yes, it's mostly a binary search -- but what are
those mysterious last few lines all about? What it the parameter
P supposed to signify (and are the comparisons to xpp[1] and
xpp[P] valid, or are they off-by-one errors)?

You also need to know the magnitude and characteristics of the lot
to be searched. This appears to be fixed for the run of the
program, and so those answers are likely to help selecting between
a binary search and a hashed store. The hash is O(1), and slightly
more complex. The binary is O(logN).

Experimenting with hashlib may give useful insight. See:

<http://cbfalconer.home.att.net/download/hashlib.zip>

user923005 · Apr 6, 2009

The quantum physicists can be simultaneously right and wrong.

A Bayesian theorist will become more and more right, as you give him
new information, but he will never be totally right (or wrong) until
after the event has passed.

A mathematician can fill up Hilbert's hotel with an infinity of wrong
answers, and then double the amount of wrong answers any time he
pleases.

I should mention that for the mathematician, despite doubling the
number of wrong answers, the total count won't actually get any
larger.

Francois Grieu · Apr 7, 2009

(e-mail address removed) wrote

> The following piece of code (a search algorithm) has to be evaluated
> for more than 5e9 times in my program. Therefore, I'd like to obtain
> the fastest implementation possible.
>
> int search (int ascnd, int ju, int jl, double xi, double *xpp, int P)
> {
> int jm, j;
> while (ju-jl > 1)
> { /* if we are not yet done */
> jm = (ju+jl) >> 1; /* compute a midpoint */
> if (xi >= xpp[jm] == ascnd) {jl = jm; /* and replace
> either the lower limit */ }
> else {ju = jm; /* or the upper limit, as appropriate
> */ }
> } /* repeat until the test condition is satisfied. */
> if (xi == xpp[1]) j = 1;
> else if (xi == xpp[P]) j = P-1;
> else j = jl;
> return j;
> }
>
> (this) is a binary search algorithm (based on the one
in Numerical recipes in C).
The aim of the code is to search the location of a set of points
xi in an array xpp with length P.

I guess (based on reminiscence that NRiC is a lasy translation from
Pascal) that the array is assumed to have P *USED* entries at
index 1 to P, which makes it an array of length P+1 from a C standpoint.
That, or the xi == xpp[P] fragment gives undefined behavior.

The array is probably sorted in ascending or descending order,
according to ascnd being 1 or 0.

I guessit is assumed that, when ascnd is 1
- on entry, 1<=jl<ju<=P, and xi is in range xpp[jl] to xpp[ju-1]
inclusive.
- at then end of the while loop, either xpp[jl]==xi, or
xpp[jl]<xi<xpp[jl+1].
Things are quite a bit less nice if the two ugly tests that
follow strike to make the result something else than jl, or if
ascnd is 0, but I think that on output we still have xi within
xpp[j] and xpp[j+1] inclusive.

Ideas for big optimizations:
- if the xi vary in such a way that the output j vary slowly, and jl/ju
do not already account for that, it can be put to good use by
choosing the first two jm as j-k and j+k clipped to range [jl+1..ju-2],
for some appropriate small k, and j the last result, which can be kept
by making j static.
- use float, or some integer type, instead of double, if that's precise
enough for the job
- if xpp[n] is somewhat linear, it could be put to use to compute a
better jm as a function of jl, ju, xi, xpp[jl], xpp[ju-1].

Ideas for clearer code and simultaneously small optimizations
- state the conditions on input precisely.
- get rid of ascnd if it is constant, or test it once at the beginning
and have two different branches of code; notice that when ascnd is 0,
the thest if (xi <= xpp[jm]) jl = jm is probably more sound,
though the result will be different in some cases.
- now state the while loop invariant and output condition.
- get rid of the mess at the end of the while loop, which should now
be much easier.

François Grieu

Eric Sosman · Apr 7, 2009

Richard said:
[...]
Quantum physicists are even less likely than ordinary physicists to
be wrong quickly enough, but at least they can be sure of being
wrong about *one* thing (or, alternatively, another thing).

As I understand it, quantum physicists cannot be either
right or wrong, but can only exist in a superposition of
rightness and wrongness. Even their vaunted ability to
calculate a wave function is largely hypothetical, giving
a pleasing uncertainty to their estimates of uncertainty.
(Vroomfondel was a philosopher, but he'd probably switched
majors from quantum physics.)

It all happens at transluminal speeds, though, which
makes it off-topic (or, rather, beyond-topic) for c.l.c.

Guest · Apr 7, 2009

(e-mail address removed) wrote

in Numerical recipes in C).
The aim of the code is to search the location of a set of points
xi in an array xpp with length P.

Click to expand...

I guess (based on reminiscence that NRiC is a lasy translation from
Pascal) that the array is assumed to have P *USED* entries at
index 1 to P, which makes it an array of length P+1 from a C standpoint.
That, or the xi == xpp[P] fragment gives undefined behavior.

Fortran, I believe.
Pascal could specify both the top and bottom of the range of an array
index

<snip>

Keith Thompson · Apr 7, 2009

Cross said:
Is it the non-availability of the return value?

Please quote enough context for your followup to make sense without
seeing the original article. The C code in question is:

int main(int argc, char **argv) { ; }

The failure to return a status is a bug in C90, but not in C99, where
falling off the end of main() is equivalent to executing a "return 0;"
statement.

(Specifically, in C90 it returns an undefined termination status to
the host environment; I suppose one could argue whether this is a
"bug".)

Barry Schwarz · Apr 8, 2009

Dear all,

The following piece of code (a search algorithm) has to be evaluated
for more than 5e9 times in my program. Therefore, I'd like to obtain
te fastest implementationpossible. Can anyone help me improving this
piece of code?
Thanks,
Kizzie

int search (int ascnd, int ju, int jl, double xi, double *xpp, int P)
{
int jm, j;
while (ju-jl > 1)
{ /* if we are not yet done */
jm = (ju+jl) >> 1; /* compute a midpoint */
if (xi >= xpp[jm] == ascnd) {jl = jm; /* and replace
either the lower limit */ }
else {ju = jm; /* or the upper limit, as appropriate
*/ }
} /* repeat until the test condition is satisfied. */
if (xi == xpp[1]) j = 1;

xi and xpp[1] are independent of the while loop above.

else if (xi == xpp[P]) j = P-1;

xi and xpp[P] are also independent of the while loop.

else j = jl;
return j;
}

You can avoid the while loop completely for the cases where either of
the two above if statements evaluate to true.

Try something along the lines of
int search(...){...
if (xi == xpp[1]) return 1;
if (xi == xpp[P]) return P-1;
while (...)
{your while code}
return jl;

Richard Bos · Apr 8, 2009

Keith Thompson said:
Please quote enough context for your followup to make sense without
seeing the original article. The C code in question is:

int main(int argc, char **argv) { ; }

The failure to return a status is a bug in C90, but not in C99, where
falling off the end of main() is equivalent to executing a "return 0;"
statement.

(Specifically, in C90 it returns an undefined termination status to
the host environment; I suppose one could argue whether this is a
"bug".)

When the whole purpose of the program is to return a successful status?
I'll say it is! But in C99, I don't see that there is a bug, as such, in
this program; there is, of course, a style error, in that it is silly to
use the argumented form of main() when you don't use the arguments. I'd
write it like this

int main(void) { return 0; }

for completeness and correctness under both Standards. Normally I'd
spread even a function as small as this over four lines, but given the
special nature of the program, I'd rather use one in this case.

Richard

Filter sober in c++ don't pass test	0	Dec 2, 2023
Help with code	0	Jun 12, 2022
visual studio compilling issues with solaris c code	1	Oct 14, 2009
threaded dla, _very_ naive, interest test 0...	3	Jun 10, 2014
Is there in memory tree structure faster than std::map?	11	Aug 12, 2012
Optimization Problem	10	Sep 25, 2007
javaFX Ultimate tic tac toe issues	0	Nov 28, 2016
generate De Bruijn sequence memory and string vs lists	0	Jan 23, 2014

faster solution

kizzienova

kizzienova

user923005

lndresnick

Bartc

user923005

CBFalconer

user923005

Francois Grieu

Eric Sosman

Guest

Keith Thompson

Barry Schwarz

Richard Bos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads