What the fastest algorithm for find max/min in an array of integers?

E

Eugeny Myunster

I know, only simple one:

#include <stdio.h>

int main()
{
int min=0,max=0,i,arr[12];
for(i=0;i<12;i++)
arr=rand()%31-10;
for(i=0;i<12;i++)
printf("%d\t",arr);
printf("\n");
for(i=0;i<12;i++)
{
if(arr[max]<arr)
max=i;
}
for(i=0;i<12;i++)
{
if(arr[min]>arr)
min=i;
}
printf("\nmax is: %d\n",arr[max]);
printf("\nmin is: %d\n",arr[min]);

return 0;
}
 
R

Richard Tobin

Eugeny Myunster said:
I know, only simple one:

You're not going to get it enormously faster. Unless you store the
data in a special way (sorted for example), you're going to have to
look at all the elements.

Some small changes that might improve speed:

- you can start the loop at 1 rather than zero. No point comparing
arr[0] against itself;
- store the min and max values in variables, rather than doing
an array reference to retrieve them each time;
- calculate both in one loop through the array;
- if it's less than the minimum, it won't be more than the maximum.

Do you really need to make it faster? Profile your whole program
to see if this is important.

-- Richard
 
K

Keith Thompson

Don Bruder said:
Eugeny Myunster said:
I know, only simple one:
[snip]

Quicksort the array, then grab the first (min) and last (max) values?

Um, no. Sorting the array is typically O(N*log N), whereas finding
the min and max in a linear search is O(N).

(A side note: I'm guessing that you mean qsort. Remember that the
standard library routine qsort doesn't necessarily use the quicksort
algorithm.)
 
R

Richard Tobin

Keith Thompson said:
Um, no. Sorting the array is typically O(N*log N), whereas finding
the min and max in a linear search is O(N).

.... as is, perhaps surprisingly, finding the median.

-- Richard
 
U

user923005

I know, only simple one:

#include <stdio.h>

int main()
{
        int min=0,max=0,i,arr[12];
        for(i=0;i<12;i++)
                arr=rand()%31-10;
        for(i=0;i<12;i++)
                printf("%d\t",arr);
        printf("\n");        
        for(i=0;i<12;i++)
        {
                if(arr[max]<arr)
                        max=i;
        }
        for(i=0;i<12;i++)
        {
                if(arr[min]>arr)
                        min=i;
        }
        printf("\nmax is: %d\n",arr[max]);            
        printf("\nmin is: %d\n",arr[min]);    

        return 0;



}- Hide quoted text -

- Show quoted text -


I'm not sure why the post I sent from my news server account never
arrived (I sent it hours ago). Here is a repost:


#include <stdlib.h>
/*
"Introduction to Algorithms"
Cormen, Leiserson, Rivest
pp. 186,187
ISBN: 0-07-013143-0

Simultaneous min & max
using only 3*N/2 comparisons

Written by Dann Corbit
9/25/2007
Donated to the public domain
*/
#ifdef e_type_LONG_DOUBLE
typedef long double e_type;
#elif defined(e_type_DOUBLE)
typedef double e_type;
#elif defined(e_type_FLOAT)
typedef float e_type;
#elif defined(e_type_UNSIGNED_LONG_LONG)
typedef unsigned long long e_type;
#elif defined(e_type_LONG_LONG)
typedef long long e_type;
#elif defined(e_type_UNSIGNED_LONG)
typedef unsigned long e_type;
#elif defined(e_type_LONG)
typedef long e_type;
#elif defined(e_type_UNSIGNED)
typedef unsigned e_type;
#elif defined(e_type_INT)
typedef int e_type;
#elif defined(e_type_SHORT)
typedef short e_type;
#elif defined(e_type_UNSIGNED_SHORT)
typedef unsigned e_type;
#elif defined(e_type_CHAR)
typedef char e_type;
#elif defined(e_type_UNSIGNED_CHAR)
typedef unsigned char e_type;
#elif defined (__cplusplus)
template < class e_type > // works with stl string class etc...
#endif
void minmax(
e_type * a, // input array
size_t arr_size, // array length
e_type * min_e, // smallest thing found
e_type * max_e // biggest thing found
)
{
e_type min_et;
e_type max_et;
size_t i,
n;
if (arr_size % 2) {
min_et = a[0];
max_et = a[0];
n = 1;
} else {
if (a[0] > a[1]) {
max_et = a[0];
min_et = a[1];
} else {
min_et = a[0];
max_et = a[1];
}
n = 2;
}
for (i = n; i < arr_size; i += 2) {

if (a > a[i + 1]) {
max_et = max_et > a ? max_et : a;
min_et = min_et < a[i + 1] ? min_et : a[i + 1];
} else {
max_et = max_et > a[i + 1] ? max_et : a[i + 1];
min_et = min_et < a ? min_et : a;
}
}
*min_e = min_et;
*max_e = max_et;
}

#if defined( UNIT_TEST ) && (defined (e_type_DOUBLE) || defined(
__cplusplus))
#include <stdio.h>

char string[32767];
double foo[32767];
int main(void)
{
size_t i = 0;
double dmin,
dmax;
while (fgets(string, sizeof string, stdin)) {
foo[i++] = atof(string);
if (i > 32766)
break;
}
minmax(foo, i, &dmin, &dmax);
printf("min=%f, max=%f\n", dmin, dmax);
return 0;
}
#endif
/*
C:\tmp>cl /W4 /Ox /DUNIT_TEST /De_type_DOUBLE minmax.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762
for
80x86
Copyright (C) Microsoft Corporation. All rights reserved.

minmax.c
Microsoft (R) Incremental Linker Version 8.00.50727.762
Copyright (C) Microsoft Corporation. All rights reserved.

/out:minmax.exe
minmax.obj

C:\tmp>dir n*.txt
Volume in drive C has no label.
Volume Serial Number is 0890-87CA

Directory of C:\tmp

04/08/2008 04:13 PM 47,206,979 n.txt
1 File(s) 47,206,979 bytes
0 Dir(s) 4,932,272,128 bytes free

C:\tmp>minmax < n.txt
min=0.000043, max=20920.000000

*/
 
F

flydream

It seems that this code can not improve the performance. because the
comparison was not reduced.

I know, only simple one:
#include <stdio.h>
int main()
{
int min=0,max=0,i,arr[12];
for(i=0;i<12;i++)
arr=rand()%31-10;
for(i=0;i<12;i++)
printf("%d\t",arr);
printf("\n");
for(i=0;i<12;i++)
{
if(arr[max]<arr)
max=i;
}
for(i=0;i<12;i++)
{
if(arr[min]>arr)
min=i;
}
printf("\nmax is: %d\n",arr[max]);
printf("\nmin is: %d\n",arr[min]);

return 0;
}- Hide quoted text -
- Show quoted text -

I'm not sure why the post I sent from my news server account never
arrived (I sent it hours ago). Here is a repost:

#include <stdlib.h>
/*
"Introduction to Algorithms"
Cormen, Leiserson, Rivest
pp. 186,187
ISBN: 0-07-013143-0

Simultaneous min & max
using only 3*N/2 comparisons

Written by Dann Corbit
9/25/2007
Donated to the public domain
*/
#ifdef e_type_LONG_DOUBLE
typedef long double e_type;
#elif defined(e_type_DOUBLE)
typedef double e_type;
#elif defined(e_type_FLOAT)
typedef float e_type;
#elif defined(e_type_UNSIGNED_LONG_LONG)
typedef unsigned long long e_type;
#elif defined(e_type_LONG_LONG)
typedef long long e_type;
#elif defined(e_type_UNSIGNED_LONG)
typedef unsigned long e_type;
#elif defined(e_type_LONG)
typedef long e_type;
#elif defined(e_type_UNSIGNED)
typedef unsigned e_type;
#elif defined(e_type_INT)
typedef int e_type;
#elif defined(e_type_SHORT)
typedef short e_type;
#elif defined(e_type_UNSIGNED_SHORT)
typedef unsigned e_type;
#elif defined(e_type_CHAR)
typedef char e_type;
#elif defined(e_type_UNSIGNED_CHAR)
typedef unsigned char e_type;
#elif defined (__cplusplus)
template < class e_type > // works with stl string class etc...
#endif
void minmax(
e_type * a, // input array
size_t arr_size, // array length
e_type * min_e, // smallest thing found
e_type * max_e // biggest thing found
)
{
e_type min_et;
e_type max_et;
size_t i,
n;
if (arr_size % 2) {
min_et = a[0];
max_et = a[0];
n = 1;
} else {
if (a[0] > a[1]) {
max_et = a[0];
min_et = a[1];
} else {
min_et = a[0];
max_et = a[1];
}
n = 2;
}
for (i = n; i < arr_size; i += 2) {

if (a > a[i + 1]) {
max_et = max_et > a ? max_et : a;
min_et = min_et < a[i + 1] ? min_et : a[i + 1];
} else {
max_et = max_et > a[i + 1] ? max_et : a[i + 1];
min_et = min_et < a ? min_et : a;
}
}
*min_e = min_et;
*max_e = max_et;

}

#if defined( UNIT_TEST ) && (defined (e_type_DOUBLE) || defined(
__cplusplus))
#include <stdio.h>

char string[32767];
double foo[32767];
int main(void)
{
size_t i = 0;
double dmin,
dmax;
while (fgets(string, sizeof string, stdin)) {
foo[i++] = atof(string);
if (i > 32766)
break;
}
minmax(foo, i, &dmin, &dmax);
printf("min=%f, max=%f\n", dmin, dmax);
return 0;}

#endif
/*
C:\tmp>cl /W4 /Ox /DUNIT_TEST /De_type_DOUBLE minmax.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762
for
80x86
Copyright (C) Microsoft Corporation. All rights reserved.

minmax.c
Microsoft (R) Incremental Linker Version 8.00.50727.762
Copyright (C) Microsoft Corporation. All rights reserved.

/out:minmax.exe
minmax.obj

C:\tmp>dir n*.txt
Volume in drive C has no label.
Volume Serial Number is 0890-87CA

Directory of C:\tmp

04/08/2008 04:13 PM 47,206,979 n.txt
1 File(s) 47,206,979 bytes
0 Dir(s) 4,932,272,128 bytes free

C:\tmp>minmax < n.txt
min=0.000043, max=20920.000000

*/
 
U

user923005

It seems that this code can not improve the performance. because the
comparison was not reduced.

2*N verse 3*N/2
Not a big improvement, but an improvement.
 
W

Willem

user923005 wrote:
)> It seems that this code can not improve the performance. because the
)> comparison was not reduced.
)
) 2*N verse 3*N/2
) Not a big improvement, but an improvement.

Ah, the good old tournament algorithm.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
J

James Harris

I know, only simple one: ....
for(i=0;i<12;i++)
{
if(arr[max]<arr)
max=i;
}
for(i=0;i<12;i++)
{
if(arr[min]>arr)
min=i;
}


In addition to the comments of others try counting down as it can
translate to more efficient machine code. If the array has at least
one item initialise max and min to the first item such as the
following to replace your code above.

max = min = arr[0];
for (i = 12; --i;;) {
if (arr > max) max = arr;
if (arr < min) min = arr;
}

--
 
J

James Harris

user923005 wrote:

)> It seems that this code can not improve the performance. because the
)> comparison was not reduced.
)
) 2*N verse 3*N/2
) Not a big improvement, but an improvement.

Ah, the good old tournament algorithm.

I'm not sure that is a tournament algorithm. What it seems to do is to
pass through the array comparing adjacent elements. Call these
elements A and B. Then if we know A > B we only need to:

compare A with max
compare B with min

The inter-element compare avoids

comparing A with min and
comparing B with max

so where a basic solution would do four comparisons the algorithm only
carries out three.

This would gain little and may cost more in mechanism on hardware
where the comparisons are single- or half-cycle but it would be good
if the comparisons were expensive.

--
 
B

Ben Bacarisse

James Harris said:
I know, only simple one: ...
for(i=0;i<12;i++)
{
if(arr[max]<arr)
max=i;
}
for(i=0;i<12;i++)
{
if(arr[min]>arr)
min=i;
}


In addition to the comments of others try counting down as it can
translate to more efficient machine code. If the array has at least
one item initialise max and min to the first item such as the
following to replace your code above.

max = min = arr[0];
for (i = 12; --i;;) {
if (arr > max) max = arr;
if (arr < min) min = arr;
}


It is common, when the first element is used like this, to make the loop:

if (arr > max)
max = arr;
else if (arr < min)
min = arr;

since no element that is > max can also be < min.
 
J

James Harris

....
max = min = arr[0];
for (i = 12; --i;;) {
if (arr > max) max = arr;
if (arr < min) min = arr;
}


It is common, when the first element is used like this, to make the loop:

if (arr > max)
max = arr;
else if (arr < min)
min = arr;

since no element that is > max can also be < min.


Interesting idea and it makes sense. I'm not sure it will be faster,
though. In particular there is no dependable pattern to the first
branch. On a modern CPU the branch prediction may have trouble
guessing which way to go. Where it guesses wrong there can be a costly
delay while the pipeline catches up again.

Without going in to esoteric stuff like cache prefetching here is a
resonably fast version of code which could come from my C, above, on a
modern x86 CPU. The cmov instruction avoids the need for any branch at
all. As you can see the conditional statements

if (arr > max) max = arr;
if (arr < min) min = arr;

can be rendered into just four instructions which should take about
half a cycle each.

;Find min and max of an array
;Uses
; ebx = array index
; ebp = copy of the element
; ecx = min
; edx = max

mov ecx, [arr] ;min = arr.0
mov edx, ecx ;max = arr.0
mov ebx, 11 ;i = 11

next_iter:
mov ebp, [arr + ebx * 4] ;Fetch this element

;Carry out the min and max updates
; if (arr < min) min = arr;
; if (arr > max) max = arr;
cmp ecx, ebp ;if min > element
cmovg ecx, ebp ;min = element
cmp edx, ebp ;if max < element
cmovl edx, ebp ;max = element

dec ebx
jnz next_iter

--
 
U

user923005

James Harris said:
I know, only simple one: ...
        for(i=0;i<12;i++)
        {
                if(arr[max]<arr)
                        max=i;
        }
        for(i=0;i<12;i++)
        {
                if(arr[min]>arr)
                        min=i;
        }

In addition to the comments of others try counting down as it can
translate to more efficient machine code. If the array has at least
one item initialise max and min to the first item such as the
following to replace your code above.
max = min = arr[0];
for (i = 12; --i;;) {
  if (arr > max) max = arr;
  if (arr < min) min = arr;
}


It is common, when the first element is used like this, to make the loop:

   if (arr > max)
       max = arr;
   else if (arr < min)
       min = arr;

since no element that is > max can also be < min.


Branches can be expensive, so here is a truly awful trick to {possibly
- YMMV} speed things up a hair:

#include <stdlib.h>
/*
"Introduction to Algorithms"
Cormen, Leiserson, Rivest
pp. 186,187
ISBN: 0-07-013143-0

Simultaneous min & max
using only 3*N/2 comparisons

Written by Dann Corbit
9/25/2007
Donated to the public domain
*/
#ifdef e_type_LONG_DOUBLE
typedef long double e_type;
#elif defined(e_type_DOUBLE)
typedef double e_type;
#elif defined(e_type_FLOAT)
typedef float e_type;
#elif defined(e_type_UNSIGNED_LONG_LONG)
typedef unsigned long long e_type;
#elif defined(e_type_LONG_LONG)
typedef long long e_type;
#elif defined(e_type_UNSIGNED_LONG)
typedef unsigned long e_type;
#elif defined(e_type_LONG)
typedef long e_type;
#elif defined(e_type_UNSIGNED)
typedef unsigned e_type;
#elif defined(e_type_INT)
typedef int e_type;
#elif defined(e_type_SHORT)
typedef short e_type;
#elif defined(e_type_UNSIGNED_SHORT)
typedef unsigned e_type;
#elif defined(e_type_CHAR)
typedef char e_type;
#elif defined(e_type_UNSIGNED_CHAR)
typedef unsigned char e_type;
#elif defined (__cplusplus)
template < class e_type > // works with stl string class etc...
#endif

/*
Computation of min and max using pairwise comparison.
*/
void minmax(
e_type * a, // input array
size_t arr_size, // array length
e_type * min_e, // smallest thing found
e_type * max_e // biggest thing found
)
{
e_type min_et;
e_type max_et;
size_t i,
n;
if (arr_size % 2) {
min_et = a[0];
max_et = a[0];
n = 1;
} else {
if (a[0] > a[1]) {
max_et = a[0];
min_et = a[1];
} else {
min_et = a[0];
max_et = a[1];
}
n = 2;
}
for (i = n; i < arr_size; i += 2) {

if (a > a[i + 1]) {
max_et = max_et > a ? max_et : a;
min_et = min_et < a[i + 1] ? min_et : a[i + 1];
} else {
max_et = max_et > a[i + 1] ? max_et : a[i + 1];
min_et = min_et < a ? min_et : a;
}
}
*min_e = min_et;
*max_e = max_et;
}

void minmaxb(
e_type * a, // input array
size_t arr_size, // array length
e_type * min_e, // smallest thing found
e_type * max_e // biggest thing found
)
{
e_type Max[2], Min[2];
size_t i;
Max[1] = Min[1] = a[0];
for (i = 1; i < arr_size; i++) {
Max[(a > Max[1])] = a;
Min[(a < Min[1])] = a;
}
*min_e = Min[1];
*max_e = Max[1];
}


#if defined( UNIT_TEST ) && (defined (e_type_DOUBLE) ||
defined( __cplusplus))
#include <stdio.h>
#include <time.h>

char string[32767];
double foo[32767];
int main(void)
{
size_t i = 0;
size_t index;
double dmin=0,
dmax=0;
clock_t start,
end;
start = clock();
while (fgets(string, sizeof string, stdin)) {
foo[i++] = atof(string);
if (i > 32766)
break;
}
end = clock();
printf("Reading the data took %f seconds\n", (end - start) * 1.0 /
CLOCKS_PER_SEC);
start = clock();
for (index = 0; index < 10000; index++)
minmax(foo, i, &dmin, &dmax);
end = clock();
printf("10000 finds of min and max: min=%f, max=%f, time = %f\n",
dmin, dmax, (end - start) * 1.0 / CLOCKS_PER_SEC);
start = clock();
for (index = 0; index < 10000; index++)
minmaxb(foo, i, &dmin, &dmax);
end = clock();
printf("10000 finds of min and max: min=%f, max=%f, time = %f\n",
dmin, dmax, (end - start) * 1.0 / CLOCKS_PER_SEC);
return 0;
}
#endif
/*

Seems like a lot of fuss for: (1.906 - 1.797)/10000 = 0.0000109 second

C:\tmp>dir n*.txt
Volume in drive C has no label.
Volume Serial Number is 0890-87CA

Directory of C:\tmp

04/08/2008 04:13 PM 47,206,979 n.txt
1 File(s) 47,206,979 bytes
0 Dir(s) 4,932,272,128 bytes free
C:\tmp>cl /W4 /Ox /DUNIT_TEST /De_type_DOUBLE minmax.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762
for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

minmax.c
Microsoft (R) Incremental Linker Version 8.00.50727.762
Copyright (C) Microsoft Corporation. All rights reserved.

/out:minmax.exe
minmax.obj

C:\tmp>minmax < n.txt
Reading the data took 0.031000 seconds
10000 finds of min and max: min=0.000043, max=20920.000000, time =
1.906000
10000 finds of min and max: min=0.000043, max=20920.000000, time =
1.797000
*/
 
H

Harald van Dijk

There is a cheesy thing which can be done if branch misprediction is
expensive.

int Max[2]={0}, Min[2] = {0};
for (i = 0; i < arr_size; i++) {
Max[(arr > Max[1])] = arr;
Min[(arr < Max[1])] = arr;


I think you meant to write

Min[(arr < Min[1])] = arr;
}

Max[0] and Min[0] contain "who cares what" and Max[1] and Min[1] contain
maximum and minimum.
 
B

Ben Bacarisse

Dann Corbit said:
if (arr > max)
max = arr;
else if (arr < min)
min = arr;

There is a cheesy thing which can be done if branch misprediction is
expensive.

Of course there will be data sets for which the branch prediction will
always be correct (unless it is a very unusual system).
int Max[2]={0}, Min[2] = {0};
for (i = 0; i < arr_size; i++) {
Max[(arr > Max[1])] = arr;
Min[(arr < Max[1])] = arr;
}


Cheesy indeed!

So much for writing code that is as clear as possible and communicates the
algorithm best.

It seems strange to me that, with computer far faster than I ever
imagined, we should still be contemplating altering source code to
squeeze performance from programs. I suppose it will never stop,
since so many people want code that is as fast as possible, rather
than fast enough.
 
J

James Harris

....
It seems strange to me that, with computer far faster than I ever
imagined, we should still be contemplating altering source code to
squeeze performance from programs. I suppose it will never stop,
since so many people want code that is as fast as possible, rather
than fast enough.

Agreed. And yet despite the amazing power of modern machines there's a
detach from user experience in many cases. I suspect a small part of
this is user interface and the rest is plain old inefficiency. Maybe
the seeking for speed is happening in the wrong places. Not sure,
though. It is a /big/ disconnect.

--
 
U

user923005

...


Agreed. And yet despite the amazing power of modern machines there's a
detach from user experience in many cases. I suspect a small part of
this is user interface and the rest is plain old inefficiency. Maybe
the seeking for speed is happening in the wrong places. Not sure,
though. It is a /big/ disconnect.

I agree with both sentiments. The awful gyrations in the code samples
I provided produce such a small improvement it would be virtually
impossible to measure on a single pass over the data (reading the data
is *by far* the slowest step anyway). Performance optimizations
should occur if and only if the code does not meet specification. If
the code does not meet spec, before anything else, it should be
profiled. After profiling, the thing that should be attacked {if
possible} is the fundamental underlying algorithm.

On the other, other hand, it is a good idea to write code that uses
appropriate and fast algorithms. If we write code intially that uses
the best choice of algorithm we are far less likely to have
performance problems that require all the other performance
enhancement steps.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top