Second largest

ravi · Aug 1, 2007

Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

Thnx in advance

Mark Bluemel · Aug 1, 2007

ravi said:
Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

Does this look like alt.do.my.homework?

First point - neither of the questions you have asked are specific to C
programming.

Second point - your homework assignments are intended to make you think,
do research and master the subject. It's not just about getting someone
to answer the questions for you.

abdo.haji.ali · Aug 1, 2007

Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

What ideas do you have in mind so far? What sorting algorithms are you
familiar with?
Did you try them?
Ideally, a binary search tree could do the trick (Note however that in
some cases it might be O(n)!)

Abdo Haji-Ali
Programmer
In|Framez

Richard Tobin · Aug 1, 2007

ravi said:
Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

Do you mean N + O(log N)? If not, what is n?

-- Richard

Richard Tobin · Aug 1, 2007

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

[/QUOTE]

Your notation is incorrect. You might mean O(N+log N), but that would
be O(N). Or N+log N, which is not correct. Or maybe approximately N +
k*log N.

What's wrong with N + O(log N) (assuming the n should be N)? It has a
perfectly well-defined meaning, that if you subtract N from the number
of comparisons, the result is O(log N).

-- Richard

Thad Smith · Aug 1, 2007

ravi said:
Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

Your notation is incorrect. You might mean O(N+log N), but that would
be O(N). Or N+log N, which is not correct. Or maybe approximately N +
k*log N.

Kelsey Bjarnason · Aug 2, 2007

Your notation is incorrect. You might mean O(N+log N), but that would
be O(N). Or N+log N, which is not correct. Or maybe approximately N +
k*log N.

What's wrong with N + O(log N) (assuming the n should be N)? It has a
perfectly well-defined meaning, that if you subtract N from the number
of comparisons, the result is O(log N).[/QUOTE]

Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).

CBFalconer · Aug 3, 2007

Kelsey said:
.... snip ...

Convention is to count only the most significant factor; thus an
algorithm which runs in O(N) + 1000 time is an O(N) algorithm, the
constant is discarded.

In the description above, the function should run in N + O(log N)
time. As N increases, runtime increases in proportion to it, with
log N adding a small and ever less significant portion of the
runtime; thus the O(log n) is discarded and the algorithm is thus
found to be O(N).

It's not a convention, it's reality. The technique is intended to
describe execution time required for large values of N. For large
enough N the value of log N is negligible, compared to N, and is
thus discarded.

Army1987 · Aug 3, 2007

Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).

Yes. time = N + O(log N) is abuse of notation. It should be
time - N = O(log N). But it is not something so strange.
For example sin(x) = x - x**3 / 3 + O(x**5) to mean that
sin(x) - x + x**3 / 3 is O(x**5). It is not so uncommon in general
(though it is the first time I've seen it for run times).

Richard Bos · Aug 3, 2007

CBFalconer said:
It's not a convention, it's reality. The technique is intended to
describe execution time required for large values of N. For large
enough N the value of log N is negligible, compared to N, and is
thus discarded.

No, it's convention. Reality is that a function which is X*O(N) +
Y*O(log N) tends to O(N) or to O(log N), depending on the relative sizes
of X and Y, for reasonable N. O(N) only dominates the time for _all_ X
if you go to Microsoft-sales-figures values of N. For most of us, such
an algorithm is O(N) if X is large relative to Y, and O(log N) if X is
small compared to Y. Assuming that N will always tend to be large enough
to dominate both X and Y, that _is_ mere convention.
(For example: even though it is by convention O(N), an algorithm which
runs through an array once, to determine maximum and minimum values, and
then performs an O(log N) operation on it quickly using those values,
can be faster for all data sets which occur in reality than an algorithm
which is also O(log N), but which runs slower because it doesn't use the
min/max values.)

Richard

Army1987 · Aug 3, 2007

Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

If the absolute value of the elements isn't too large, you can
use
if (a - b - abs(a - b))
/* a is greater than or equal to b */;
else
/* a is smaller than b */;
so you can do anything with 0 comparisons.
You can add for (i = 0; i < N - 1; i++) continue; to compare i
with N - 1 exactly N times, so the total number of comparisons is
N. Now 0 is O(log N), of course. (Yes, the compiler is allowed to
optimize it away, but what it does is OT here -- only the behavior
of the abstract machine, which compares i with N - 1 exactly N
times, is relevant. If this a problem, you can always declare i as
volatile...)

Richard Tobin · Aug 3, 2007

What's wrong with N + O(log N) (assuming the n should be N)? It has a
perfectly well-defined meaning, that if you subtract N from the number
of comparisons, the result is O(log N).

[/QUOTE]

Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).

I'm not sure whether you're disagreeing with me or not...

An algorithm that does N + O(log N) comparisons does indeed do O(N)
comparisons, but the former statement is more precise. There are
algorithms that do O(N) comparisons but more than N + O(log N), for
example one that does 2N.

-- Richard

Spoon · Aug 3, 2007

ravi said:
Use only N + O(log N) comparisons to find the second largest
(or smallest) element in a list of N elements.

You're in the wrong newsgroup. (I think you want comp.programming)
http://en.wikipedia.org/wiki/Selection_algorithm

Spoon · Aug 3, 2007

Army1987 said:
if (a - b - abs(a - b))
/* a is greater than or equal to b */;
else
/* a is smaller than b */;

so you can do anything with 0 comparisons.

How do you suppose abs() is defined?

Richard Heathfield · Aug 3, 2007

Spoon said:

How do you suppose abs() is defined?

As a function that takes ints, not elements (whereas the OP specifically
asked for elements, not necessarily ints, from which we may assume that
the OP is after a general solution).

Another objection to the above is the behaviour when a - b < INT_MIN.

Army1987 · Aug 3, 2007

How do you suppose abs() is defined?

I don't think anything in the Standard forbids it to be defined
like
int abs(int _n)
{
long double _x = n * n;
return (int)nearbyintl(sqrtl(_x));
}
provided that long double has enough precision. I don't think it
is actually implemented like that anywhere, but you can always
write your own function which does that.

Eric Sosman · Aug 3, 2007

Kelsey Bjarnason wrote On 08/02/07 18:57,:

Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).

<off-topic>

There's no such convention. Big-O expresses *any*
upper bound, at any point in an expression. (Obviously,
it's silly to include in the expression any terms of
lower order than the Big-O term.)

Do you think Knuth uses Big-O incorrectly when he
writes (for example)

H_n = ln n + gamma + O(1/n)

? This says that the nth harmonic number is roughly
equal to the natural log of n, plus a constant, and
that the approximation error is no worse than C/n for
n > N. Both C and N are unknown, but we know the error
eventually becomes and remains less than a bound that
diminishes as the inverse of n.

He *could* instead have written

H_n = ln n + O(1)

.... which says the nth harmonic number is roughly equal
to the natural log of n, with an approximation error
no worse (asymptotically) than a constant. This is true,
but less informative than the original.

If he had followed your suggestion, he would have
told us even less:

H_n = O(ln n)

.... which says the nth harmonic number is approximately
proportional to the log of n. It might be a million
times the log of n, or it might be the log of n plus
a million; all we're told is that it's approximately
proportional, with an unknown proportionality constant
and unknown contributions from lower-order terms.

He might even have written any of

H_n = O(n)

H_n = O(n^2)

H_n = O(e^n)

.... because Big-O expresses only an upper bound. A
value that is bounded by a logarithmic function is
obviously also bounded by linear, quadratic, and
exponential functions. But all these "improvements"
communicate less information than the original.

Back to programming: Can you explain why Quicksort
is usually said to be faster than Heapsort, even though
both have asymptotic average running times of O(n log n)?
If you cannot, what reason do you have to prefer one
over the other?[*]

[*] Assuming for simplicity's sake that your choice
is based solely on expected running time; there are, of
course, many other things an engineer must consider in
an actual situation.

</off-topic>

Kelsey Bjarnason · Aug 3, 2007

Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).

I'm not sure whether you're disagreeing with me or not...

An algorithm that does N + O(log N) comparisons does indeed do O(N)
comparisons, but the former statement is more precise. There are
algorithms that do O(N) comparisons but more than N + O(log N), for
example one that does 2N.[/QUOTE]

Sure, but notationally, N + O(log N) is incorrect by convention, as -
again, by convention - the most significant term defines the order, not
the lesser terms.

Using base 10 values, consider N versus log N:

N log N

10 1
100 2
1000 3
10000 4
100000 5
1000000 6

If a function is order N + log(N), as N increases from 10 to a million,
runtime increases from 10 to a million - but the log part only increases
from 1 to 6. The N matters, as it defines that the runtime increases in a
particular and significant matter; the log N is irrelevant, as an overhead
of 6 time units in a runtime of a million units is essentially irrelevant,
and it becomes even more so as N increases further.

Thus an N + O(log N) algorithm is properly described as O(N), since the N
swamps the log N into total insignificance.

CBFalconer · Aug 3, 2007

Richard said:
.... snip ...

No, it's convention. Reality is that a function which is X*O(N) +
Y*O(log N) tends to O(N) or to O(log N), depending on the relative sizes
of X and Y, for reasonable N. O(N) only dominates the time for _all_ X

....

Are you deliberately trying to create confusion? Above I specified
(see the underlined area) "large values". These are not "expected
values" nor "mean values", etc. They are of unlimited size. If
you don't like a size, go bigger. In the particular example, for
large N, log N becomes negligible when compared to N.

Richard Tobin · Aug 4, 2007

Kelsey Bjarnason said:
Sure, but notationally, N + O(log N) is incorrect by convention,

No, you're just unfamiliar with the the usage. N + O(log N) tells
you that the excess over N is O(log N).

If a function is order N + log(N)

That would be O(N + log(N)), which is not what we are talking about.

O(N + log(N)) doesn't mean anything different from O(N), but N + O(log N)
means something different and more precise.

-- Richard

Program to find the largest integer element of an array.	1	Mar 2, 2022
Top 10 players minheap sort - need help	1	Oct 3, 2022
second largest element in an array	20	Jan 27, 2006
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023
Blue J Ciphertext Program	2	Nov 22, 2023
HELP PLEASE	4	Jul 20, 2022
Command Line Arguments	0	Mar 7, 2023
Information with WMI in Python.	1	Feb 28, 2023

Second largest

ravi

Mark Bluemel

abdo.haji.ali

Richard Tobin

Richard Tobin

Thad Smith

Kelsey Bjarnason

CBFalconer

Army1987

Richard Bos

Army1987

Richard Tobin

Spoon

Spoon

Richard Heathfield

Army1987

Eric Sosman

Kelsey Bjarnason

CBFalconer

Richard Tobin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads