Second largest

R

ravi

Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

Thnx in advance
 
M

Mark Bluemel

ravi said:
Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

Does this look like alt.do.my.homework?

First point - neither of the questions you have asked are specific to C
programming.

Second point - your homework assignments are intended to make you think,
do research and master the subject. It's not just about getting someone
to answer the questions for you.
 
A

abdo.haji.ali

Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

What ideas do you have in mind so far? What sorting algorithms are you
familiar with?
Did you try them?
Ideally, a binary search tree could do the trick (Note however that in
some cases it might be O(n)!)

Abdo Haji-Ali
Programmer
In|Framez
 
R

Richard Tobin

ravi said:
Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

Do you mean N + O(log N)? If not, what is n?

-- Richard
 
R

Richard Tobin

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.
[/QUOTE]
Your notation is incorrect. You might mean O(N+log N), but that would
be O(N). Or N+log N, which is not correct. Or maybe approximately N +
k*log N.

What's wrong with N + O(log N) (assuming the n should be N)? It has a
perfectly well-defined meaning, that if you subtract N from the number
of comparisons, the result is O(log N).

-- Richard
 
T

Thad Smith

ravi said:
Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.

Your notation is incorrect. You might mean O(N+log N), but that would
be O(N). Or N+log N, which is not correct. Or maybe approximately N +
k*log N.
 
K

Kelsey Bjarnason

Your notation is incorrect. You might mean O(N+log N), but that would
be O(N). Or N+log N, which is not correct. Or maybe approximately N +
k*log N.

What's wrong with N + O(log N) (assuming the n should be N)? It has a
perfectly well-defined meaning, that if you subtract N from the number
of comparisons, the result is O(log N).[/QUOTE]

Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).
 
C

CBFalconer

Kelsey said:
.... snip ...

Convention is to count only the most significant factor; thus an
algorithm which runs in O(N) + 1000 time is an O(N) algorithm, the
constant is discarded.

In the description above, the function should run in N + O(log N)
time. As N increases, runtime increases in proportion to it, with
log N adding a small and ever less significant portion of the
runtime; thus the O(log n) is discarded and the algorithm is thus
found to be O(N).

It's not a convention, it's reality. The technique is intended to
describe execution time required for large values of N. For large
enough N the value of log N is negligible, compared to N, and is
thus discarded.
 
A

Army1987

Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).

Yes. time = N + O(log N) is abuse of notation. It should be
time - N = O(log N). But it is not something so strange.
For example sin(x) = x - x**3 / 3 + O(x**5) to mean that
sin(x) - x + x**3 / 3 is O(x**5). It is not so uncommon in general
(though it is the first time I've seen it for run times).
 
R

Richard Bos

CBFalconer said:
It's not a convention, it's reality. The technique is intended to
describe execution time required for large values of N. For large
enough N the value of log N is negligible, compared to N, and is
thus discarded.

No, it's convention. Reality is that a function which is X*O(N) +
Y*O(log N) tends to O(N) or to O(log N), depending on the relative sizes
of X and Y, for reasonable N. O(N) only dominates the time for _all_ X
if you go to Microsoft-sales-figures values of N. For most of us, such
an algorithm is O(N) if X is large relative to Y, and O(log N) if X is
small compared to Y. Assuming that N will always tend to be large enough
to dominate both X and Y, that _is_ mere convention.
(For example: even though it is by convention O(N), an algorithm which
runs through an array once, to determine maximum and minimum values, and
then performs an O(log N) operation on it quickly using those values,
can be faster for all data sets which occur in reality than an algorithm
which is also O(log N), but which runs slower because it doesn't use the
min/max values.)

Richard
 
A

Army1987

Can anybody tell me a method to

Use only N + O(log n) comparisons to find the second largest (or
smallest) element in a list of N elements.
If the absolute value of the elements isn't too large, you can
use
if (a - b - abs(a - b))
/* a is greater than or equal to b */;
else
/* a is smaller than b */;
so you can do anything with 0 comparisons.
You can add for (i = 0; i < N - 1; i++) continue; to compare i
with N - 1 exactly N times, so the total number of comparisons is
N. Now 0 is O(log N), of course. (Yes, the compiler is allowed to
optimize it away, but what it does is OT here -- only the behavior
of the abstract machine, which compares i with N - 1 exactly N
times, is relevant. If this a problem, you can always declare i as
volatile...)
:)
 
R

Richard Tobin

What's wrong with N + O(log N) (assuming the n should be N)? It has a
perfectly well-defined meaning, that if you subtract N from the number
of comparisons, the result is O(log N).
[/QUOTE]
Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).

I'm not sure whether you're disagreeing with me or not...

An algorithm that does N + O(log N) comparisons does indeed do O(N)
comparisons, but the former statement is more precise. There are
algorithms that do O(N) comparisons but more than N + O(log N), for
example one that does 2N.

-- Richard
 
S

Spoon

Army1987 said:
if (a - b - abs(a - b))
/* a is greater than or equal to b */;
else
/* a is smaller than b */;

so you can do anything with 0 comparisons.

How do you suppose abs() is defined?
 
R

Richard Heathfield

Spoon said:
How do you suppose abs() is defined?

As a function that takes ints, not elements (whereas the OP specifically
asked for elements, not necessarily ints, from which we may assume that
the OP is after a general solution).

Another objection to the above is the behaviour when a - b < INT_MIN.
 
A

Army1987

How do you suppose abs() is defined?
I don't think anything in the Standard forbids it to be defined
like
int abs(int _n)
{
long double _x = n * n;
return (int)nearbyintl(sqrtl(_x));
}
provided that long double has enough precision. I don't think it
is actually implemented like that anywhere, but you can always
write your own function which does that. :)
 
E

Eric Sosman

Kelsey Bjarnason wrote On 08/02/07 18:57,:
Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).

<off-topic>

There's no such convention. Big-O expresses *any*
upper bound, at any point in an expression. (Obviously,
it's silly to include in the expression any terms of
lower order than the Big-O term.)

Do you think Knuth uses Big-O incorrectly when he
writes (for example)

H_n = ln n + gamma + O(1/n)

? This says that the nth harmonic number is roughly
equal to the natural log of n, plus a constant, and
that the approximation error is no worse than C/n for
n > N. Both C and N are unknown, but we know the error
eventually becomes and remains less than a bound that
diminishes as the inverse of n.

He *could* instead have written

H_n = ln n + O(1)

.... which says the nth harmonic number is roughly equal
to the natural log of n, with an approximation error
no worse (asymptotically) than a constant. This is true,
but less informative than the original.

If he had followed your suggestion, he would have
told us even less:

H_n = O(ln n)

.... which says the nth harmonic number is approximately
proportional to the log of n. It might be a million
times the log of n, or it might be the log of n plus
a million; all we're told is that it's approximately
proportional, with an unknown proportionality constant
and unknown contributions from lower-order terms.

He might even have written any of

H_n = O(n)

H_n = O(n^2)

H_n = O(e^n)

.... because Big-O expresses only an upper bound. A
value that is bounded by a logarithmic function is
obviously also bounded by linear, quadratic, and
exponential functions. But all these "improvements"
communicate less information than the original.

Back to programming: Can you explain why Quicksort
is usually said to be faster than Heapsort, even though
both have asymptotic average running times of O(n log n)?
If you cannot, what reason do you have to prefer one
over the other?[*]

[*] Assuming for simplicity's sake that your choice
is based solely on expected running time; there are, of
course, many other things an engineer must consider in
an actual situation.

</off-topic>
 
K

Kelsey Bjarnason

Convention is to count only the most significant factor; thus an algorithm
which runs in O(N) + 1000 time is an O(N) algorithm, the constant is
discarded.

In the description above, the function should run in N + O(log N) time.
As N increases, runtime increases in proportion to it, with log N adding a
small and ever less significant portion of the runtime; thus the O(log
n) is discarded and the algorithm is thus found to be O(N).

I'm not sure whether you're disagreeing with me or not...

An algorithm that does N + O(log N) comparisons does indeed do O(N)
comparisons, but the former statement is more precise. There are
algorithms that do O(N) comparisons but more than N + O(log N), for
example one that does 2N.[/QUOTE]

Sure, but notationally, N + O(log N) is incorrect by convention, as -
again, by convention - the most significant term defines the order, not
the lesser terms.

Using base 10 values, consider N versus log N:

N log N

10 1
100 2
1000 3
10000 4
100000 5
1000000 6

If a function is order N + log(N), as N increases from 10 to a million,
runtime increases from 10 to a million - but the log part only increases
from 1 to 6. The N matters, as it defines that the runtime increases in a
particular and significant matter; the log N is irrelevant, as an overhead
of 6 time units in a runtime of a million units is essentially irrelevant,
and it becomes even more so as N increases further.

Thus an N + O(log N) algorithm is properly described as O(N), since the N
swamps the log N into total insignificance.
 
C

CBFalconer

Richard said:
.... snip ...


No, it's convention. Reality is that a function which is X*O(N) +
Y*O(log N) tends to O(N) or to O(log N), depending on the relative sizes
of X and Y, for reasonable N. O(N) only dominates the time for _all_ X
....

Are you deliberately trying to create confusion? Above I specified
(see the underlined area) "large values". These are not "expected
values" nor "mean values", etc. They are of unlimited size. If
you don't like a size, go bigger. In the particular example, for
large N, log N becomes negligible when compared to N.
 
R

Richard Tobin

Kelsey Bjarnason said:
Sure, but notationally, N + O(log N) is incorrect by convention,

No, you're just unfamiliar with the the usage. N + O(log N) tells
you that the excess over N is O(log N).
If a function is order N + log(N)

That would be O(N + log(N)), which is not what we are talking about.

O(N + log(N)) doesn't mean anything different from O(N), but N + O(log N)
means something different and more precise.

-- Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,112
Latest member
BrentonMcc
Top