Finding Repeated and Missing Numbers

H

Harsha Srinath

Athough this might not be directly relayed to C syntax, I thought some
of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?

2. How would you find the only missing element in that array? Can you
think of a way to do it while iterating through the array only once. Is
overflow a problem in the solution? Why not?

For both the problems, the solution I can think of is to sort
the elements and then compare consecutive elements (in O(nlogn) time).
Summing the numbers didn't seem of much use. Creating another such
array and then using the elements as indices may not be that neat of an
approach from a memory perspective, I felt.

Any other ways, folks?

Thanks,
Harsha
 
R

Richard Harter

Athough this might not be directly relayed to C syntax, I thought some
of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?

2. How would you find the only missing element in that array? Can you
think of a way to do it while iterating through the array only once. Is
overflow a problem in the solution? Why not?

For both the problems, the solution I can think of is to sort
the elements and then compare consecutive elements (in O(nlogn) time).
Summing the numbers didn't seem of much use. Creating another such
array and then using the elements as indices may not be that neat of an
approach from a memory perspective, I felt.

Any other ways, folks?

OT. That said, there is missing information. If array one has
1000001 entries and array two has 999999 entries (as implied by the
description) then summing works. The sum in case one is 500005000000
[n*(n+1)/2] plus the extra element and in case two minus the missing
element. The catch is that the summation may trigger overflows. Your
task is write the code for multiple precision integer arithmetic.



Richard Harter, (e-mail address removed)
http://home.tiac.net/~cri, http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.
 
H

Harsha Srinath

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?

2. How would you find the only missing element in that array? Can you
think of a way to do it while iterating through the array only once. Is
overflow a problem in the solution? Why not?

For both the problems, the solution I can think of is to sort
the elements and then compare consecutive elements (in O(nlogn) time).
Summing the numbers didn't seem of much use. Creating another such
array and then using the elements as indices may not be that neat of an
approach from a memory perspective, I felt.

Any other ways, folks?

OT. That said, there is missing information. If array one has
1000001 entries and array two has 999999 entries (as implied by the
description) then summing works. The sum in case one is 500005000000
[n*(n+1)/2] plus the extra element and in case two minus the missing
element. The catch is that the summation may trigger overflows. Your
task is write the code for multiple precision integer arithmetic.

Sorry. You are right, there are 1,000,001 elements and 999,999 entires
in the two cases respectively.

And if not summation, any other way??

Thanks,
Harsha Srinath
 
P

Peter Nilsson

Harsha said:
Athough this might not be directly relayed to C syntax, I thought some
of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?

unsigned long i, missing, sum, Sum = 1000000ul / 2 * 1000001ul;
for (i = 0, sum = 0; i < 1000001; i++)
sum += array;
missing = sum - Sum;
2. How would you find the only missing element in that array? Can you
think of a way to do it while iterating through the array only once.

unsigned long i, missing, sum, Sum = 1000000ul / 2 * 1000001ul;
for (i = 0, sum = 0; i < 999999; i++)
sum += array;
missing = Sum - sum;
Is overflow a problem in the solution? Why not?

Summation works fine; if you're using unsigned arithmetic, then you
can't (generally) overflow. The real problem is determining why you
have such arrays in the first place. ;)
 
C

Chris Torek

(I had to restore some of the quotes here.)

[and again, but with one number missing]

(We just had this puzzle posted.)

OT. That said, there is missing information. If array one has
1000001 entries and array two has 999999 entries (as implied by the
description) then summing works. ... The catch is that the summation
may trigger overflows. Your task is write the code for multiple
precision integer arithmetic.

Actually, you need not go that far. C99's "long long" is guaranteed
to hold the result without overflow (since as you noted the sum is
(1000000*1000001)/2 or 500000500000, and LLONG_MAX must be at least
9223372036854775807, which is considerably larger). But even C99
is not required:

Sorry. You are right, there are 1,000,001 elements and 999,999 entires
in the two cases respectively.

And if not summation, any other way??

Use "unsigned" and the properties of Galois fields. Unsigned
arithmetic implements GF(2-sup-k), where k is the number of bits
in the unsigned integer type.

unsigned sum;
size_t i;

for (sum = 0, i = 0; i < 1000001; i++)
sum += arr;
sum -= (1000000U * 1000001U / 2U);
printf("the missing number is %u\n", sum);

If addition is not to your liking, you can use subtraction (which
is of course just addition in disguise), or xor.
 
J

John L

Harsha Srinath said:
Athough this might not be directly relayed to C syntax, I thought some
of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?

The phrase "using little extra memory" makes me wonder if
the OP's friend who set this puzzle has in mind using
an array of a million bits (or a million of any data type)
and setting each bit as the corresponding number is seen.
If a bit is already set to one, then that number has been
seen before. At the end, any remaining zero bits represent
missing numbers.
 
P

pete

Harsha Srinath wrote:
And if not summation, any other way??

The XOR of all of the integers from 1 to 999999 is zero.

/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

#define EXTRA 998000
#define MISSING 999000
#define SIZE 1000000

int main(void)
{
long unsigned *array;
long unsigned index;
long unsigned result;
const long unsigned size = SIZE;

assert(SIZE == 1000000);
array = malloc((SIZE + 1) * sizeof *array);
if (array == NULL) {
puts("I'm tired");
exit(EXIT_FAILURE);
}
for (index = 0; index != SIZE; ++index) {
array[index] = index + 1;
}
array[index] = EXTRA;
result = 0;
for (index = 0; index != SIZE + 1; ++index) {
result ^= array[index];
}
free(array);
printf("The extra number is %lu\n", result ^ SIZE);
array = malloc((SIZE - 1) * sizeof *array);
if (array == NULL) {
puts("I'm tired");
exit(EXIT_FAILURE);
}
for (index = 0; index != SIZE - 1; ++index) {
array[index] = index + 1;
}
if (size > MISSING) {
array[MISSING - 1] = SIZE;
}
result = 0;
for (index = 0; index != SIZE - 1; ++index) {
result ^= array[index];
}
free(array);
printf("The missing number is %lu\n", result ^ SIZE);
return 0;
}

/* END new.c */
 
H

Harsha Srinath

John said:
The phrase "using little extra memory" makes me wonder if
the OP's friend who set this puzzle has in mind using
an array of a million bits (or a million of any data type)
and setting each bit as the corresponding number is seen.
If a bit is already set to one, then that number has been
seen before. At the end, any remaining zero bits represent
missing numbers.

Yup. Thats exactly what made me also think of an possible alternative
to the summation method.
 
C

CBFalconer

Richard said:
Harsha Srinath said:
Athough this might not be directly relayed to C syntax, I thought
some of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way
to do it using little extra memory?

2. How would you find the only missing element in that array? Can
you think of a way to do it while iterating through the array
only once. Is overflow a problem in the solution? Why not?

For both the problems, the solution I can think of is to sort
the elements and then compare consecutive elements (in O(nlogn)
time). Summing the numbers didn't seem of much use. Creating
another such array and then using the elements as indices may not
be that neat of an approach from a memory perspective, I felt.

Any other ways, folks?

OT. That said, there is missing information. If array one has
1000001 entries and array two has 999999 entries (as implied by
the description) then summing works. The sum in case one is
500005000000 [n*(n+1)/2] plus the extra element and in case two
minus the missing element. The catch is that the summation may
trigger overflows. Your task is write the code for multiple
precision integer arithmetic.

As long as M is greater than 1000000 you can use modulo M
arithmetic throughout and avoid all overflows. To put it on-topic,
this is fairly automatic if using C unsigned longs.
 
L

Lawrence Kirby

(I had to restore some of the quotes here.)

And if not summation, any other way??

Use "unsigned" and the properties of Galois fields. Unsigned
arithmetic implements GF(2-sup-k), where k is the number of bits
in the unsigned integer type.

unsigned sum;
size_t i;

for (sum = 0, i = 0; i < 1000001; i++)
sum += arr;
sum -= (1000000U * 1000001U / 2U);
printf("the missing number is %u\n", sum);

If addition is not to your liking, you can use subtraction (which
is of course just addition in disguise), or xor.


However these are all forms of summation.

Lawrence
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top