Generating unique random numbers

L

lallous

Hello,

This code works fine when 'size' is less than 32768 however when size is
bigger this function never returns.
Can't find out why?!
If I break into the code I can see that 'i' is 32768....
void MakeRandomArray(unsigned long **a, unsigned long size)
{
unsigned long *data = new unsigned long [size];
double sizef = (double)(size - 1);
char *map = new char[size];
unsigned long n, i;

memset(data, 0, size * sizeof(unsigned long));
memset(map, 0, size * sizeof(char));

srand((unsigned) time(NULL));

for (i=0;i<size;i++)
{
for (;;)
{
double f = ((double) rand()) / RAND_MAX;
n = (unsigned long)(f * sizef);
if (map[n])
continue;
data = n;
map[n] = 1;
break;
}
}
delete [] map;
*a = data;
}
 
D

David B. Held

lallous said:
This code works fine when 'size' is less than 32768 however
when size is bigger this function never returns.
Can't find out why?!

It would seem that the most likely problem is here:
[...]
double f = ((double) rand()) / RAND_MAX;
n = (unsigned long)(f * sizef);
if (map[n])
continue;
[...]

Perhaps your RNG doesn't give sufficient resolution after
floating-point conversions to cover your domain. One way
to test this is to write a loop which tries to obtain each
number from 0 to sizef, and maybe displays the number of
attempts to get it. That way, you get a better idea of the
coverage your RNG is providing.

Dave
 
L

lallous

David B. Held said:
lallous said:
This code works fine when 'size' is less than 32768 however
when size is bigger this function never returns.
Can't find out why?!

It would seem that the most likely problem is here:
[...]
double f = ((double) rand()) / RAND_MAX;
n = (unsigned long)(f * sizef);
if (map[n])
continue;
[...]

Perhaps your RNG doesn't give sufficient resolution after
floating-point conversions to cover your domain. One way
to test this is to write a loop which tries to obtain each
number from 0 to sizef, and maybe displays the number of
attempts to get it. That way, you get a better idea of the
coverage your RNG is providing.

Dave
Wrote some function:
void SearchForZeroAndSize(unsigned long size)
{
double sizef = (double)(size);
unsigned long n;

srand((unsigned)time(NULL));

unsigned long ntries(0);
bool nozero = true, nosize = true;

while (nozero || nosize)
{
double f = ((double) rand()) / RAND_MAX;
f = f * sizef;
n = (unsigned long)(f);
if (n==0 && nozero)
{
printf("found zero after %ld tries\n", ntries);
nozero = false;
}
else if (n==size && nosize)
{
printf("found 'size' after %ld tries\n", ntries);
nosize = false;
}
ntries++;
}
}

Output:
found 'size' after 9902 tries
found zero after 44910 tries



Here is some more info:

unsigned long i, biggest = 0;
for (i=0; i < size; i++)
{
for (;;)
{
bool flag = false;
double f = ((double) rand()) / RAND_MAX;
f = f * sizef;
n = (unsigned long)(f);

if (n > biggest)
{
biggest = n;
if (biggest == 99999)
n = biggest; // useless code, but just to put a breakpoint
printf("biggest so far: %ld\n", biggest);
}

if (flag) // when the function hangs, put a BP here and adjust flag to
TRUE
{
unsigned long filled(0), unfilled(0);
for (unsigned long j=0;j<size;j++)
{
if (!map[j])
{
unfilled++;
//printf("%d is not filled!\n", j);
}
else
filled++;
if (j % 1000 == 0)
{
//printf("filled: %ld unfilled:%ld\n", filled, unfilled);
}
}
printf("filled: %ld unfilled:%ld\n", filled, unfilled);
}
if (map[n])
continue;
data = n;
map[n] = 1;
break;
}
}

The output goes:

biggest so far: 97978
biggest so far: 98790
biggest so far: 99510
biggest so far: 99718
biggest so far: 99916
biggest so far: 99995
biggest so far: 99999
filled: 32768 unfilled:67232

This asserts that I am getting random numbers above 32768...

p.s: I am using VC6++
 
R

Rob Williscroft

lallous wrote in
Hello,

This code works fine when 'size' is less than 32768 however when size is
bigger this function never returns.
Can't find out why?!

Because 32768 is RAND_MAX on your platform so at some point
your map ends up with RAND_MAX entries set to 1 and your inner
loop never will get passed "if (map[n]) continue;" as n is always
one of the previously set values.

Note that the scaling you do with sizef distributes every value
in [0, RAND_MAX] to a distinct value in [0, size - 1], so when
size > RAND_MAX there are simply some values that will never
appear.
void MakeRandomArray(unsigned long **a, unsigned long size)
{
unsigned long *data = new unsigned long [size];
double sizef = (double)(size - 1);
char *map = new char[size];
unsigned long n, i;

memset(data, 0, size * sizeof(unsigned long));
memset(map, 0, size * sizeof(char));

srand((unsigned) time(NULL));

for (i=0;i<size;i++)
{
for (;;)
{
double f = ((double) rand()) / RAND_MAX;
n = (unsigned long)(f * sizef);
if (map[n])
continue;
data = n;
map[n] = 1;
break;
}
}
delete [] map;
*a = data;
}


Rob.
 
P

Peter van Merkerk

This code works fine when 'size' is less than 32768 however when size
is
bigger this function never returns.
Can't find out why?!
If I break into the code I can see that 'i' is 32768....
void MakeRandomArray(unsigned long **a, unsigned long size)
{
unsigned long *data = new unsigned long [size];
double sizef = (double)(size - 1);
char *map = new char[size];
unsigned long n, i;

memset(data, 0, size * sizeof(unsigned long));
memset(map, 0, size * sizeof(char));

srand((unsigned) time(NULL));

for (i=0;i<size;i++)
{
for (;;)
{
double f = ((double) rand()) / RAND_MAX;
n = (unsigned long)(f * sizef);
if (map[n])
continue;
data = n;
map[n] = 1;
break;
}
}
delete [] map;
*a = data;
}


Hmm, look more like C than C++ code to me.

Anyway the most likely reason why this function never returns is that
the rand() function only generates 32768 unique numbers (size >
RAND_MAX), so after 32768 unique numbers have been generated this
function can no longer find new unique numbers. You probably want to use
a different random generator
(http://en.wikipedia.org/wiki/Mersenne_Twister), as the standard ones
are often not particulary good.

Your algorithm is not particulary efficient. A faster way would be to
fill a vector with unique numbers and then shuffle it:

#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

vector<int> MakeRandomArray(int size)
{
vector<int> v(size);

for(int i = 0; i < v.size(); ++i)
{
v = i;
}

random_shuffle(v.begin(), v.end());
return v;
}

int main()
{
cout << "Enter size: ";
int size;
cin >> size;
vector<int> v = MakeRandomArray(size);

for(int i = 0; i < v.size(); ++i)
{
cout << v <<" ";
}
return 0;
}
 
L

lallous

Rob Williscroft said:
lallous wrote in
Hello,

This code works fine when 'size' is less than 32768 however when size is
bigger this function never returns.
Can't find out why?!

Because 32768 is RAND_MAX on your platform so at some point
your map ends up with RAND_MAX entries set to 1 and your inner
loop never will get passed "if (map[n]) continue;" as n is always
one of the previously set values.

Note that the scaling you do with sizef distributes every value
in [0, RAND_MAX] to a distinct value in [0, size - 1], so when
size > RAND_MAX there are simply some values that will never
appear.

Thanks Rob, that makes sense.
Also thanks to Peter, STL and random_shuffle looks nice.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top