Using rand(), how to I avoid repeats?

Discussion in 'Perl Misc' started by Scott Stark, Jul 21, 2004.

  1. Scott Stark

    Scott Stark Guest

    I'm using rand() to generate a series of random numbers. Is there any
    way to prevent the same number from being used more than once? All I
    can think of is keeping a list of used numbers, and having some kind
    of "if used, next" statement, which seems lame. (Theoretically that
    could keep going forever, if it randomly kept selecting the same
    numbers.) Any thoughts?

    thanks
    Scott
    Scott Stark, Jul 21, 2004
    #1
    1. Advertising

  2. Scott Stark

    thundergnat Guest

    Scott Stark wrote:

    > I'm using rand() to generate a series of random numbers. Is there any
    > way to prevent the same number from being used more than once? All I
    > can think of is keeping a list of used numbers, and having some kind
    > of "if used, next" statement, which seems lame. (Theoretically that
    > could keep going forever, if it randomly kept selecting the same
    > numbers.) Any thoughts?
    >
    > thanks
    > Scott


    Sounds like you want some variation of a shuffle. Take a look at
    Algorithm::Numerical::Shuffle on cpan.

    Generate an array of numbers, shuffle it, then shift off how many items
    you need.
    thundergnat, Jul 21, 2004
    #2
    1. Advertising

  3. >>>>> "Scott" == Scott Stark <> writes:

    Scott> I'm using rand() to generate a series of random numbers. Is
    Scott> there any way to prevent the same number from being used more
    Scott> than once? All I can think of is keeping a list of used
    Scott> numbers, and having some kind of "if used, next" statement,
    Scott> which seems lame. (Theoretically that could keep going forever,
    Scott> if it randomly kept selecting the same numbers.) Any thoughts?

    You probably need to deal from a deck then.

    use List::Util qw(shuffle); # might need this from CPAN
    my @shuffled;

    while (time passes) {
    @shuffled = shuffle(1..100) unless @shuffled;
    my $number = shift @shuffled;

    etc
    }

    print "Just another Perl hacker,"; # the original

    --
    Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
    <> <URL:http://www.stonehenge.com/merlyn/>
    Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
    See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
    Randal L. Schwartz, Jul 21, 2004
    #3
  4. (Scott Stark) wrote in message news:<>...
    > I'm using rand() to generate a series of random numbers. Is there any
    > way to prevent the same number from being used more than once? All I
    > can think of is keeping a list of used numbers, and having some kind
    > of "if used, next" statement, which seems lame. (Theoretically that
    > could keep going forever, if it randomly kept selecting the same
    > numbers.) Any thoughts?


    Keeping a hash of 'used' numbers could indeed be lame in that
    performance would degrade as the percentage of numbers used became
    higher and higher.

    A better approach might be to start with an array (hash?) with an
    entry for every number in the acceptable range of random numbers,
    where the key and the value are identical to begin with. Here is an
    example representation after this array has been created:

    $random[1] = 1;
    $random[2] = 2;
    $random[3] = 3;
    #etcetera
    $random[16] = 16; #range max.

    The first time you choose from a range of 1 to 16 lets say 3 is the
    random number. Then you could splice a couple of array slices, so
    that the 'higher' half takes up at the spot where you just eliminated
    one of your possibilities:

    $random[1] = 1;
    $random[2] = 2;

    #$random[3] = 3;

    $random[3] = 4;
    $random[4] = 5;
    #etcetera
    $random[15] = 16;

    The next time you will choose a random number from 1 to 15: the new
    maximum. If it just happens to return 15, for instance, you would
    then get the value specified by that array index: 16.

    There are a few gotchas, such as being sure not to try to create more
    random numbers than are in the original range, and managing the array
    minimum and maximum values as you continue to slice and splice.

    Happy coding!
    NY Tennessean, Jul 22, 2004
    #4
  5. (Scott Stark) wrote in message news:<>...
    > I'm using rand() to generate a series of random numbers. Is there any
    > way to prevent the same number from being used more than once? All I
    > can think of is keeping a list of used numbers, and having some kind
    > of "if used, next" statement, which seems lame. (Theoretically that
    > could keep going forever, if it randomly kept selecting the same
    > numbers.) Any thoughts?


    You've got a couple of answers suggesting you use modules that allow
    you to "shuffle" an array, and I can't disagree: modules are usually
    the way to go as they have been used and tested more than any home
    grown solution.

    But still I wanted to share with you the home-grown code I've rolled
    since my original suggestion. Beware, its a little clunky. It could
    stand some perfecting but "works" for me, at least for the dozen or so
    times I tested it:

    sub uniq_random_int_in ($$$) {
    #based on random_int_in in perlfaq4, data manipulation
    my($min, $max, $qty) = @_;
    # Assumes that the three arguments are integers themselves!
    # Assumes $min < $max
    # Assumes $qty <= $max - $min + 1
    my @set, @uniq;
    for (my $i=0; $i <= $max - $min; $i++) {
    $set[$i] = $i;
    }
    for (my $i=0; $i < $qty; $i++) {
    my $rand = int rand($max - $min - $i + 1);
    $uniq[$i] = $set[$rand] + $min;
    @set = (@set[0..$rand-1], @set[$rand+1..scalar(@set)-1]);
    print $uniq[$i], "\n"; #display only, can comment out
    }
    return @uniq;
    }

    @uniq_rands = uniq_random_int_in(11,20,5); # for testing
    NY Tennessean, Jul 22, 2004
    #5
  6. Scott Stark

    Guest

    (Scott Stark) wrote:
    > I'm using rand() to generate a series of random numbers. Is there any
    > way to prevent the same number from being used more than once?


    A lot of ways.

    > All I can think of is keeping a list of used numbers, and having some
    > kind of "if used, next" statement, which seems lame.


    Well, I'd keep a hash of used numbers. And "redo" would likely more sense
    than "next". If your numbers are floats, you may have some problem with
    the numeric and string representations being nonidentical, but I highly
    doubt that will be a problem. It is not lame at all, unless you are
    sampling very densely among all possible values. (If you are sampling very
    densely, i.e. you will before you are done need a high fraction of all
    possible values, then I would just shuffle the list of all of those
    possible values.)

    > (Theoretically that
    > could keep going forever, if it randomly kept selecting the same
    > numbers.) Any thoughts?


    If your random number generator degenerates thus, then in that case you are
    pretty much screwed regardless.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Jul 22, 2004
    #6
  7. Scott Stark

    Juha Laiho Guest

    said:
    > (Scott Stark) wrote:
    >> All I can think of is keeping a list of used numbers, and having some
    >> kind of "if used, next" statement, which seems lame.

    ....
    >> (Theoretically that could keep going forever, if it randomly kept
    >> selecting the same numbers.) Any thoughts?

    >
    >If your random number generator degenerates thus, then in that case you
    >are pretty much screwed regardless.


    Not so -- think of a situation where your number space is f.ex.
    1000 numbers, and you don't want repetitions. You'll end up with a
    significant and inreasing possibility of retries for each new number.
    For the last number, there's only a 1/1000 chance that the random
    algorithm comes up with the desired number. I made a small script
    to test this; even with range of 3 (create randomised list of numbers
    of 1 to 3) I got with short testing 12 "false guesses" in generating
    the list. With larger ranges the number of false guesses got significantly
    worse -- with just range of 1000, the worst results I got had over 3000000
    false guesses. Results with less than 1000000 false guesses seemed to be
    rare.

    So, for a list of random numbers without repetitions, the shuffled list
    algorithms are definitely better.
    --
    Wolf a.k.a. Juha Laiho Espoo, Finland
    (GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
    PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
    "...cancel my subscription to the resurrection!" (Jim Morrison)
    Juha Laiho, Jul 22, 2004
    #7
  8. Scott Stark

    Anno Siegel Guest

    NY Tennessean <> wrote in comp.lang.perl.misc:
    > (Scott Stark) wrote in message
    > news:<>...
    > > I'm using rand() to generate a series of random numbers. Is there any
    > > way to prevent the same number from being used more than once? All I
    > > can think of is keeping a list of used numbers, and having some kind
    > > of "if used, next" statement, which seems lame. (Theoretically that
    > > could keep going forever, if it randomly kept selecting the same
    > > numbers.) Any thoughts?

    >
    > You've got a couple of answers suggesting you use modules that allow
    > you to "shuffle" an array, and I can't disagree: modules are usually
    > the way to go as they have been used and tested more than any home
    > grown solution.
    >
    > But still I wanted to share with you the home-grown code I've rolled
    > since my original suggestion. Beware, its a little clunky. It could
    > stand some perfecting but "works" for me, at least for the dozen or so
    > times I tested it:


    Yes, it does what it should, and yes, it *is* a little clunky.

    > sub uniq_random_int_in ($$$) {

    ^^^^^
    Don't give a function a prototype unless there is a distinct advantage.
    Normally, perl subs are not prototyped.

    > #based on random_int_in in perlfaq4, data manipulation
    > my($min, $max, $qty) = @_;
    > # Assumes that the three arguments are integers themselves!
    > # Assumes $min < $max
    > # Assumes $qty <= $max - $min + 1
    > my @set, @uniq;


    This doesn't declare @uniq. "my (@set, @uniq)" is what you need. You
    didn't run this under strict, did you?

    > for (my $i=0; $i <= $max - $min; $i++) {
    > $set[$i] = $i;
    > }


    That could be more simply written

    @set = 0 .. $max - $min;

    But what you really want in @set is the set of possible results, so
    you can make that

    @set = $min .. $max;

    > for (my $i=0; $i < $qty; $i++) {
    > my $rand = int rand($max - $min - $i + 1);


    The number "$max - $min - $i + 1" is exactly the number of elements
    left in @set in the $i-th step. You start with $max - $min + 1 elements
    in the 0-th step, and you lose one element each time $i increases. So

    my $rand = int rand @set;

    does the same thing. The selection process is now even clearer:
    Select a random number between 0 and the number of elements left
    (exclusive). Pick that element.

    > $uniq[$i] = $set[$rand] + $min;


    When @set contains the numbers $min .. $max directly, you don't need
    the addition. $set[$rand] is the wanted random number.

    > @set = (@set[0..$rand-1], @set[$rand+1..scalar(@set)-1]);

    ^^^^^^
    "@set" is already in scalar context, no need for "scalar".

    But Perl's splice() function is what you really want here:

    splice( @set, $rand, 1);

    This returns the value(s) that are spliced out, so you can find
    the random index and move the indicated element from @set to @uniq
    in one step:

    push @uniqe, splice( @set, rand @set, 1);

    > print $uniq[$i], "\n"; #display only, can comment out
    > }
    > return @uniq;
    > }
    >
    > @uniq_rands = uniq_random_int_in(11,20,5); # for testing


    Putting it all together, this is how I would write it:

    sub uniq_random_int_in {
    my($min, $max, $qty) = @_;
    my @set = ( $min .. $max);
    map splice( @set, rand @set, 1), 1 .. $qty;
    }

    Anno
    Anno Siegel, Jul 22, 2004
    #8
  9. wrote in message news:<20040721233346.458$>...
    > (Scott Stark) wrote:
    > > I'm using rand() to generate a series of random numbers. Is there any
    > > way to prevent the same number from being used more than once?

    >
    > A lot of ways.
    >
    > > All I can think of is keeping a list of used numbers, and having some
    > > kind of "if used, next" statement, which seems lame.

    >
    > Well, I'd keep a hash of used numbers.


    There may be more than one way to do it, but this is definitely worse
    than:

    1) the above-stated ideas of using a module to shuffle an array
    2) The "idea" behind my home rolled solution (the implementation could
    definitely be improved)

    As the hash of used numbers grows, chances increase that to generate
    each random number, rand() will needlessly be called more than once,
    maybe even several times. I've experienced this. I wrote a similar
    program at my first programming job for a bingo supply company.
    Random bingo grids, and as I recall rows and or columns have to add up
    to the same number?! Well I kept an array of used combinations and
    the darn thing took longer to run than it did to write! Never again
    would I do that. Maybe thats why I took such an interest in this
    problem.
    NY Tennessean, Jul 22, 2004
    #9
  10. On 2004-07-21, Scott Stark <> wrote:
    > I'm using rand() to generate a series of random numbers. Is there any
    > way to prevent the same number from being used more than once? All I
    > can think of is keeping a list of used numbers, and having some kind
    > of "if used, next" statement, which seems lame. (Theoretically that
    > could keep going forever, if it randomly kept selecting the same
    > numbers.) Any thoughts?


    Several solutions have been presented already. Here's one I haven't
    seen posted in this thread yet: a "lazy sparse shuffle".

    my %int_pool;
    use constant MAX_UNIQUE_INT => 2**32-1;

    sub get_unique_int () {
    my $n = keys %int_pool;
    my $r = $n + int rand(MAX_UNIQUE_INT+1 - $n);
    my $o = (exists $int_pool{$r} ? $int_pool{$r} : $r);
    $int_pool{$r} = (exists $int_pool{$n} ? $int_pool{$n} : $n);
    return $o;
    }

    This method works the like the hash solutions already posted, except
    that it has a guaranteed upper bound for execution time, even if the
    number space does get densely sampled. On the other hand, it only
    works for sample spaces that can be easily enumerated.

    --
    Ilmari Karonen
    If replying by e-mail, please replace ".invalid" with ".net" in address.
    Ilmari Karonen, Jul 22, 2004
    #10
  11. Scott Stark

    Guest

    Juha Laiho <> wrote:
    > said:
    > > (Scott Stark) wrote:
    > >> All I can think of is keeping a list of used numbers, and having some
    > >> kind of "if used, next" statement, which seems lame.

    > ...
    > >> (Theoretically that could keep going forever, if it randomly kept
    > >> selecting the same numbers.) Any thoughts?

    > >
    > >If your random number generator degenerates thus, then in that case you
    > >are pretty much screwed regardless.

    >
    > Not so -- think of a situation where your number space is f.ex.
    > 1000 numbers, and you don't want repetitions. You'll end up with a
    > significant and inreasing possibility of retries for each new number.


    Increasing, but for the most part trivial.

    > For the last number, there's only a 1/1000 chance that the random
    > algorithm comes up with the desired number. I made a small script
    > to test this; even with range of 3 (create randomised list of numbers
    > of 1 to 3) I got with short testing 12 "false guesses" in generating
    > the list. With larger ranges the number of false guesses got
    > significantly worse -- with just range of 1000, the worst results I got
    > had over 3000000 false guesses. Results with less than 1000000 false
    > guesses seemed to be rare.


    You clearly did something wrong. It rarely takes over 15,000 tries to get
    all numbers from 1 to 1,000. The odds against needing 1,000,000 tries are
    beyond astronomical.

    If your random number generator is so defective that you can't get the
    numbers you need from it using the hash method, then neither can you trust
    its use in your shuffling procedure.

    >
    > So, for a list of random numbers without repetitions, the shuffled list
    > algorithms are definitely better.


    That is true as long as it is possible to enumerate all possible numbers
    within the domain and you want to sample a large fraction of that domain
    (as I said myself in the paragraph you snipped).

    If we are talking about general methods, I'd rather have a method that
    works in general and is slightly suboptimal in a special case, than a
    method that fails in general but works in one special case.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Jul 24, 2004
    #11
  12. Scott Stark

    Juha Laiho Guest

    said:
    >Juha Laiho <> wrote:
    >> with just range of 1000, the worst results I got had over 3000000
    >> false guesses. Results with less than 1000000 false guesses seemed to
    >> be rare.

    >
    >You clearly did something wrong. It rarely takes over 15,000 tries to get
    >all numbers from 1 to 1,000. The odds against needing 1,000,000 tries are
    >beyond astronomical.


    Thanks for the correction; my previous test snippet apparently did have
    some dire problem -- now I seem to get repetitions in the range of 5000
    when generating numbers 1..1000 - so looks like I can agree with your
    results.

    Apologies for criticizing your method on false grounds.
    --
    Wolf a.k.a. Juha Laiho Espoo, Finland
    (GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
    PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
    "...cancel my subscription to the resurrection!" (Jim Morrison)
    Juha Laiho, Jul 25, 2004
    #12
  13. Scott Stark

    Joe Smith Guest

    Juha Laiho wrote:

    > Not so -- think of a situation where your number space is f.ex.
    > 1000 numbers, and you don't want repetitions. You'll end up with a
    > significant and inreasing possibility of retries for each new number.
    > For the last number, there's only a 1/1000 chance that the random
    > algorithm comes up with the desired number.


    If you have 1000 numbers available, and have used 999 of them,
    then the last number is *NOT* random! Using rand() to return
    the one and only possible number is not the way to do things.
    Use shuffle algorithm instead.
    -Joe
    Joe Smith, Jul 25, 2004
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. kj12345

    Refresh repeats last action

    kj12345, Sep 28, 2004, in forum: ASP .Net
    Replies:
    3
    Views:
    493
    Eliyahu Goldin
    Sep 29, 2004
  2. Mick
    Replies:
    1
    Views:
    697
    =?Utf-8?B?VGFtcGEgLk5FVCBLb2Rlcg==?=
    Nov 29, 2004
  3. Amelyan
    Replies:
    2
    Views:
    3,985
    clintonG
    Dec 19, 2005
  4. 7stud --

    rand() v. rand(0.1) ?

    7stud --, Sep 15, 2007, in forum: Ruby
    Replies:
    6
    Views:
    221
    Morton Goldberg
    Sep 16, 2007
  5. cate
    Replies:
    1
    Views:
    195
    Evertjan.
    Jun 14, 2010
Loading...

Share This Page