[SUMMARY] Not So Random (#173)

M

Matthew Moss

Apologies for the late summary... I'm moving to a new place today,
tomorrow, this weekend... and start classes on Monday. Also, look for
the next quiz to show up this Saturday, not tomorrow.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

[Procedural generation][1] is a valuable technique for computer
graphics and games that relies on algorithms to produce content on the
fly. Use of procedural techniques have increased in recent years, due
to the large volume of content that must be created... too large or
unwieldy to be completed by hand in a short duration. But procedural
generation is not new. Early computer games such as _Elite_ and
_Rescue on Fractalus_ used algorithms, rather than preformed data, to
represent large worlds on machines with limited storage.

On one hand, CPU cycles are traded for storage limitations. On the
other, CPU cycles are exchanged for human and schedule limitations.
Essentially, procedural generation is a way to swap CPU time for other
limited resources. But if the same algorithms are run repeatedly,
won't everything look the same? An algorithm, given the same input, by
definition produces the same output.

That's where pseudorandom numbers can help, both to provide variety
and maintain repeatability. A random sequence starting with a seed
value of 42 will -- when put through a particular algorithm -- always
generate the same output. Yet, start the sequence with seed value 43,
and the same algorithm will generate new output. We can generate a
forest of randomly generated trees, each looking similar but unique,
and not have the trees randomly change every frame.

Do we need more than one generator? It depends on your algorithms and
how you generate the content. It could work, but it may be easier to
keep some things separate. One random number generator for the forest.
Another generator for the city buildings. Another one for people. That
way, I can avoid generating information for the forest if it happens
to go offscreen, because the PRNGs for the buildings and the people
are independent of the forest PRNG.

Generally, for task such as this, you don't usually need a generator
such as the Mersenne Twister. A [linear congruential generator][2],
while not really very random, is quite simple, fast and often
sufficient. For such a task, Ruby code of a few lines would have been
completely sufficient.

But I wanted to see how we could convince Ruby's `rand` to do what we
wanted, without writing a new PRNG. I'm glad I asked, because after
playing with closures and continuations, I conceived only one solution
myself... I knew there had to be another, better way.

But first, let's look at the first solution from _Frederick Cheung_,
which implements the only idea I had.

class Random
def initialize(*rand_args)
@rand_args = rand_args
ignored, @start_seed = srand, srand
@sequence = 0
end

def next
srand(@start_seed)
@sequence.times { rand *@rand_args}
@sequence += 1
rand *@rand_args
end

def reset
@sequence = 0
end
end

At first, I thought Frederick was the only one who implemented a
solution that would allow `rand` to produce random floats between zero
and one (i.e. what you get when you call `rand` with no parameters). I
consider this important behavior. However, it turns out that passing
zero to `rand` induces the same behavior, so props to everyone who
provided a default value of zero.

This implementation is straightforward. Each time we call `next` to
get the next random number in the sequence, we restart the sequence,
calling `rand` a number of times equal to `@sequence`, which is
incremented each call. This is very simple and guarantees we always
get back to the correct point, whether or not other `Random` objects
have generated numbers.

(Note that while I did mention concurrent in the quiz description, I
did not mean to imply multithreaded, so for our purposes here, this is
sufficient.)

I found this line in the initializer a little curious:

ignored, @start_seed = srand, srand

`srand` returns the previous seed. Calling it twice in a row will
first set the seed, then return that and set it again to a new value.
But the saved seed is what is used in calls to `next`. It's a nice
trick to get the (time-based) seed generated by `srand`.

While simple, this solution is highly inefficient. Procedural
generation can require a lot of random numbers, and this PRNG
algorithm is quadratic. (That is, `O(n^2)`.) Very quickly, this
generator would become painfully slow.

One possible improvement, provided in a few solutions, is caching.
Since we're generating the same exact data, we can save our work and
reuse it later. In our contrived situation, This isn't as bad as it
sounds. Although memory use must be considered, it's not unlikely that
a particular `Random` object -- a particular sequence -- will always
generate the same quantity of numbers every time. If we're not
uprooting trees in that forest, it's likely we need to continue
generating the same number of trees. Of course, in typical situations,
you wouldn't force yourself into this quiz's confines, but if a small
cache and a one-time generation is sufficient, you could trade a
little memory for CPU time, if need be.

To finish up, let's look at _brabuhr_'s clean, compact solution.

require 'slave'

class Random
def initialize(ceiling, seed = nil)
@ceiling = ceiling
@seed = seed || rand(2112)
@slave = Slave::new{
lambda {|reset|
if reset
srand(@seed)
else
rand(@ceiling)
end
}
}
self.reset
end

def next
@slave.object.call(false)
end

def reset
@slave.object.call(true)
end
end

The [Slave][3] library combines a forked Ruby process with DRb
(Distributed Ruby) and creates a client-server relationship between
the two processes. This relationship is created in the call to
`Slave::new`, which passes the following block to the slave:

lambda { |reset|
if reset
srand(@seed)
else
rand(@ceiling)
end
}

Calling this block with a boolean (as done in the `next` and `reset`
methods) simply invokes either `srand` or `rand`. But because the
slave object is a forked Ruby process, it contains a separate built-in
seed for those methods. Each `Random` instance will create such a
slave, thereby creating independent seeds that cannot interfere with
one another. In my own experience, I am so used to threading (i.e.
shared state and address space), the idea of using a forked process
(i.e. independent state and address space) didn't cross my mind.

Using Slave may have some overhead from the interprocess
communication. However, it is a constant-time operation: it should
easily beat the first `Random` implementation for performance for
typical use.



[1]: http://en.wikipedia.org/wiki/Procedural_generation
[2]: http://en.wikipedia.org/wiki/Linear_congruential_generator
[3]: http://codeforpeople.com/lib/ruby/slave/slave-1.2.1/
 
F

Frederick Cheung

I found this line in the initializer a little curious:

ignored, @start_seed = srand, srand

`srand` returns the previous seed. Calling it twice in a row will
first set the seed, then return that and set it again to a new value.
But the saved seed is what is used in calls to `next`. It's a nice
trick to get the (time-based) seed generated by `srand`.

It's worth noting that (when available) srand with no arguments will
try and use things like /dev/random to seed itself so you can get
yourself a nice random seed.

Fred
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top