Mersenne Twister -- A Revised C++ Implementation

  • Thread starter Scott Robert Ladd
  • Start date
S

Scott Robert Ladd

I've posted my revised C++ implementation of the Mersenne Twister at:

http://www.coyotegulch.com/libcoyote/TwistedRoad/TwistedRoad.html

This is "free-as-in-liberty" and "free-as-in-beer" code.

The Mersenne Twister is a "random number" generator invented by Makoto
Matsumoto and Takuji Nishimura; their website includes numerous
implementations of the algorithm.

Essentially, the Mersenne Twister is a very large linear-feedback shift
register. The algorithm operates on a 19,937 bit seed, stored in an
624-element array of 32-bit unsigned integers. The value 2^19937-1 is a
Mersenne prime; the technique for manipulating the seed is based on an
older "twisting" algorithm -- hence the name "Mersenne Twister".

An appealing aspect of the Mersenne Twister is its use of binary
operations -- as opposed to time-consuming multiplication -- for
generating numbers. The algorithm also has a very long period, and good
granularity. It is both fast and effective for non-cryptographic
applications.
 
T

tom_usenet

I've posted my revised C++ implementation of the Mersenne Twister at:

http://www.coyotegulch.com/libcoyote/TwistedRoad/TwistedRoad.html

This is "free-as-in-liberty" and "free-as-in-beer" code.

The Mersenne Twister is a "random number" generator invented by Makoto
Matsumoto and Takuji Nishimura; their website includes numerous
implementations of the algorithm.

Essentially, the Mersenne Twister is a very large linear-feedback shift
register. The algorithm operates on a 19,937 bit seed, stored in an
624-element array of 32-bit unsigned integers. The value 2^19937-1 is a
Mersenne prime; the technique for manipulating the seed is based on an
older "twisting" algorithm -- hence the name "Mersenne Twister".

An appealing aspect of the Mersenne Twister is its use of binary
operations -- as opposed to time-consuming multiplication -- for
generating numbers. The algorithm also has a very long period, and good
granularity. It is both fast and effective for non-cryptographic
applications.

There is a mersenne twister in the new standard library extension
draft. It started out it boost - www.boost.org

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
 
S

Scott Robert Ladd

On Mon, 05 Jan 2004 13:31:04 GMT, Scott Robert Ladd
There is a mersenne twister in the new standard library extension
draft. It started out it boost - www.boost.org

The library extension draft isn't an official part of any available
compiler. While it is possible to download and install the Boost
libraries, I know many people who have problems with the Boost licensing
and design.

Diversity is good; I wish Boost well.
 
C

Carsten Hansen

Scott Robert Ladd said:
I've posted my revised C++ implementation of the Mersenne Twister at:

http://www.coyotegulch.com/libcoyote/TwistedRoad/TwistedRoad.html

This is "free-as-in-liberty" and "free-as-in-beer" code.

The Mersenne Twister is a "random number" generator invented by Makoto
Matsumoto and Takuji Nishimura; their website includes numerous
implementations of the algorithm.

Essentially, the Mersenne Twister is a very large linear-feedback shift
register. The algorithm operates on a 19,937 bit seed, stored in an
624-element array of 32-bit unsigned integers. The value 2^19937-1 is a
Mersenne prime; the technique for manipulating the seed is based on an
older "twisting" algorithm -- hence the name "Mersenne Twister".

An appealing aspect of the Mersenne Twister is its use of binary
operations -- as opposed to time-consuming multiplication -- for
generating numbers. The algorithm also has a very long period, and good
granularity. It is both fast and effective for non-cryptographic
applications.

Your web page is full of errors.

The C/C++ Standard does not define rand.
Multiplication is not slow on today's processors. Bit shifting is on x86.
The Mersenne Twister is relative slow because it does
four table lookups
five shifts
eight bitwise operations
for each number.

Carsten Hansen
 
R

Ron Natalie

Carsten Hansen said:
The C/C++ Standard does not define rand.

It does define rand(). It doesn't however mandate the implementation
he includes in his document. The function that appears in the C standard
is not normative and exists just to show a possible impelementation. They
are for illustration purposes only (mostly showing that given the same starting
seed the requirement that the same pseudo-random sequence is generated).
 
C

Carsten Hansen

Ron Natalie said:
It does define rand(). It doesn't however mandate the implementation
he includes in his document. The function that appears in the C standard
is not normative and exists just to show a possible impelementation. They
are for illustration purposes only (mostly showing that given the same starting
seed the requirement that the same pseudo-random sequence is generated).

You are of course correct.

I objected to Scott's claim on his web site that
"Standard C (and thus Standard C++) explicitly defines the following linear
congruential generator for implementing the rand and srand functions:"

Carsten Hansen
 
S

Scott Robert Ladd

It does define rand(). It doesn't however mandate the implementation
he includes in his document. The function that appears in the C standard
is not normative and exists just to show a possible impelementation. They
are for illustration purposes only (mostly showing that given the same
starting seed the requirement that the same pseudo-random sequence is
generated).

No mandate exists, but I have seen compilers that use the "reference"
implementation. Unlike a function like sin, which should return consistent
results across platforms, rand() is quite variable -- and its use of
global values makes it unsuitable for many applications.
 
S

Scott Robert Ladd

I objected to Scott's claim on his web site that
"Standard C (and thus Standard C++) explicitly defines the following linear
congruential generator for implementing the rand and srand functions:"

Explicit it *not* a synonym for mandated. rand() is one of the every few
fucntions for which the standard "suggests" a specific implementation.
Don't make a suggestion if you don't want people to follow it... ;)
 
S

Scott Robert Ladd

The C/C++ Standard does not define rand.

Yes it does; however, that is debated elsewhere in this thread.
Multiplication is not slow on today's processors. Bit shifting is on x86.
The Mersenne Twister is relative slow because it does
four table lookups
five shifts
eight bitwise operations
for each number.

Okay, I can write a one-line generator function that is vastly faster than
the Mersenne Twister -- of course, it won't be a very good generator (like
the one suggested in the C standard), but it will be fast.

The "minimal standard", as suggested by Knuth and others, involve many
operations; in general, Mersenne Twister is as faster or faster than any
other generator that has similar statistical properties. And a whole lot
of people seem to agree; you can find the cites in the article.

Two "errors", one of which isn't and the other a platform dependence.
Doesn't sound very "full of errors" to me. But thanks for the pointers.
 
C

Carsten Hansen

Scott Robert Ladd said:
Yes it does; however, that is debated elsewhere in this thread.


Okay, I can write a one-line generator function that is vastly faster than
the Mersenne Twister -- of course, it won't be a very good generator (like
the one suggested in the C standard), but it will be fast.

The "minimal standard", as suggested by Knuth and others, involve many
operations; in general, Mersenne Twister is as faster or faster than any
other generator that has similar statistical properties. And a whole lot
of people seem to agree; you can find the cites in the article.

Two "errors", one of which isn't and the other a platform dependence.
Doesn't sound very "full of errors" to me. But thanks for the pointers.

Your claims about "a" and "m" in LCM, "a and m can take on only a very few
values" and "m most certainly being a prime", are false.
You claim about best LCM for 32-bit numbers is inconsistent. Is it a=16807
and m = 2147483647 or a=42871 and m=69621.
Knuth calls the first "adequate but less outstanding". Its result from the
spectral test is far below other 32-bit LCMs.

The random number generator Knuth has on his web site involves two table
lookups and one bitwise operation. Hence it is much faster than the Mersenne
Twister.

Marsaglia has many high quality random number generators involving far less
operations than the Mersenne Twister.
Here is a quote about the Mersenne Twister by Marsaglia: "But it requires an
elaborate C program and is slower than many RNGs that do as well in tests,
have comparable or longer periods and require only a few lines of code."

The Mersenne Twister claim to fame is mostly because of its cute name.

Your code show that you don't understand random number generation.
In get_rand_range you basically map [0, 4294963695] onto the range. That
cannot be done uniformly. That is part of the C FAQ.

Carsten Hansen
 
S

Scott Robert Ladd

Your claims about "a" and "m" in LCM, "a and m can take on only a very few
values" and "m most certainly being a prime", are false.

Indeed, a and m can take on any value, but only very specific values
produce a useful random number generator. I could have made that clearer
in the article, and will likely make a small revision tonight.
The random number generator Knuth has on his web site involves two table
lookups and one bitwise operation. Hence it is much faster than the Mersenne
Twister.

And a fine generator it is; I've never made any claim that the Mersenne
Twister is "the best", and, in fact, I provide links to sites that
compare many algorithms. The Mersenne Twister has many adherents and
supporters, most of whom have far more credentials than you or I. The
algorithm performs excellently on the most demanding statistical tests.

You are welcome, of course, to use any algorithm you wish. Have fun.
Marsaglia has many high quality random number generators involving far less
operations than the Mersenne Twister.
Here is a quote about the Mersenne Twister by Marsaglia: "But it requires an
elaborate C program and is slower than many RNGs that do as well in tests,
have comparable or longer periods and require only a few lines of code."

I has not seen this quote, though I certainly won't deny its existence. It
may or may not be valid; I have respect for Mr. Marsaglia, but he is only
one voice of many. And in my opinion, the Mersenne Twister does *not*
require "an elaborate C program."
The Mersenne Twister claim to fame is mostly because of its cute name.

Ah -- so in your opinion, good algorithms must have dull and academic
sounding names?
Your code show that you don't understand random number generation.

Hmmm... and your comments here clearly show that you can't read, or that
you lead an active fantasy life. As in...
In get_rand_range you basically map [0, 4294963695] onto the range. That
cannot be done uniformly. That is part of the C FAQ.

Where do I say that the results are uniform? Since I never claim that the
result is uniform, you accuse me of a non-existent error.

I must point out that the given technique is a common one used for
obtaining a "random" value within a given range; you'll find it in many
text books, papers, and articles. It is imperfect; if you have divined a
superior technique, please illuminate us.
 
S

Stephen Howe

Your code show that you don't understand random number generation.
In get_rand_range you basically map [0, 4294963695] onto the range. That
cannot be done uniformly.

I must have overlooked it. Please point out where he claims uniformity.

Thank you

Stephen Howe
 
C

Carsten Hansen

Scott Robert Ladd said:
Indeed, a and m can take on any value, but only very specific values
produce a useful random number generator. I could have made that clearer
in the article, and will likely make a small revision tonight.


And a fine generator it is; I've never made any claim that the Mersenne
Twister is "the best", and, in fact, I provide links to sites that
compare many algorithms. The Mersenne Twister has many adherents and
supporters, most of whom have far more credentials than you or I. The
algorithm performs excellently on the most demanding statistical tests.

You are welcome, of course, to use any algorithm you wish. Have fun.


I has not seen this quote, though I certainly won't deny its existence. It
may or may not be valid; I have respect for Mr. Marsaglia, but he is only
one voice of many. And in my opinion, the Mersenne Twister does *not*
require "an elaborate C program."

You can do your own search on Google.
Of course, insinuating that I made up the quote, shows something about your
ethics.
Ah -- so in your opinion, good algorithms must have dull and academic
sounding names?

You logic is flawed. I expressed no such opinion.
Your code show that you don't understand random number generation.

Hmmm... and your comments here clearly show that you can't read, or that
you lead an active fantasy life. As in...
In get_rand_range you basically map [0, 4294963695] onto the range. That
cannot be done uniformly. That is part of the C FAQ.

Where do I say that the results are uniform? Since I never claim that the
result is uniform, you accuse me of a non-existent error.

I must point out that the given technique is a common one used for
obtaining a "random" value within a given range; you'll find it in many
text books, papers, and articles. It is imperfect; if you have divined a
superior technique, please illuminate us.

A random number generator that does not generate uniformly distributed
numbers is of very questionable use. If that is what you are looking for,
you have definitely found it.
Koenig and Moo's "Accelerated C++" contains a better way of generating
numbers from a range. That is an introductory text. Maybe it is too advanced
for you.

Carsten Hansen
 
G

galathaea

:
: I've posted my revised C++ implementation of the Mersenne Twister at:
:
: http://www.coyotegulch.com/libcoyote/TwistedRoad/TwistedRoad.html
:
: This is "free-as-in-liberty" and "free-as-in-beer" code.

Just a few comments about your site and code:

First, I wanted to thank you for taking the time to put this on the
internet. I believe making educational materials freely available is an
extremely important undertaking, and anyone who spends some free time to
publish an article explaining anything gets a thumbs up from me. The style
of your article is relaxed and an easy read, which is an additional
accomplishment. However, there are some technical inaccuracies that should
be pointed out.

First, you state:
"In this article, I'm concerned with the needs of stochastic algorithms,
games, and simulations. For those applications, a useful PRNG produces a
sequence of values such that every number in a given range has an equal
chance of being produced. This is what numericists refer to as a uniform
deviate."

And yet your code:

//--------------------------------------------------------------------------
-
// Obtain a psuedorandom integer in the range [lo,hi]
inline mtprng::int_type mtprng::get_rand_range(mtprng::int_type lo,
mtprng::int_type hi)
{
// Local working storage
real_type range = hi - lo + 1.0;

// Use real value to caluclate range
return lo + int_type(floor(range * get_rand_real2()));
}

displays the common error of nonuniform range mapping. This discrepancy
should probably be corrected (either your stated goals or your
implementation).

You also state, later, concerning the sample c implementation in the
standard:
"That algorithm is a slight elaboration on the basic linear congruential
algorithm, in that it uses a long (32-bit signed integer) for the seed, but
returns only a positive int (16-bit signed integer)."

That particular statement should be qualified (or perhaps less qualified).
The seed bit size is not explicit to that implementation, and in fact
compiling that algorithm in various environments will give different bit
sizes for the seed. You shouldn't stress the types at all (in fact srand
takes an int, and an int on the target you seem to be developing on --
VC++ -- is actually the same size as a long, both 32 bit -- it is only the
taken modulus which restricts the output range).

Also, you state:
"The ANSI rand function returns values between 0 and 32,767."

which is similarly incorrect, unless you qualify it by stressing that you
are referring to the sample implementation, as the range required by the ISO
/ IEC standard only defines RAND_MAX as the upper bound.

There are some more nitpicks I might state, but they are similar in scope
and reasoning (like your use of the unsigned long as 32 bit type without any
conditional compilation). However, a more blanket suggestion might be to
just take all the VC++ specific stuff out of your assumptions (and
#ifdef's!), as it is mostly unneeded and where you need a specific bit size,
either use some library like boost's or insert your own conditional
compilation to pick the appropriate type in a more general manner.

Finally, I just wanted to ask if you have considered using template
metaprogramming to ensure the unrolling of your statically-sized loops and
avoid the need for a counter and its manipulation / testing at runtime.
Although profiling and such should always be checked prior and during
optimisation, this is certainly code where one of the main goals is speed.

Just some comments. Again I thank you for your time and effort.
 
S

Scott Robert Ladd

//--------------------------------------------------------------------------
// Obtain a psuedorandom integer in the range [lo,hi]
inline mtprng::int_type mtprng::get_rand_range(mtprng::int_type lo,
mtprng::int_type hi)

displays the common error of nonuniform range mapping. This discrepancy
should probably be corrected (either your stated goals or your
implementation).

I'm aware that this one routine does not fall within the definition of
"uniform"; however, given the usual range of numbers I need, say [0..100),
the statistical anomaly is very small. The modulus operation is quite
common in this context, and I never really considered doing anything more
sophisticated, given the limited impact on my applications.

Be that as it may, I've already modified the function (internally) to use
something a bit more sophisticated that does maintain uniformity. I'll
post a revised version of the code in a few days.
You also state, later, concerning the sample c implementation in the
standard:
"That algorithm is a slight elaboration on the basic linear congruential
algorithm, in that it uses a long (32-bit signed integer) for the seed, but
returns only a positive int (16-bit signed integer)."

That particular statement should be qualified (or perhaps less qualified).

You are very correct; in fact, I may remove the entire discussion of the
"suggested" C implementation, given that almost no current compilers use
it.
Also, you state:
"The ANSI rand function returns values between 0 and 32,767."
which is similarly incorrect, unless you qualify it by stressing that you
are referring to the sample implementation, as the range required by the ISO
/ IEC standard only defines RAND_MAX as the upper bound.

I should have stated that 32,767 is an *example* of a maximum range --
although it is the most common value for RAND_MAX on small systems (in my
experience).
There are some more nitpicks I might state, but they are similar in scope
and reasoning (like your use of the unsigned long as 32 bit type without any
conditional compilation). However, a more blanket suggestion might be to
just take all the VC++ specific stuff out of your assumptions (and
#ifdef's!), as it is mostly unneeded and where you need a specific bit size,
either use some library like boost's or insert your own conditional
compilation to pick the appropriate type in a more general manner.

I much prefer C99 or Fortran 95, where I can portably and explicitly
request a specific bit size without the need for little tricks. For
example, when working in C99, I use int32_t from stdint.h. I note that the
reference implementation uses unsigned long, as do many published
algorithms -- but that doesn't mean I shouldn't be smarter ;)

The VC++-specific #ifdefs work around a bug in older Visual C++ compilers
that do not support the assignment of values to member constants within
the class definitions. I don't see any point in removing this, since many
people still use older tools. Perhaps I should make it non-compiler
specific with a compile-time definition; frankly, I do so little coding on
Windows anymore, I'd forgotten it was there.
Finally, I just wanted to ask if you have considered using template
metaprogramming to ensure the unrolling of your statically-sized loops and
avoid the need for a counter and its manipulation / testing at runtime.
Although profiling and such should always be checked prior and during
optimisation, this is certainly code where one of the main goals is speed.

I've done my share of template metaprogramming (DSP, linear algebra), and
find that its efficacies are often offset by the obscurity of the syntax.
In the case of the Mersenne Twister, some of my programs use it to
generate trillions of numbers -- and profiling shows that the Twister is a
very minor contributor to execution time.

I'll consider the issue, though.
Just some comments. Again I thank you for your time and effort.

I appreciate the thoughtful comments.
 
O

osmium

Scott said:
You are very correct; in fact, I may remove the entire discussion of the
"suggested" C implementation, given that almost no current compilers use
it.

I much prefer that you leave the code there and correct the weasel words.
It's code that a lot of people would like to see.
The VC++-specific #ifdefs work around a bug in older Visual C++ compilers
that do not support the assignment of values to member constants within
the class definitions. I don't see any point in removing this, since many
people still use older tools. Perhaps I should make it non-compiler
specific with a compile-time definition; frankly, I do so little coding on
Windows anymore, I'd forgotten it was there.

I would have really appreciated a comment as to what the #ifdef was all
about. BFTSLK loses more and more meaning as time goes on.
 
J

jeffc

Carsten Hansen said:
You can do your own search on Google.
Of course, insinuating that I made up the quote, shows something about your

You logic is flawed. I expressed no such opinion.

His logic was flawed, as was yours when you said he insinuated you made up
the quote. Isn't it time you stopped bickering? Your arguments are
becoming weaker and more desperate - the sign that it's over.
 
C

Chris Theis

Carsten Hansen said:
[SNIP]> >

A random number generator that does not generate uniformly distributed
numbers is of very questionable use. If that is what you are looking for,
you have definitely found it.
Koenig and Moo's "Accelerated C++" contains a better way of generating
numbers from a range. That is an introductory text. Maybe it is too advanced
for you.

Sit back and relax. There is no need for both of you to get personal here.
If you want to be knitpicking then the statement that a RNG that does not
generate uniformly distributed numbers is of questionable use, is simply not
correct in that form. Such a RNG is certainly useless as a basis to obtain
arbitrarily specified distributions but especially in the field of Monte
Carlo simulations RNGs producing non-uniform distributions are wanted.
Although of course one must be aware which distribution is obtained and
control over this distribution is mandatory.

Cheers
Chris
 
S

Scott Robert Ladd

His logic was flawed, as was yours when you said he insinuated you made up
the quote. Isn't it time you stopped bickering? Your arguments are
becoming weaker and more desperate - the sign that it's over.

I never insinuated that he made up the quote! I simply said I'd never seen
the quote -- two totally different assertions.
 
J

jeffc

Scott Robert Ladd said:
I never insinuated that he made up the quote! I simply said I'd never seen
the quote -- two totally different assertions.

That's what I just said.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top