Re: Random 64 bit number in stdc

Discussion in 'C Programming' started by James Dow Allen, Aug 19, 2013.

1. James Dow AllenGuest

I've polished up my own random library. If you're
very masochistic^H^H^H^H^H^H^H^H^H^H^H eager to be a "beta tester",
e-mail me. PRNG isn't novel; it's based on Marsaglia etc.
Only "novelty" is avoiding unnecessary waste of random bits.

Here's a summary of the external specification:

/* Seed or reseed the random generator */
void rand_seed(int seed);

/* Return cnt random bits. 0 < cnt < 33 */
uint32_t rand_bits(unsigned int cnt);

/* Return random uniform 0 < x < 1 */
double rand_prob(void);

/* Return 1 with probability prob */
int rand_decide(double prob);

/* Return random uniform 0 <= x < range <= 65535 */
unsigned int rand_index(unsigned short range);

/* Return a uniform random variable from `minv' to `maxv' */
double rand_unif(double minv, double maxv);

/* Return Poisson random variable with mean and variance = ln(e_to_p) */
int rand_poisson(double e_to_p);

/* Return gaussian random variable */
double rand_gaussian(double meanv, double sdevv);

/* Set a random point on the surface of a N-dim hypersphere */
void rand_surfpoint(double coord[], double radius, int n);

/* Set a random point in interior of a N-dim hypersphere */
double rand_inball(double coord[], double radius, int n);

struct walkerent {
int x1, x2;
double prob1;
};

/* Given histogram wp[0...(n-1)], prepare for rand_walker() */
void rand_setwalker(struct walkerent *wp, unsigned short n);

/* Return 0 <= x < n according to set distribution */
int rand_walker(struct walkerent *wp, unsigned short n);

but abandoned that when I saw makefile complexity.
.... In a program which really only needs printf! ::whack::
(Am I the only one who prefers simplicity and standaloneness?)

James

James Dow Allen, Aug 19, 2013

2. James Dow AllenGuest

Rosario1903 <> might have writ, in
news::

> On Mon, 19 Aug 2013 04:19:39 +0000 (UTC), James Dow Allen wrote:
>
>>I've polished up my own random library.

> why not
> void rand_seed(uint32_t* seed, uint32_t size);
> where seed is one array of unsigned of size: size

Your point is valid. I'll offer four excuses:

(1) Big seed may be needed for cryptography and casinos.
Small seed is good enough for simulations, which is my background.

(2) I did not find a good write-up on seeding for Marsaglia nor
Mersenne Twister. Correctness (i.e. copying faithfully so as not to
accidentally destroy randomness) is more important than functionality !

(3) My added value, if any, isn't the generator. It's the methods for
conserving bits, implementing Walker algorithm, etc.

(4) I wrote this for my own use and amusement. If it has any future at
all, that future will come *with the help* of other C programmers.

James

James Dow Allen, Aug 19, 2013

3. Eric SosmanGuest

On 8/19/2013 12:19 AM, James Dow Allen wrote:
> I've polished up my own random library. If you're
> very masochistic^H^H^H^H^H^H^H^H^H^H^H eager to be a "beta tester",
> e-mail me. PRNG isn't novel; it's based on Marsaglia etc.
> Only "novelty" is avoiding unnecessary waste of random bits.
>
> Here's a summary of the external specification:
>
> /* Seed or reseed the random generator */
> void rand_seed(int seed);

That's not much entropy to seed a generator with lots
of internal state.

> /* Return random uniform 0 < x < 1 */
> double rand_prob(void);

That's peculiar: [0,1) is what I think most people expect.

> /* Return random uniform 0 <= x < range <= 65535 */
> unsigned int rand_index(unsigned short range);

The range restriction seems odd.

I'm not saying any of these design choices are "wrong,"
just that they're surprising.

--
Eric Sosman
d

Eric Sosman, Aug 19, 2013
4. James Dow AllenGuest

On Monday, August 19, 2013 8:19:56 PM UTC+7, Eric Sosman wrote:
> On 8/19/2013 12:19 AM, James Dow Allen wrote:
> > void rand_seed(int seed);

>
> That's not much entropy to seed a generator with lots
> of internal state.

Answered in my previous post, more or less. Marsaglia's
or MTwister's numbers are very "random" just always one
of the same (up to 4 billion different) random sequences. ::whack::
If someone points me to a good write-up for seeding Marsaglia
*without worry that pathological case will degrade the PRNG*,
I will support it.

> > /* Return random uniform 0 < x < 1 */
> > double rand_prob(void);

> That's peculiar: [0,1) is what I think most people expect.

I want the mean value to be 0.5 exactly. (And I like symmetry.)
Anyway, the measure of (p==0 exactly) event would be small anyway
(2^-31).

> > /* Return random uniform 0 <= x < range <= 65535 */
> > unsigned int rand_index(unsigned short range);

> The range restriction seems odd.

Do you *ever* use random *non-power-of-two* indexes from
a larger range? A *non-factorable* range?
.... And rand_bits() is available if power of two.

I had larger range implemented but its value seemed outweighed
by added coding complexity. (I do like to think that if my
code has any virtues at all, *simplicity* is one of them.)

James

James Dow Allen, Aug 19, 2013
5. Eric SosmanGuest

On 8/19/2013 9:38 AM, James Dow Allen wrote:
> On Monday, August 19, 2013 8:19:56 PM UTC+7, Eric Sosman wrote:
>> On 8/19/2013 12:19 AM, James Dow Allen wrote:
>>> [...]
>>> /* Return random uniform 0 <= x < range <= 65535 */
>>> unsigned int rand_index(unsigned short range);

>> The range restriction seems odd.

>
> Do you *ever* use random *non-power-of-two* indexes from
> a larger range? A *non-factorable* range?
> ... And rand_bits() is available if power of two.

"Indices," maybe not. "Values," certainly.

> I had larger range implemented but its value seemed outweighed
> by added coding complexity. (I do like to think that if my
> code has any virtues at all, *simplicity* is one of them.)

Tastes vary (that's all this discussion is about, really),
but a two-line rejection loop doesn't seem complex to me.

--
Eric Sosman
d

Eric Sosman, Aug 19, 2013
6. James Dow AllenGuest

On Monday, August 19, 2013 8:51:49 PM UTC+7, Eric Sosman wrote:
> Tastes vary (that's all this discussion is about, really),
> but a two-line rejection loop doesn't seem complex to me.

Seven lines altogether, even without bit conservation,
since a completely new case is needed.
It was easier to just add them (as I've now done),
than to defend the indefensible.

James

James Dow Allen, Aug 19, 2013
7. Keith ThompsonGuest

James Dow Allen <> writes:
> I've polished up my own random library. If you're
> very masochistic^H^H^H^H^H^H^H^H^H^H^H eager to be a "beta tester",
> e-mail me. PRNG isn't novel; it's based on Marsaglia etc.
> Only "novelty" is avoiding unnecessary waste of random bits.
>
> Here's a summary of the external specification:
>
> /* Seed or reseed the random generator */
> void rand_seed(int seed);

If you want to have a specified number of bits in the seed, I suggest
using the typedefs from <stdint.h>, noting that the exact-width types
are not guaranteed to exist.

I note that the standard srand() takes an unsigned int seed argument,
which seems more convenient. Does yours work correctly with negative
seeds?

> /* Return cnt random bits. 0 < cnt < 33 */
> uint32_t rand_bits(unsigned int cnt);

I suggest spelling out "count".

[...]

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Aug 19, 2013
8. Eric SosmanGuest

On 8/19/2013 12:59 PM, James Dow Allen wrote:
> On Monday, August 19, 2013 8:51:49 PM UTC+7, Eric Sosman wrote:
>> Tastes vary (that's all this discussion is about, really),
>> but a two-line rejection loop doesn't seem complex to me.

>
> Seven lines altogether, even without bit conservation,
> since a completely new case is needed.
> It was easier to just add them (as I've now done),
> than to defend the indefensible.

Mine (based on a different PRNG, with a typedef and some
macros that shouldn't baffle anyone):

/**
* Returns a non-negative pseudo-random value strictly less
* than its argument, which must be in the range 1 through
* MSRANDMAX-MSRANDMIN+1, inclusive. Returned values are
* exactly equiprobable.
*/
msrand_t
msrandint(msrand_t limit)
{
msrand_t divisor, value;
divisor = (MSRANDMAX - MSRANDMIN + 1) / limit;
while ( (value = (msrand() - MSRANDMIN) / divisor) >= limit )
;
return value;
}

The loop could be micro-optimized a little, if anyone cared
(I never did, since even in the worst case the expected number
of divisions per call is slightly less than two).

--
Eric Sosman
d

Eric Sosman, Aug 19, 2013
9. Eric SosmanGuest

On 8/19/2013 3:26 PM, Eric Sosman wrote:
> [...]
> msrand_t
> msrandint(msrand_t limit)
> {
> msrand_t divisor, value;
> divisor = (MSRANDMAX - MSRANDMIN + 1) / limit;
> while ( (value = (msrand() - MSRANDMIN) / divisor) >= limit )
> ;
> return value;
> }
>
> The loop could be micro-optimized a little, if anyone cared
> (I never did, since even in the worst case the expected number
> of divisions per call is slightly less than two).

Sorry: That should be "divisions per call *in the loop*,"
hence "slightly less than three all told."

--
Eric Sosman
d

Eric Sosman, Aug 19, 2013
10. James Dow AllenGuest

On Tuesday, August 20, 2013 2:26:24 AM UTC+7, Eric Sosman wrote:

> > Seven lines altogether, even without bit conservation,

> ...
> while ( (value = (msrand() - MSRANDMIN) / divisor) >= limit )
> ;

I do NOT quibble with your fine code, but just seek to clarify
my "7 lines": In my older age, I'll often use {} when ;
could suffice. (Two of my 7 were together only "do {} ".
Similarly, I also spend two lines for the "if (range > OTHER) {}"

But much more importantly, your code will spend 31(?) random
bits to pick a value in 0...14 while (slightly less than) 4

James

James Dow Allen, Aug 20, 2013
11. James Dow AllenGuest

Eric Sosman <> might have writ, in
news:kutrh1\$c7a\$:

> ...
> msrand_t divisor, value;
> divisor = (MSRANDMAX - MSRANDMIN + 1) / limit;

Eric, am I correct that, typically, msrand_t will be uint32_t
but for the code to work MSRANDMAX will be only 0x7fffffff ?

In practice I usually do such to make arithmetic easier,
but for my "offical" random library, I (masochistically?)
use 32-bit generators.

James

James Dow Allen, Aug 20, 2013
12. Eric SosmanGuest

On 8/19/2013 10:15 PM, James Dow Allen wrote:
> On Tuesday, August 20, 2013 2:26:24 AM UTC+7, Eric Sosman wrote:
>
>>> Seven lines altogether, even without bit conservation,

>
>> ...
>> while ( (value = (msrand() - MSRANDMIN) / divisor) >= limit )
>> ;

>
> I do NOT quibble with your fine code, but just seek to clarify
> my "7 lines": In my older age, I'll often use {} when ;
> could suffice. (Two of my 7 were together only "do {} ".
> Similarly, I also spend two lines for the "if (range > OTHER) {}"
> that has to be added.
>
> But much more importantly, your code will spend 31(?) random
> bits to pick a value in 0...14 while (slightly less than) 4

As I mentioned, the underlying PRNG is different from
yours. (It's Park & Miller's "Minimal Standard Random Number
Generator," a congruential algorithm with prime modulus.)
Each call to msrand() yields a result in the range 1 through
0x7fffffE, that is, just a smidgen less than 31 bits' worth.

I've never given any thought to economizing bit use with
this generator. It would certainly complicate things -- that
leftover 0.9999999986564 of a bit would be awkward in the
extreme! There certainly are settings, though, were parsimony
it would be undesirable to consume more bits than absolutely
necessary.

--
Eric Sosman
d

Eric Sosman, Aug 20, 2013
13. Eric SosmanGuest

On 8/20/2013 12:52 AM, James Dow Allen wrote:
> Eric Sosman <> might have writ, in
> news:kutrh1\$c7a\$:
>
>> ...
>> msrand_t divisor, value;
>> divisor = (MSRANDMAX - MSRANDMIN + 1) / limit;

>
> Eric, am I correct that, typically, msrand_t will be uint32_t
> but for the code to work MSRANDMAX will be only 0x7fffffff ?

#define MSRANDMIN 1 /* smallest value ever produced */
#define MSRANDMAX 2147483646 /* largest value (2**31 - 2) */

/* Define a signed integral type which can express all the generated
* values, along with negative quantities as low as -47664652 which
* may appear as intermediate values in the computations.
*/
#if (INT_MIN <= -47664652 && INT_MAX >= MSRANDMAX)
typedef int msrand_t;
#elif (LONG_MIN <= -47664652 && LONG_MAX >= MSRANDMAX)
typedef long msrand_t;
#else
/* Shouldn't happen */
# error "No suitable integral type for msrand_t exists!"
#endif

In hindsight, using a signed type for msrand_t was a blunder
on my part. The generator uses negative values internally, but
I shouldn't have let that dictate the interface. Nowadays I
guess uint_least31_t (!) would be a good choice -- but the choice
was made in the days before C99.

--
Eric Sosman
d

Eric Sosman, Aug 20, 2013
14. Öö TiibGuest

On Monday, 19 August 2013 16:38:37 UTC+3, James Dow Allen wrote:
> On Monday, August 19, 2013 8:19:56 PM UTC+7, Eric Sosman wrote:
> > On 8/19/2013 12:19 AM, James Dow Allen wrote:
> > > void rand_seed(int seed);

> >
> > That's not much entropy to seed a generator with lots
> > of internal state.

>
> Answered in my previous post, more or less. Marsaglia's
> or MTwister's numbers are very "random" just always one
> of the same (up to 4 billion different) random sequences. ::whack::
> If someone points me to a good write-up for seeding Marsaglia
> *without worry that pathological case will degrade the PRNG*,
> I will support it.
>
> > > /* Return random uniform 0 < x < 1 */
> > > double rand_prob(void);

> > That's peculiar: [0,1) is what I think most people expect.

>
> I want the mean value to be 0.5 exactly. (And I like symmetry.)
> Anyway, the measure of (p==0 exactly) event would be small anyway
> (2^-31).

That might be very useful for cases when we want a coin flipping
algorithm that has one coin from 2147483648 to stand on edge.
People need such thing rarely and more often they want p<0.5 and
p>=0.5 to have exactly equal probability (also sort of symmetry).

Öö Tiib, Aug 21, 2013
15. James Dow AllenGuest

Öö Tiib <> might have writ, in news:70280474-0cab-45cd-89bb-e1e4c5427111

> On Monday, 19 August 2013 16:38:37 UTC+3, James Dow Allen wrote:
>> I want the mean value to be 0.5 exactly. (And I like symmetry.)
>> Anyway, the measure of (p==0 exactly) event would be small anyway
>> (2^-31).

>
> That might be very useful for cases when we want a coin flipping
> algorithm that has one coin from 2147483648 to stand on edge.
> People need such thing rarely and more often they want p<0.5 and
> p>=0.5 to have exactly equal probability (also sort of symmetry).

FWIW, my code doesn't produce 0.5 exactly, in addition to not producing
0 or 1 exactly. p < 0.5 and p > 0.5 do have equal probabilities regardless
of whether or where you place the "or equal."

What is the best way to position, say, ten samples evenly in [0,1]? I think
{0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95}
is an appropriate answer. Frankly I cannot imagine why anyone would
disagree with this!

James

James Dow Allen, Aug 21, 2013
16. Eric SosmanGuest

On 8/21/2013 7:35 AM, James Dow Allen wrote:
> Öö Tiib <> might have writ, in news:70280474-0cab-45cd-89bb-e1e4c5427111
>
>> On Monday, 19 August 2013 16:38:37 UTC+3, James Dow Allen wrote:
>>> I want the mean value to be 0.5 exactly. (And I like symmetry.)
>>> Anyway, the measure of (p==0 exactly) event would be small anyway
>>> (2^-31).

>>
>> That might be very useful for cases when we want a coin flipping
>> algorithm that has one coin from 2147483648 to stand on edge.
>> People need such thing rarely and more often they want p<0.5 and
>> p>=0.5 to have exactly equal probability (also sort of symmetry).

>
> FWIW, my code doesn't produce 0.5 exactly, in addition to not producing
> 0 or 1 exactly. p < 0.5 and p > 0.5 do have equal probabilities regardless
> of whether or where you place the "or equal."
>
> What is the best way to position, say, ten samples evenly in [0,1]? I think
> {0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95}
> is an appropriate answer. Frankly I cannot imagine why anyone would
> disagree with this!

Looks suitably random, too. ;-)

--
Eric Sosman
d

Eric Sosman, Aug 21, 2013
17. James Dow AllenGuest

Eric Sosman <> might have writ, in
news:kv2btm\$ea7\$:
> Looks suitably random, too. ;-)

I'm curious what this means. Surely you don't misunderstand the context?

James

James Dow Allen, Aug 21, 2013
18. Eric SosmanGuest

On 8/21/2013 9:08 AM, James Dow Allen wrote:
> Eric Sosman <> might have writ, in
> news:kv2btm\$ea7\$:
>> Looks suitably random, too. ;-)

>
> I'm curious what this means. Surely you don't misunderstand the context?

you wrote

>>> What is the best way to position, say, ten samples evenly in [0,1]?

I think
>>> {0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95}
>>> is an appropriate answer. Frankly I cannot imagine why anyone would
>>> disagree with this!

Frankly, I cannot imagine why anyone would offer this collection
as a "random" sample.

Besides which, it is obvious that the *right* ten values to
use are

0.013047-
0.067468+
0.160295+
0.283302+
0.425563-
0.574437+
0.716698-
0.839705-
0.932532-
0.986953+

.... and this should be so transparently clear to even the feeblest
intellect that I cannot imagine why anyone would disagree! ;-)

http://en.wikipedia.org/wiki/Legendre_polynomials#Shifted_Legendre_polynomials

--
Eric Sosman
d

Eric Sosman, Aug 21, 2013
19. James Dow AllenGuest

Eric Sosman <> might have writ, in
news:kv2hlq\$d26\$:

> On 8/21/2013 9:08 AM, James Dow Allen wrote:
>> Eric Sosman <> might have writ, in
>> news:kv2btm\$ea7\$:
>>> Looks suitably random, too. ;-)

>>
>> I'm curious what this means. Surely you don't misunderstand the
>> context?

>
> you wrote
>
> >>> What is the best way to position, say, ten samples evenly in
> >>> [0,1]?

What does "evenly" mean?

> >>> I think
> >>> {0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95}
> >>> is an appropriate answer. Frankly I cannot imagine why anyone
> >>> would disagree with this!

>
> Frankly, I cannot imagine why anyone would offer this collection
> as a "random" sample.

Until now I'd never thought you were stupid. Unless yours is an
elaborate joke, apparently the feeling wasn't reciprocated.

Pay attention:

The rand_prob() we are discussing always returns one of 2,147,483,648
different values. AFAIK essentially all such routines function
similarly: which of those 2147483648 values is returned on a given call
is random; however the set of 2147483648 possibilities is fixed.

To understand rand_prob() we want to know about that set of 2147483648
distinct values. But that is a large number of values to enumerate.