Skewed Random Number

shuchalle · Sep 19, 2005

Hello,

I am writing a program in Java. I have following requirements.

We have large data set points whose value will range from 100 to 1500.

We need to select 10% of dataset points randomly. So if there were
40000 data points - we need to select 4000 points on random basis.

Now you say - well that's easy. Well - here is the twist.

We need to "skew" the randomness so that more points are selected
towards higher number as in near to 1500 and less points are selected
toward lower end of spectrum that is 100. But all in all -still 10% (or
4000 out of 40000 dataset points) of total points out of data points
should be selected.

We can use some sort of "logarithmic skewage" - if there is such a
word.

Any clever ideas or hints would be much appreciated!

Regards,

AZXML

Oliver Wong · Sep 19, 2005

Hello,

I am writing a program in Java. I have following requirements.

We have large data set points whose value will range from 100 to 1500.

We need to select 10% of dataset points randomly. So if there were
40000 data points - we need to select 4000 points on random basis.

Now you say - well that's easy. Well - here is the twist.

We need to "skew" the randomness so that more points are selected
towards higher number as in near to 1500 and less points are selected
toward lower end of spectrum that is 100. But all in all -still 10% (or
4000 out of 40000 dataset points) of total points out of data points
should be selected.

We can use some sort of "logarithmic skewage" - if there is such a
word.

Any clever ideas or hints would be much appreciated!

Umm... what's the problem exactly? You seem to be under the assumption
that all random distributions are uniform; that's not the case.

I don't know what kind of distribution you want, but Poisson
distribution, Beta distribution with A=1;B=3, Gamma distribution with
(k=1;theta=2), exponential distrubiton, and many others all have the
property that one end of the spectrum is more likely than others.

Why don't you take a look at
http://en.wikipedia.org/wiki/Category:Continuous_distributions

- Oliver

Thomas Fritsch · Sep 19, 2005

We have large data set points whose value will range from 100 to 1500.

We need to select 10% of dataset points randomly. So if there were
40000 data points - we need to select 4000 points on random basis.

Now you say - well that's easy. Well - here is the twist.

We need to "skew" the randomness so that more points are selected
towards higher number as in near to 1500 and less points are selected
toward lower end of spectrum that is 100. But all in all -still 10% (or
4000 out of 40000 dataset points) of total points out of data points
should be selected.

We can use some sort of "logarithmic skewage" - if there is such a
word.

Any clever ideas or hints would be much appreciated!

A simple method for generating a random number, which favors large values a
bit, could be:
double x = Math.random(); // uniform distributed in [0,1]
x = Math.pow(x, 0.9); // skewed distributed in [0,1]
x = 1400 * x + 100; // skewed distributed in [100,1500]

Roedy Green · Sep 19, 2005

Any clever ideas or hints would be much appreciated!

you need a course in elementary probability and statistics.

Here are some hints.

See http://mindprod.com/jgloss/randomnumbers.htmls

here is how nextGaussian works to produce a normal bell shaped curve
distribution:

synchronized public double nextGaussian() {
if (haveNextNextGaussian) {
haveNextNextGaussian = false;
return nextNextGaussian;
} else {
double v1, v2, s;
do {
v1 = 2 * nextDouble() - 1; // between -1.0 and
1.0
v2 = 2 * nextDouble() - 1; // between -1.0 and
1.0
s = v1 * v1 + v2 * v2;
} while (s >= 1 || s == 0);
double multiplier = Math.sqrt(-2 * Math.log(s)/s);
nextNextGaussian = v2 * multiplier;
haveNextNextGaussian = true;
return v1 * multiplier;
}
}

It works by taking two random doubles.

Another common distribution is called Poisson.

You need to be more precise about just how the elements are skewed
more toward the high end before you can come up with a formula to skew
them.

Here is the general idea of how you can do this.

1. scale your random number 0..1 over a more interesting domain of a
function with a simple multiplication.

2. crank it through some non-linear formula, e.g. x squared, sqrt,
exp, log, log base n, x^n, a polynomial, a chebychev polynomial,
parabola,... doing this to exp(x) for example will result in points
being dense at the low end and sparse at the high end.

3. scale it back into suitable range with a multiplication.

Different formulae will give you different skewings. If you don't
have a particular mathematical model you need, just pick a formula
that satisfies you intuitively. Graph the function and the
distribution.

Genetic algoritm generating the text	0	Aug 18, 2023
Python battle game help	2	Feb 23, 2023
Portable random number generator	38	Nov 10, 2010
Trying to build a SARIMAX model to forecast the S&P500 trend	0	Nov 5, 2023
Portable random number generator	2	Nov 15, 2010
Idk need help in editing this source code	0	Nov 5, 2022
Random Real Number/Float Value GeneratorI	2	Apr 22, 2007
How to create a JSON array with values from DOM(HTML TABLE) when I click a button using JQuery/Javascript?	0	May 1, 2023

Skewed Random Number

shuchalle

Oliver Wong

Thomas Fritsch

Roedy Green

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads