Distribution

P

prince.pangeni

Hi all,
I am doing a simulation project using Python. In my project, I want
to use some short of distribution to generate requests to a server.
The request should have two distributions. One for request arrival
rate (should be poisson) and another for request mix (i.e. out of the
total requests defined in request arrival rate, how many requests are
of which type).
Example: Suppose the request rate is - 90 req/sec (generated using
poisson distribution) at time t and we have 3 types of requests (i.e.
r1, r2, r2). The request mix distribution output should be similar to:
{r1 : 50 , r2 : 30 , r3 : 10} (i.e. out of 90 requests - 50 are of r1
type, 30 are of r2 type and 10 are of r3 type).
As I an new to python distribution module, I am not getting how to
code this situation. Please help me out for the same.

Thanks in advance

Prince
 
P

Peter Otten

prince.pangeni said:
Hi all,
I am doing a simulation project using Python. In my project, I want
to use some short of distribution to generate requests to a server.
The request should have two distributions. One for request arrival
rate (should be poisson) and another for request mix (i.e. out of the
total requests defined in request arrival rate, how many requests are
of which type).
Example: Suppose the request rate is - 90 req/sec (generated using
poisson distribution) at time t and we have 3 types of requests (i.e.
r1, r2, r2). The request mix distribution output should be similar to:
{r1 : 50 , r2 : 30 , r3 : 10} (i.e. out of 90 requests - 50 are of r1
type, 30 are of r2 type and 10 are of r3 type).
As I an new to python distribution module, I am not getting how to
code this situation. Please help me out for the same.

You don't say what distribution module you're talking of, and I guess I'm
not the only one who'd need to know that detail.

However, with sufficient resolution and duration the naive approach sketched
below might be good enough.

# untested
DURATION = 3600 # run for one hour
RATE = 90 # requests/sec
RESOLUTION = 1000 # one msec

requests = ([r1]*50 + [r2]*30 + [r3]*10)
time_slots = [0]*(RESOLUTION*DURATION)
times = range(RESOLUTION*DURATION)

for _ in range(DURATION*RATE):
time_slots[random.choice(times)] += 1

for time, count in enumerate(time_slots):
for _ in range(count):
issue_request_at(random.choice(requests), time)
 
R

Robert Kern

What is a distribution? That term already means something in Python
jargon, and it doesn't match the rest of your use case.

So what do you mean by “distribution� Maybe we can find a less
confusing term.

Judging from the context, he means a probability distribution.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
R

Robert Kern

Hi all,
I am doing a simulation project using Python. In my project, I want
to use some short of distribution to generate requests to a server.
The request should have two distributions. One for request arrival
rate (should be poisson) and another for request mix (i.e. out of the
total requests defined in request arrival rate, how many requests are
of which type).
Example: Suppose the request rate is - 90 req/sec (generated using
poisson distribution)

Just a note on terminology to be sure we're clear: a Poisson *distribution*
models the number of arrivals in a given time period if the events are from a
Poisson *process* with a given mean rate. To model the inter-event arrival
times, you use an exponential distribution. If you want to handle events
individually in your simulation, you will need to use the exponential
distribution to figure out the exact times for each. If you are handling all of
the events in each second "in bulk" without regard to the exact times or
ordering within that second, then you can use a Poisson distribution.
at time t and we have 3 types of requests (i.e.
r1, r2, r2). The request mix distribution output should be similar to:
{r1 : 50 , r2 : 30 , r3 : 10} (i.e. out of 90 requests - 50 are of r1
type, 30 are of r2 type and 10 are of r3 type).
As I an new to python distribution module, I am not getting how to
code this situation. Please help me out for the same.

I am going to assume that you want to handle each event independently. A basic
strategy is to keep a time variable starting at 0 and use a while loop until the
time reaches the end of the simulation time. Increment it using a draw from the
exponential distribution each loop. Each iteration of the loop is an event. To
determine the kind of event, you will need to draw from a weighted discrete
distribution. What you want to do here is to do a cumulative sum of the weights,
draw a uniform number from 0 to the total sum, then use bisect to find the item
that matches.

import bisect
import random


# Use a seeded PRNG for repeatability. Use the methods on the Random
# object rather than the functions in the random module.
prng = random.Random(1234567890)

avg_rate = 90.0 # reqs/sec

kind_weights = [50.0, 30.0, 10.0]
kind_cumsum = [sum(kind_weights[:i+1]) for i in range(len(kind_weights))]
kind_max = kind_cumsum[-1]

max_time = 10.0 # sec
t = 0.0 # sec
events = [] # (t, kind)
while t < max_time:
dt = prng.expovariate(avg_rate)
u = prng.uniform(0.0, kind_max)
kind = bisect.bisect_left(kind_cumsum, u)
events.append((t, kind))
t += dt


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
D

Dennis Lee Bieber

Judging from the context, he means a probability distribution.
Futhermore, he explicitly mentions Poisson later in the text.

That should just be a case of finding the mathematical definition of
the distribution and coding something using random (since the random
module has gaussian/normal, but not Poisson -- maybe numpy/simpy?) to
obtain the time variant.

I suspect the real question is how to handle the weighted selection
of r1-r3. And for such a relatively short sample set (90 entries), the
chunked list shown previously, with randint to index into it is viable.

The alternative, more generalized (in a way), would be to use a
sorted list of the probabilities (or sums)... (pseudo-code)

probs = [ (numR1, r1),
(numR1 + numR2, r2),
(numR1 + numR2 + numR3, r3) ]
#obviously one is unlikely to actually create constants for the numRs
#more likely to create in line
probs.sort()

ran = random.randint(numR1 + numR2 + numR3)
for (p, r) in probs:
if ran > p: continue
theR = r
 
L

Laurent Claessens

Il 20/03/2012 12:21, Ben Finney ha scritto:
I guess scipy is also available in plain python (didn't check), but the
following works with Sage :

----------------------------------------------------------------------
| Sage Version 4.8, Release Date: 2012-01-20 |
| Type notebook() for the GUI, and license() for information. |
----------------------------------------------------------------------
sage: from scipy import stats
sage: X=stats.poisson.rvs
sage: X(4)
5
sage: X(4)
2
sage: X(4)
3


Hope it helps
Laurent
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,054
Latest member
LucyCarper

Latest Threads

Top