Why does this choke?

S

S Kemplay

Hi all,

I wrote a script to choose random dates for a statistics assignment.
I only need to choose 30 dates from one year with no leap years and it works
fine. However I tested with different numbers of dates. It hangs from 450 up.
I only need 30 dates but it would be good to know why it hangs. (My coding
probably has something to do with it :))


import random

def getmonth():
month = random.randint(1,12)
return month

def getday(month, leaps):
thirtyones = [1,3,5,7,8,10,12]
thirties = [4,6,9,11]
if month in thirtyones:
day = random.randint(1,31)
elif month in thirties:
day = random.randint(1,30)
else:
if leaps == 1: leap = random.randint(1,4)
else: leap = 1
if leap in [2,3,4]:
day = random.randint(1,29)
else:
day = random.randint(1,28)
return day

def getdates(n, leaps):
dates = []
i = 0
while i < n:
month = getmonth()
day = getday(month, leaps)
if (day, month) in dates:
continue
i += 1
dates.append((day, month))
return dates


Thanks

Sean Kemplay
 
D

Dennis Lee Bieber

S Kemplay fed this fish to the penguins on Friday 07 November 2003
03:29 am:
I wrote a script to choose random dates for a statistics assignment.
I only need to choose 30 dates from one year with no leap years and it

"... with no leap years..." so why do you have that mess for leap year
February length?
works fine. However I tested with different numbers of dates. It hangs
from 450 up. I only need 30 dates but it would be good to know why it
hangs. (My coding probably has something to do with it :))
You don't show the main program invocation... However... How much time
did you let the program run? Your code is asking for UNIQUE random
dates -- no duplicates (which doesn't seem to be something I'd expect
in a statistical test). As a result, you have this growing list of
dates that has to be compared each time you generate a new date. For a
short list this may not be noticeable, but for longer lists it takes
time!

Oh! That's it -- you are generating 450 UNIQUE DATES, but there are a
maximum of 366 possible (since you don't keep the year)... The loop
will NEVER END since once the first 366 are used, it can not generate a
valid unique date.

You are also generating an awful lot of random numbers... Try the
following (watch out for line wraps)

-=-=-=-=-=-=-=-=-

import random

# start with MARCH to avoid the confusion of 365 vs 366 (leap)
MONTHS = [ 'Mar', 'Apr', 'May',
'Jun', 'Jul', 'Aug',
'Sep', 'Oct', 'Nov',
'Dec', 'Jan', 'Feb' ]

# a list of the start day for each month
LENGTH = [ 1, 32, 62,
93, 123, 154,
185, 215, 246,
276, 307, 338 ]

def RandomDate():
# generate a random date index within a normal year
dayindex = random.randint(1, 365)
# check for a leap year date (though your requirements
# say "no leap years", and without having a year attached
# this just confuses statistics)
if random.randint(1, 4) == 4:
dayindex += 1

monthindex = 0
while monthindex < 11:
# scan LENGTH to find which month
if dayindex < LENGTH[monthindex]: break
monthindex += 1

monthindex -= 1 # adjust for last loop

month = MONTHS[monthindex]
day = dayindex - LENGTH[monthindex]

return (month, day)


if __name__ == "__main__":
dates = []
while len(dates) < 450: # if the limit is >365, do not use UNIQUE
aDate = RandomDate()
dates.append(aDate) # replace this with next two for UNIQUE
# if aDate not in dates:
# dates.append(aDate)


print dates



--
 
D

Dennis Lee Bieber

Dennis Lee Bieber fed this fish to the penguins on Friday 07 November
2003 08:55 am:

Obviously I wasn't full awake when I wrote that -- a few bugs were
left... I think this one is more suited:

import random

# start with MARCH to avoid the confusion of 365 vs 366 (leap)
MONTHS = [ 'Mar', 'Apr', 'May',
'Jun', 'Jul', 'Aug',
'Sep', 'Oct', 'Nov',
'Dec', 'Jan', 'Feb' ]

# a list of the start day for each month
LENGTH = [ 1, 32, 62,
93, 123, 154,
185, 215, 246,
276, 307, 338 ]

def RandomDate():
# generate a random date index within a year
if random.randint(1, 4) == 4:
dayindex = random.randint(1, 366)
else:
dayindex = random.randint(1, 365)

monthindex = 0
while monthindex < 12:
# scan LENGTH to find which month
if dayindex < LENGTH[monthindex]: break
monthindex += 1

monthindex -= 1 # adjust for last loop

month = MONTHS[monthindex]
day = dayindex - LENGTH[monthindex] + 1

return (month, day)


if __name__ == "__main__":
dates = []
while len(dates) < 450: # if the limit is >365, do not use UNIQUE
aDate = RandomDate()
dates.append(aDate) # replace this with next two for UNIQUE
# if aDate not in dates:
# dates.append(aDate)


print dates

--
 
S

S Kemplay

Hi Dennis,

Thanks for your advice,

I really like the way you handled the dates and can see your script generates
fewer random numbers for each date generated.

It still has to check whether the dates in the list (for unique dates). I
guess breaking a big list down into shorter lists would help?

Sean Kemplay
 
D

Dennis Lee Bieber

S Kemplay fed this fish to the penguins on Saturday 08 November 2003
21:23 pm:

It still has to check whether the dates in the list (for unique
dates). I guess breaking a big list down into shorter lists would
help?
Probably not -- you'd still have to make sure you didn't have common
dates crossing any lists.

For unique dates, you have two concerns: 1) you can not ask for more
dates than exist in a year (I'd limit it to 365, as 366 could mean the
last date needed is Feb 29, which occurs one fourth as often as any
other); 2) the time needed to determine uniqueness.

For the latter, and only if you will never run with duplicate dates,
changing from a list to a dictionary might be faster -- the dates, as
tuples, are hashable, and can become keys into a dictionary for rapid
look-up.

import random

# start with MARCH to avoid the confusion of 365 vs 366 (leap)
MONTHS = [ 'Mar', 'Apr', 'May',
'Jun', 'Jul', 'Aug',
'Sep', 'Oct', 'Nov',
'Dec', 'Jan', 'Feb' ]

# a list of the start day for each month
LENGTH = [ 1, 32, 62,
93, 123, 154,
185, 215, 246,
276, 307, 338 ]

def RandomDate():
# generate a random date index within a year
if random.randint(1, 4) == 4:
dayindex = random.randint(1, 366)
else:
dayindex = random.randint(1, 365)

monthindex = 0
while monthindex < 12:
# scan LENGTH to find which month
if dayindex < LENGTH[monthindex]: break
monthindex += 1

monthindex -= 1 # adjust for last loop

month = MONTHS[monthindex]
day = dayindex - LENGTH[monthindex] + 1

return (month, day)


if __name__ == "__main__":
dates = {}
while len(dates) < 365: # if the limit is >365, do not use UNIQUE
aDate = RandomDate()
if dates.has_key(aDate):
dates[aDate] += 1 # just for fun, keep count of duplicates
else:
dates[aDate] = 1

print dates



--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top