W
WIWA
Hi,
I'm writing an application to analyse my Apache access_log file. In
the below script (which is based on an example I found in 'How to
think as a Programmer'-) I want to measure the amount of hits per
hour. I know it is not the best algorithm but it works for now.
I see some strange things: on rather small (it must be +/- < 4000 )
entries in the access_log, my script works fine. Above this limit, I
get the following error.
Traceback (most recent call last):
File "hits_per_uur.py", line 18, in
lijst.append(int(datum[1]))
IndexError: list index out of range
Question: do lists have a limit? Anyone know how I can change this
simple script so that it works for more entries as well.
------------------------------------------------------------------------------
import sys,string
def inbucket(lijst, low, high):
count=0
for num in lijst:
if low<=num<high:
count=count+1
return count
f=open('c:/summary.txt','a',1)
f.write("Hits per uur" + "\n")
lijst=['']*0
data=sys.stdin.readlines()
for line in data:
words=string.split(line)
datum=string.split(words[3],':') # datum bevat de volledige datum:
[16/Sep/2003:05:22:57 +0200]
lijst.append(int(datum[1]))
numbuckets=24
buckets= [0]*numbuckets
bucketwidth=24/numbuckets
for i in range(numbuckets):
low=i* bucketwidth
high=low+bucketwidth
buckets=inbucket(lijst,low,high)
for num in range(len(buckets)):
output=str(num) + "\t" + str(buckets[num]) + "\n"
f.write(output)
f.close()
I'm writing an application to analyse my Apache access_log file. In
the below script (which is based on an example I found in 'How to
think as a Programmer'-) I want to measure the amount of hits per
hour. I know it is not the best algorithm but it works for now.
I see some strange things: on rather small (it must be +/- < 4000 )
entries in the access_log, my script works fine. Above this limit, I
get the following error.
Traceback (most recent call last):
File "hits_per_uur.py", line 18, in
lijst.append(int(datum[1]))
IndexError: list index out of range
Question: do lists have a limit? Anyone know how I can change this
simple script so that it works for more entries as well.
------------------------------------------------------------------------------
import sys,string
def inbucket(lijst, low, high):
count=0
for num in lijst:
if low<=num<high:
count=count+1
return count
f=open('c:/summary.txt','a',1)
f.write("Hits per uur" + "\n")
lijst=['']*0
data=sys.stdin.readlines()
for line in data:
words=string.split(line)
datum=string.split(words[3],':') # datum bevat de volledige datum:
[16/Sep/2003:05:22:57 +0200]
lijst.append(int(datum[1]))
numbuckets=24
buckets= [0]*numbuckets
bucketwidth=24/numbuckets
for i in range(numbuckets):
low=i* bucketwidth
high=low+bucketwidth
buckets=inbucket(lijst,low,high)
for num in range(len(buckets)):
output=str(num) + "\t" + str(buckets[num]) + "\n"
f.write(output)
f.close()