Cropping log files

K

Kamus of Kadizhar

Yet another possible newbie question.

I'm tyring to figure out how to best crop a log file. I have a file
that grows infinitely in length. I want to keep only the last n entries
in the file.

I've been looking for some sort of ascii-database or log file management
python module that lets me say: how many records in the file? and then
say: delete the first nr - n records.

No joy.

I don't want to suck the whole file into memory if I can help it, and I
can't help thinking that doing a
for line in file(logfile)
nr += 1

to count the number of lines, then reading the file again, discarding
the first nr - n records, writing the rest to a temp file, and then
renaming the files is the most efficient way to go.

Not only that, but what happens if the logfile is written to while I'm
doing all of this - I may lose log file entries....

I found FLAD, even that seems to be overkill for what I need. So, any
modules out there for log file management?

The logging module lets me log events, but there aren't any tools for
managing the log files created in the way I need.... The
RotatingFileHandler rotates logs in the sense of logrotate, but what I
need is to keep only the last n records in one file.

-Kamus
 
D

Duncan Booth

The logging module lets me log events, but there aren't any tools for
managing the log files created in the way I need.... The
RotatingFileHandler rotates logs in the sense of logrotate, but what I
need is to keep only the last n records in one file.

Keeping a fixed number of records in one file is difficult if the records
aren't a fixed length, which is why most logging systems limit the number
of records that are kept by the size of the data, rather than by the number
of records.

If you can modify your requirements to say that you need to keep 'at least
n records' then rotating file logger will do that, just make sure that the
size that is kept in all files is at least n times the largest record you
might log. Of course the records will then be split across multiple files.

If you really need to keep n records in one file, then the best thing is
probably to write your own handler for the logging module and store the log
records in a fixed length format in the file.

i.e. you choose a size larger than your largest log record and pad each log
record to that length with nulls. Then you just write out to the file as
though it were a circular buffer. You need some way of finding the first
record again: one way is to always ensure you have at least one empty
record in the file so when you first open the file you scan to the first
empty record, then each time you write to the file you write the new record
followed by a new empty record, then skip back so the empty record will be
overwritten. When you reach record number n+1 in the file you simply wrap
back to the start. You'll also want to write code to read a logfile
starting after the last empty record.

If you can't set a hard upper bound on the size of a record you will need a
more complicated file structure, or if that 'n' doesn't absolutely have to
be set in stone simply allow occasional long records to wrap over into the
next record and handle that case in both writing and reading code.

The NTEventHandler, (for those of us on Windows systems) does pretty much
this last option, it uses event logs files which are a fixed length and
wrap round, but individual records can span more than one block in the
file, so the number of entries in a logfile is actually an upper limit.
 
K

Kamus of Kadizhar

Duncan said:
If you can modify your requirements to say that you need to keep 'at least
n records' then rotating file logger will do that, just make sure that the
size that is kept in all files is at least n times the largest record you
might log. Of course the records will then be split across multiple files.

You know, that just might work. I need more-or-less n records; the
number is not critical.

So, if I allow the rotating logger a max file size that accomodates
approx. n/4 records, tell it to keep older files to a depth of 6, then
read older log files until I get some number of records greater than n....

I just have bad experience with letting log files get to infinite
length... :)
 
C

Colin Brown

....
I'm tyring to figure out how to best crop a log file. I have a file
that grows infinitely in length. I want to keep only the last n entries
in the file.
....

An alternative solution may be to write date-based files with deleting of
files older than some limit.
Here is an example (class HourlyLog) of date-based hourly logging with
day-based cleanup:

------------------------------------------------------
import glob, os, time

def Now(): # UTC time now as a string
return time.asctime(time.gmtime())

def DD_HH(): # 2 digit utc day & hour
return time.strftime("%d_%H",time.gmtime())

def Hhmmss(): # "hh:mm:ss" string
return time.strftime("%H:%M:%S",time.gmtime())

def DaysAgo(number): # time days ago
return time.time() - (86400 * number)

class Hour:
def __init__(self):
self.hour = time.gmtime()[3]

def change(self):
hour = time.gmtime()[3]
if hour != self.hour:
self.hour = hour
return 1
else:
return 0

# filename must contain '##_##' for the 2 digit day and hour fields
def FileHourLog(f,s): # append string to daily logfile (nb: win32 does
lf -> crlf)
try:
ff=open(DD_HH().join(f.split('##_##',1)),'aU')
except:
ff=open(DD_HH().join(f.split('##_##',1)),'wU')
ff.write(''.join(['-----< ',Now(),'
-----','\n',s.replace('\r\n','\r').replace('\r','\n'),'\n']))
ff.close()

def FileBefore(f,time): # return true if file modification date is
before time()-secs
try:
if os.stat(f)[8] < time:
return 1
else:
return 0
except:
return 0 # may not exist in multi-threaded app.

def FileTidy(list,at): # delete files older than time
for file in list:
try:
if FileBefore(file,at):
os.remove(file)
except:
pass # may not exist in multi-threaded app.

def FilesMatching(path,pattern): # return a list of files matching pattern.
return glob.glob(os.path.join(path,pattern))

class HourlyLog: # filename as per FileLog; number of days to keep
def __init__(self,f,n):
self.hour = Hour()
self.f = f
self.n = n

def log(self,s):
FileHourLog(self.f,s)
if self.hour.change():
[path, name] = os.path.split(self.f)
pattern = name.replace('#','?')
folder = FilesMatching(path,pattern)
FileTidy(folder,DaysAgo(self.n))
 
R

Robin Munn

Kamus of Kadizhar said:
Yet another possible newbie question.

I'm tyring to figure out how to best crop a log file. I have a file
that grows infinitely in length. I want to keep only the last n entries
in the file.
[snip]

I don't want to suck the whole file into memory if I can help it, and I
can't help thinking that doing a
for line in file(logfile)
nr += 1

to count the number of lines, then reading the file again, discarding
the first nr - n records, writing the rest to a temp file, and then
renaming the files is the most efficient way to go.

Are you in a Unix environment here? If so, why not let the 'tail'
command do the work for you? Something like this:

# Close any open file objects pointing to the log file
long_log_filename = '/var/log/lotsoflines.log'
short_log_filename = '/var/log/lotsoflines.log.truncated'
lines_to_keep = 250
cmd = 'tail -%(lines)d %(long)s > %(short)s && mv %(short)s %(long)s' % {
'lines': lines_to_keep,
'short': short_log_filename,
'long': long_log_filename,
}
os.system(cmd)
# Now reopen your file objects

Or if you really want to do it in pure Python, then have a look at the
source for 'tail' and see how it finds the last N lines. In most of my
experiments, I've found that 'tail' grabs the last 10 lines out of a
1-gigabyte log file within milliseconds, while 'wc -l' takes a lot
longer to count the whole file. That's not very scientific evidence, of
course, but why don't you try it and see for yourself?
 
G

Gerrit Holl

Robin said:
Or if you really want to do it in pure Python, then have a look at the
source for 'tail' and see how it finds the last N lines. In most of my
experiments, I've found that 'tail' grabs the last 10 lines out of a
1-gigabyte log file within milliseconds, while 'wc -l' takes a lot
longer to count the whole file. That's not very scientific evidence, of
course, but why don't you try it and see for yourself?

Maybe it .seeks() to .getsize()?

Gerrit.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top