Hi,
I have an application which writes log files out. If then log file
size is great than let's say 1M, the application will create a new log
file with sequence number. the log file format likes
mylogfile_mmddyy_1.txt, mylogfile_mmddyy_2.txt. ....without upper
limit.
don't do that.
Use a time stamp and use a naming convention that follows a canonical
sort order. E.g. mylogfile_yyyy-mm-dd_hh-mm-ss.txt. The guys that must
service your application will appreciate greatly. Furthermore you should
prefer UTC time stamps for logging to avoid confusion with daylight saving.
Now the problem is if my application get restarted, I need to know
what is the largest sequence number of my log file.
Either create always a new log if the application gets restarted or
forbear from the size limit and use a time limit instead. I would
recommend the latter. If your application is under heavy load the files
grow larger. What's bad with that?
From the service point of view it is a big advantage to have a
deterministic relation between the file name (in fact something like a
primary key) and the content. And it is even better if the canonical
file name ordering corresponds to their logical order.
I am thinking in
a loop from 1 to like 100000, check if the file exist, if it does
not , then I get the max sequence number I need.
From that you see how bad the idea is. Everyone who searches for a
certain entry has to do the same loop, regardless if program or human.
In fact you have absolutely no advantage over putting all logs of a day
into a single file in this case.
But this method looks
very awkward. Is there another way to do this(get the max number for a
series of similar files)?
No. And since most file systems do not maintain a defined sort ordering,
there is no cheaper solution in general. You could scan the entire
directory content, but this is in the same order.
My applicaiton is running on windows platform but did not using MFC
function very much.
That makes no difference here.
Using rotating logs with a fixed time slice is straight forward to
implement, although in case of application restarts. You could use a
simple and fast hash function on the time stamp, that controls log file
switches. Every time the hash changes a virtual method that switches the
log could be invoked. Only his method implements the full rendering of
the file name scheme.
This makes it very easy and with good performance to implement different
cycle times, e.g once per week, once per day and once per hour.
And if you are even smarter you could add a functionality that cleans up
old log automatically once they exceed a configured age. This prevents
from the common issue of full volumes.
Again a fixed relation between the file name and the content is helpful.
All you have to do is to calculate the file name that corresponds to now
minus a configured period and delete all files in the folder which names
compare less to this name and which match the pattern of your logfiles,
e.g. mylogfile_*.txt. Neither you have to touch their content nor you
have to parse the names.
Unfortunately this will always be O(n), so it should not be invoked too
often (e.g. once a day).
Marcel