Hi Alex,
Anybody catches any other ways to improve my program (attached), you are
most welcome. Help me learn, that is one of the objectives of this
newsgroup, right? Or is it all about exchanging the next to impossible
solution to the never to happen unreal world problems?
I don't know what a concordance table is, and I haven't
looked a lot into your program, but anyway here are some
things I noticed at a glance:
| #! usr/bin/env python
| # Merge log files to autolog file
| import os
| import fileinput
| #top='C:\\Documents and Settings\\avanderspek\\My Documents\\CiDRAdata\\Syncrude\\CSL\\August2010'
| top='C:\\Users\\ZDoor\\Documents\\CiDRA\\Syncrude\CSL\\August2010'
If you have backslashes in strings, you might want to use
"raw strings". Instead of "c:\\Users\\ZDoor" you'd write
r"c:\Users\ZDoor" (notice the r in front of the string).
| i,j,k=0,0,0
| date={}
I suggest to use more spacing to make the code more
readable. Have a look at
http://www.python.org/dev/peps/pep-0008/
for more formatting (and other) tips.
| fps=0.3048
| tab='\t'
|
| bt='-999.25'+'\t''-999.25'+'\t''-999.25'+'\t''-999.25'+'\t'+'-999.25'
If these numbers are always the same, you should use
something like
NUMBER = "-999.25"
COLUMNS = 5
bt = "\t".join(COLUMNS * [NUMBER])
(with better naming, of course).
Why don't you use `tab` here?
I _highly_ recommend to use longer (unabbreviated) names.
| al='Status'+'\t'+'State'+'\t'+'-999.25'
|
| for root,dirs,files in os.walk(top):
| #Build a concordance table of days on which data was collected
| for name in files:
| ext=name.split('.',1)[1]
There's a function `splitext` in `os.path`.
| if ext=='txt':
| dat=name.split('_')[1].split('y')[1]
| if dat in date.keys():
You can just write `if dat in date` (in Python versions >=
2.2, I think).
| date[dat]+=1
| else:
| date[dat]=1
| print 'Concordance table of days:'
| print date
| print 'List of files processed:'
| #Build a list of same day filenames, 5 max for a profile meter,skip first and last days
| for f in sorted(date.keys())[2:-1]:
| logs=[]
| for name in files:
| ext=name.split('.')[1]
| if ext=='txt':
| dat=name.split('_')[1].split('y')[1]
I guess I'd move the parsing stuff (`x.split(s)
` etc.)
into small functions with meaningful names. After that I'd
probably notice there's much redundancy and refactor them. 
| if dat==f:
| logs.append(os.path.join(root,name))
| #Open the files and read line by line
| datsec=False
| lines=[[] for i in range(5)]
One thing to watch out for: The above is different from
`[[]] * 5` which uses the _same_ empty list for all entries.
Probably the semantics you chose is correct.
| fn=0
| for line in fileinput.input(logs):
| if line.split()[0]=='DataID':
| datsec=True
| ln=0
| if datsec:
| lines[fn].append(line.split())
| ln+=1
| if ln==10255:
This looks like a "magic number" and should be turned into a
constant.
| datsec=False
| fileinput.nextfile()
| fn+=1
| print fileinput.filename().rsplit('\\',1)[1]
| fileinput.close()
| aut='000_AutoLog'+f+'.log'
| out=os.path.join(root,aut)
| alf=open(out,'w')
| alf.write('Timestamp (mm/dd/yyyy hh:mm:ss) VF 1 VF 2 VF 3 VF 4 VF 5 Q 1 Q 2 Q 3 Q 4 Q 5 Status State Metric Band Temperature 1 Band Temperature 2 Band Temperature 3 Band Temperature 4 Band Temperature 5 SPL 1 SPL 2 SPL 3 SPL 4 SPL 5'+'\n')
| for wn in range(1,10255,1):
You don't need to write the step argument if it's 1.
| for i in range(5):
| lines[wn][2]=str(float(lines[wn][2])/fps)
| tp=lines[0][wn][0]+' '+lines[0][wn][1]
| vf=tab.join([lines[wn][2] for i in range(5)])
| vq=tab.join([lines[wn][3] for i in range(5)])
| vs=tab.join([lines[wn][4] for i in range(5)])
| #sf=tab.join([lines[wn][5] for i in range(5)])
| #sq=tab.join([lines[wn][6] for i in range(5)])
| #ss=tab.join([lines[wn][7] for i in range(5)])
Maybe use an extra function?
def choose_a_better_name():
return tab.join([lines[index][wn][2] for index in range(5)])
Moreover, the repetition of this line looks as if you wanted
to put the right hand sides of the assignments in a list,
instead of assigning to distinct names (`vf` etc.).
By the way, you use the number 5 a lot. I guess this should
be a constant, too.
| alf.write(tp+'\t'+vf+'\t'+vq+'\t'+al+'\t'+bt+'\t'+vs+'\n')
Suggestion: Use
tab.join([tp, vf, vq, al, bt, vs]) + "\n"
Again, not using distinct variables would have an advantage
here.
| alf.close()
| print "Done"
Stefan