S
s99999999s2003
hi
i wrote some code to compare 2 files. One is the base file, the other
file i got from somewhere. I need to compare this file against the
base,
eg base file
abc
def
ghi
eg another file
abc
def
ghi
jkl
after compare , the base file will be overwritten with "jkl". Also both
files tend to grow towards > 20MB ..
Here is my code...using difflib.
pat = re.compile(r'^\+') ## i want to get rid of the '+' from the
difflib output...
def difference(filename,basename):
import difflib
base = open(basename)
a = base.readlines()
input = open(filename)
b = input.readlines()
d = difflib.Differ()
diff = list(d.compare(a, b))
if len(diff) > 0:
os.remove(basename)
o = open(basename, "aU")
for i in diff:
if pat.search(i):
i = i.lstrip("\+ ")
o.writelines(i) ## write a new base
file...
o.close()
g = open(basename)
return g.readlines()
Whenever the 2 files get very large, i find that it's very slow
comparing...any good advice to speed things up.? I thought of removing
readlines() method, and use line by line compare. Is it a better way?
thanks
i wrote some code to compare 2 files. One is the base file, the other
file i got from somewhere. I need to compare this file against the
base,
eg base file
abc
def
ghi
eg another file
abc
def
ghi
jkl
after compare , the base file will be overwritten with "jkl". Also both
files tend to grow towards > 20MB ..
Here is my code...using difflib.
pat = re.compile(r'^\+') ## i want to get rid of the '+' from the
difflib output...
def difference(filename,basename):
import difflib
base = open(basename)
a = base.readlines()
input = open(filename)
b = input.readlines()
d = difflib.Differ()
diff = list(d.compare(a, b))
if len(diff) > 0:
os.remove(basename)
o = open(basename, "aU")
for i in diff:
if pat.search(i):
i = i.lstrip("\+ ")
o.writelines(i) ## write a new base
file...
o.close()
g = open(basename)
return g.readlines()
Whenever the 2 files get very large, i find that it's very slow
comparing...any good advice to speed things up.? I thought of removing
readlines() method, and use line by line compare. Is it a better way?
thanks