ndiff

Bryan · Jul 24, 2003

i tried using ndiff and Differ.compare today from the difflib module. i
have three questions.

1. both ndiff and Differ.compare return all the lines including lines that
are the same in both files, not just the diffs. is the convention to take
the output and then filter out lines that contain a space as the first
character to just get the diffs? it seems strange to me that the output is
not just the deltas and a lot of wasted filtering (especially if the file is
very large) to get the diff you wanted in the first place. isn't there a
better way?

2. i also tried passing IS_LINE_JUNK and IS_CHARACTER_JUNK, but there was
no difference in the output even though i changed some whitespace in the
file. i then wrote my own junk functions and again, there was no
difference in the output even though i returned 1 to filter out some lines.
can someone show an example of using IS_LINE_JUNK and IS_CHARACTER_JUNK
showing different output than when not using it.

3. is there a simple method that just returns true or false whether two
files are different or not? i was hoping that ndiff/compare would return an
empty list if there was no difference, but that's not the case. i ended up
using a simple: if file1.read() == file2.read(): but there must be a smarter
faster way.

thanks,

bryan

Ian Bicking · Jul 24, 2003

3. is there a simple method that just returns true or false whether two
files are different or not? i was hoping that ndiff/compare would return an
empty list if there was no difference, but that's not the case. i ended up
using a simple: if file1.read() == file2.read(): but there must be a smarter
faster way.

Maybe something like:

def areDifferent(file1, file2):
while 1:
data1, data2 = file1.read(1000), file2.read(1000)
if not data1 and not data2:
return True
if data1 != data2:
return False

You still have to go through the entire file if you really want to be
sure. If you use filenames, of course, you can take some shortcuts:

def filesDiffer(filename1, filename2):
if os.stat(filename1).st_size != os.stat(filename2).st_size:
return False
else:
return areDifferent(open(filename1), open(filename2)

You could also try a quick comparison from somewhere not at the
beginning (using .seek(pos)), if you think it is likely that files will
have common headers. But you'd still have to scan the entire file to be
sure.

Ian

Raymond Hettinger · Jul 25, 2003

1. both ndiff and Differ.compare return all the lines including lines that

are the same in both files, not just the diffs. is the convention to take
the output and then filter out lines that contain a space as the first
character to just get the diffs? it seems strange to me that the output is
not just the deltas and a lot of wasted filtering (especially if the file is
very large) to get the diff you wanted in the first place. isn't there a
better way?

The new difflib.py in Py2.3 has two new functions, context_diff()
and unified_diff(). The new functions and an exposed underlying
method strip-away the commonalities leaving only the changes
and context, if desired.

Raymond Hettinger

comparing two lists, ndiff performance	3	Jan 30, 2008
[QUIZ] NDiff (#46)	13	Sep 9, 2005
[SUMMARY] NDiff (#46)	2	Sep 15, 2005
Remote SSH and Configuring code help	0	Dec 13, 2023
What should I do Before I give up programming?	6	Jan 14, 2023
Restore a unified diff	3	Jan 4, 2005
What is AI programming to us non-bigtech programmers?	4	Jun 1, 2023
Find and count strings of text from multiple files	17	Dec 16, 2021

ndiff

Bryan

Ian Bicking

Raymond Hettinger

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads