check if files are the same on Windows

B

Beliavsky

A crude way to check if two files are the same on Windows is to look
at the output of the "fc" function of cmd.exe, for example

def files_same(f1,f2):
cmnd = "fc " + f1 + " " + f2
return ("no differences" in popen(cmnd).read())

This is needlessly slow, because one can stop comparing two files
after the first difference is detected. How should one check that
files are the same in Python? The files are plain text.
 
M

Marc 'BlackJack' Rintsch

Beliavsky said:
[…] How should one check that files are the same in Python? The files
are plain text.

Take a look at the `filecmp` module. Pay attention to the shallow
argument of `filecmp.cmp()` and the default value!

Ciao,
Marc 'BlackJack' Rintsch
 
S

Shane Geiger

In the unix world, 'fc' would be like diff.

"""
Python example of checksumming files with the MD5 module.

In Python 2.5, the hashlib module would be preferable/more elegant.
"""

import md5

import string, os
r = lambda f: open(f, "r").read()
def readfile(f,strip=False): return (strip and stripper(r(f))) or r(f)
def writefile(f, data, perms=750): open(f, "w").write(data) and
os.chmod(f, perms)

def get_md5(fname):
hash = md5.new()
contents = readfile(fname)
hash.update(contents)
value = hash.digest()
return (fname, hash.hexdigest())

import glob

for f in glob.glob('*'):
print get_md5(f)


A crude way to check if two files are the same on Windows is to look
at the output of the "fc" function of cmd.exe, for example

def files_same(f1,f2):
cmnd = "fc " + f1 + " " + f2
return ("no differences" in popen(cmnd).read())

This is needlessly slow, because one can stop comparing two files
after the first difference is detected. How should one check that
files are the same in Python? The files are plain text.

--
Shane Geiger
IT Director
National Council on Economic Education
(e-mail address removed) | 402-438-8958 | http://www.ncee.net

Leading the Campaign for Economic and Financial Literacy
 
K

kyosohma

In the unix world, 'fc' would be like diff.

"""
Python example of checksumming files with the MD5 module.

In Python 2.5, the hashlib module would be preferable/more elegant.
"""

import md5

import string, os
r = lambda f: open(f, "r").read()
def readfile(f,strip=False): return (strip and stripper(r(f))) or r(f)
def writefile(f, data, perms=750): open(f, "w").write(data) and
os.chmod(f, perms)

def get_md5(fname):
hash = md5.new()
contents = readfile(fname)
hash.update(contents)
value = hash.digest()
return (fname, hash.hexdigest())

import glob

for f in glob.glob('*'):
print get_md5(f)




--
Shane Geiger
IT Director
National Council on Economic Education
(e-mail address removed) | 402-438-8958 | http://www.ncee.net

Leading the Campaign for Economic and Financial Literacy

sgeiger.vcf
1KDownload

You can also use Python's file "read" method to read a block of each
file in a loop in binary mode.

Something like:

file1 = open(path1, 'rb')
file2 = open(path2, 'rb')

bytes1 = file1.read(blocksize)
bytes2 = file2.read(blocksize)

And then just compare bytes to see if there is a difference. If so,
break out of the loop. I saw this concept in the book: Python
Programming, 3rd Ed. by Lutz.

Have fun!

Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top