Script to compare two directory structures

  • Thread starter Generic Usenet Account
  • Start date
G

Generic Usenet Account

We had a need to compare two directory structures to see if they are
identical (meaning if they have the same structure, same contents and
same versions of files). I wrote a shell script for this purpose
(posted to the comp.sources.d newsgroup). It works, but given my
scant knowledge of scripting, it is rather crude. I am looking for
something more professional and robust, perhaps using perl. Any help
would be appreciated.

Thanks,
Bhta
 
E

Ed Morton

Generic said:
We had a need to compare two directory structures to see if they are
identical (meaning if they have the same structure, same contents and
same versions of files). I wrote a shell script for this purpose
(posted to the comp.sources.d newsgroup). It works, but given my
scant knowledge of scripting, it is rather crude. I am looking for
something more professional and robust, perhaps using perl. Any help
would be appreciated.

Thanks,
Bhta

diff -r dir1 dir2

If that doesn't work for you, explain why.

Ed.
 
G

Generic Usenet Account

diff -r dir1 dir2

If that doesn't work for you, explain why.

Ed.

Sheepishly I must admit that you are absolutely right. Thanks for
pointing this out ----- you made me realize how I wasted a couple of
hours of precious time on a wild goose chase.
 
G

Generic Usenet Account

diff -r dir1 dir2

If that doesn't work for you, explain why.

Ed.

There is at least one situation where my extremely crude script works,
but diff -r dir1 dir2 does not. For example, the script will work
even if the two directory structures are not visible on the same
system e.g. one directory is on the build server while another
directory is on the test server, and there is no cross mounting
between the two.

Bhta
 
K

Kenny McCormack

Ed Morton said:
diff -r dir1 dir2

If that doesn't work for you, explain why.

Ed.

From what I can tell, "diff -r" actually compares (performs the 'diff'
functionality on) every file that it encounters. That's not strictly
speaking the same as that which most people intuitively assume when they
think of comparing directory structures.

What I am getting at is that what "most people intuitively assume when
they think of comparing directory structures" is comparing things at the
time-stamp level, not at the file contents level. The implications of
this are:
1) That comparing byte for byte is time-intensive (and usually
unnecessary)
2) That, especially on non-Unix systems, performing the 'diff'
functionality on binary files is not exactly a well-defined
operation.

Therefore, it would still be nice to have a Unix-based util that is
more akin to the sorts of things that many of us are familiar with on,
e.g., the MS Windows platform.
 
J

Josef Moellers

Jim said:
I am not familiar with utilities to compare directory structures on the
MS Windows platform, but were I to write a program to compare directory
structures (not file contents), I would use the File::Find module to
recursively get file names in each directory tree, the stat function to
get the modification times, and hashes to save and compare the
resulting data.

As for the hashes (as in checksums), I'd use the Digest::MD5 module:

use File::Find to build a (Perl-)hash mapping pathnames to
node-information, constructing the latter from stat() and md5_hex().
If the number of entries is huge and memory is tight, you could write
this information for both trees to a file and then compare the contents
of these files.

Josef
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top