Comparing two files for equality

E

Edgardo Hames

Hi everybody,

After reading the "Refactoring for PL/SQL Developers" in the January
issue of the Oracle magazine, I tried to implement a program that
compares two files for equality. I came up with the following trivial
solution

puts (IO.readlines file[0]) == (IO.readlines file[1])

Have you got any other suggestions?

Kind Regards,
Ed
 
J

Joel VanderWerf

Edgardo said:
Hi everybody,

After reading the "Refactoring for PL/SQL Developers" in the January
issue of the Oracle magazine, I tried to implement a program that
compares two files for equality. I came up with the following trivial
solution

puts (IO.readlines file[0]) == (IO.readlines file[1])

Have you got any other suggestions?

Is this cheating?

require 'fileutils'
p FileUtils.cmp(file[0], file[1])
 
F

Florian Gross

Edgardo said:
Hi everybody,
Moin.

I tried to implement a program that
compares two files for equality. I came up with the following trivial
solution

puts (IO.readlines file[0]) == (IO.readlines file[1])

Have you got any other suggestions?

Not much different:

def files_equal?(*files)
files.map do |file|
File.size file
end.uniq.size <= 1 and
files.map do |file|
File.read file
end.uniq.size <= 1
end

This ought to be slightly faster in the average case.

Other optimizations would be reading the files line-wise in parallel and
bailing out as soon as one of the lines differs.
 
A

Anders Engström

Edgardo said:
Hi everybody,
Moin.

I tried to implement a program that
compares two files for equality. I came up with the following trivial
solution

puts (IO.readlines file[0]) == (IO.readlines file[1])

Have you got any other suggestions?

Not much different:

def files_equal?(*files)
files.map do |file|
File.size file
end.uniq.size <= 1 and
files.map do |file|
File.read file
end.uniq.size <= 1
end

This ought to be slightly faster in the average case.

Other optimizations would be reading the files line-wise in parallel and
bailing out as soon as one of the lines differs.

Another alternative - probably slow as h*ll, but...

def files_equals?(*files)
require 'digest/md5'

return files.map do |file|
Digest::MD5.hexdigest(File.read(file))
end.uniq.size == 1
end

//Anders
 
A

Andreas Semt

Florian said:
def files_equal?(*files)
files.map do |file|
File.size file
end.uniq.size <= 1 and
files.map do |file|
File.read file
end.uniq.size <= 1
end

Could anybody please explain the "end.uniq.size" line?
Thanks!

Greetings,
Andreas
 
N

Nicholas Van Weerdenburg

end is the end of an expression that returns an array.

# this is quivalent.
a=files.map do |file|
...
end
a.uniq

# Since everything in ruby is an expression, you can do:
files.map do |file|
...
end.uniq

# making the expression explicit...
(files.map do |file|
....
end).uniq

Regards,
Nick
 
A

Andreas Semt

Nicholas said:
end is the end of an expression that returns an array.

# this is quivalent.
a=files.map do |file|
...
end
a.uniq

# Since everything in ruby is an expression, you can do:
files.map do |file|
...
end.uniq

# making the expression explicit...
(files.map do |file|
....
end).uniq

Regards,
Nick

Thanks Nick!

Nice Ruby code.
It's every time a pleasure to see a solution by Florian.

Greetings,
Andreas
 
I

Ilmari Heikkinen

Hello,

I came up with this:

[snip]

could also be written as (not thread-safe):

def <=> f
result = size <=> f.size
( result = read(SIZE) <=> f.read(SIZE) ) until result != 0 or eof?
result
end

def IO.compare_file(fn1, fn2)
open(fn1){|f1|
open(fn2){|f2| f1<=> f2 }}
end
would be quite thread-safe
maybe i'm worrying too much...

one other solution:
FileUtils.compare_file(fn1, fn2)


cheers
 
A

Austin Ziegler

Have you got any other suggestions?

% gem install diff-lcs
% ldiff file1 file2

;)

(Except that in doing this I discovered a bug in Diff::LCS. Expect a
bugfix when I figure it out.)

-austin
 
E

Edgardo Hames

% gem install diff-lcs
% ldiff file1 file2

;)

(Except that in doing this I discovered a bug in Diff::LCS. Expect a
bugfix when I figure it out.)

Wow! I hope that I come up with some more questions that help you all
find bugs in your programs.

Working for better Ruby apps ;)
Ed
 
M

Martin DeMello

Florian Gross said:
Other optimizations would be reading the files line-wise in parallel and
bailing out as soon as one of the lines differs.

To which a nice interface would be

IO.zip('file1', 'file2') {|a, b|
return false if a != b
}

return true

martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top