Comparing two files for equality

Discussion in 'Ruby' started by Edgardo Hames, Jan 12, 2005.

  1. Hi everybody,

    After reading the "Refactoring for PL/SQL Developers" in the January
    issue of the Oracle magazine, I tried to implement a program that
    compares two files for equality. I came up with the following trivial
    solution

    puts (IO.readlines file[0]) == (IO.readlines file[1])

    Have you got any other suggestions?

    Kind Regards,
    Ed
    --
    Alcohol is the anesthesia by which we endure the operation of life.
    -- George Bernard Shaw
     
    Edgardo Hames, Jan 12, 2005
    #1
    1. Advertising

  2. Edgardo Hames wrote:
    > Hi everybody,
    >
    > After reading the "Refactoring for PL/SQL Developers" in the January
    > issue of the Oracle magazine, I tried to implement a program that
    > compares two files for equality. I came up with the following trivial
    > solution
    >
    > puts (IO.readlines file[0]) == (IO.readlines file[1])
    >
    > Have you got any other suggestions?


    Is this cheating?

    require 'fileutils'
    p FileUtils.cmp(file[0], file[1])
     
    Joel VanderWerf, Jan 12, 2005
    #2
    1. Advertising

  3. Edgardo Hames wrote:

    > Hi everybody,


    Moin.

    > I tried to implement a program that
    > compares two files for equality. I came up with the following trivial
    > solution
    >
    > puts (IO.readlines file[0]) == (IO.readlines file[1])
    >
    > Have you got any other suggestions?


    Not much different:

    def files_equal?(*files)
    files.map do |file|
    File.size file
    end.uniq.size <= 1 and
    files.map do |file|
    File.read file
    end.uniq.size <= 1
    end

    This ought to be slightly faster in the average case.

    Other optimizations would be reading the files line-wise in parallel and
    bailing out as soon as one of the lines differs.
     
    Florian Gross, Jan 12, 2005
    #3
  4. On Wed, Jan 12, 2005 at 09:11:17PM +0900, Florian Gross wrote:
    > Edgardo Hames wrote:
    >
    > >Hi everybody,

    >
    > Moin.
    >
    > >I tried to implement a program that
    > >compares two files for equality. I came up with the following trivial
    > >solution
    > >
    > >puts (IO.readlines file[0]) == (IO.readlines file[1])
    > >
    > >Have you got any other suggestions?

    >
    > Not much different:
    >
    > def files_equal?(*files)
    > files.map do |file|
    > File.size file
    > end.uniq.size <= 1 and
    > files.map do |file|
    > File.read file
    > end.uniq.size <= 1
    > end
    >
    > This ought to be slightly faster in the average case.
    >
    > Other optimizations would be reading the files line-wise in parallel and
    > bailing out as soon as one of the lines differs.
    >


    Another alternative - probably slow as h*ll, but...

    def files_equals?(*files)
    require 'digest/md5'

    return files.map do |file|
    Digest::MD5.hexdigest(File.read(file))
    end.uniq.size == 1
    end

    //Anders

    --
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
    Anders Engström
    http://www.gnejs.net PGP-Key: ED010E7F
    [Your mind is like an umbrella. It doesn't work unless you open it.]
     
    Anders Engström, Jan 12, 2005
    #4
  5. Edgardo Hames

    Andreas Semt Guest

    Florian Gross wrote:
    >
    > def files_equal?(*files)
    > files.map do |file|
    > File.size file
    > end.uniq.size <= 1 and
    > files.map do |file|
    > File.read file
    > end.uniq.size <= 1
    > end
    >


    Could anybody please explain the "end.uniq.size" line?
    Thanks!

    Greetings,
    Andreas

    > This ought to be slightly faster in the average case.
    >
    > Other optimizations would be reading the files line-wise in parallel and
    > bailing out as soon as one of the lines differs.
    >
     
    Andreas Semt, Jan 12, 2005
    #5
  6. end is the end of an expression that returns an array.

    # this is quivalent.
    a=files.map do |file|
    ...
    end
    a.uniq

    # Since everything in ruby is an expression, you can do:
    files.map do |file|
    ...
    end.uniq

    # making the expression explicit...
    (files.map do |file|
    ....
    end).uniq

    Regards,
    Nick

    On Thu, 13 Jan 2005 01:15:25 +0900, Andreas Semt <> wrote:
    > Florian Gross wrote:
    > >
    > > def files_equal?(*files)
    > > files.map do |file|
    > > File.size file
    > > end.uniq.size <= 1 and
    > > files.map do |file|
    > > File.read file
    > > end.uniq.size <= 1
    > > end
    > >

    >
    > Could anybody please explain the "end.uniq.size" line?
    > Thanks!
    >
    > Greetings,
    > Andreas
    >
    > > This ought to be slightly faster in the average case.
    > >
    > > Other optimizations would be reading the files line-wise in parallel and
    > > bailing out as soon as one of the lines differs.
    > >

    >
    >



    --
    Nicholas Van Weerdenburg
     
    Nicholas Van Weerdenburg, Jan 12, 2005
    #6
  7. Edgardo Hames

    Andreas Semt Guest

    Nicholas Van Weerdenburg wrote:

    > end is the end of an expression that returns an array.
    >
    > # this is quivalent.
    > a=files.map do |file|
    > ...
    > end
    > a.uniq
    >
    > # Since everything in ruby is an expression, you can do:
    > files.map do |file|
    > ...
    > end.uniq
    >
    > # making the expression explicit...
    > (files.map do |file|
    > ....
    > end).uniq
    >
    > Regards,
    > Nick
    >


    Thanks Nick!

    Nice Ruby code.
    It's every time a pleasure to see a solution by Florian.

    Greetings,
    Andreas



    > On Thu, 13 Jan 2005 01:15:25 +0900, Andreas Semt <> wrote:
    >
    >>Florian Gross wrote:
    >>
    >>>def files_equal?(*files)
    >>> files.map do |file|
    >>> File.size file
    >>> end.uniq.size <= 1 and
    >>> files.map do |file|
    >>> File.read file
    >>> end.uniq.size <= 1
    >>>end
    >>>

    >>
    >>Could anybody please explain the "end.uniq.size" line?
    >>Thanks!
    >>
    >>Greetings,
    >>Andreas
    >>
    >>
    >>>This ought to be slightly faster in the average case.
    >>>
    >>>Other optimizations would be reading the files line-wise in parallel and
    >>>bailing out as soon as one of the lines differs.
    >>>

    >>
    >>

    >
    >
     
    Andreas Semt, Jan 12, 2005
    #7
  8. Andreas Semt wrote:

    > Nice Ruby code.
    > It's every time a pleasure to see a solution by Florian.


    Thank you. :)
     
    Florian Gross, Jan 12, 2005
    #8
  9. Hello,

    On 12.1.2005, at 21:56, georgesawyer wrote:

    > I came up with this:
    >
    > [snip]


    could also be written as (not thread-safe):

    def <=> f
    result = size <=> f.size
    ( result = read(SIZE) <=> f.read(SIZE) ) until result != 0 or eof?
    result
    end

    def IO.compare_file(fn1, fn2)
    open(fn1){|f1|
    open(fn2){|f2| f1<=> f2 }}
    end
    would be quite thread-safe
    maybe i'm worrying too much...

    one other solution:
    FileUtils.compare_file(fn1, fn2)


    cheers
     
    Ilmari Heikkinen, Jan 12, 2005
    #9
  10. On Wed, 12 Jan 2005 13:26:53 +0900, Edgardo Hames <> wrote:
    > Have you got any other suggestions?


    % gem install diff-lcs
    % ldiff file1 file2

    ;)

    (Except that in doing this I discovered a bug in Diff::LCS. Expect a
    bugfix when I figure it out.)

    -austin
    --
    Austin Ziegler *
    * Alternate:
     
    Austin Ziegler, Jan 12, 2005
    #10
  11. On Thu, 13 Jan 2005 06:42:45 +0900, Austin Ziegler <> wrote:
    > On Wed, 12 Jan 2005 13:26:53 +0900, Edgardo Hames <> wrote:
    > > Have you got any other suggestions?

    >
    > % gem install diff-lcs
    > % ldiff file1 file2
    >
    > ;)
    >
    > (Except that in doing this I discovered a bug in Diff::LCS. Expect a
    > bugfix when I figure it out.)
    >


    Wow! I hope that I come up with some more questions that help you all
    find bugs in your programs.

    Working for better Ruby apps ;)
    Ed
    --
    Alcohol is the anesthesia by which we endure the operation of life.
    -- George Bernard Shaw
     
    Edgardo Hames, Jan 12, 2005
    #11
  12. Florian Gross <> wrote:

    > Other optimizations would be reading the files line-wise in parallel and
    > bailing out as soon as one of the lines differs.


    To which a nice interface would be

    IO.zip('file1', 'file2') {|a, b|
    return false if a != b
    }

    return true

    martin
     
    Martin DeMello, Jan 18, 2005
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GenxLogic
    Replies:
    3
    Views:
    1,377
    andrewmcdonagh
    Dec 6, 2006
  2. John Smith

    comparing doubles for equality

    John Smith, Dec 30, 2006, in forum: C Programming
    Replies:
    12
    Views:
    762
  3. tom forsmo
    Replies:
    2
    Views:
    395
    Ian Wilson
    Apr 18, 2007
  4. Edward Rutherford

    Comparing fp types for equality

    Edward Rutherford, Dec 20, 2011, in forum: C Programming
    Replies:
    12
    Views:
    448
    Rui Maciel
    Dec 20, 2011
  5. Jens Thoms Toerring

    Comparing objects for equality

    Jens Thoms Toerring, May 8, 2007, in forum: Perl Misc
    Replies:
    0
    Views:
    112
    Jens Thoms Toerring
    May 8, 2007
Loading...

Share This Page