compare 2 text files - check for difference - Please help

Discussion in 'Ruby' started by Mmcolli00 Mom, Dec 9, 2008.

  1. Hi. I want to take two files that are supposed to be identical, then ook
    for any difference in the two.

    Text1.txt Text2.txt
    aaaaaa aaaaab

    For example, the above text files comparison would return that 'b' was
    found disimilar during the comparison. I have tried to upload new gems
    such as Diff:LCS however this did not work. I am not sure what I am
    doing wrong. Can you please point me in the right directions? Thanks.
    MC
    --
    Posted via http://www.ruby-forum.com/.
    Mmcolli00 Mom, Dec 9, 2008
    #1
    1. Advertising

  2. Mmcolli00 Mom wrote:
    > I have tried to upload new gems
    > such as Diff:LCS however this did not work. I am not sure what I am
    > doing wrong.


    "This did not work" is not a useful problem description.

    Post your code (preferably a minimal test program), and post exactly
    what you see when you run it.

    An alternative approach might be to write the data to two temporary
    files (if not already in files) and run diff -u file1 file2 (e.g. using
    system() or IO.popen)
    --
    Posted via http://www.ruby-forum.com/.
    Brian Candler, Dec 9, 2008
    #2
    1. Advertising

  3. require 'diff/lcs/Array'

    file = File.open("1.txt",'r')
    @row = {}
    file.each_line do |@line|
    key, val = @line.chomp.split(",",0)
    @row[key] = val
    end

    file2 = File.open("2.txt",'r')
    @row2 = {}
    file2.each_line do |@line2|
    key2, val2 = @line2.chomp.split(",",0)
    @row2[key2] = val2
    end


    puts diffs = Diff::LCS.diff(@line,@line2)

    testing files contents...
    1.txt 2.txt
    1,2 1,2,3

    outputed when files different as above:
    #<Diff::LCS::Change:0x2e5d95c>
    #<Diff::LCS::Change:0x2e5da60>
    #<Diff::LCS::Change:0x2e5d90c>

    outputed when files are identicial
    #<Diff::LCS::Change:0x2e5da88>

    I want to be able to say that 2.txt contained '3' therefore the file was
    not the same on that line. Thanks. MC
    --
    Posted via http://www.ruby-forum.com/.
    Mmcolli00 Mom, Dec 9, 2008
    #3
  4. Re: compare 2 text files - check for difference - Please hel

    Are you wanting to get the actual differences, or just know if they are
    different?

    -- Josh
    http://iammrjoshua.com

    Mmcolli00 Mom wrote:
    > require 'diff/lcs/Array'
    >
    > file = File.open("1.txt",'r')
    > @row = {}
    > file.each_line do |@line|
    > key, val = @line.chomp.split(",",0)
    > @row[key] = val
    > end
    >
    > file2 = File.open("2.txt",'r')
    > @row2 = {}
    > file2.each_line do |@line2|
    > key2, val2 = @line2.chomp.split(",",0)
    > @row2[key2] = val2
    > end
    >
    >
    > puts diffs = Diff::LCS.diff(@line,@line2)
    >
    > testing files contents...
    > 1.txt 2.txt
    > 1,2 1,2,3
    >
    > outputed when files different as above:
    > #<Diff::LCS::Change:0x2e5d95c>
    > #<Diff::LCS::Change:0x2e5da60>
    > #<Diff::LCS::Change:0x2e5d90c>
    >
    > outputed when files are identicial
    > #<Diff::LCS::Change:0x2e5da88>
    >
    > I want to be able to say that 2.txt contained '3' therefore the file was
    > not the same on that line. Thanks. MC


    --
    Posted via http://www.ruby-forum.com/.
    Joshua Abbott, Dec 9, 2008
    #4
  5. Re: compare 2 text files - check for difference - Please hel

    I want to get the difference. I need the difference to prove what is
    changing in some files and not in others. I'll run it routinely to check
    file content.

    --
    Posted via http://www.ruby-forum.com/.
    Mmcolli00 Mom, Dec 9, 2008
    #5
  6. Re: compare 2 text files - check for difference - Please hel

    I see.

    I've never used Diff::LCS personally, but I can tell that to get more
    info about what those diff objects contain you could try something like
    this:

    diffs = Diff::LCS.diff(@line,@line2)

    diffs.map(&:inspect)

    That will output the contents of the object to a string so you can see
    what's there.

    -- Josh
    http://iammrjoshua.com

    Mmcolli00 Mom wrote:
    > I want to get the difference. I need the difference to prove what is
    > changing in some files and not in others. I'll run it routinely to check
    > file content.


    --
    Posted via http://www.ruby-forum.com/.
    Joshua Abbott, Dec 9, 2008
    #6
  7. Mmcolli00 Mom wrote:
    > require 'diff/lcs/Array'
    >
    > file = File.open("1.txt",'r')
    > @row = {}
    > file.each_line do |@line|
    > key, val = @line.chomp.split(",",0)
    > @row[key] = val
    > end
    >
    > file2 = File.open("2.txt",'r')
    > @row2 = {}
    > file2.each_line do |@line2|
    > key2, val2 = @line2.chomp.split(",",0)
    > @row2[key2] = val2
    > end
    >
    >
    > puts diffs = Diff::LCS.diff(@line,@line2)


    You are using a very odd way of iterating, which is explicitly forbidden
    in ruby 1.9. If you write

    foo.each do |@bar| ...

    then for each element of foo, the instance variable @bar is set to that
    element. So the net result here is that after the loops have finished,
    @line remains set to the last line of 1.txt, and @line2 remains set to
    the last line of 2.txt. You can demonstrate this by adding

    p @line, @line2

    just before calling Diff::LCS.

    (In ruby 1.9, block parameters must be local variables, and are always
    local to the block - they always drop out of scope afterwards)

    So, your test program simplifies to the following:

    require 'rubygems'
    require 'diff/lcs/array'

    @line = "1,2\n"
    @line2 = "1,2,3\n"
    puts diffs = Diff::LCS.diff(@line,@line2)

    The question is, what do you expect this to do?

    If you replace 'puts' with 'p' in the last line, you get the following
    more detailled output:

    [[#<Diff::LCS::Change:0xb7be613c @element=",", @action="+",
    @position=3>, #<Diff::LCS::Change:0xb7be60d8 @element="3", @action="+",
    @position=4>], [#<Diff::LCS::Change:0xb7be5fac @element="", @action="-",
    @position=4>]]

    My guess is that Diff::LCS is treating the string as a sequence of
    bytes. The first change is [add "," at pos 3, add "3" at pos 4], which
    is correct. The second change is strange, as it seems to say ["remove
    nothing from pos 4"]

    However, since all the examples in the README show two arrays being
    passed, and here you're passing in two strings, I'm not sure this is
    even a supported way of working with this library.

    Your code also builds two hashes, @row and @row2, but doesn't seem to
    use them at all. Were you trying to do something with them?

    Finally, your use of split may not behave the way you expect:

    irb(main):002:0> key2, val2 = "1,2,3\n".chomp.split(",",0)
    => ["1", "2", "3"]
    irb(main):003:0> key2
    => "1"
    irb(main):004:0> val2
    => "2"

    That is, you're ignoring everything after the second field.

    I can strongly recommend playing around with expressions in irb, and
    adding snippets of "p ..expression.." within your code, to get a feel
    for what's happening.

    HTH,

    Brian.
    --
    Posted via http://www.ruby-forum.com/.
    Brian Candler, Dec 9, 2008
    #7
  8. > I want to be able to say that 2.txt contained '3' therefore the file was
    > not the same on that line. Thanks. MC


    Then perhaps you want to feed in the lines as arrays:

    require 'rubygems'
    require 'diff/lcs/array'

    lines1 = lines2 = nil
    File.open("1.txt") { |f| lines1 = f.readlines }
    File.open("2.txt") { |f| lines2 = f.readlines }

    p diffs = Diff::LCS.diff(lines1, lines2)

    This gives me the following output:

    [[#<Diff::LCS::Change:0xb7c17fc0 @element="1,2\n", @action="-",
    @position=0>, #<Diff::LCS::Change:0xb7c180d8 @element="1,2,3\n",
    @action="+", @position=0>]]

    That is:
    - there was a single change (first element of the array)
    - this change had two parts (two elements to inner array)
    - remove "1,2\n" at pos 0, i.e. the first line
    - add "1,2,3\n" at pos 0

    It gets more interesting if you do other changes. For example, if 1.txt
    contains

    1,2
    3,4
    5,6
    7,8
    9,10
    11,12
    13,14
    15,16

    and 2.txt contains

    1,2
    3,4,5
    9,9,9
    5,6
    7,8
    9,10
    11,12
    13,14
    15,16
    17,18

    Then the output becomes:(*)

    [[#<Diff::LCS::Change:0xb7c19a3c @action="-", @element="3,4\n",
    @position=1>,
    #<Diff::LCS::Change:0xb7c199d8 @action="+", @element="3,4,5\n",
    @position=1>,
    #<Diff::LCS::Change:0xb7c19988 @action="+", @element="9,9,9\n",
    @position=2>],
    [#<Diff::LCS::Change:0xb7c19578 @action="+", @element="17,18\n",
    @position=9>]]

    That is: first change bundle is remove the 3,4\n at the second line (#1,
    counting from zero), add 3,4,5\n, and add 9,9,9\n. The third change
    bundle is to add 17,18\n at the tenth line.

    HTH,

    Brian.

    (*) You can also change p to pp, and add 'require "pp"' to the top of
    the file, to get alternative pretty-print formatting.
    --
    Posted via http://www.ruby-forum.com/.
    Brian Candler, Dec 9, 2008
    #8
  9. I just used this on my txt files however, I can't get the script to
    output only when a line was disimillar. Right now it shows everything.

    p sdiff = lines1.sdiff(lines2)

    I want to output the following if sdiff returns the @action=!
    [#<Diff::LCS::ContextChange:24309500 @action=! positions=0,0
    elements="1,2\n","1,2,3\n">

    Do you know if there is a way to output only the disimilarities when the
    @action=! is outputed by sdiff? Right now, it shows every line
    similar/disimilar beginning with #<Diff:


    #************************************************************
    require 'rubygems'
    require 'diff/lcs/array'

    lines1 = lines2 = nil
    File.open("1.txt") { |f| lines1 = f.readlines }
    File.open("2.txt") { |f| lines2 = f.readlines }

    p sdiff = lines1.sdiff(lines2)

    if sdiff =~ /@action!/
    then puts sdiff
    end

    --
    Posted via http://www.ruby-forum.com/.
    Mmcolli00 Mom, Dec 10, 2008
    #9
  10. Mmcolli00 Mom wrote:
    > I just used this on my txt files however, I can't get the script to
    > output only when a line was disimillar. Right now it shows everything.
    >
    > p sdiff = lines1.sdiff(lines2)
    >
    > I want to output the following if sdiff returns the @action=!
    > [#<Diff::LCS::ContextChange:24309500 @action=! positions=0,0
    > elements="1,2\n","1,2,3\n">


    I don't understand.

    If the two files are identical, you should get an empty array. Is that
    not the case? You can test for an empty array using sdiff.empty?

    If the two files are not identical, you will get a series of elements
    telling you chunks which are different, and for each chunk what is
    different. This is similar to the "diff" command at the shell.
    --
    Posted via http://www.ruby-forum.com/.
    Brian Candler, Dec 10, 2008
    #10
  11. I have a huge xml file that I am reading in after a submit routine. The
    routine fails on specific xml elements. I want to be able find every
    element that was changed by the routine through after the second submit
    routine.

    I can already check if the file is different. I just want to pinpoint
    the value that is different. I have noticed that if the context has
    changed then the output will show @action=! and then the value such as
    ([#<Diff::LCS::ContextChange:24309500 @action=! positions=0,0
    > elements="1,2\n","1,2,3\n">) My problem is that one line can be so long that I don't see exactly which element is dissimiar. So I wanted to break each line up then search on anything returning this '@action=!'

    --
    Posted via http://www.ruby-forum.com/.
    Mmcolli00 Mom, Dec 10, 2008
    #11
  12. If anything, do you know how to get this into a new text file?

    require 'rubygems'
    require 'diff/lcs/array'

    lines1 = lines2 = nil
    File.open("xml1.txt") { |f| lines1 = f.readlines}
    File.open("xml2.txt") { |f| lines2 = f.readlines }

    diffs = Diff::LCS.diff(lines1, lines2)
    sdiff = Diff::LCS.sdiff(lines1,lines2)

    p sdiff = Diff::LCS.sdiff(lines1, lines2)

    File.open('log.txt', 'w') do |f1|
    f1.puts Diff::LCS.sdiff(lines1, lines2)
    f1.close
    end

    (this is what it outputs..it doesn't show @action, element like the p
    sdiff creates)

    #<Diff::LCS::ContextChange:0x2e2c424>
    #<Diff::LCS::ContextChange:0x2e2c30c>
    #<Diff::LCS::ContextChange:0x2e2c1f4>
    #<Diff::LCS::ContextChange:0x2e2c0b4>...

    --
    Posted via http://www.ruby-forum.com/.
    Mmcolli00 Mom, Dec 10, 2008
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Frank
    Replies:
    1
    Views:
    739
    Dimitre Novatchev
    Sep 12, 2003
  2. KK
    Replies:
    2
    Views:
    525
    Big Brian
    Oct 14, 2003
  3. www
    Replies:
    3
    Views:
    1,551
    Roedy Green
    Jun 29, 2007
  4. zw
    Replies:
    0
    Views:
    351
  5. Replies:
    3
    Views:
    1,437
    Rolf Magnus
    Jan 18, 2009
Loading...

Share This Page