Ruby Golf: Object Diff

J

John W. Long

The problem:

When debugging a program that uses large objects and a test fails because
the object is different from the expected it is sometimes hard to discern
the difference between the objects from the output of Test::Unit.

The goal of this hole is to create a method that will output the differences
of two objects in an intelligent manner. Something similar in concept to
this:
#<TestObject1:0x1 ... @b="b", ... @d="d", ...>
#<TestObject1:0x2 ... @b="", ... @d="", ...>

You are free to use whatever method you might choose. Program size does not
matter. Creativity and one-up-man-ship is encouraged. We will probably post
the best solution on the wiki.

The following test case is a guideline and may be changed if you have better
ideas for how such a method should work:

<TestCase>
class TestObject1
attr_accessor :a, :b, :c, :d, :e
def initialize(a, b, c, d, e = nil)
@a, @b, @c, @d, @e = a, b, c, d, e
end
end
class TestObject2
attr_accessor :a, :b, :c, :d, :f, :g
def initialize(a, b, c, d, f, g)
@a, @b, @c, @d, @f, @g = a, b, c, d, f, g
end
end

class TC_ObjectDiff < Test::Unit::TestCase
def test_to_s
object = TestObject1.new('a', 'b', 'c', 'd', [1, 2, 3, 4])
def object.__id__
0
end
assert_equal('#<TestObject1:0x0 @a="a", @b="b", @c="c", @d="d", @e=[1, 2,
3, 4]>',
ObjectDiff.object_to_s(object))
end
def test_compare
object1 = TestObject1.new('a', 'b', 'c', 'd')
def object1.__id__
1
end
object2 = TestObject1.new('a', '', 'c', '')
def object2.__id__
2
end
string1, string2 = ObjectDiff.compare(object1, object2)
puts [string1, string2]
assert_equal('#<TestObject1:0x1 ... @b="b", ... @d="d", ...>', string1)
assert_equal('#<TestObject1:0x2 ... @b="", ... @d="", ...>', string2)

complexObject1 = TestObject1.new('1', {:a => 1, :b => 2, :j => 3}, [1, 2,
3], object1)
def complexObject1.__id__
3
end
complexObject2 = TestObject1.new('1', {:a => 2, :b => object1, :j => 3},
[0, 2], object2)
def complexObject2.__id__
4
end
string1, string2 = ObjectDiff.compare(complexObject1, complexObject2)
puts [string1, string2]
assert_equal('#<TestObject1:0x3 ... @b={:a=>1, :b=>2, ...}, @c=[1, ... 3],
@d=#<TestObject1:0x1 ... @b="b", ... @d="d", ...>, ...>', string1)
assert_equal('#<TestObject1:0x4 ... @b={:a=>2, :b=>#<TestObject1:0x284a3a8
@c="c", @b="b", @e=nil, @a="a", @d="d">, ...}, @c=[0, ...],
@d=#<TestObject1:0x2 ... @b="", ... @d="", ...>, ...>', string2)
end
def test_compare_different_classes
object1 = TestObject1.new('a', 'b', 'c', 'd')
def object1.__id__
5
end
object2 = TestObject2.new('a', '', 'c', '', 'f', 'g')
def object2.__id__
6
end

string1, string2 = ObjectDiff.compare(object1, object2)
puts [string1, string2]
assert_equal('#<TestObject1:0x5 ... @b="b", ... @d="d", @e=nil>', string1)
assert_equal('#<TestObject2:0x6 ... @b="", ... @d="", @f="f", @g="g">',
string2)
end
def test_compare_array
a = [1, 2, 3, 4, 5]
b = [1, 3, 4, 4, 5, 6]
string_a, string_b = ObjectDiff.compare_array(a, b)
puts [string_a, string_b]
assert_equal('[... 2, 3, ... ...]', string_a)
assert_equal('[... 3, 4, ... ... 6]', string_b)
end
def test_compare_hash
a = {
:a => 1,
:b => 2,
:c => 3,
:d => 4
}
b = {
:j => 2,
:b => 3,
:c => 3,
:d => 5
}
string_a, string_b = ObjectDiff.compare_hash(a, b)
puts [string_a, string_b]
assert_equal('{:a=>1, :b=>2, ... :d=>4}', string_a)
assert_equal('{ :b=>3, ... :d=>5, :j=>2}', string_b)
end
end
</TestCase>



Here is my solution:



<MySolution>
class ObjectDiff
class Node
attr_reader :eek:bject, :nodes

def initialize(value)
@object = value
@nodes = {}
variables = @object.instance_variables
variables.each { |key|
variable = @object.instance_eval(key)
@nodes[key] = Node.new(variable)
}
end

def to_s
return @object.inspect if [String, Integer, NilClass, Array, Hash,
Fixnum, Integer, Bignum].index(@object.class)
string = s_begin(self)
@nodes.each { |key, node|
string << "#{key}=#{node.to_s}, "
}
s_end(string)
end

def compare_with(node)
return ::ObjectDiff::compare_hash(@object, node.object) if
@object.instance_of?(Hash) and node.object.instance_of?(Hash)
return ::ObjectDiff::compare_array(@object, node.object) if
@object.instance_of?(Array) and node.object.instance_of?(Array)
return [@object.inspect, node.object.inspect] if [String, Integer,
NilClass, Array, Hash, Fixnum, Integer, Bignum].index(@object.class)

keys = []
@nodes.each_key { |key|
keys << key
}
node.nodes.each_key { |key|
keys << key
}
keys.uniq!
keys.sort!

string1 = s_begin(self)
string2 = s_begin(node)

keys.each { |key|
node1 = @nodes[key]
node2 = node.nodes[key]
if @nodes.has_key?(key) and node.nodes.has_key?(key)
if node1.to_s == node2.to_s
string1 << '... '
string2 << '... '
else
s1, s2 = node1.compare_with(node2)
string1 << "#{key}=#{s1}, "
string2 << "#{key}=#{s2}, "
end
else
if @nodes.has_key?(key)
string1 << "#{key}=#{node1.to_s}, "
end
if node.nodes.has_key?(key)
string2 << "#{key}=#{node2.to_s}, "
end
end
}

string1 = s_end(string1)
string2 = s_end(string2)

[string1, string2]
end

def s_begin(node)
"#<#{node.object.class.name}:0x#{format('%x', node.object.__id__)} "
end

def s_end(string)
string.chomp!(' ')
string.chomp!(',')
string << '>'
end
end

def self.object_to_s(object)
Node.new(object).to_s
end

def self.compare(object1, object2)
node1 = Node.new(object1)
node2 = Node.new(object2)
node1.compare_with(node2)
end

def self.compare_array(array1, array2)
if array1.size < array2.size
a = array2
b = array1
flipped = true
else
a = array1
b = array2
flipped = false
end

string1 = '['
string2 = '['

for i in 0...a.size
if i < b.size
node1 = Node.new(a)
node2 = Node.new(b)
if node1.to_s == node2.to_s
string1 << '... '
string2 << '... '
else
string_a, string_b = compare(a, b)
string1 << "#{string_a}, "
string2 << "#{string_b}, "
end
else
string1 << Node.new(a).to_s
end
end

string1.strip!
string1.chomp!(',')
string2.strip!
string2.chomp!(',')

string1 << ']'
string2 << ']'

if flipped
[string2, string1]
else
[string1, string2]
end
end

def self.compare_hash(hash1, hash2)
keys = []
hash1.each_key { |key|
keys << key
}
hash2.each_key { |key|
keys << key
}
keys.uniq!
keys.sort!{ |a, b|
a = a.inspect if a.is_a?(Symbol)
b = b.inspect if b.is_a?(Symbol)
a <=> b
}

string1 = '{'
string2 = '{'

keys.each { |key|
node1 = Node.new(hash1[key])
node2 = Node.new(hash2[key])
if hash1.has_key?(key) and hash2.has_key?(key)
if node1.to_s == node2.to_s
string1 << '... '
string2 << '... '
else
s1, s2 = node1.compare_with(node2)
string1 << "#{key.inspect}=>#{s1}, "
string2 << "#{key.inspect}=>#{s2}, "
end
else
if hash1.has_key?(key) and not hash2.has_key?(key)
append = "#{key.inspect}=>#{node1.to_s}, "
string1 << append
string2 << (' ' * append.size)
else
append = "#{key.inspect}=>#{node2.to_s}, "
string2 << append
string1 << (' ' * append.size)
end
end
}
string1.strip!
string1.chomp!(',')
string2.strip!
string2.chomp!(',')

string1 << '}'
string2 << '}'

[string1, string2]
end
end
</MySolution>



Feel free to improve upon my code or create your own. Remember this is for
posterity, so be honest...

___________________
John Long
www.wiseheartdesign.com
 
M

Mauricio Fernández

assert_equal('[... 2, 3, ... ...]', string_a)
assert_equal('[... 3, 4, ... ... 6]', string_b)

In arrays, '...' is used for the first element if equal to that othe
other array

Not sure what you mean.

I assumed '...' meant "element equal to the one in the other array
(at the same index)". I didn't read the following right, and believed
spaces where used instead of '...' if the match was on the first item.
[kept that]
assert_equal('{:a=>1, :b=>2, ... :d=>4}', string_a)
assert_equal('{ :b=>3, ... :d=>5, :j=>2}', string_b)

... which is not the case for hashes: spaces instead???
[changed to use '...' instead, as in arrays]

No spaces are used when the element doesn't exsist in the other object. For
example:

OK, I see it now. Implementing this is however quite complicated (ie.
not as easy as what I'm doing now :)
a = { :a => 1, :b => 2, :c => 3}
b = { :b => 2, :c => 3}

so ['{ :a => 1, ... ...}', '{ ... ...}'] would be the output of
compare for hashes. When printed on separate lines this helps to determine
what keys are missing.

The same should be true when comparing classes with different instance
variables. Although to my chagrin my test case doesn't show this and neither
does my code.

This can get quite ugly if the iv. refers to an object whose text
representation is "big" and you have line wrapping.
Again I think sorting by key.inspect may be the best solution.
ok

Overall a good, clean solution. I like the way you're using object_to_s to
do most of the grunt work. Two thumbs up. I hope you plan to post your code
again when it is ready for general use.

I can clear it a bit if you want. Note that I didn't try to make it
especially clean or anything, just short (or rather not too long).
A couple of things I haven't figured out:

1. What does ruby use when outputting the object id for object.inspect?
Either I can't get the formatting right, or it's not the object id. (Compare
object_to_s with inspect.)

It uses the address of the object:

static VALUE
rb_obj_inspect(obj)
VALUE obj;
{
...
if (rb_inspecting_p(obj)) {
str = rb_str_new(0, strlen(c)+10+16+1); /* 10:tags 16:addr 1:nul */
sprintf(RSTRING(str)->ptr, "#<%s:0x%lx ...>", c, obj);
RSTRING(str)->len = strlen(RSTRING(str)->ptr);
return str;
}
...
str = rb_str_new(0, strlen(c)+6+16+1); /* 6:tags 16:addr 1:nul */

2. How do you handle fringe classes that are descendants of Array, Hash,
String, etc... intelligently? For instance say A < String and has an
instance variable. How do you print this and note the difference? (Or should
you even bother?)

perhaps introducing a notation like
<MyArrayClass:0x12345678 [... 1, 2, ..., 3] @a=1>

--
_ _
| |__ __ _| |_ ___ _ __ ___ __ _ _ __
| '_ \ / _` | __/ __| '_ ` _ \ / _` | '_ \
| |_) | (_| | |_\__ \ | | | | | (_| | | | |
|_.__/ \__,_|\__|___/_| |_| |_|\__,_|_| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Beeping is cute, if you are in the office ;)
-- Alan Cox
 
J

John W. Long

Hi Mauricio,
OK, I see it now. Implementing this is however quite complicated (ie.
not as easy as what I'm doing now :)

Which may explain why your code was so clean :). I'd like to see what would
happen if you did add this functinality. I may try and do this myself if I
have time this weekend and see how easy it is. But I agree the problem does
get much more complicated when you add the spaces.
a = { :a => 1, :b => 2, :c => 3}
b = { :b => 2, :c => 3}

so ['{ :a => 1, ... ...}', '{ ... ...}'] would be the output of
This can get quite ugly if the iv. refers to an object whose text
representation is "big" and you have line wrapping.

True, but this would be a case where turning linewrapping on would really
help. The number of spaces by the way is determined by the length of the
ivname => iv combination so when the lines are printed in parallel it really
helps.
It uses the address of the object:

static VALUE
rb_obj_inspect(obj)
VALUE obj;
{
...
if (rb_inspecting_p(obj)) {
str = rb_str_new(0, strlen(c)+10+16+1); /* 10:tags 16:addr 1:nul */
sprintf(RSTRING(str)->ptr, "#<%s:0x%lx ...>", c, obj);
RSTRING(str)->len = strlen(RSTRING(str)->ptr);
return str;
}
...
str = rb_str_new(0, strlen(c)+6+16+1); /* 6:tags 16:addr 1:nul */

That makes sense. But it begs the question: how do you get the address of an
object in ruby?
2. How do you handle fringe classes that are descendants of Array, Hash,
String, etc... intelligently? For instance say A < String and has an
instance variable. How do you print this and note the difference? (Or should
you even bother?)

perhaps introducing a notation like
<MyArrayClass:0x12345678 [... 1, 2, ..., 3] @a=1>

So you would use the above whenever the object was a kind_of?(Array) but not
a instance_of?(Array) ?
 
M

Mauricio Fernández

Hi Mauricio,


Which may explain why your code was so clean :). I'd like to see what would
happen if you did add this functinality. I may try and do this myself if I
have time this weekend and see how easy it is. But I agree the problem does
get much more complicated when you add the spaces.

Not *much* more. I estimated I'd need 10 lines, it took 12:

batsman@tux-chan:/tmp$ diff -u objectdiff.good.rb objectdiff.rb | unexpand | expand -t 2
--- objectdiff.good.rb 2003-07-19 22:47:56.000000000 +0200
+++ objectdiff.rb 2003-07-19 23:00:31.000000000 +0200
@@ -59,7 +59,19 @@
when Hash
keys = obj.keys.sort{|a,b| a.to_s <=> b.to_s}
r = "{"
+ count = 0
+ keys2 = other.keys.sort{|a,b| a.to_s <=> b.to_s} if Hash === other
keys.each_with_index do |k, i|
+ if Hash === other
+ keys2.each_with_index do |key,n|
+ if key.to_s >= k.to_s
+ keys2 = keys2[n+1..-1]
+ break
+ end
+ r << " " * (object_to_s(other[key]).size + 4 +
+ key.inspect.size)
+ end
+ end
if Hash === other and other.key?(k) and obj[k] == other[k]
r << (i == 0? '': ' ') + "..."
elsif Hash === other and other.key?(k)
@@ -175,7 +187,7 @@
string_a, string_b = ObjectDiff.compare_hash(a, b)
puts [string_a, string_b]
assert_equal('{:a=>1, :b=>2, ... :d=>4}', string_a)
- assert_equal('{:b=>3, ... :d=>5, :j=>2}', string_b)
+ assert_equal('{ :b=>3, ... :d=>5, :j=>2}', string_b)
end
end

That makes sense. But it begs the question: how do you get the address of an
object in ruby?

It's impossible AFAIK within plain Ruby. You can however write an
extension to do so, by doing:

(some .c file, name doesn't matter)
#include <ruby.h>

static
VALUE
get_address(VALUE class, VALUE obj)
{
return INT2NUM((long)obj);
}

void
Init_GetAddress()
{
rb_define_method(rb_mKernel, "get_address", get_address, 1);
}

extconf.rb:
require 'mkmf'

create_makefile('GetAddress')


and then within Ruby:
require 'GetAddress' # have to adjust path before and/or install extension

a = ""
p get_address(a)
2. How do you handle fringe classes that are descendants of Array, Hash,
String, etc... intelligently? For instance say A < String and has an
instance variable. How do you print this and note the difference? (Or should
you even bother?)

perhaps introducing a notation like
<MyArrayClass:0x12345678 [... 1, 2, ..., 3] @a=1>

So you would use the above whenever the object was a kind_of?(Array) but not
a instance_of?(Array) ?

yes. I wouldn't expect that to happen very often, however.

--
_ _
| |__ __ _| |_ ___ _ __ ___ __ _ _ __
| '_ \ / _` | __/ __| '_ ` _ \ / _` | '_ \
| |_) | (_| | |_\__ \ | | | | | (_| | | | |
|_.__/ \__,_|\__|___/_| |_| |_|\__,_|_| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

And 1.1.81 is officially BugFree(tm), so if you receive any bug-reports
on it, you know they are just evil lies.
-- Linus Torvalds
 
J

John W. Long

A couple of things I haven't figured out:
I think we need a pure ruby solution here. At least for this hole. :)

A regular expression would do the trick:

irb(main):003:0> o = Object.new
=> #<Object:0x277c828>
irb(main):005:0> r = /:(0x.*?)[\s>]/
=> /:(0x.*?)[\s>]/
irb(main):007:0> o.inspect =~ r
=> 8
irb(main):008:0> $1
=> "0x277c828"
irb(main):009:0> o.instance_eval('@a=1')
=> 1
irb(main):010:0> o.inspect
=> "#<Object:0x277c828 @a=1>"
irb(main):011:0> o.inspect =~ r
=> 8
irb(main):012:0> $1
=> "0x277c828"

Seems to work pretty well, although I have no idea how we would get that
value for things like arrays and strings.
 
T

ts

"M" == Mauricio =?iso-8859-1?Q?Fern=E1ndez?= <Mauricio> writes:

M> did you cut & paste my code, or is my training to think like you
M> bringing its first results? (the code is char-per-char identical in
M> everything but 'inpf' :)

cut & paste

M> I meant impossible if #__id__, #object_id & #id are redefined.

See the discussion about #class

Guy Decoux

p.s. :

svg% ruby -e 'def __id__() end'
-e:1: warning: redefining `__id__' may cause serious problem
svg%
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top