Overloading Array Subtraction operator

Discussion in 'Ruby' started by Nicko, Jun 10, 2007.

  1. Nicko

    Nicko Guest

    Hi,

    I have two arrays of hashes, and I'd like to subtract them to find the
    difference elements between them eg.

    -----------
    array1 = Array.new
    array2 = Array.new

    tmp = {:name => "fred", :phone => "545334"}
    array1.push(tmp)
    tmp2 = tmp1.dup

    array2.push(tmp2)
    tmp3 = {:name => "stan", :phone => "hehe"}
    array1.push(tmp3)

    arraydiff = array1 - array2
    --------------

    What methods would I have to overload to accomplish this task? I
    could not find an example like this anywhere!

    Nicko
    Nicko, Jun 10, 2007
    #1
    1. Advertising

  2. Nicko

    Erwin Abbott Guest

    The - operator compares objects by their ID, so they aren't removed
    unless they are instances of the same object. They may have the same
    value, but be separate instances like this example. You can accomplish
    what you want like this:

    array1.select{|x| !array2.include? x}
    # => [{:name=>"stan", :phone=>"hehe}]

    Array#include? compares using == so they are compared by value, not by
    their #object_id.

    [:a, :b, :c].object_id # => 2711200
    [:a, :b, :c].object_id # => 2690960 ... a new instance, same value

    Regards,
    Erwin
    Erwin Abbott, Jun 10, 2007
    #2
    1. Advertising

  3. Nicko

    Nicko Guest

    Wow!
    Thank you both!

    I ended up with

    class SuperArray < Array
    def -(other)
    self.select{|x| !other.include? x}
    end
    end

    and it works great :) I can optimise it later :)

    Nicko

    On Jun 10, 5:56 pm, "Erwin Abbott" <> wrote:
    > The - operator compares objects by their ID, so they aren't removed
    > unless they are instances of the same object. They may have the same
    > value, but be separate instances like this example. You can accomplish
    > what you want like this:
    >
    > array1.select{|x| !array2.include? x}
    > # => [{:name=>"stan", :phone=>"hehe}]
    >
    > Array#include? compares using == so they are compared by value, not by
    > their #object_id.
    >
    > [:a, :b, :c].object_id # => 2711200
    > [:a, :b, :c].object_id # => 2690960 ... a new instance, same value
    >
    > Regards,
    > Erwin
    Nicko, Jun 10, 2007
    #3
  4. On 10.06.2007 10:25, Nicko wrote:
    > Wow!
    > Thank you both!
    >
    > I ended up with
    >
    > class SuperArray < Array
    > def -(other)
    > self.select{|x| !other.include? x}
    > end
    > end
    >
    > and it works great :) I can optimise it later :)


    It is usually not such a good idea to inherit base classes like Array
    and Hash. Here are two more healthy approaches.

    1. wrap Array with a class that represents the concept (which one btw?)
    your Array is used for. Then implement #- (and all the other methods).

    2. wrap Hash with a class that represents the concept (which one btw?)
    your Hash is used for. Then implement #==, #hash and #eql? accordingly.

    The basic reason why your code does not work as you would like it to
    work is that Hash does not implement #eql? and #hash in a way that
    considers Hash content (for the reasons please search the archives, the
    topic has come up frequently). Note:

    irb(main):037:0> h={:foo=>:bar}
    => {:foo=>:bar}
    irb(main):038:0> h == h.dup
    => true
    irb(main):039:0> h.eql? h.dup
    => false
    irb(main):040:0> h.hash == h.dup.hash
    => false

    Kind regards

    robert
    Robert Klemme, Jun 10, 2007
    #4
  5. On Sun, Jun 10, 2007 at 07:50:35PM +0900, Robert Klemme wrote:
    [...]
    > It is usually not such a good idea to inherit base classes like Array
    > and Hash.

    [...]

    That is an interesting statement. I don't think I agree with it, but I'd
    like to hear your reasoning behind it.

    > Kind regards
    > robert

    --Greg
    Gregory Seidman, Jun 10, 2007
    #5
  6. On Jun 10, 2:56 am, "Erwin Abbott" <> wrote:
    > The - operator compares objects by their ID, so they aren't removed
    > unless they are instances of the same object.


    Is that so? Then why does this work?

    irb(main):001:0> %w{a b c} - %w{b}
    => ["a", "c"]

    And any number of similar examples.

    --
    -yossef
    Yossef Mendelssohn, Jun 10, 2007
    #6
  7. Nicko

    Erwin Abbott Guest

    On 6/10/07, Yossef Mendelssohn <> wrote:
    > Is that so? Then why does this work?
    >
    > irb(main):001:0> %w{a b c} - %w{b}
    > => ["a", "c"]
    >


    Yes, I responded hastily. The rdocs for Array#- don't say how objects
    are compared so I made a bad assumption. I only meant to convey it
    wasn't being done by comparing values.

    Thanks for pointing that out.
    Erwin Abbott, Jun 10, 2007
    #7
  8. Nicko

    Erwin Abbott Guest

    On 6/10/07, Erwin Abbott <> wrote:
    > Yes, I responded hastily. The rdocs for Array#- don't say how objects
    > are compared so I made a bad assumption. I only meant to convey it
    > wasn't being done by comparing values.


    ... at least with the array of Hashes, Hash#hash is used and not
    Hash#== or some value based comparison. I'm not sure how it's done
    with Strings or Fixnums, I'd have to check the source code probably.
    Check it out with the profiler:

    $ ruby -rprofile -e '[{:a=>3}] - [{:b=>0,:a=>0}]'
    % cumulative self self total
    time seconds seconds calls ms/call ms/call name
    0.00 0.00 0.00 2 0.00 0.00 Kernel.hash
    0.00 0.00 0.00 1 0.00 0.00 Array#-
    0.00 0.01 0.00 1 0.00 10.00 #toplevel

    $ ruby -rprofile -e '%w[a b c] - %w[b d e f]'
    % cumulative self self total
    time seconds seconds calls ms/call ms/call name
    0.00 0.00 0.00 1 0.00 0.00 Array#-
    0.00 0.01 0.00 1 0.00 10.00 #toplevel

    $ ruby -rprofile -e '[1,2,3] - [0,3,5]'
    % cumulative self self total
    time seconds seconds calls ms/call ms/call name
    0.00 0.00 0.00 1 0.00 0.00 Array#-
    0.00 0.01 0.00 1 0.00 10.00 #toplevel

    Regards,

    Erwin
    Erwin Abbott, Jun 10, 2007
    #8
  9. Nicko

    Nicko Guest

    On Jun 10, 8:46 pm, Robert Klemme <> wrote:
    >
    > It is usually not such a good idea to inherit base classes like Array
    > and Hash. Here are two more healthy approaches.
    >


    The code is meant to be getting two lists of files, one on a usb stick
    and one on a network share, putting them in hashes (for filename, size
    and md5 hash) and now i want a list of the files that are in one list
    but not on the other.

    If the hashes are the same, they won't be the same instance because
    they were generated seperately.

    Why is inheriting from Array not a healthy approach?

    Sorry I'm a ruby newbie.

    Thanks for the below info, it just seems like an overkill for what i
    am doing.

    Nicko

    > 1. wrap Array with a class that represents the concept (which one btw?)
    > your Array is used for. Then implement #- (and all the other methods).
    >
    > 2. wrap Hash with a class that represents the concept (which one btw?)
    > your Hash is used for. Then implement #==, #hash and #eql? accordingly.
    >
    > The basic reason why your code does not work as you would like it to
    > work is that Hash does not implement #eql? and #hash in a way that
    > considers Hash content (for the reasons please search the archives, the
    > topic has come up frequently). Note:
    >
    > irb(main):037:0> h={:foo=>:bar}
    > => {:foo=>:bar}
    > irb(main):038:0> h == h.dup
    > => true
    > irb(main):039:0> h.eql? h.dup
    > => false
    > irb(main):040:0> h.hash == h.dup.hash
    > => false
    >
    > Kind regards
    >
    > robert
    Nicko, Jun 11, 2007
    #9
  10. On 10.06.2007 17:38, Gregory Seidman wrote:
    > On Sun, Jun 10, 2007 at 07:50:35PM +0900, Robert Klemme wrote:
    > [...]
    >> It is usually not such a good idea to inherit base classes like Array
    >> and Hash.

    > [...]
    >
    > That is an interesting statement. I don't think I agree with it, but I'd
    > like to hear your reasoning behind it.


    This has been discusses numerous times - even here. On a conceptual
    level basically more often than not a user defined class XYZ /is not/ an
    Array but /uses/ an Array (for storing something). More practically by
    inheriting Array you conveniently publish all methods you might consider
    useful but you also publish methods that allow for direct Array
    manipulation - which is especially bad if you want to ensure some
    additional constraints (e.g. a certain element order). While you can
    /unpublish/ methods with Ruby IMHO it is less error prone to explicitly
    define methods that you want to allow on your class. (Just consider a
    new version of Ruby is available which adds methods to Array that you do
    not want to be available for your clients but which by default /are/
    available unless you change your code as well. If you use delegation in
    this case you do not have to do anything about it.

    If you disagree then you might be sharing a camp with Bertrand Meyer
    whom I regard highly for his book OOSE, where he also promotes
    implementation inheritance (which you find in Eiffel). Note though that
    in Eiffel you have more options to control visibility of methods and
    inheritance than in Ruby and the compiler will catch many mistakes you
    can make in this area.

    Kind regards

    robert
    Robert Klemme, Jun 11, 2007
    #10
  11. On 11.06.2007 03:25, Nicko wrote:
    > On Jun 10, 8:46 pm, Robert Klemme <> wrote:
    >> It is usually not such a good idea to inherit base classes like Array
    >> and Hash. Here are two more healthy approaches.

    >
    > The code is meant to be getting two lists of files, one on a usb stick
    > and one on a network share, putting them in hashes (for filename, size
    > and md5 hash) and now i want a list of the files that are in one list
    > but not on the other.


    Why then don't you just substract key arrays (assuming that your keys
    are file names)? Or is size and MD5 important for your comparison? In
    that case I'd probably do this:

    FileInfo = Struct.new :file_name, :size, :md5

    If you put instances of this class in an Array or Set your substraction
    logic will work.

    > If the hashes are the same, they won't be the same instance because
    > they were generated seperately.
    >
    > Why is inheriting from Array not a healthy approach?


    See my other reply.

    Kind regards

    robert
    Robert Klemme, Jun 11, 2007
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. August1
    Replies:
    4
    Views:
    342
    August1
    Sep 21, 2004
  2. John Smith
    Replies:
    2
    Views:
    414
    Ivan Vecerina
    Oct 6, 2004
  3. Replies:
    11
    Views:
    720
    James Kanze
    May 16, 2007
  4. Replies:
    3
    Views:
    516
  5. Replies:
    9
    Views:
    689
    Christopher Bazley
    Jan 30, 2010
Loading...

Share This Page