Subclassing Hash to enforce value uniqueness ala key uniqueness.

Discussion in 'Ruby' started by Adam Gardner, Nov 18, 2008.

  1. Adam Gardner

    Adam Gardner Guest

    First of all, hello to everyone. This is my first message to this list.

    I find myself needing something very much like a Hash, but with unique
    one-to-one mapping in both directions, both key-to-value, and
    value-to-key. Now, I've been looking around a bit on Google, then
    Rubyforge, the RAA, and the archives of this list, and I'm a bit
    surprised I wasn't able to find a prior implimentation of this. Time to
    impliment my own, right?

    Well, before I got too far into the process, I though I would ask for
    people's thoughts here. First off, here is the effect I would like to
    see. I'll call my subclass OneToOne for now, though I'm open to better
    names.

    >> oto = OneToOne[:a=>1, :b=>2, :c=>3]

    => {:a=>1, :b=>2, :c=>3} # or on 1.8.x, {:b=>2, :c=>3, :a=>1}
    >> oto[:d] = 1

    => 1
    >> oto

    => {:b=>2,:c=>3,:d=>1}
    >> OneToOne[:a=>1,:b=>1,:c=>1]

    => {:c=>1}
    ...
    etc

    Here is some code which begins to implement such a class:

    class OneToOne < Hash

    def self.[](*args)
    super(*args).invert.invert
    end

    def initialize(*args)
    super(*args)
    self.fix
    end

    def []=(key,val)
    self.delete_if {|k,v| v == val}
    super(key,val)
    end

    def store(key,val)
    self.delete_if {|k,v| v == val}
    super(key,val)
    end

    def key(val) #this is just a nicety to arrange for code to run in 1.9
    and 1.8
    if Hash.method_defined?:)key)
    super(val)
    else
    self.index(val)
    end
    end

    def fix
    self.replace(self.invert.invert)
    end
    end

    This code works to a certain degree, but I see two problems going
    forward. First of all, Hash is a built-in in ruby which is written (as
    far as I am aware) entirely in C. As such, there is no 'core' method
    which can be overridden to affect change in all the methods which build
    on it. At least, implementing []= did not give me store, and
    implimenting store did not give me []=. Neither gives me merge. How many
    other methods will I therefore have to override to ensure value
    uniqueness? It seems to me, every method which could modify the contents
    of the hash in some way (other than simply removing entries). Even then,
    could I guarantee no two keys would have the same value, with the same
    level of certainty I have that no two values have the same key?

    The second problem is efficiency: It seems to me that this could
    probably be done much more efficiently, especially if implemented in C.
    It doesn't affect my much for what I need it for right now, but if I
    want to reuse it later (or if someone else wants to use it), it could
    matter a great deal.

    (Incidentally: searching for information about this has been somewhat
    hampered by the fact that the phrase 'hash value' is very ambiguous
    between the a value contained in a hash and paired with a key, or value
    returned by a hashing algorithm such as Object#hash or MD5. It doesn't
    help that most of other terms I can think of to search for, such as
    bidirectional or unique or one-to-one, are more common when talking
    about the latter than the former. Now I understand why several other
    languages choose the term 'Dictionary'.)

    Anyway, if someone has either a) a prior implementation of this, or b)
    some suggestions on improving my implimentation, I'd love to hear it.

    Thanks for your time,
    - Adam Gardner
     
    Adam Gardner, Nov 18, 2008
    #1
    1. Advertising

  2. Adam Gardner

    Trans Guest

    On Nov 17, 11:33=A0pm, Adam Gardner <> wrote:
    > Anyway, if someone has either a) a prior implementation of this, or b)
    > some suggestions on improving my implimentation, I'd love to hear it.


    You may be able to save yourself some time with a little meta
    programming. Eg.

    class HashSet < Hash

    Hash.instance_methods(false).each do |m|
    define_method(m, *a, &b)
    r =3D super
    fix
    r
    end
    end

    def fix
    self.replace(self.invert.invert)
    end

    end

    You'll want to improve upon that code of course, probably limit it to
    only certain methods. But you get the idea.

    T.
     
    Trans, Nov 18, 2008
    #2
    1. Advertising

  3. Adam Gardner wrote:
    > The second problem is efficiency: It seems to me that this could
    > probably be done much more efficiently, especially if implemented in C.


    Every time you add a value you iterate over all the other values to check
    whether the value is already there. This makes adding an element O(n). Having
    adding to a datastructure be an O(n) operation is usually a bad idea. Here's
    how I'd probably do it (untested):

    class OneToOne
    def initialize()
    @key_value = {}
    @value_key = {}
    end

    def [](k)
    @key_value[k]
    end

    def []=(k,v)
    if @key_value.has_key?(k)
    @value_key.delete(@key_value[k])
    end
    if @value_key.has_key(?v)
    @key_value.delete(@value_key[v])
    end
    @key_value[k] = v
    @value_key[v] = k
    end

    def to_hash
    @key_value
    end
    def invert
    @value_key
    end
    ...
    end


    HTH,
    Sebastian
    --
    Jabber:
    ICQ: 205544826
     
    Sebastian Hungerecker, Nov 18, 2008
    #3
  4. On 18.11.2008 09:37, Sebastian Hungerecker wrote:
    > Adam Gardner wrote:
    >> The second problem is efficiency: It seems to me that this could
    >> probably be done much more efficiently, especially if implemented in C.

    >
    > Every time you add a value you iterate over all the other values to check
    > whether the value is already there. This makes adding an element O(n). Having
    > adding to a datastructure be an O(n) operation is usually a bad idea. Here's
    > how I'd probably do it (untested):


    <snip>good example</snip>

    Right, my main point would be: why subclass? You inherit a lot of
    methods that you need to watch out for. Rather, as Basti showed,
    implement a new class with all the necessary methods.

    Kind regards

    robert


    PS: Sorry for the nicknaming. :)
     
    Robert Klemme, Nov 18, 2008
    #4
  5. Adam Gardner

    James Coglan Guest

    [Note: parts of this message were removed to make it a legal post.]

    >
    > Every time you add a value you iterate over all the other values to check
    > whether the value is already there. This makes adding an element O(n).
    > Having
    > adding to a datastructure be an O(n) operation is usually a bad idea.
    > Here's
    > how I'd probably do it (untested):
    >
    > # example (elided)




    This seems like a good idea, I'd have suggested the same. This is exactly
    what Set in the standard library does to ensure uniqueness, it puts the
    members as keys in a hash and lets Ruby sort out the hashtable as
    efficiently as possible. So a pair of mutually inverted hash tables is
    probably the way to go.
     
    James Coglan, Nov 18, 2008
    #5
  6. Sebastian Hungerecker wrote:
    > =A0 def to_hash
    > =A0 =A0 @key_value
    > =A0 end


    This should return @key_value.dup=20

    > =A0 def invert
    > =A0 =A0 @value_key
    > =A0 end


    And this should either return @value_key.dup or another instance of OneToOn=
    e=20
    where the both hashes are switched.

    HTH,
    Sebastian
    =2D-=20
    Jabber:
    ICQ: 205544826
     
    Sebastian Hungerecker, Nov 19, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Paul Tomblin
    Replies:
    3
    Views:
    4,456
    Paul Tomblin
    Dec 8, 2006
  2. rp
    Replies:
    1
    Views:
    584
    red floyd
    Nov 10, 2011
  3. Une bévue
    Replies:
    5
    Views:
    176
    Une bévue
    Aug 10, 2006
  4. Antonio Quinonez
    Replies:
    2
    Views:
    195
    Antonio Quinonez
    Aug 14, 2003
  5. Ralf Baerwaldt
    Replies:
    1
    Views:
    150
    Paul Lalli
    Jul 20, 2004
Loading...

Share This Page