equals and hascode

Discussion in 'Java' started by mark jason, Dec 18, 2010.

  1. mark jason

    mark jason Guest

    hi
    I am using colt matrix package from cern for some matrix manipulations
    in my application.To provide a more convenient interface for client ,I
    decided to override a matrix class (let's say BaseMatrix) as below

    class MyMatrix extends cern.somecoltpackage.BaseMatrix {
    public SomeValue easierInterface1(){
    //return something...
    }
    public SomeOtherValue easierInterface2(){
    //return somethingelse...
    }
    ....
    }

    However,while running FindBugs over the code,I am getting this warning

    MyMatrix.java:1 MyMatrix inherits equals and uses Object.hashCode()

    I went through the source code of the superclass in cern package and
    couldn't find any implementation of equals() or hashCode().What should
    I do about this warning?Should I ignore it?Or should I implement these
    methods?Please advise.
    thanks,
    mark.
     
    mark jason, Dec 18, 2010
    #1
    1. Advertising

  2. mark jason <> wrote:
    > class MyMatrix extends cern.somecoltpackage.BaseMatrix { [...] }
    > However,while running FindBugs over the code,I am getting this warning
    > MyMatrix.java:1 MyMatrix inherits equals and uses Object.hashCode()
    > I went through the source code of the superclass in cern package and
    > couldn't find any implementation of equals() or hashCode().


    Could you try to run FindBugs over cern.somecoltpackage.BaseMatrix ?

    I don't know "FindBugs", but the wording appears to imply a bug
    already in the super class (or even higher up).
     
    Andreas Leitgeb, Dec 18, 2010
    #2
    1. Advertising

  3. mark jason

    mark jason Guest

    On Dec 18, 2:12 pm, Andreas Leitgeb <>
    wrote:

    > Could you try to run FindBugs over cern.somecoltpackage.BaseMatrix ?
    >
    > I don't know "FindBugs", but the wording appears to imply a bug
    > already in the super class (or even higher up).


    same warning there,
    /src/cern/colt/matrix/DoubleMatrix2D.java:284
    cern.colt.matrix.DoubleMatrix2D defines equals and uses
    Object.hashCode()
     
    mark jason, Dec 18, 2010
    #3
  4. mark jason

    Aeris Guest

    mark jason wrote:

    > MyMatrix.java:1 MyMatrix inherits equals and uses Object.hashCode()


    One of the (transitive) superclass of your class redefined «void
    equals(Object)» but not «int hashCode()»
    This is potentially wrong because Java doc say «o1.equals(o2) =>
    o1.hashCode() == o2.hashCode()» and can lead to strange behavior (bad sort
    on generic algorithm, wrong result of equals, breaked HashMap or HashSet…)

    A «good» implementation of equals is

    public boolean equals(Object obj) {
    if(this == obj) {
    return true;
    }

    if (obj == null || !(obj instanceof MyClass)) {
    return false;
    }

    return this.hashCode() == obj.hashCode();
    }
     
    Aeris, Dec 18, 2010
    #4
  5. mark jason

    mark jason Guest

    On Dec 18, 4:04 pm, Aeris <> wrote:
    > mark jason wrote:
    > > MyMatrix.java:1 MyMatrix inherits equals and uses Object.hashCode()

    >
    > One of the (transitive) superclass of your class redefined «void
    > equals(Object)» but not «int hashCode()»


    The inheritance hierachy is

    MyMatrix extends DenseDoubleMatrix2D extends DoubleMatrix2D .......

    DenseDoubleMatrix2D does not define equals()
    DoubleMatrix2D redefines equals() as below but not hashCode()

    public boolean equals(Object obj) {
    if (this == obj) return true;
    if (obj == null) return false;
    if (!(obj instanceof DoubleMatrix2D)) return false;
    return cern.colt.matrix.linalg.Property.DEFAULT.equals(this,
    (DoubleMatrix2D) obj);
    }
     
    mark jason, Dec 18, 2010
    #5
  6. mark jason

    Aeris Guest

    mark jason wrote:

    > DoubleMatrix2D redefines equals() as below but not hashCode()


    Here is the mistake
    And by inheritance, your custom class is also «bugged»
     
    Aeris, Dec 18, 2010
    #6
  7. mark jason

    Aeris Guest

    Chris Uppal wrote:

    > except in the extremely
    > rare cases where some class's instances' states are entirely reflected in
    > their hashCode()s.


    Does it shouldn't be the case?

    A good practice is to write immutable object and in case of mutable object,
    I agree with you that hashcode could throw a exception to avoid collection
    or map problem.

    And Java doc say:
    > Whenever it is invoked on the same object more than once during an

    execution of a Java application, the hashCode method must consistently
    return the same integer

    > If two objects are equal according to the equals(Object) method, then

    calling the hashCode method on each of the two objects must produce the same
    integer result.

    > However, the programmer should be aware that producing distinct integer

    results for unequal objects may improve the performance of hashtables.

    So, in the most of the cases (immutable object), hashcode and equals MAY be
    equivalent and hashCode could be computed «statically» in constuctors and my
    equals code is good.
    In minor cases (immutable object), hashCode + equals mix is a non-sense
    concept, which could lead to very very serious problems with all generic
    classes (collection, map, set, generic algo…) and MUST be avoided and
    disable by an exception.
     
    Aeris, Dec 18, 2010
    #7
  8. mark jason

    Lew Guest

    mark jason wrote:
    > The inheritance hierachy is
    >
    > MyMatrix extends DenseDoubleMatrix2D extends DoubleMatrix2D .......
    >
    > DenseDoubleMatrix2D does not define equals()
    > DoubleMatrix2D redefines equals() as below but not hashCode()
    >
    > public boolean equals(Object obj) {
    > if (this == obj) return true;
    > if (obj == null) return false;
    > if (!(obj instanceof DoubleMatrix2D)) return false;


    Note that this code effectively checks twice whether 'obj == null'.

    > return cern.colt.matrix.linalg.Property.DEFAULT.equals(this,
    > (DoubleMatrix2D) obj);
    > }


    Others have pointed out that the "colt" library itself has the bug. This
    means that its matrix classes will behave strangely in hashed collections such
    as 'HashSet'.

    Read and study /Effective Java/ (2nd ed.) by Joshua Bloch for the rationale
    and best practices around 'hashCode' and 'equals'. This book is /de rigueur/
    for anyone wishing to describe themselves as a Java programmer.

    --
    Lew
    Ceci n'est pas une pipe.
     
    Lew, Dec 18, 2010
    #8
  9. mark jason

    Lew Guest

    On 12/18/2010 07:15 AM,
    > Aeris wrote:
    >
    >> A �good� implementation of equals is
    >>
    >> public boolean equals(Object obj) {
    >> if(this == obj) {
    >> return true;
    >> }
    >>
    >> if (obj == null || !(obj instanceof MyClass)) {
    >> return false;
    >> }
    >>
    >> return this.hashCode() == obj.hashCode();
    >> }


    Chris Uppal wrote:
    > That is a very, very,

    very, very, very, very
    > /bad/ implementation of equals(), except in the extremely


    no exceptions

    > rare cases where some class's instances' states are entirely reflected in their
    > hashCode()s. I.e. that each unique (up to whatever notion of equality is
    > appropriate for that class) object has a unique hash code. That is hardly ever
    > going to be true.


    And you should never count on it.

    Plus it checks twice whether 'obj == null'.

    --
    Lew
    Ceci n'est pas une pipe.
     
    Lew, Dec 18, 2010
    #9
  10. mark jason

    mark jason Guest

    On Dec 18, 6:12 pm, Lew <> wrote:

    > Others have pointed out that the "colt" library itself has the bug.  This
    > means that its matrix classes will behave strangely in hashed collections such
    > as 'HashSet'.



    thanks everyone for the replies,so if I still want to subclass the
    class from colt package,should I implement the method as given below?
    regards,
    mark
    @Override
    public int hashCode() {
    throw new UnsupportedOperationException();
    }
     
    mark jason, Dec 18, 2010
    #10
  11. mark jason

    Eric Sosman Guest

    On 12/18/2010 6:04 AM, Aeris wrote:
    > mark jason wrote:
    >
    >> MyMatrix.java:1 MyMatrix inherits equals and uses Object.hashCode()

    >
    > One of the (transitive) superclass of your class redefined «void
    > equals(Object)» but not «int hashCode()»
    > This is potentially wrong because Java doc say «o1.equals(o2) =>
    > o1.hashCode() == o2.hashCode()» and can lead to strange behavior (bad sort
    > on generic algorithm, wrong result of equals, breaked HashMap or HashSet…)
    >
    > A «good» implementation of equals is
    >
    > public boolean equals(Object obj) {
    > if(this == obj) {
    > return true;
    > }
    >
    > if (obj == null || !(obj instanceof MyClass)) {
    > return false;
    > }
    >
    > return this.hashCode() == obj.hashCode();
    > }


    This is "«good»" in the sense that it satisfies the requirement
    that equals() agree with hashCode(), but it does so only in a trivial
    sense. If this implementation were "«good enough»", then it could
    have been a `final' method of the Object class -- but it's not, meaning
    that other classes are expected to have their own ideas about equals(),
    notions that take into account more information than this one does.

    For example, imagine that Long used this implementation for
    equals(). Then for any Long instance `val', there would be
    4294967295 other Long instances `valx' such that

    val1.equals(valx) && val1.longValue() != valx.longValue()

    .... leading to interesting consequences like

    HashMap<Long,String> map = new HashMap<Long,String>();
    map.put(1L, "one");
    map.put(0x100000000L, "four gig");
    assert (map.size() == 1);

    .... which many people would find surprising -- and disappointing.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Dec 18, 2010
    #11
  12. mark jason

    Aeris Guest

    Chris Uppal wrote:

    > The reason I say that it could never have been a requirement that equality
    > of hash codes be enough for certainty that the objects are equal(), is
    > that there are only a few possible values for hashCode() -- about 4
    > billion (2 ** 32).


    This is where I fail
    I have not considered that hashcode restrict the «equals» possibility to
    2**32 cases…
     
    Aeris, Dec 18, 2010
    #12
  13. mark jason

    Eric Sosman Guest

    On 12/18/2010 9:16 AM, Chris Uppal wrote:
    > Aeris wrote:
    >> Chris Uppal wrote:
    >>
    >>> except in the extremely
    >>> rare cases where some class's instances' states are entirely reflected
    >>> in their hashCode()s.

    >>
    >> Does it shouldn't be the case?

    >
    > I'm not really sure what you are asking here ? If you are asking whether it's
    > a good idea for an object's state to be entirely reflected in its hash code,
    > then my answer is that you wouldn't ever normally have the option (see below).
    >
    >> A good practice is to write immutable object and in case of mutable
    >> object, I agree with you that hashcode could throw a exception to avoid
    >> collection or map problem.

    >
    > I didn't recommend that /every/ mutable object should throw from hashCode(); my
    > point is only that it's a useful design /option/, and possibly a good one in
    > the case of the Colt library Matrix classes.
    >
    > The idea is that mutability is only bad /if/ it affects the object's idea of
    > what it is equal to (and therefore also its hashCode()). Most objects are
    > either immutable or don't have that "changing-equality" property. In either
    > case they can go into hashed collections without problems.
    >
    > The problems only come when the object has "changing-equality" AND it actually
    > changes (or is remotely likely to do so) while it is in the collection. It may
    > be that the collections the code uses are -- by design -- only used to hold
    > objects which have "settled down" and will never change again. Depending on
    > circumstances, it might, or might not, be worth the effort of adding code to
    > make it /impossible/ for the object to change once it had been added to the
    > collection. (For instance, making an immutable copy of the object and putting
    > that in the collection instead, or putting a boolean
    > neverChangeAgainEverEverEver
    > flag in the object.)
    >
    > However, if the object has "changing-equality" and is normally mutable over its
    > whole lifecycle, then it /may/ make sense to add safety by throwing from
    > hashCode().


    Note that this won't help with objects in non-hashed keyed
    collections like TreeSet, which are based on compareTo() (or on a
    Comparator) and probably don't use hashCode() at all.

    Also, although the neverChangeAgainEverEverEver flag may sound
    plausible, I'm not at all sure it's practical. The question arises:
    When does it get set? At the first call to hashCode() or equals() or
    compareTo() or some other instance method that *might* be called when
    some collection gets a reference to the object? What if you put the
    object in an ordinary List and then call Collections.sort() on the
    List; should the object thereby lose its ability to mutate? If so,
    why; if not, why not? I fear the management of the flag might turn
    out to be more complicated than it's worth, giving rise to extra
    methods like testEqualityWithoutFreezing(). And then you need to tell
    the programmers when to use testEqualityWithoutFreezing() instead of
    equals() and vice versa ... and isn't this at least as brittle as
    telling them not to mutate an object while it's in a Map?

    I do not foresee a day when programming languages that are powerful
    enough to be useful will also be so safe they require no discipline on
    the part of the programmers. See also "With Folded Hands" by Jack
    Williamson.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Dec 18, 2010
    #13
  14. On Sat, 18 Dec 2010 11:23:22 -0500, Eric Sosman wrote:

    > Also, although the neverChangeAgainEverEverEver flag may sound
    > plausible, I'm not at all sure it's practical. The question arises:
    > When does it get set?
    >

    As an alternative approach, how about building the 'immutable' class to
    contain the all the 'immutable' attributes. These are declared as
    protected and the only public methods are getters and equals(). Now
    extend it with a subclass that contains all the methods and constructors
    that alter the protected attributes and a freeze() method that calculates
    the hash attribute and returns a reference to the 'immutable' class.

    I think this would work: you'd put the object returned by freeze() into
    hashed collections, etc. where it can't be changed because it has no
    methods that apply changes and its attributes are protected. If you
    retain a reference to the subclass you can continue to modify this
    instance and even update the collection by deleting the superclass
    instance and replacing it with a new instance returned by the subclass's
    freeze() method.

    This idea is purely a thought experiment. Is it simply too arcane to be
    worth the implementation and maintenance effort and/or has it got some
    other flaw I didn't spot?


    --
    martin@ | Martin Gregorie
    gregorie. | Essex, UK
    org |
     
    Martin Gregorie, Dec 18, 2010
    #14
  15. Martin Gregorie <> wrote:
    > As an alternative approach, how about building the 'immutable' class to
    > contain the all the 'immutable' attributes. These are declared as
    > protected and the only public methods are getters and equals(). Now
    > extend it with a subclass that contains all the methods and constructors
    > that alter the protected attributes and a freeze() method that calculates
    > the hash attribute and returns a reference to the 'immutable' class.


    I wonder, whether you meant returning an ImmutableBaseClass typed
    reference to the mutable instance, or if you meant to create a new
    ImmutableBaseClass-typed instance initialized to the same relevant
    member values.

    Either way would be unsafe. One could still add a mutable instance
    directly into the HashMap (bypassing freeze()) and then modify it.

    You could do with an abstract base class, from which you derive a
    mutable and an immutable subclass and have your mutable subclass's
    freeze() create an immutable-subclass-typed snapshot instance.
     
    Andreas Leitgeb, Dec 18, 2010
    #15
  16. mark jason

    markspace Guest

    On 12/18/2010 5:47 AM, mark jason wrote:
    > On Dec 18, 6:12 pm, Lew<> wrote:
    >
    >> Others have pointed out that the "colt" library itself has the bug. This
    >> means that its matrix classes will behave strangely in hashed collections such
    >> as 'HashSet'.

    >
    >
    > thanks everyone for the replies,so if I still want to subclass the
    > class from colt package,should I implement the method as given below?



    I think a good fix would be to contact the Cern-Colt people and get them
    to fix their bug.

    Failing that, grab the source code and fix it yourself. The problem
    with that is they might assume that hashcode() provides identity
    semantics and you'll break their code if you do change hashcode().

    We'd have know exactly what
    cern.colt.matrix.linalg.Property.DEFAULT.equals does to figure out
    what's the right way to go here.


    > regards,
    > mark
    > @Override
    > public int hashCode() {
    > throw new UnsupportedOperationException();
    > }
     
    markspace, Dec 18, 2010
    #16
  17. "Patricia Shanahan" <> wrote in message
    news:...
    > Aeris wrote:
    >> Chris Uppal wrote:
    >>
    >>> The reason I say that it could never have been a requirement that
    >>> equality
    >>> of hash codes be enough for certainty that the objects are equal(), is
    >>> that there are only a few possible values for hashCode() -- about 4
    >>> billion (2 ** 32).

    >>
    >> This is where I fail
    >> I have not considered that hashcode restrict the «equals» possibility to
    >> 2**32 cases…
    >>

    >
    > Your equals method would work if, and only if, hashCode implements a
    > "perfect hash". Even for types with fewer than 2**32 cases, depending on
    > having a perfect hash is often not practical.


    Or not optimal. A class consisting of a four character string would
    probably not make a perfect hash of them, even though that's easy enough,
    because the normal String hash tends to have better bucket-distribution
    characteristics.
     
    Mike Schilling, Dec 18, 2010
    #17
  18. On Sat, 18 Dec 2010 17:41:52 +0000, Andreas Leitgeb wrote:

    > I wonder, whether you meant returning an ImmutableBaseClass typed
    > reference to the mutable instance,
    >

    That's what I meant.

    > Either way would be unsafe. One could still add a mutable instance
    > directly into the HashMap (bypassing freeze()) and then modify it.
    >

    Good point. As implied, I thought I must have missed something, so thanks
    for pointing it out.

    > You could do with an abstract base class, from which you derive a
    > mutable and an immutable subclass and have your mutable subclass's
    > freeze() create an immutable-subclass-typed snapshot instance.
    >

    Neat.

    Am I right to suppose the immutable subclass's getHash() would return a
    hash value while the mutable subclass's getHash() would throw an
    exception to prevent it being added to a hash collection?


    --
    martin@ | Martin Gregorie
    gregorie. | Essex, UK
    org |
     
    Martin Gregorie, Dec 18, 2010
    #18
  19. Martin Gregorie <> wrote:
    > Am I right to suppose the immutable subclass's getHash() would return a
    > hash value while the mutable subclass's getHash() would throw an
    > exception to prevent it being added to a hash collection?


    Throwing an exception is one way.

    Defining hashCode() to return some constant like that
    public int hashCode() { return 42; }
    would be ok, too. (actual value doesn't matter)

    In the former case, you get a runtime exception on trying to add
    it to a HashMap, in the latter case it will work, but always place
    all the mutable instances into the same bucket. Bad performance,
    but consistent/correct (regardless of how equals() is defined).
     
    Andreas Leitgeb, Dec 18, 2010
    #19
  20. mark jason

    Lew Guest

    Aeris wrote:
    > So, in the most of the cases (immutable object), hashcode and equals MAY be
    > equivalent and hashCode could be computed «statically» in constuctors and my
    > equals code is good.


    Actually, 'hashCode()' and 'equals()' will not be equivalent in general,
    because unequal objects always risk having the same hash code. So if by "my
    equals code" you mean

    'return this.hashCode() == obj.hashCode();'

    this is never good. Ever. You cannot and should not assume that distinct by
    'equals()' implies distinct hash codes. Bad, very bad. Very, very bad.
    Very, very, very bad. Very, very, very, very, ...

    > In minor cases (immutable object), hashCode + equals mix is a non-sense
    > concept, which could lead to very very serious problems with all generic
    > classes (collection, map, set, generic algo…) and MUST be avoided and
    > disable by an exception.


    It is possible to use mutable objects in collections as long as they don't
    mutate whilst in the collection. It's just risky. But it is done all the
    time, e.g., in JPA entities.

    --
    Lew
    Ceci n'est pas une pipe.
     
    Lew, Dec 19, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Gregory A. Swarthout

    equals and hashCode

    Gregory A. Swarthout, Dec 19, 2003, in forum: Java
    Replies:
    2
    Views:
    370
    Silvio Bierman
    Dec 20, 2003
  2. Franck

    equals() and persitence

    Franck, Jan 27, 2004, in forum: Java
    Replies:
    8
    Views:
    442
    Anton Spaans
    Jan 28, 2004
  3. Edward A Thompson
    Replies:
    4
    Views:
    558
    Tony Morris
    Feb 11, 2004
  4. Mike Schilling
    Replies:
    11
    Views:
    1,479
    Mike Schilling
    Jun 12, 2004
  5. Tom Dyess

    equals and hashCode

    Tom Dyess, Jun 19, 2005, in forum: Java
    Replies:
    28
    Views:
    938
    Virgil Green
    Jun 23, 2005
Loading...

Share This Page