hashCode

Discussion in 'Java' started by bob smith, Aug 10, 2012.

  1. bob smith

    bob smith Guest

    Is it always technically correct to override the hashCode function like so:

    @Override
    public int hashCode() {
    return 1;
    }

    Would it be potentially better if that was Object's implementation?
     
    bob smith, Aug 10, 2012
    #1
    1. Advertising

  2. bob smith

    Arne Vajhøj Guest

    On 8/10/2012 11:47 AM, bob smith wrote:
    > Is it always technically correct to override the hashCode function like so:
    >
    > @Override
    > public int hashCode() {
    > return 1;
    > }


    It meets the minimum contractual obligation for that method.

    But performance of anything using the hash capability (like HashMap<>)
    would be very bad.

    > Would it be potentially better if that was Object's implementation?


    It has a different behavior that may not be valid if you override equals.

    Arne
     
    Arne Vajhøj, Aug 10, 2012
    #2
    1. Advertising

  3. bob smith

    Eric Sosman Guest

    On 8/10/2012 11:47 AM, bob smith wrote:
    > Is it always technically correct to override the hashCode function like so:
    >
    > @Override
    > public int hashCode() {
    > return 1;
    > }
    >
    > Would it be potentially better if that was Object's implementation?


    Define "better."

    --
    Eric Sosman
    d
     
    Eric Sosman, Aug 10, 2012
    #3
  4. bob smith

    markspace Guest

    On 8/10/2012 9:13 AM, Arne Vajhøj wrote:
    > On 8/10/2012 11:47 AM, bob smith wrote:
    >> Is it always technically correct to override the hashCode function
    >> like so:
    >>
    >> @Override
    >> public int hashCode() {
    >> return 1;
    >> }

    >
    > It meets the minimum contractual obligation for that method.
    >
    > But performance of anything using the hash capability (like HashMap<>)
    > would be very bad.
    >
    >> Would it be potentially better if that was Object's implementation?

    >
    > It has a different behavior that may not be valid if you override equals.



    I think at this point we are doing Bob's homework for him.
     
    markspace, Aug 10, 2012
    #4
  5. bob smith

    Arne Vajhøj Guest

    On 8/10/2012 1:13 PM, markspace wrote:
    > On 8/10/2012 9:13 AM, Arne Vajhøj wrote:
    >> On 8/10/2012 11:47 AM, bob smith wrote:
    >>> Is it always technically correct to override the hashCode function
    >>> like so:
    >>>
    >>> @Override
    >>> public int hashCode() {
    >>> return 1;
    >>> }

    >>
    >> It meets the minimum contractual obligation for that method.
    >>
    >> But performance of anything using the hash capability (like HashMap<>)
    >> would be very bad.
    >>
    >>> Would it be potentially better if that was Object's implementation?

    >>
    >> It has a different behavior that may not be valid if you override equals.

    >
    > I think at this point we are doing Bob's homework for him.


    Could be.

    But I think the question whether returning a constant from
    hashCode is a valid Java question for understanding the
    contract for that method.

    And I am pretty sure that I have seen other similar
    examples (just with 42 as constant).

    Arne
     
    Arne Vajhøj, Aug 10, 2012
    #5
  6. bob smith

    Roedy Green Guest

    On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
    <> wrote, quoted or indirectly quoted someone
    who said :

    > @Override
    > public int hashCode() {
    > return 1;
    > }


    that's about the worst possible hashCode function.

    See http://mindprod.com/jgloss/hashcode.html for tips on how to write
    good ones.
    --
    Roedy Green Canadian Mind Products http://mindprod.com
    A new scientific truth does not triumph by convincing its opponents and making them see the light,
    but rather because its opponents eventually die, and a new generation grows up that is familiar with it.
    ~ Max Planck 1858-04-23 1947-10-04
     
    Roedy Green, Aug 10, 2012
    #6
  7. bob smith

    Lew Guest

    Roedy Green wrote:
    > bob smith wrote, quoted or indirectly quoted someone
    > who said :
    >
    >> @Override
    > > public int hashCode() {
    > > return 1;
    > > }

    >
    > that's about the worst possible hashCode function.


    Normally that's correct, but it's conceivable that one might do it for
    some hackish reason. In most situations where one might do such
    an override as this, one would do better not to override hashCode().

    > See http://mindprod.com/jgloss/hashcode.html for tips on how to write
    > good ones.


    The default of assembling it via the mix-in of attribute hash codes
    using the Knuth constants is usually good enough.

    h(object) = Sum(i ∈ 0.. n-1) of ( attribute * pow(31, n - 1 - i) );

    or

    public static int calculateHash(Foo arg) {
    int h = 0;

    for ( each attribute of Foo that contributes to 'equals()' )
    {
    h = 31 * h + attribute.hashCode();
    }
    return h;
    }

    http://en.wikipedia.org/wiki/Hash_function
    http://en.wikipedia.org/wiki/Java_hashCode()
    http://en.wikipedia.org/wiki/Java_hashCode()#Java

    --
    Lew
     
    Lew, Aug 10, 2012
    #7
  8. bob smith

    bob smith Guest

    On Friday, August 10, 2012 11:34:28 AM UTC-5, Eric Sosman wrote:
    > On 8/10/2012 11:47 AM, bob smith wrote:
    >
    > > Is it always technically correct to override the hashCode function like so:

    >
    > >

    >
    > > @Override

    >
    > > public int hashCode() {

    >
    > > return 1;

    >
    > > }

    >
    > >

    >
    > > Would it be potentially better if that was Object's implementation?

    >
    >
    >
    > Define "better."
    >
    >
    >
    > --
    >
    > Eric Sosman
    >
    > d


    Better in the sense that you would never HAVE to override hashCode.

    Now, there are cases where you HAVE to override it, or your code is very broken.
     
    bob smith, Aug 10, 2012
    #8
  9. bob smith

    Lew Guest

    bob smith wrote:
    > Eric Sosman wrote:
    >> bob smith wrote:
    >>> Is it always technically correct to override the hashCode function like so:


    It complies with the contract for 'hashCode()'. Is that all it takes to be correct?

    >>> Would it be potentially better if that was Object's implementation?


    Would what be better if what were Object's implementation of what?

    >> Define "better."


    > Better in the sense that you would never HAVE to override hashCode.
    >
    > Now, there are cases where you HAVE to override it, or your code is very broken.


    No.

    No matter what 'Object''s 'hashCode()' implementation were, it would need to
    be overridden when you want value equality instead of object identity for
    'equals()'.

    See Joshua Bloch's seminal work /Effective Java/, which has items that pertain to this.

    Bottom line: 'hashCode()', 'equals()', and when present, 'compareTo()' must be consistent.

    'toString()' should be consistent with those.

    As long as 'hashCode()' fulfills the contract, your code will work - functionally. But a bad
    'hashCode()' could and likely will noticeably affect performance. There is more to correctness
    than mere functional conformance.

    --
    Lew
     
    Lew, Aug 10, 2012
    #9
  10. bob smith

    Arne Vajhøj Guest

    On 8/10/2012 3:17 PM, Roedy Green wrote:
    > On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
    > <> wrote, quoted or indirectly quoted someone
    > who said :
    >
    >> @Override
    >> public int hashCode() {
    >> return 1;
    >> }

    >
    > that's about the worst possible hashCode function.


    Yes, but the posted asked "Is it always technically correct to ..."
    not whether is was "best possible".

    Arne
     
    Arne Vajhøj, Aug 11, 2012
    #10
  11. bob smith

    Arne Vajhøj Guest

    On 8/10/2012 6:22 PM, bob smith wrote:
    > On Friday, August 10, 2012 11:34:28 AM UTC-5, Eric Sosman wrote:
    >> On 8/10/2012 11:47 AM, bob smith wrote:
    >>> Is it always technically correct to override the hashCode function like so:
    >>> @Override
    >>> public int hashCode() {
    >>> return 1;
    >>> }
    >>> Would it be potentially better if that was Object's implementation?

    >>
    >> Define "better."

    >
    > Better in the sense that you would never HAVE to override hashCode.
    >
    > Now, there are cases where you HAVE to override it, or your code is very broken.


    It is not broken.

    It will perform poorly in many cases.

    Arne
     
    Arne Vajhøj, Aug 11, 2012
    #11
  12. bob smith

    Arne Vajhøj Guest

    On 8/10/2012 6:32 PM, Lew wrote:
    > bob smith wrote:
    >> Now, there are cases where you HAVE to override it, or your code is very broken.

    >
    > No.


    > As long as 'hashCode()' fulfills the contract, your code will work - functionally. But a bad
    > 'hashCode()' could and likely will noticeably affect performance. There is more to correctness
    > than mere functional conformance.


    If the code per specs is guaranteed to work then it is correct.

    Good (or just decent) performance is not necessary for code to
    be correct.

    At least not in the traditional programming terminology.

    In plain English maybe.

    Arne
     
    Arne Vajhøj, Aug 11, 2012
    #12
  13. bob smith

    Roedy Green Guest

    On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <>
    wrote, quoted or indirectly quoted someone who said :

    > h =3D 31 * h + attribute.hashCode();
    > }

    In my essay I recommend XOR which is an inherentely faster operation
    than multiply. I wonder which actually works out better. If you had a
    large number of fields, the multiply effect could fall off the left
    hand end. It is the algorithm used for String which could have very
    long strings, so Sun must have thought of that.
    --
    Roedy Green Canadian Mind Products http://mindprod.com
    A new scientific truth does not triumph by convincing its opponents and making them see the light,
    but rather because its opponents eventually die, and a new generation grows up that is familiar with it.
    ~ Max Planck 1858-04-23 1947-10-04
     
    Roedy Green, Aug 11, 2012
    #13
  14. bob smith

    Eric Sosman Guest

    On 8/10/2012 6:22 PM, bob smith wrote:
    [... many blank lines removed for legibility's sake ...]
    > On Friday, August 10, 2012 11:34:28 AM UTC-5, Eric Sosman wrote:
    >> On 8/10/2012 11:47 AM, bob smith wrote:
    >>
    >>> Is it always technically correct to override the hashCode function like so:
    >>>
    >>> @Override
    >>> public int hashCode() {
    >>> return 1;
    >>> }
    >>>
    >>> Would it be potentially better if that was Object's implementation?

    >>
    >> Define "better."

    >
    > Better in the sense that you would never HAVE to override hashCode.
    >
    > Now, there are cases where you HAVE to override it, or your code is very broken.


    I cannot think of a case where you HAVE to override hashCode(),
    except as a consequence of other choices that you didn't HAVE to
    make. You don't HAVE to invent classes where distinct instances
    are considered equal, and even if you do you don't HAVE to put those
    instances in HashMaps or HashSets or whatever.

    But that's a bit specious: All it says is that you don't HAVE
    to override hashCode() because you don't HAVE to use things that
    call it. It's like "You don't HAVE to pay taxes, because you don't
    HAVE to live outside prison." So, let's take it as a given that
    you will often need to write classes that override equals() and
    hashCode() -- I imagine you understand that they go together.

    Okay: Then returning a constant 1 (or 42 or 0 or whatever)
    would in fact satisfy the letter of the law regarding hashCode():
    Whenever x.equals(y) is true, x.hashCode() == y.hashCode(). In
    your example this would be trivially true because x,y,z,... all
    have the same hashCode() value, whether they're equal or not --
    You have lived up to the letter of the law.

    Of course, such a hashCode() would make all those hash-based
    containers pretty much useless: They would work in the sense that
    they would get the Right Answer, but they'd be abominably slow,
    with expected performance of O(N) instead of O(1). See
    <http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/>
    for a survey of some denial-of-service attacks that work by driving
    hash tables from O(1) to O(N), resulting in catastrophic failure
    of the attacked system.

    In other words, the letter of the law on hashCode() is a bare
    minimum that guarantees correct functioning, but it is not enough
    to guarantee usability. Why isn't the law more specific? Because
    nobody knows how to write "hashCode() must be correct *and* usable"
    in terms that would cover all the classes all the Java programmers
    have dreamed up and will dream up. Your hashCode() meets the bare
    minimum requirement, but is not "usable." The actual hashCode()
    provided by Object also meets the bare minimum requirement, and *is*
    usable as it stands, until (and unless; you don't HAVE to) you
    choose to implement other equals() semantics, and a hashCode() to
    match them.


    --
    Eric Sosman
    d
     
    Eric Sosman, Aug 11, 2012
    #14
  15. bob smith

    Eric Sosman Guest

    On 8/10/2012 7:25 PM, Arne Vajhøj wrote:
    > On 8/10/2012 3:17 PM, Roedy Green wrote:
    >> On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
    >> <> wrote, quoted or indirectly quoted someone
    >> who said :
    >>
    >>> @Override
    >>> public int hashCode() {
    >>> return 1;
    >>> }

    >>
    >> that's about the worst possible hashCode function.

    >
    > Yes, but the posted asked "Is it always technically correct to ..."
    > not whether is was "best possible".


    He also asked whether it would "be potentially better."

    --
    Eric Sosman
    d
     
    Eric Sosman, Aug 11, 2012
    #15
  16. bob smith

    Jan Burse Guest

    bob smith schrieb:
    > Is it always technically correct to override the hashCode function like so:
    >
    > @Override
    > public int hashCode() {
    > return 1;
    > }
    >
    > Would it be potentially better if that was Object's implementation?
    >


    Maybe it would make sense to spell out what the contract
    for hashCode() is. Well the contract is simply, the
    following invariant should hold:

    /* invariant that should hold */
    if a.equals(b) then a.hashCode()==b.hashCode()

    It should be noted that this does not imply:

    /* not implied and thus not required by the invariant */
    if a.hashCode()==b.hashCode() then a.equals(b)

    It is also quite unlikely that a hashCode() would satisfy
    the later, although the closer it comes to the later, the
    better it works for HashMap, etc..

    The default objects implementation of hashCode() matches
    the default objects impementation of equals(). The default
    objcts implementation of equals() is ==. And the default
    objects implementation of hashCode() is
    System.identityHashCode().

    The System identity hash code is stored in the object
    and generated by the system. It does not change during
    GC although the internal object address might change
    during GC. It is only 32bit although internal object
    addresses might by 64bit with a corresponding JVM.

    Returning a constant, any constant c not only 1, would be
    technically correct correct for the default implementation
    of the class object. Since it trivially satisfies the
    invariant:

    if a.equals(b) then c==c

    is trivially true, since c==c is true. But it is not
    better. Since you would get very degenerated HashMaps,
    etc..

    You need to override the hashhCode() when there is danger
    that the invariant is not anymore satisified. This is
    not the case when equals() is not overridden. So overriding
    hashCode() just for fun when equals() is not overriden,
    usually doesn't make sense. It will probably only slow
    down the hashCode() calculation. So the following:

    hashCode() = sum attr_i * c^i

    Is not necessary. But it would be a possible way to go
    when equals() were overriden in the following way:

    equals(other) = and_i attr_i.equals(other.attr_i)

    The above happens when you turn your object into a container
    of other objects irrespective of the own object identity.
    But beware if the container contains itself somewhere. This
    is why we find in the code for Hashtable the following
    complication:

    public synchronized int hashCode() {
    /*
    * This code detects the recursion caused by computing the hash code
    * of a self-referential hash table and prevents the stack overflow
    * that would otherwise result. This allows certain 1.1-era
    * applets with self-referential hash tables to work. This code
    * abuses the loadFactor field to do double-duty as a hashCode
    * in progress flag, so as not to worsen the space performance.
    * A negative load factor indicates that hash code computation is
    * in progress.
    */

    Interestingly it will return a constant for the object when
    it detects a loop. Maybe one could do better... Dunno

    Bye
     
    Jan Burse, Aug 11, 2012
    #16
  17. bob smith

    Jan Burse Guest

    Jan Burse schrieb:
    > during GC. It is only 32bit although internal object
    > addresses might by 64bit with a corresponding JVM.


    Typically even less bits, since the same space
    is used for some object flags.
     
    Jan Burse, Aug 11, 2012
    #17
  18. bob smith

    Arne Vajhøj Guest

    On 8/11/2012 8:00 AM, Eric Sosman wrote:
    > On 8/10/2012 7:25 PM, Arne Vajhøj wrote:
    >> On 8/10/2012 3:17 PM, Roedy Green wrote:
    >>> On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
    >>> <> wrote, quoted or indirectly quoted someone
    >>> who said :
    >>>
    >>>> @Override
    >>>> public int hashCode() {
    >>>> return 1;
    >>>> }
    >>>
    >>> that's about the worst possible hashCode function.

    >>
    >> Yes, but the posted asked "Is it always technically correct to ..."
    >> not whether is was "best possible".

    >
    > He also asked whether it would "be potentially better."


    "better to use Object hashCode" which again should bring the
    correctness question before the performance question.

    Arne
     
    Arne Vajhøj, Aug 11, 2012
    #18
  19. bob smith

    Joerg Meier Guest

    On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote:

    > On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <>
    > wrote, quoted or indirectly quoted someone who said :
    >> h =3D 31 * h + attribute.hashCode();
    >> }

    > In my essay I recommend XOR which is an inherentely faster operation
    > than multiply.


    Hasn't that been wrong since about the invention of the 80386 processor
    family ? Pretty sure by now MUL and XOR both take one cycle and that's it.


    Liebe Gruesse,
    Joerg

    --
    Ich lese meine Emails nicht, replies to Email bleiben also leider
    ungelesen.
     
    Joerg Meier, Aug 11, 2012
    #19
  20. bob smith

    Jan Burse Guest

    Roedy Green schrieb:
    > If you had a
    > large number of fields, the multiply effect could fall off the left
    > hand end.


    Actually this does not happen, since you multiply with 31,
    which is 1+2+4+8+16. So that:

    a*31+b = a*16+a*8+a*4+a*2+a+b

    So for a HashMap that uses an index = hash & (2^n - 1) (which is
    the same as hash mod 2^n), the impact of a will be still seen,
    even when it occurs at the very left hand side.

    There is some Microsoft C# HashMap implementation which does
    not use mod 2^n, but instead some primes. In case the
    implementation choses 31 as the designated prime, all
    information but for the first field will be lost. But since
    mod 2^32 is also applied, this might not be completely true.

    For 2^n I don't know exactly how the impact could be
    described. I guess in a HashMap with index = hash mod 2^1 the
    hash amounts to a parity bit, since the sum in a+b acts like
    an xor on the first right hand bit. But 2^n with n>1 the
    31 multiplication is a little more crude.
     
    Jan Burse, Aug 11, 2012
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roedy Green

    hashCode for byte[]

    Roedy Green, Aug 22, 2003, in forum: Java
    Replies:
    1
    Views:
    434
    Dale King
    Aug 22, 2003
  2. Marco
    Replies:
    10
    Views:
    779
  3. Gregory A. Swarthout

    equals and hashCode

    Gregory A. Swarthout, Dec 19, 2003, in forum: Java
    Replies:
    2
    Views:
    359
    Silvio Bierman
    Dec 20, 2003
  4. kelvSYC

    Designing hashCode() methods

    kelvSYC, Dec 23, 2003, in forum: Java
    Replies:
    1
    Views:
    387
    Ulrich Stern
    Dec 24, 2003
  5. Dimitri Pissarenko

    Hashcode of primitive types

    Dimitri Pissarenko, Jan 29, 2004, in forum: Java
    Replies:
    5
    Views:
    5,902
    Hylander
    Jan 29, 2004
Loading...

Share This Page