Improving hashCode() to match equals()

Discussion in 'Java' started by Marco, Dec 18, 2003.

  1. Marco

    Marco Guest

    In my NamedBitField class I define equals() to mean that 2
    NamedBitFields are equal if all 3 of their fields are equal.

    Any suggestions for improving my hashCode() definition? I
    can improve its performance by caching the computed hashCode
    in a transient field, sure, but can I improve the _way_ it's
    computed?


    class NamedBitField {

    String name; // name of bit field
    int startIndex; // where it starts
    int length; // how many bits it occupies

    pubic boolean equals(Object obj) {
    if (this == obj)
    return true;
    if ( obj == null || this.getClass() != obj.getClass() )
    return false;
    NamedBitField other = (NamedBitField) obj;
    return
    this.name.equals(other.name) &&
    this.startIndex == other.startIndex &&
    this.length == other.length;
    }

    public int hashCode() {

    // to make this symmetric with equals() I'd prefer to
    // involve all 3 fields (name, startIndex and length)
    // in the computation of the hash. But how?
    //
    // Here I'm simply reusing String.hashCode(), but the
    // hashed String at least incorporates the other
    // fields
    return ((name + startIndex) + length).hashCode();
    }
    }


    Marco
    ----------------------------------------------------
    Please remove digits from e-mail address (tr/0-9//d)
    Marco, Dec 18, 2003
    #1
    1. Advertising

  2. Marco

    VisionSet Guest

    "Marco" <5et.u9k> wrote in message
    news:t2lEb.557$...
    > In my NamedBitField class I define equals() to mean that 2
    > NamedBitFields are equal if all 3 of their fields are equal.
    >
    > Any suggestions for improving my hashCode() definition? I
    > can improve its performance by caching the computed hashCode
    > in a transient field, sure, but can I improve the _way_ it's
    > computed?
    >
    >
    > class NamedBitField {
    >
    > String name; // name of bit field
    > int startIndex; // where it starts
    > int length; // how many bits it occupies
    >
    > pubic boolean equals(Object obj) {
    > if (this == obj)
    > return true;
    > if ( obj == null || this.getClass() != obj.getClass() )
    > return false;
    > NamedBitField other = (NamedBitField) obj;
    > return
    > this.name.equals(other.name) &&
    > this.startIndex == other.startIndex &&
    > this.length == other.length;
    > }
    >
    > public int hashCode() {
    >
    > // to make this symmetric with equals() I'd prefer to
    > // involve all 3 fields (name, startIndex and length)
    > // in the computation of the hash. But how?
    > //
    > // Here I'm simply reusing String.hashCode(), but the
    > // hashed String at least incorporates the other
    > // fields
    > return ((name + startIndex) + length).hashCode();
    > }
    > }


    public int hashCode() {

    int const = 17;
    int h = 13;
    h = h * startIndex + const;
    h = h * length + const;
    h = h * name.hashCode() + cost;

    return h;
    }

    Something like that, I believe, where 17 & 13 are different prime numbers,
    but it isn't that important. It doesn't matter if h overflows.

    --
    Mike W
    VisionSet, Dec 18, 2003
    #2
    1. Advertising

  3. Marco

    Chris Smith Guest

    VisionSet wrote:
    > public int hashCode() {
    >
    > int const = 17;
    > int h = 13;
    > h = h * startIndex + const;
    > h = h * length + const;
    > h = h * name.hashCode() + cost;
    >
    > return h;
    > }
    >
    > Something like that, I believe, where 17 & 13 are different prime numbers,
    > but it isn't that important. It doesn't matter if h overflows.


    Sure, something like that. Except that, IIRC, const is an unused
    reserved word in Java.

    --
    www.designacourse.com
    The Easiest Way to Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
    Chris Smith, Dec 18, 2003
    #3
  4. Marco

    Marco Guest

    Chris Smith wrote:
    > VisionSet wrote:
    >
    >>public int hashCode() {
    >>
    >> int const = 17;
    >> int h = 13;
    >> h = h * startIndex + const;
    >> h = h * length + const;
    >> h = h * name.hashCode() + cost;
    >>
    >> return h;
    >>}
    >>
    >>Something like that, I believe, where 17 & 13 are different prime numbers,
    >>but it isn't that important. It doesn't matter if h overflows.


    That's great, thanks. This algorithm makes the following 2
    objects would return different hash codes despite their
    similarity...

    p.name = "foobar"; p.startIndex = 2; p.length = 5;
    q.name = "foobar"; q.startIndex = 5; q.length = 2;

    I can see now that adding 17 ('const' variable) to the end
    of each cumulative multiplication is essential. Looks fast
    too. I may even benchmark it on my little 486 :)

    I did have crazy thoughts about implementing hashCode() in
    C and calling it through JNI. I guess the overhead in
    dynamically linking and then marshalling arguments to a
    native stack just isn't worth it for such a small piece of
    code though.

    > IIRC, const is an unused reserved word in Java.


    Thanks Chris. You're right. const is reserved.

    Marco
    ----------------------------------------------------
    Please remove digits from e-mail address (tr/0-9//d)
    Marco, Dec 18, 2003
    #4
  5. Marco

    Thinkit Guest

    "VisionSet" <> wrote in message news:<nalEb.5824$>...
    > "Marco" <5et.u9k> wrote in message
    > news:t2lEb.557$...
    > > In my NamedBitField class I define equals() to mean that 2
    > > NamedBitFields are equal if all 3 of their fields are equal.
    > >
    > > Any suggestions for improving my hashCode() definition? I
    > > can improve its performance by caching the computed hashCode
    > > in a transient field, sure, but can I improve the _way_ it's
    > > computed?
    > >
    > >
    > > class NamedBitField {
    > >
    > > String name; // name of bit field
    > > int startIndex; // where it starts
    > > int length; // how many bits it occupies
    > >
    > > pubic boolean equals(Object obj) {
    > > if (this == obj)
    > > return true;
    > > if ( obj == null || this.getClass() != obj.getClass() )
    > > return false;
    > > NamedBitField other = (NamedBitField) obj;
    > > return
    > > this.name.equals(other.name) &&
    > > this.startIndex == other.startIndex &&
    > > this.length == other.length;
    > > }
    > >
    > > public int hashCode() {
    > >
    > > // to make this symmetric with equals() I'd prefer to
    > > // involve all 3 fields (name, startIndex and length)
    > > // in the computation of the hash. But how?
    > > //
    > > // Here I'm simply reusing String.hashCode(), but the
    > > // hashed String at least incorporates the other
    > > // fields
    > > return ((name + startIndex) + length).hashCode();
    > > }
    > > }

    >
    > public int hashCode() {
    >
    > int const = 17;
    > int h = 13;
    > h = h * startIndex + const;
    > h = h * length + const;
    > h = h * name.hashCode() + cost;
    >
    > return h;
    > }
    >
    > Something like that, I believe, where 17 & 13 are different prime numbers,
    > but it isn't that important. It doesn't matter if h overflows.


    You should really use 0x11 and 0xD, respectively. Unless you are
    inputting bowling scores, there's no reason for decimal!
    Thinkit, Dec 19, 2003
    #5
  6. Hi Marco,
    I might be off the beam here but I noticed this piece of code in your
    class:

    >> return ((name + startIndex) + length).hashCode();


    What occured to me is that (if I read it correctly) you have a name of
    "Fred", a startIndex = 1 and length = 12, would that produce the same
    result as "Fred", 11 and 2 ?

    If so would

    String final delim = ":";
    return (name + delim + startIndex + delim + length).hashCode();

    be a better solution because it separates the fields so that the hashCode()
    function can see the difference ?


    --
    cio
    Derek
    Derek Clarkson, Dec 19, 2003
    #6
  7. Marco

    Marco Guest

    Derek Clarkson wrote:
    >>>return ((name + startIndex) + length).hashCode();

    >
    > If so would
    >
    > String final delim = ":";
    > return (name + delim + startIndex + delim + length).hashCode();
    >
    > be a better solution because it separates the fields so that the hashCode()
    > function can see the difference ?


    Thanks Derek. You're right. Concatenating the numeric
    fields eliminated their distinctiveness somewhat...

    name = "Fred";
    startIndex = 1; // if 11 then same hash code
    length = 12; // if 2 then same hash code
    // calls "Fred112".hashCode()
    ((name + startIndex) + length).hashCode();

    Instead of pushing my object's state through
    String.hashCode(), I'm now incorporating it into a series
    of cumulative multiplications involving prime numbers as
    recommended by an earlier post in this thread.

    Marco
    ----------------------------------------------------
    Please remove digits from e-mail address (tr/0-9//d)
    Marco, Dec 19, 2003
    #7
  8. Marco

    Alex Hunsley Guest

    Thinkit wrote:

    > "VisionSet" <> wrote in message news:<nalEb.5824$>...
    >
    >>"Marco" <5et.u9k> wrote in message
    >>news:t2lEb.557$...
    >>
    >>>In my NamedBitField class I define equals() to mean that 2
    >>>NamedBitFields are equal if all 3 of their fields are equal.
    >>>
    >>>Any suggestions for improving my hashCode() definition? I
    >>>can improve its performance by caching the computed hashCode
    >>>in a transient field, sure, but can I improve the _way_ it's
    >>>computed?
    >>>
    >>>
    >>>class NamedBitField {
    >>>
    >>> String name; // name of bit field
    >>> int startIndex; // where it starts
    >>> int length; // how many bits it occupies
    >>>
    >>> pubic boolean equals(Object obj) {
    >>> if (this == obj)
    >>> return true;
    >>> if ( obj == null || this.getClass() != obj.getClass() )
    >>> return false;
    >>> NamedBitField other = (NamedBitField) obj;
    >>> return
    >>> this.name.equals(other.name) &&
    >>> this.startIndex == other.startIndex &&
    >>> this.length == other.length;
    >>> }
    >>>
    >>> public int hashCode() {
    >>>
    >>> // to make this symmetric with equals() I'd prefer to
    >>> // involve all 3 fields (name, startIndex and length)
    >>> // in the computation of the hash. But how?
    >>> //
    >>> // Here I'm simply reusing String.hashCode(), but the
    >>> // hashed String at least incorporates the other
    >>> // fields
    >>> return ((name + startIndex) + length).hashCode();
    >>> }
    >>>}

    >>
    >>public int hashCode() {
    >>
    >> int const = 17;
    >> int h = 13;
    >> h = h * startIndex + const;
    >> h = h * length + const;
    >> h = h * name.hashCode() + cost;
    >>
    >> return h;
    >>}
    >>
    >>Something like that, I believe, where 17 & 13 are different prime numbers,
    >>but it isn't that important. It doesn't matter if h overflows.

    >
    >
    > You should really use 0x11 and 0xD, respectively. Unless you are
    > inputting bowling scores, there's no reason for decimal!


    Why difference, at all, does it make whether he writes 13 or 0xD? It's
    all the same to the bytecode that you end up with.
    Also, there actually *is* a reason to use decimal - other coders who
    come along later will more readily recognise 13 and 17 as prime numbers,
    as opposed to 0x11 and 0xD. I don't know of many people that make a
    habit of knowing what prime numbers are in hex.
    (And, yes, you could convert it in your head back to decimal, but that's
    just extra cognitive load and obfuscation.)

    Ah, hold on, I've come across you before! You're the dude who thinks hex
    is magically better than decimal somehow, and is on a mission to convert
    the world!

    http://makeashorterlink.com/?C2D351AD6

    alex
    Alex Hunsley, Dec 19, 2003
    #8
  9. Marco

    Alex Hunsley Guest

    Alex Hunsley wrote:

    > Thinkit wrote:
    >
    >> "VisionSet" <> wrote in message
    >> news:<nalEb.5824$>...
    >>
    >>> "Marco" <5et.u9k> wrote in message
    >>> news:t2lEb.557$...
    >>>
    >>>> In my NamedBitField class I define equals() to mean that 2
    >>>> NamedBitFields are equal if all 3 of their fields are equal.
    >>>>
    >>>> Any suggestions for improving my hashCode() definition? I
    >>>> can improve its performance by caching the computed hashCode
    >>>> in a transient field, sure, but can I improve the _way_ it's
    >>>> computed?
    >>>>
    >>>>
    >>>> class NamedBitField {
    >>>>
    >>>> String name; // name of bit field
    >>>> int startIndex; // where it starts
    >>>> int length; // how many bits it occupies
    >>>>
    >>>> pubic boolean equals(Object obj) {
    >>>> if (this == obj)
    >>>> return true;
    >>>> if ( obj == null || this.getClass() != obj.getClass() )
    >>>> return false;
    >>>> NamedBitField other = (NamedBitField) obj;
    >>>> return
    >>>> this.name.equals(other.name) &&
    >>>> this.startIndex == other.startIndex &&
    >>>> this.length == other.length;
    >>>> }
    >>>>
    >>>> public int hashCode() {
    >>>>
    >>>> // to make this symmetric with equals() I'd prefer to
    >>>> // involve all 3 fields (name, startIndex and length)
    >>>> // in the computation of the hash. But how?
    >>>> //
    >>>> // Here I'm simply reusing String.hashCode(), but the
    >>>> // hashed String at least incorporates the other
    >>>> // fields
    >>>> return ((name + startIndex) + length).hashCode();
    >>>> }
    >>>> }
    >>>
    >>>
    >>> public int hashCode() {
    >>>
    >>> int const = 17;
    >>> int h = 13;
    >>> h = h * startIndex + const;
    >>> h = h * length + const;
    >>> h = h * name.hashCode() + cost;
    >>>
    >>> return h;
    >>> }
    >>>
    >>> Something like that, I believe, where 17 & 13 are different prime
    >>> numbers,
    >>> but it isn't that important. It doesn't matter if h overflows.

    >>
    >>
    >>
    >> You should really use 0x11 and 0xD, respectively. Unless you are
    >> inputting bowling scores, there's no reason for decimal!

    >
    >
    > Why difference, at all, does it make whether he writes 13 or 0xD? It's
    > all the same to the bytecode that you end up with.
    > Also, there actually *is* a reason to use decimal - other coders who
    > come along later will more readily recognise 13 and 17 as prime numbers,
    > as opposed to 0x11 and 0xD. I don't know of many people that make a
    > habit of knowing what prime numbers are in hex.
    > (And, yes, you could convert it in your head back to decimal, but that's
    > just extra cognitive load and obfuscation.)
    >
    > Ah, hold on, I've come across you before! You're the dude who thinks hex
    > is magically better than decimal somehow, and is on a mission to convert
    > the world!
    >
    > http://makeashorterlink.com/?C2D351AD6
    >
    > alex


    Sorry, make that

    http://makeashorterlink.com/?P2C412AD6

    alex
    Alex Hunsley, Dec 19, 2003
    #9
  10. (Thinkit) wrote in message news:<>...
    > "VisionSet" <> wrote in message news:<nalEb.5824$>...


    > > Something like that, I believe, where 17 & 13 are different prime numbers,
    > > but it isn't that important. It doesn't matter if h overflows.

    >
    > You should really use 0x11 and 0xD, respectively. Unless you are
    > inputting bowling scores, there's no reason for decimal!


    ???
    What an odd comment. As a long time aficionado of good programming
    style, this is the first time I've ever heard this particular rule of
    thumb. What's inherently better about using hex literals over decimal
    literals in code? Personally, if I saw 0xD in that code, it would not
    be immediately obvious that it was chosen for being prime.

    In fact, for purely mathematical usage, I would say you should always
    use decimal. I only use hex when dealing with low-level data concepts
    such as bit masks and hardware flags; i.e. situations where knowing
    the bit pattern of the data is more important than knowing its numeric
    value.
    Karl von Laudermann, Dec 19, 2003
    #10
  11. Marco

    John Guest

    John, Jan 17, 2004
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Gregory A. Swarthout

    equals and hashCode

    Gregory A. Swarthout, Dec 19, 2003, in forum: Java
    Replies:
    2
    Views:
    332
    Silvio Bierman
    Dec 20, 2003
  2. Mike Schilling
    Replies:
    11
    Views:
    1,367
    Mike Schilling
    Jun 12, 2004
  3. Tom Dyess

    equals and hashCode

    Tom Dyess, Jun 19, 2005, in forum: Java
    Replies:
    28
    Views:
    883
    Virgil Green
    Jun 23, 2005
  4. Replies:
    6
    Views:
    452
    Alex Hunsley
    Apr 11, 2006
  5. Danno
    Replies:
    29
    Views:
    863
    Jeffrey Schwab
    Sep 16, 2006
Loading...

Share This Page