In 'HashMap.put', "if (e.hash == hash && eq(k, e.key))" ?

R

Red Orchid

The 'HashMap.put' has the code "if (e.hash == hash && eq(k, e.key))".
Do you think that the 'e.hash == hash' is required ?

I think that the "if (k == e.key || k.equals(e.key))" is proper.

For further details,
the following codes are parts of 'HashMap'.

<code>
public V put(K key, V value) {
K k = maskNull(key);
int hash = hash(k);
int i = indexFor(hash, table.length);

for (Entry<K,V> e = table; e != null; e = e.next) {
if (e.hash == hash && eq(k, e.key)) { // #1
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
...

static int indexFor(int h, int length) {
return h & (length-1); // #2
}

static boolean eq(Object x, Object y) {
return x == y || x.equals(y); // #3
}

</code>

The linked list of the 'table' is heterogeneous with
hash value. That is, when hash value of 'E0' is 5
and one of 'E1' is 6, both 'E0' and 'E1' can exist in
the same bucket 'table' because of the above #2
( 00 & 10 = 00, 01 & 10 = 00).

But, it do not justify the code "if (e.hash == hash" of #1.

Because, ...

a) The functions of 'Map' puts/gets the value 'V' that is
mapped by the key 'K', not by the hash value.

b) The execution time of "k == e.key" by #3 is not large
than one of 'e.hash == hash'.

Therefore, I think,
The code "if (k == e.key || k.equals(e.key))" is proper
instead of #1. #1 is overabundance.

I think that if there is overabundance in a library,
it means that the library is not optimised.

What is your opinion ?
Thanks.
 
H

Hendrik Maryns

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Red Orchid schreef:
The 'HashMap.put' has the code "if (e.hash == hash && eq(k, e.key))".
Do you think that the 'e.hash == hash' is required ?

I think that the "if (k == e.key || k.equals(e.key))" is proper.

For further details,
the following codes are parts of 'HashMap'.

<code>
public V put(K key, V value) {
K k = maskNull(key);
int hash = hash(k);
int i = indexFor(hash, table.length);

for (Entry<K,V> e = table; e != null; e = e.next) {
if (e.hash == hash && eq(k, e.key)) { // #1
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
...

static int indexFor(int h, int length) {
return h & (length-1); // #2
}

static boolean eq(Object x, Object y) {
return x == y || x.equals(y); // #3
}

</code>

The linked list of the 'table' is heterogeneous with
hash value. That is, when hash value of 'E0' is 5
and one of 'E1' is 6, both 'E0' and 'E1' can exist in
the same bucket 'table' because of the above #2
( 00 & 10 = 00, 01 & 10 = 00).

But, it do not justify the code "if (e.hash == hash" of #1.

Because, ...

a) The functions of 'Map' puts/gets the value 'V' that is
mapped by the key 'K', not by the hash value.

b) The execution time of "k == e.key" by #3 is not large
than one of 'e.hash == hash'.

Therefore, I think,
The code "if (k == e.key || k.equals(e.key))" is proper
instead of #1. #1 is overabundance.

I think that if there is overabundance in a library,
it means that the library is not optimised.

What is your opinion ?


I definitely agree that the code could be much clearer, but in a hurry
(and after some debugging experience with disappearing values), I am
pretty sure I can tell you that this test is to make sure that different
keys that give the same hash code can still be stored, that is what the
linked list is for.

H.

- --
Hendrik Maryns

==================
www.lieverleven.be
http://aouw.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFD3gRZe+7xMGD3itQRAptGAJ9s2jUN+3Mcbs0bzg8hzGcd+LYSswCfRHOd
RHJX056IXaeDjH+IKJfr3J8=
=IAPn
-----END PGP SIGNATURE-----
 
G

Gordon Beaton

The 'HashMap.put' has the code "if (e.hash == hash && eq(k,
e.key))". Do you think that the 'e.hash == hash' is required ?

The extra test is an optimization for a common case, since it is a
fast operation compared to invoking equals() on each of the keys.

Each bucket in the hastable can potentially store many different
objects with different keys and different hash values. There is no
point in checking the key for equality unless you know the hash value
is correct.
I think that the "if (k == e.key || k.equals(e.key))" is proper.

k == e.key implies k.equals(e.key), so both tests aren't strictly
necessary.

However k.equals(e.key) does not imply k == e.key, so you need to
check with equals() regardless for those times when a different key
(with the same value) is used.

Your test for identity (k = e.key) might give a slight performance
advantage when identical (not just equal) keys are being reused often,
but likely less than the hash comparison above.

/gordon
 
Z

zero

Gordon Beaton said:
The extra test is an optimization for a common case, since it is a
fast operation compared to invoking equals() on each of the keys.

Remember that that && operator shortcuts - it doesn't evaluate the second
operand if the first is already false. So, provided the elements have good
hashing algorithms, this can indeed have quite some influence on
performance.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top