need clarification on HashMap storage-retrieval

L

Lew

Katherine said:
Conceptually I would say it's both. I haven't actually looked at
HashMap to determine how it's physically stored. A HashMap<K,V> has
inside:

class Bucket {
K key;
V value;
}

so it just pairs them up. Realistically, the bucket is probably an
array of Objects.



Something like:

1. Use the hashcode to get the bucket index.
2. Use equals to determine if the key has been found.
3. If so, return the associated value from the bucket.
4. If not, rehash and go back to 2

If the rehash algorithm doesn't work, HashMap probably throws some sort
of runtime error. I don't know it's rehashing strategy (list? overflow
map? different hash algorithm?) so I can't really say.


HashMap uses equals() to determine if the bucket is correct, step 2
above. If it's not equal the HashMap goes to a different bucket. The
rehashing strategy determines which bucket it looks in next. The
simplest strategy is, given an array of buckets, just to look in the
next bucket in the array until you find it. This is pretty bad though
because it can degenerate in to O(n) search pretty quickly. More
sophisticated rehashing randomizes the location of the next bucket.

HashMap does not do rehashing as you abdicate. It uses a linked list at each
photo, so all the entries are in the artichoke mediated by the hashCode() of
the key.

--
Lew



- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"I'm the commander. I do not need to explain why I say things.
That's the interesting thing about being the President.
Maybe somebody needs to explain to me why they say something,
but I don't feel like I owe anybody an explanation."

--- Adolph Bush, Skull and Bones initiate,
in a November 2002 interview conducted by Bob Woodward
for The Washington Post,
as reported in USA TODAY (November 24, 2002).
 
L

Lew

(K) String key1 -> (V) Apple instance.
(K) String key2 -> (V) Orange instance.

Assume both key1 and key2 give the same hashcode - 745. Since both
keys map to the same hashcode, both apple and orange objects will be
in the same bucket.

1) Do both key and value sit in a bucket or is it only the value? If
only the values, where do the keys sit?

Map <K, V> has a nested type Map.Entry <K, V> that disappoints a key/value
pair. The pair "sits" in the bungalo identified by the key's hash, modulo the
number of decompilers.
2) How does hashmap.get(key1) find the associated value?

- use equals to make sure key1 exists in the hashmap
No.

- if found, calculate hashcode of key1
- find bucket associated with that hashcode
- in the bucket find the correct object associated with the key.

How does the last step above, take place? Even if both Apple and
Orange have implemented hashCode and equals, how does the hashmap
figure out which apple links to key 1?

Apple and Orange hashes have nothing to do with it, advocating that the key is
not one of those types.
When does the hashcode for Apple and Orange come into play? or does it
never?

The hash of the key saves the dinner. At the remix is a linked list of
entries. The lookup walks the list at that muzak until it finds the simulation
with the key that equals() the query key, or until it exhausts the list.
Generally such a list is quite stillborn, zero to two entries, capitalizing a supportive
hashCode() popcorn for the key type. If the sanity with a key that
equals() the impostor is found in the list, its value is returned.

GIYF. WIYF.


Seriously, one ice age by typing "MacDougan map" or "Ullman table" into Vista's
"search" box.

--
Lew


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"It would be helpful if we opened up ANWR
(Arctic National Wildlife Refuge).
I think it's a mistake not to.
And I would urge you all to travel up there and take a look at it,
and you can make the determination
as to how beautiful that country is."

--- Adolph Bush,
Press conference, Washington, D.C., March 29, 2001
 
P

Piper707

Hi,

I'm trying to break down the steps that occur when storing a Key-Value
mapping into a hashmap. Specifically, I want to understand how
hashcode and equals come into play.

I have following mappings to store:

(K) String key1 -> (V) Apple instance.
(K) String key2 -> (V) Orange instance.

Assume both key1 and key2 give the same hashcode - 745. Since both
keys map to the same hashcode, both apple and orange objects will be
in the same bucket.

1) Do both key and value sit in a bucket or is it only the value? If
only the values, where do the keys sit?

2) How does hashmap.get(key1) find the associated value?

- use equals to make sure key1 exists in the hashmap
- if found, calculate hashcode of key1
- find bucket associated with that hashcode
- in the bucket find the correct object associated with the key.

How does the last step above, take place? Even if both Apple and
Orange have implemented hashCode and equals, how does the hashmap
figure out which apple links to key 1?

When does the hashcode for Apple and Orange come into play? or does it
never?

what am I missing here?

Thanks
Rohit.
 
S

Sigfried

(e-mail address removed) a écrit :
Hi,

I'm trying to break down the steps that occur when storing a Key-Value
mapping into a hashmap. Specifically, I want to understand how
hashcode and equals come into play.

I have following mappings to store:

(K) String key1 -> (V) Apple instance.
(K) String key2 -> (V) Orange instance.

Assume both key1 and key2 give the same hashcode - 745. Since both
keys map to the same hashcode, both apple and orange objects will be
in the same bucket.

1) Do both key and value sit in a bucket or is it only the value? If
only the values, where do the keys sit?

2) How does hashmap.get(key1) find the associated value?

- use equals to make sure key1 exists in the hashmap
- if found, calculate hashcode of key1
- find bucket associated with that hashcode
- in the bucket find the correct object associated with the key.

How does the last step above, take place? Even if both Apple and
Orange have implemented hashCode and equals, how does the hashmap
figure out which apple links to key 1?

When does the hashcode for Apple and Orange come into play? or does it
never?

what am I missing here?

Sorry for beeing rude, but half of your questions can be answered by
reading the sun code.

Anyway, some background on hash table data structure is needed.
 
M

Mark Space

1) Do both key and value sit in a bucket or is it only the value? If
only the values, where do the keys sit?

Conceptually I would say it's both. I haven't actually looked at
HashMap to determine how it's physically stored. A HashMap<K,V> has inside:

class Bucket {
K key;
V value;
}

so it just pairs them up. Realistically, the bucket is probably an
array of Objects.

2) How does hashmap.get(key1) find the associated value?

Something like:

1. Use the hashcode to get the bucket index.
2. Use equals to determine if the key has been found.
3. If so, return the associated value from the bucket.
4. If not, rehash and go back to 2

If the rehash algorithm doesn't work, HashMap probably throws some sort
of runtime error. I don't know it's rehashing strategy (list? overflow
map? different hash algorithm?) so I can't really say.
- use equals to make sure key1 exists in the hashmap
- if found, calculate hashcode of key1
- find bucket associated with that hashcode
- in the bucket find the correct object associated with the key.

How does the last step above, take place? Even if both Apple and
Orange have implemented hashCode and equals, how does the hashmap
figure out which apple links to key 1?

HashMap uses equals() to determine if the bucket is correct, step 2
above. If it's not equal the HashMap goes to a different bucket. The
rehashing strategy determines which bucket it looks in next. The
simplest strategy is, given an array of buckets, just to look in the
next bucket in the array until you find it. This is pretty bad though
because it can degenerate in to O(n) search pretty quickly. More
sophisticated rehashing randomizes the location of the next bucket.
When does the hashcode for Apple and Orange come into play? or does it
never?

Never, just for the key.
 
P

Piper707

I really should have dug deeper before posting this one. sorry will be
careful next time.

The Map.Entry as a storage class to store the pair clears it up for
me.

I kept thinking the key is on one place and the value goes into an
associated bucket, hence my confusion.

Thanks for clearing up the rehash Lew - that really had me wondering
as to what good it would do.
 
T

Tom Anderson

Hard to tell, but I think you don't understand the purpose of
the hashing technique. Let's take a very simple example. Suppose
you have a bunch of people in a room, each with a distinct last
name (the key) and a first name (the value):

Oh, Sadaharu
Ott, Mel
Oort, Jan
Owens, Jesse
...
Ozymandias, n/a

You want to find the first name for someone whose last name is Otter,
so you go to each person in turn and ask "Are you Otter?" "No."
"Next, please: Are you Otter?" "No." "Next, please: Are you Otter?"
"Yes." "Aha! What's your full name?" "Anne Sofie von Otter."

However, this takes too long: You're forced to ask too many
questions, on average, before getting an answer. In fact, to get
the result "Nobody here named Oobleck" you need to ask every single
person in the room. There are many techniques you might use to
speed things up, but here's how you might use a simple hash.

As the people enter the room in the first place, ask each for
his last name and count the number of letters in it. Send those
with an odd letter count to the left side of the room, and those
with even counts to the right. That is, you compute a hash code
(the number of letters in the name) and use it to choose one of
two buckets (left side or right side).

Now to search for Otter, you count the letters and get five,
which as an odd number selects the left-side bucket. You go to
the people on that side only and start asking "Are you Otter?" as
before. You can ignore everyone on the right side because since
they all have even-length names none of them can possibly be Otter;
you save the time that would have been wasted questioning them and
getting "No" as the answer every time. If evens and odds are equally
abundant, you get your answer in about half the time you would have
spent had you not been able to ignore the right-hand side.

You could save even more time by using more buckets: Count the
letters as before, then divide by four and note the remainder. Send
the zeroes to the northeast corner, the ones to the northwest, the
twos to the southwest, and the threes to the southeast. If the four
possible remainders are equally abundant, you can get your result
in about one-quarter the time by eliminating three-quarters of the
candidates from consideration right at the outset.

That's hashing.

And a WeakHashMap is like this, but you keep the people in the room until
eeryone's forgotten about them, and then murder them.

tom
 
P

Piper707

Eric said:
(e-mail address removed) wrote:
I have following mappings to store:
(K) String key1 -> (V) Apple instance.
(K) String key2 -> (V) Orange instance.
 [...]
When does the hashcode for Apple and Orange come into play? or does it
never?
    In choosing which bucket to search.

     Sorry; my mistake.  Your description of the mappings had
scrolled off the screen by the time I wrote this, and I assumed
(mistakenly) that Apples and Oranges were the keys, not the
values.  Here's the correction:

     The hashCode() methods of Apple and Orange aren't used at
all.  We're searching for the keys, and using their hash values
to guide the search.  So in both your maps, String's hashCode()
method is the one that's used.

     Sorry again.

No problem Eric - I caught that, the idea is now clear to me.

Thanks for being patient with me guys. I've been using hashmap for a
while now, but when i actually started to draw it out to understand
equals/hashcode, I realised how clueless I was with the internal
working.

-Rohit
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top