Arrays as key in a HashMap

trippy · Feb 17, 2006

Hendrik Maryns said:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
NotDashEscaped: You need GnuPG to verify this message

trippy schreef:

John C. Bollinger said:

trippy wrote:
I want to use arrays as keys in a HashMap.

But a key has to be unique.

1 key to 1 value
And your point is?

There is nothing inherently unworkable with arrays being map keys;
arrays are objects, after all. You might as well argue that Strings
cannot be used as keys because they (often) contain multiple characters.

Click to expand...

But arrays have more than one thing in them by definition.

Okay, say using this idea, you keep a list of users and their
information. Now, you have multiple people accessing the same user's
information because thisguy[0] will give you the same value as thisguy
[4].

I guess that's my point. 1 key to 1 value. With the arrays you'll have
multiple keys to values.

I guess if you want that, whatever but I think it's a bad idea. Why
don't you keep things in a plain old ArrayList if you're gonna do that.

Click to expand...

Still don?t see your point. What?s the difference if I only want some
objects to be stored as a list, between ArrayList and array?

Actually, the difference would be between ArrayList and Map (probably
HashMap) ArrayList wouldn't require the 1 key to 1 value like a map
would.

Only for
reading, I see no point in using a List.

H.

--
trippy
mhm31x9 Smeeter#29 WSD#30
sTaRShInE_mOOnBeAm aT HoTmAil dOt CoM

NP: "I Am The Law" -- Anthrax

"Now, technology's getting better all the time and that's fine,
but most of the time all you need is a stick of gum, a pocketknife,
and a smile."

-- Robert Redford "Spy Game"

Chris Smith · Feb 17, 2006

tom fredriksen said:
My point is to advise about possible dangers and mystical bugs by doing
it that way. A complex key like that is just begging for trouble along
the way. KISS.

Not to mention the idiomatic breakdown.

[Warning: I'm going to make up terminology here -- strong identity, weak
identity, and confused identity -- because I don't think there is widely
accepted terminology in use throughout the Java programming language to
describe these concepts.]

A key should be a key. It ought to be just as complex as it needs to be
for the application. If I need to look up records that match an array,
then I ought to use an array for a key.

The problem here, as I see it, is that HashMap doesn't work well (or at
a minimum, is dangerous) with keys that have confused identities. An
object with a confused identity is an object whose logical identity may
change throughout its lifetime; loosely speaking, an object that
overrides equals and hashCode in such a way that they may change through
its lifetime. I advocate avoiding objects with confused identities
whenever possible... but certain APIs such as the main Collections API
interfaces require confused identity.

Note that arrays themselves work just fine as keys. It's just necessary
to first recognize that they have strong identity; changing the contents
doesn't change who they are, and the same contents don't make two
different arrays equal to each other. Similarly, objects with weak
identity make fine HashMap keys, regardless of their complexity, so long
as they represent the information that you really want to use as a key.
Instances of the List or Map interfaces, though, do not work well as
keys unless they are immutable, since mutable collections are required
to have confused identities.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

tom fredriksen · Feb 17, 2006

Chris said:
A key should be a key. It ought to be just as complex as it needs to be
for the application.

The problem here, as I see it, is that HashMap doesn't work well (or at
a minimum, is dangerous) with keys that have confused identities.

Well, it seems like you have a different view of it than I have.

I prefer to keep things as minimalistic and as simple as possible, thats
the way I have found to keep the bug count close to zero and a high
degree of autonomy and maintainability on the servers I have created. If
you have a different approach, please share. I would love to hear other
ways of doing it.

/tom

John C. Bollinger · Feb 18, 2006

Roedy said:
You can't use arrays literally as keys.

Sure you can.

You need sensible equals and
hashCode for them.

Object provides equals() and hashCode() methods that meet all the
technical requirements.

Absolute identity is probably not what you want.

It is not uncommon for me to use Maps where object identity is exactly
the right choice for my keys -- such a map associates values with
specific key objects (object semantics). On the other hand, it may be
more typical in general to use keys that have value semantics, such as
Strings and Integers. In that case the map can be considered as
associating values with whatever things the key objects /represent/; as
a result, you can retrieve each value with any equivalent representation
of the underlying word, number, or whatever that constitutes its logical
key. Both uses have their places.

So you must embed you array in a class that lets you override those
two methods.

If you want value semantics, then that's probably the most
straightforward way to go about it. For what it's worth, it appears
that the OP does want value semantics, and I agree that embedding his
arrays in wrapper objects would be the approach I would consider first.

Chris Smith · Feb 18, 2006

tom fredriksen said:
Well, it seems like you have a different view of it than I have.

I prefer to keep things as minimalistic and as simple as possible, thats
the way I have found to keep the bug count close to zero and a high
degree of autonomy and maintainability on the servers I have created. If
you have a different approach, please share. I would love to hear other
ways of doing it.

The point that any necessary complexity can be encapsulated behind the
contract for Object.equals and Object.hashCode. Somewhat recently, for
example, I wrote some code (I'm leaving out some details to prevent this
post from getting too long) where I needed to keep a Map<Graph,Strategy>
where Graph is a directed graph in the conventional sense, and Strategy
encapsulates a series of transformations that may be performed on that
graph. A user would look up a graph and be given certain options for
transformation. The result of the transformation might then be looked
up again, resulting in a few further options.

Now, I don't think too many people would consider a digraph to be a
minimalistic data structure... but it's what I needed to look up in the
structure. The Graph class was immutable, and tied to Strategy so that
graphs were manipulated by Strategy objects that return a different
Graph as a result. Graph encapsulated the lookup simply by being able
to calculate a consistent hash code for any graph, and being able to
compare itself against another class for equality. Implementing those
things wasn't exactly trivial (though not really hard, either), but once
it was done, the HashMap data structure worked fine, of course.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

tom fredriksen · Feb 18, 2006

Chris said:
The point that any necessary complexity can be encapsulated behind the
contract for Object.equals and Object.hashCode.

Just because you have a contract about an interface, and it can be kept
constant, does not mean that the code behind it needs to be as
complicated as you can manage, or want to, create it. Thats the key to
low complexity/high maintainability code.

Somewhat recently, for
example, I wrote some code (I'm leaving out some details to prevent this
post from getting too long) where I needed to keep a Map<Graph,Strategy>
where Graph is a directed graph in the conventional sense, and Strategy
encapsulates a series of transformations that may be performed on that
graph.

I am not in a position to say if that was the best way to do it. But of
course, its an intriguing way of doing it. But does it solve the low
complexity/high maintainability equation? Have you needed to change
things because of bugs or any other unforeseen design-/problems?

/tom

Roedy Green · Feb 18, 2006

It is not uncommon for me to use Maps where object identity is exactly
the right choice for my keys -- such a map associates values with
specific key objects (object semantics)

If you index by identity, you need the object itself to feed in as the
retrieval key. So what's the point of looking it up if you already
have it?

You would not use this to find the objects matched by identity
themselves,, but to find other unrelated associated objects, that the
key object does not link to, only by Map membership.

It would be like insisting on canonical interned strings for all your
HashMap lookup keys.

Roedy Green · Feb 18, 2006

Similarly, objects with weak
identity make fine HashMap keys, regardless of their complexity, so long
as they represent the information that you really want to use as a key.

That is a mindbending idea, using an array as an effectively immutable
and unique key for HashMap lookup even though the values of the
elements themselves change.

If you do this, you had better write a paragraph of explanation with
circles and arrows.

Roedy Green · Feb 18, 2006

That is a mindbending idea, using an array as an effectively immutable
and unique key for HashMap lookup even though the values of the
elements themselves change.

If you do this, you had better write a paragraph of explanation with
circles and arrows.

If I were implementing that I think I would put the array in a wrapper
object even if just to help maintenance programmers understand that
the object itself has its own identity that is preserved even though
the elements change.

I would also put in the hashCode and equals methods even if they just
called super with an explanation that was intentional. You DON'T want
them looking at the individual array elements.

Otherwise someone will be tempted to "fix" your code by adding a
"proper" equals method.

Chris Smith · Feb 18, 2006

tom fredriksen said:
Just because you have a contract about an interface, and it can be kept
constant, does not mean that the code behind it needs to be as
complicated as you can manage, or want to, create it. Thats the key to
low complexity/high maintainability code.

Can you solve that problem with less complexity, or do you just move the
complexity elsewhere? Somewhere, you've got to have code that compares
two graphs for equality. The hash code is, of course, an optimization
to avoid a linear search... but it's an even more welcome optimization
when the graphs are so expensive to compare.

I am not in a position to say if that was the best way to do it. But of
course, its an intriguing way of doing it. But does it solve the low
complexity/high maintainability equation? Have you needed to change
things because of bugs or any other unforeseen design-/problems?

Since I wrote the code two weeks ago and it's not used in production, it
doesn't really mean much that I haven't seen any bugs. But no, I
haven't seen any bugs.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Chris Smith · Feb 18, 2006

Roedy said:
That is a mindbending idea, using an array as an effectively immutable
and unique key for HashMap lookup even though the values of the
elements themselves change.

If you do this, you had better write a paragraph of explanation with
circles and arrows.

It's definitely not a common way to use a plain array. The point is
that it works fine, and the complexity of the object's state makes it no
more likely to fail. Explanation is necessary, I agree... though I
don't see the utility of circles or arrows.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Ian Pilcher · Feb 18, 2006

Roedy said:
That is a mindbending idea, using an array as an effectively immutable
and unique key for HashMap lookup even though the values of the
elements themselves change.

If you do this, you had better write a paragraph of explanation with
circles and arrows.

Or use an IdentityHashMap.

Roedy Green · Feb 19, 2006

though I
don't see the utility of circles or arrows.

It was a reference to an Arlo Guthrie song, Alice's restaurant.

James McGill · Feb 19, 2006

It was a reference to an Arlo Guthrie song, Alice's restaurant.

They are getting younger every day, Roedy. I feel your pain

Mike Schilling · Feb 19, 2006

James McGill said:
They are getting younger every day, Roedy. I feel your pain

I made a poster to display on the door of my office: A snarling Dick Cheney,
captioned "Seven hunters, two game wardens, and a cow". Many people have
complimented me on how funny it is, and then asked "Why a cow?"

tom fredriksen · Feb 19, 2006

Chris said:
Can you solve that problem with less complexity, or do you just move the
complexity elsewhere? Somewhere, you've got to have code that compares
two graphs for equality. The hash code is, of course, an optimization
to avoid a linear search... but it's an even more welcome optimization
when the graphs are so expensive to compare.

Of course somewhere you have to have the logic to perform the operations
you need done. And using equals() to initiate a comparison of two graphs
is as good as any.

But, using a hash code of the graph as a unique identifier of the graph
is not the way to go, as you then have to take into consideration many
factors that are quite complex, such as

- the complexity of the graph,
- a correct algorithm to produce a representative hash code for the
type and complexity of the graph,
- accuracy of the representation,
- collision rate of the hash code
- distribution properties of the hash code, when considering the
accuracy, collision rate etc

I would try to look for another piece of simple information about a
graph that can be used as a unique identifier.

/tom

Oliver Wong · Feb 20, 2006

Roedy Green said:
If you index by identity, you need the object itself to feed in as the
retrieval key. So what's the point of looking it up if you already
have it?

You would not use this to find the objects matched by identity
themselves,, but to find other unrelated associated objects, that the
key object does not link to, only by Map membership.

It would be like insisting on canonical interned strings for all your
HashMap lookup keys.

One case where I've used object identity as a key for maps was for
annotating trees for a compiler. The source code for the nodes of tree were
generated, so I didn't want to edit their source code and add in attributes
that way. So I make the nodes the keys in my map, and add annotations for
symbol-table construction and type checking as values.

Two nodes, representing the token "x" might be exactly equal in all
their attributes, but may be conceptually considered different, due to their
position in the tree. One such "x" may be a local variable, while another
one is a method parameter. So in cases like these, you really want identity,
and not just equality.

- Oliver

Magic number in Boolean	15	Mar 10, 2006
[Slightly OT] Which algorithm for permutations	0	Mar 3, 2006
Extending java.util.Set	11	Feb 15, 2006
Lisp-style list	9	Feb 21, 2006
Declaring local variables inside loop	37	Feb 22, 2006
Avoiding failsafe behaviour of iterators	13	Feb 15, 2006
WeakHashMap for values	3	Jun 8, 2006
Strategy or Iterator?	28	Mar 2, 2006

Arrays as key in a HashMap

trippy

Chris Smith

tom fredriksen

John C. Bollinger

Chris Smith

tom fredriksen

Roedy Green

Roedy Green

Roedy Green

Chris Smith

Chris Smith

Ian Pilcher

Roedy Green

James McGill

Mike Schilling

tom fredriksen

Oliver Wong

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads