Using a lot of Maps

S

ses

So I find myself using rather a lot of maps in my latest piece of work
to represent nested objects e.g:

HashMap<Integer, HashMap<Integer,String>>

Where for example the first and second integers are attributes and
string is some value that an Integer-Integer combination corresponds
to. I only ever need to really put and get the String value based on
the Integer attributes, but it seems a more effective tree like
structure than using a custom data structure and having to iterate
over it.

My question is, should I be worried about using maps in this manner,
would a custom data structure be better? I think I'm becoming too used
to using them.
 
T

Tom Anderson

So I find myself using rather a lot of maps in my latest piece of work
to represent nested objects e.g:

HashMap<Integer, HashMap<Integer,String>>

Where for example the first and second integers are attributes and
string is some value that an Integer-Integer combination corresponds
to. I only ever need to really put and get the String value based on
the Integer attributes, but it seems a more effective tree like
structure than using a custom data structure and having to iterate
over it.

My question is, should I be worried about using maps in this manner,
would a custom data structure be better? I think I'm becoming too used
to using them.

That kind of collection-of-collection structure is okay for two or maybe
three levels, but after that, IME, it's usually worth reifying the objects
that are hiding inside it.

Could you give us a bit of an example of your data, and what you're doing
with it?

tom
 
M

markspace

So I find myself using rather a lot of maps in my latest piece of work
to represent nested objects e.g:

HashMap<Integer, HashMap<Integer,String>>

Where for example the first and second integers are attributes and
string is some value that an Integer-Integer combination corresponds
to. I only ever need to really put and get the String value based on
the Integer attributes, but it seems a more effective tree like
structure than using a custom data structure and having to iterate
over it.

My question is, should I be worried about using maps in this manner,
would a custom data structure be better? I think I'm becoming too used
to using them.


Possibly. I think (and this is just offf the top of my head) that one
problem with this approach is that you might end up with a lot of deeply
nested Maps:

HashMap<.., HashMap<.., HashMap<.., HashMap<..,..>>>> ....

And that's going to turn what could have been a single lookup O(1) into
a linear search O(n). It might be better to find a representation of
the key and value that allows you to look up a deeply nested value in
one go.

HashMap<Integer,Heirarchy> cache = ....

where Hierarchy is something like (code not compiled or tested):

class Hierarchy {
private final ArrayList<Object> hierarchy;
private int hash;

public Hierarchy( Object... parents) {
hierarchy = new ArrayList<Object>( parents.length );
for( Object o : parents) {
hierarchy.add( o );
}
}

@Override
public int hashCode() {
if( hash == 0 ) {
for( Object o : hierarchy ) {
hash = hash * 53 + o.hashCode();
}
}
return hash;
}

public List<Object> getHierarchy() {
return hierarchy.clone();
}

@Override
public boolean equals( Object ...
should override this
didn't need it for the example
}

This will let you make an entire hierarchy into one object, so it can be
looked-up in one go. I think this helps on both reads and writes, since
you don't have to search for a place to put it into a nest of Maps, or
make new Maps on the fly to hold new children.

As Tom said though I'd like to hear more about your use model for this
application. I don't think I've ever considered doing what you did, so
I'm unsure what sort of things one could or should do with it.
 
L

Lew

ses said:
So I find myself using rather a lot of maps in my latest piece of work
to represent nested objects e.g:

HashMap<Integer, HashMap<Integer,String>>

The generic argument inside the angle brackets should be 'Map', not
'HashMap'. If you use such nested structures, it should be similar
to:

Map <Integer, Map <Integer, String>> nestled = new HashMap <Integer,
Map said:
Where for example the first and second integers are attributes and
string is some value that an Integer-Integer combination corresponds
to. I only ever need to really put and get the String value based on

Your key is what? Not Integer. Not Integer looks up Integer. It's
Pair <Integer, Integer>.

IIUC.
the Integer attributes, but it seems a more effective tree like
structure than using a custom data structure and having to iterate
over it.

My question is, should I be worried about using maps in this manner,
would a custom data structure be better? I think I'm becoming too used
to using them.

What you show *is* a custom data structure.

That said, figure out what you're actually trying to accomplish.
Based on the verbal description, I read your requirement as to
implement a mapping from a pair of integers to a string, thus:
{ (Integer, Integer) -> String }.

What you implemented was { (Integer) -> { (Integer) -> String }}.

If your verbal description really does describe a mapping from an
integer pair to a string, implement it as such.

class Pair <T, U> { final T first; final U second; }
public class Foo
{
Map <Pair <Integer, Integer>, String> nestles = new HashMap <Pair
<Integer, Integer>, String> ();
// etc.
}

If I misinterpreted the requirement description, please clarify.
 
M

markspace

@Override
public int hashCode() {
if( hash == 0 ) {
for( Object o : hierarchy ) {
hash = hash * 53 + o.hashCode();
}
}
return hash;
}


There's a mistake above: I shouldn't use hash to accumulate the hash
code. Another thread could see a non-zero value that will be changed
later. Gotta use a temporary variable there.

@Override
public int hashCode() {
if( hash == 0 ) {
int x;
for( Object o : hierarchy ) {
x = x * 53 + o.hashCode();
}
hash = x;
}
return hash;
}
 
J

Jan Burse

Patricia said:
Remember to override equals and hashCode in terms of the equals and hash
code methods for first and second. Also, if the fields are final, which
is good especially in a hash structure key, they need to be initialized.
In this case I suggest a constructor.

Patricia

And if you use TreeMap as an instance of Map, instead of HashMap,
you would either need to implement the comparable interface or
provide an additonal comparator.

Pitty Java doesn't have Pair or somesuch in the java.util.* path.

Bye
 
J

Jan Burse

Jan said:
Pitty Java doesn't have Pair or somesuch in the java.util.* path.

On the other hand tree map has the advantage that one can
provide an extra comparator when constructing it. So one
could also go with new Object[]{.,.} instead of Pair. Just
covering the array in the comparator.

Best Regards
 
L

Lew

markspace said:
There's a mistake above:  I shouldn't use hash to accumulate the hash
code.  Another thread could see a non-zero value that will be changed
later.  Gotta use a temporary variable there.

   @Override
   public int hashCode() {
     if( hash == 0 ) {
       int x;
       for( Object o : hierarchy ) {
         x = x * 53 + o.hashCode();
       }
       hash = x;
     }
     return hash;
   }

If your goal is thread safety, you haven't achieved it with this.

Suppose Thread A comes along and sees '0' as the hash, calculates a
non-zero value and uses it. Some time later, Thread B comes along and
sees the object. It might see all the values in hierarchy that Thread
A saw, and still see a 0 value. Or, it might see different values in
the 'hierarchy' and see the calculated, non-zero value for 'hash' from
the old values. Or it might see some of the same values in
'hierarchy' but not others that have changed, and it might or might
not see 0 in 'hash'. Absent some synchronization, all bets are off.
 
M

markspace

Or, it might see different values in
the 'hierarchy' and see the calculated,


Nope, the class as written is immutable, and therefore thread safe.

Absent some synchronization, all bets are off.


The "final" keyword synchronizes the writes in the constructor with all
reads that follow the constructor. The JLS specifies that the final
keyword creates a "freeze action" at the end of the constructor, which
synchronizes all writes in the constructor with all other possible reads
in the JVM.

The write of "hash" is idempotent and therefore safe. It's a race
condition, but not a data race.

If I had allowed "hierarchy" to be modified after the constructor, then
yes, I'd have to use some additional form of synchronization. However,
as the class stands, it's thread safe.

C.f. _Java Concurrency in Practice_ by Brain Goetz.
 
A

Arne Vajhøj

Nope, the class as written is immutable, and therefore thread safe.

Technically it is not immutable as hashCode changes its state.
The "final" keyword synchronizes the writes in the constructor with all
reads that follow the constructor. The JLS specifies that the final
keyword creates a "freeze action" at the end of the constructor, which
synchronizes all writes in the constructor with all other possible reads
in the JVM.

The write of "hash" is idempotent and therefore safe. It's a race
condition, but not a data race.

If I had allowed "hierarchy" to be modified after the constructor, then
yes, I'd have to use some additional form of synchronization. However,
as the class stands, it's thread safe.

Safe, but not very elegant, because you have a test that
will make the calculation execute somewhere between one and
number of threads times.

Arne
 
M

markspace

Technically it is not immutable as hashCode changes its state.


Hmm, String is immutable and does the same thing.

Safe, but not very elegant, because you have a test that
will make the calculation execute somewhere between one and
number of threads times.


I must disagree. The code is basically the same as String and in modern
systems, with their massive memory usages, is highly unlikely to execute
more than once. The lack of synchronization on that method is worth it
in the long run -- the more threads, the more worthwhile it becomes.

"Elegant" is in the eye of the beholder but I think you are picking at
nits here. The code is perfectly fine and imo elegant.

In summary, read this:

<http://en.wikipedia.org/wiki/Parkinson's_Law_of_Triviality>
 
M

markspace

All hail Brain Goetz! And Brain Lea and Brain Bloch.


I for one welcome our new brainy... naw, that's overdone.

I guess any pedantic post correcting someone has to contain at least one
really ridiculous and obvious spelling error, doesn't it?
 
A

Arne Vajhøj

Hmm, String is immutable and does the same thing.

Hmm.

So immutable then just means immutable in the documented
state.

That is flexible usage of English.
I must disagree. The code is basically the same as String and in modern
systems, with their massive memory usages, is highly unlikely to execute
more than once. The lack of synchronization on that method is worth it
in the long run -- the more threads, the more worthwhile it becomes.

"Elegant" is in the eye of the beholder but I think you are picking at
nits here. The code is perfectly fine and imo elegant.

The fact that other code use the technique does not make it
elegant.

To some extent elegant is a matter of taste. But having
some code executed 1-N times depending on cache usage
pattern should not be considered elegant by anyone.

Given that String must by far be the class most used
with hashCode, then this type of code may be justified.

But that does still not make it elegant.

Arne

PS: Oh - and don't expect the "problem" to become
less probably on newer systems - the cache coherency
of x86/x86-64 is rather unique, RISC systems does not
have it, and Intel is talking about dropping it for
far out future systems
(http://www.itworld.com/hardware/128338/intel-1000-core-processor-possible).
 
M

markspace

That is flexible usage of English.


Well it's focusing on the important parts -- the externally visible
behavior. And by "externally visible" I mean visible to any thread in
the system, for any reason, no matter how convoluted its execution path.

The fact that other code use the technique does not make it
elegant.


I think we are going to have to agree to disagree on this.

PS: Oh - and don't expect the "problem" to become
less probably on newer systems - the cache coherency
of x86/x86-64 is rather unique, RISC systems does not
have it, and Intel is talking about dropping it for
far out future systems
(http://www.itworld.com/hardware/128338/intel-1000-core-processor-possible).


That's interesting. I'm vaguely familiar with message passing systems.
Message passing is not new, I did work briefly on one lo these many
years ago.

My understanding is that there are two types of synchronization --
shared memory, and message passing. Java's monitors are a shared memory
design. I'm not sure what it would take to get a monitor in Java to run
on a message passing system. And the rest of the Java memory model
might be impossible to port.

Anyway, good info, but unlikely to affect the Java world, which is
shared memory only afaict.
 
S

Screamin' Lord Byron

There's a mistake above: I shouldn't use hash to accumulate the hash
code. Another thread could see a non-zero value that will be changed
later. Gotta use a temporary variable there.

@Override
public int hashCode() {
if( hash == 0 ) {
int x;
for( Object o : hierarchy ) {
x = x * 53 + o.hashCode();
}
hash = x;
}
return hash;
}

I must admit that I don't really get this code. What should be the
initial value of x? Obviously this is a typo, as x isn't even
initialized and this won't compile. However, if x is initialized to hash
then x = x * 53 + o.hashCode() doesn't make any sense to me (because
hash==0). What am I missing here?
 
R

Roedy Green

My question is, should I be worried about using maps in this manner,
would a custom data structure be better? I think I'm becoming too used
to using them.

You might use an ArrayList for faster iterating. There is no point in
creating a new structure unless you have measured the performance of
the canned options and found them wanting, and it is obvious you can
make a major saving.
 
L

Lew

Screamin' Lord Byron said:
I must admit that I don't really get this code. What should be the
initial value of x? Obviously this is a typo, as x isn't even
initialized and this won't compile. However, if x is initialized to hash
then x = x * 53 + o.hashCode() doesn't make any sense to me (because
hash==0). What am I missing here?

The additive term. 'o.hashCode()' is unlikely to be 0 (though if it
is 'null' the whole thing blows up). If it is, then it rightly
contributes zero to the result. So at each iteration through the loop
after the first, 'x' will also be non-zero, most likely.

I learned '31' as the coefficient to use for 32-bit hashes, though,
back in my Knuth-reading days. Is '53' better?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top