Using java.util.map

  • Thread starter Christopher Benson-Manica
  • Start date
C

Christopher Benson-Manica

I sincerely hope I'm missing something about how java.util.map can be
used, but it seems that there is no way to update a single value
without traversing the map twice, at least not cleanly. As an
example, consider this code taken from the Sun autoboxing example:

public static void main(String[] args) {
Map<String, Integer> m = new TreeMap<String, Integer>();
for (String word : args) {
Integer freq = m.get(word);
m.put(word, (freq == null ? 1 : freq + 1));
}
}

Notice how both get() and put() are invoked. Clearly this means that
the map is traversed twice - once to find the value, and once to
figure out where to put a new value (which happens to be the same
place!). In C++ (where I come from), maps contain objects of type
pair<key, value>, and one can therefore retrieve the pair<> in
question and update the value with only one traversal of the map. Is
there any way to avoid two traversals in Java short of writing a kludgy
wrapper object such as

class WrapsAnInt {
private int wrapped;

public WrapsAnInt( int w ) { wrapped = w; }
public void setWrapped( int w ) { wrapped = w; }
public int getWrapped() { return wrapped; }
}

and put *those* in the map? If not, why didn't Java simply introduce
the concept of a pair<> like C++ did and fix the problem?
 
R

Robert Klemme

I sincerely hope I'm missing something about how java.util.map can be
used, but it seems that there is no way to update a single value
without traversing the map twice, at least not cleanly. As an
example, consider this code taken from the Sun autoboxing example:

public static void main(String[] args) {
Map<String, Integer> m = new TreeMap<String, Integer>();
for (String word : args) {
Integer freq = m.get(word);
m.put(word, (freq == null ? 1 : freq + 1));
}
}

Notice how both get() and put() are invoked. Clearly this means that
the map is traversed twice - once to find the value, and once to

Not the whole map is traversed twice. There are two lookups which take
O(log n) for the TreeMap and O(1) for the HashMap.
figure out where to put a new value (which happens to be the same
place!). In C++ (where I come from), maps contain objects of type
pair<key, value>, and one can therefore retrieve the pair<> in
question and update the value with only one traversal of the map. Is
there any way to avoid two traversals in Java short of writing a kludgy
wrapper object such as

class WrapsAnInt {
private int wrapped;

public WrapsAnInt( int w ) { wrapped = w; }
public void setWrapped( int w ) { wrapped = w; }
public int getWrapped() { return wrapped; }
}

If you're worried about performance writing a mutable Integer is the
best solution. Note that this class then also should inherit
java.lang.Number.

Also, use a HashMap which does faster lookups.
and put *those* in the map? If not, why didn't Java simply introduce
the concept of a pair<> like C++ did and fix the problem?

Java != C++

Cheers

robert
 
T

Thomas Weidenfeller

F'up set.
I sincerely hope I'm missing something about how java.util.map can be
used, but it seems that there is no way to update a single value
without traversing the map twice, at least not cleanly.

a) You don't traverse the map twice (traversing means systematically
visiting every entry). You perform a lookup and an insertion. Only if
these operations degenerate each element in the map would have been visited.

b) You assume that the C++ way is clean (with the need to explicitly
create an extra pair<K,T> object for each entry), and the Java way is dirty.

I claim the C++ way is dirty, because it exposes implementation interna.

c) You claim there is an equivalent of the get() method in C++ for maps,
which return a pair<K,T>. To the best of my knowledge, there isn't. You
get pairs when you iterate over a C++ map (which is not done in the Java
example), or you get references to values when you use the index
operator[], but nor pairs.

d) You can get the equivalent of (artificially created) pairs when you
treat a Map as a Set and iterate over the set - just like iterating over
a C++ map.
Notice how both get() and put() are invoked. Clearly this means that
the map is traversed twice

Clearly it doesn't mean that.
Is
there any way to avoid two traversals in Java short of writing a kludgy
wrapper object such as

That kludge is in fact a re-creation of the C++ kludge. You are adding
and put *those* in the map? If not, why didn't Java simply introduce
the concept of a pair<> like C++ did and fix the problem?

If you like C++'s way better, stay with C++.

/Thomas
 
P

Patricia Shanahan

Robert Klemme wrote:
....
If you're worried about performance writing a mutable Integer is the
best solution. Note that this class then also should inherit
java.lang.Number.

Why inherit from java.lang.Number?

I had a situation like this, and wrote the following class, inside my
Counter class which has the HashMap:

private static class Count {
private int val = 0;

private void increment() {
val++;
}

private int get() {
return val;
}
}

It does everything the surrounding class needs it to do, and not one
thing more. You seem to saying that I should have implemented several
public methods that the surrounding class will never need, and no other
class can call except through reflection?

Patricia
 
R

Robert Klemme

Robert Klemme wrote:
...

Why inherit from java.lang.Number?

I had a situation like this, and wrote the following class, inside my
Counter class which has the HashMap:

private static class Count {
private int val = 0;

private void increment() {
val++;
}

private int get() {
return val;
}
}

It does everything the surrounding class needs it to do, and not one
thing more. You seem to saying that I should have implemented several
public methods that the surrounding class will never need, and no other
class can call except through reflection?

Of course you can do it this way. But if you frequently (i.e. in
different classes) need a mutable Integer / Float then IMHO it's better
to provide a more general implementation. And if that inherits
java.lang.Number (and implements methods as needed) then it can also be
used in contexts that deal with Number so it's more generally usable.
This should probably be in the standard lib anyway.

My 0.02EUR

Kind regards

robert
 
C

Christopher Benson-Manica

Thomas Weidenfeller said:
F'up set.

I have no idea why, and I don't read that group - I'm programming in
Java, and I just want to understand the Java way of doing this. I'm
setting them back to c.l.j.p, but if you really feel this discussion
belongs elsewhere, by all means set followup-to on your reply.
a) You don't traverse the map twice (traversing means systematically
visiting every entry). You perform a lookup and an insertion. Only if
these operations degenerate each element in the map would have been visited.

You're right, "traverse" was not the right word.
b) You assume that the C++ way is clean (with the need to explicitly
create an extra pair<K,T> object for each entry), and the Java way is dirty.

I'm not saying Java is dirty, I'm just saying it seems to require two
lookups unless the programmer takes countermeasures.
I claim the C++ way is dirty, because it exposes implementation interna.

It's certainly an arguable point.
c) You claim there is an equivalent of the get() method in C++ for maps,
which return a pair<K,T>. To the best of my knowledge, there isn't.

find( const key_type& x ) returns an iterator to the pair with key x,
if it exists.
Clearly it doesn't mean that.

But the lookup is performed twice, yes?
That kludge is in fact a re-creation of the C++ kludge. You are adding
another level of indirection. Which is what the C++ pair<K,T> provides
in the first place in C++.

Yes, it's exactly a recreation of the C++ kludge, and I'm just
astonished that apparently the desire for it is small enough that
isn't built into the java.util.map...
 
C

Christopher Benson-Manica

Robert Klemme said:
Not the whole map is traversed twice. There are two lookups which take
O(log n) for the TreeMap and O(1) for the HashMap.

Right, I definitely misspoke.
If you're worried about performance writing a mutable Integer is the
best solution.
Yes...

Java != C++

Clearly :) But surely programmers have similar desires in both
languages, and I'm only curious why Java did not provide for this
specific desire.
 
P

Patricia Shanahan

Robert said:
Of course you can do it this way. But if you frequently (i.e. in
different classes) need a mutable Integer / Float then IMHO it's better
to provide a more general implementation. And if that inherits
java.lang.Number (and implements methods as needed) then it can also be
used in contexts that deal with Number so it's more generally usable.
This should probably be in the standard lib anyway.

Of course, if I found I generally needed a mutable integer, I might well
at some point refactor my code to use it instead of my Count class.
However, so far, I've had no need for a mutable integer.

Patricia
 
L

Lee Fesperman

Patricia said:
Of course, if I found I generally needed a mutable integer, I might well
at some point refactor my code to use it instead of my Count class.
However, so far, I've had no need for a mutable integer.

I agree. Despite my attitude when I first started with Java, I've never found a need for
it. Later, I did implement a set of Mutable classes (integer, double, string) for our
ORDBMS. We use Java methods for Stored Procedures, and the Mutable classes were intended
to support standard semantics for Stored Procedures. Even though it is a pure Java
database system, we do provide an ODBC driver where this capability is more important.
 
T

Thomas Weidenfeller

Christopher said:
I have no idea why,

You are doing language advocacy in a non-advocacy group. Since you don't
want to move the discussion to an advocacy group I don't see a point to
continue the discussion.
 
C

Christopher Benson-Manica

Thomas Weidenfeller said:
You are doing language advocacy in a non-advocacy group. Since you don't
want to move the discussion to an advocacy group I don't see a point to
continue the discussion.

What language am I advocating?
 
R

Rogan Dawes

Christopher said:
I sincerely hope I'm missing something about how java.util.map can be
used, but it seems that there is no way to update a single value
without traversing the map twice, at least not cleanly. As an
example, consider this code taken from the Sun autoboxing example:

public static void main(String[] args) {
Map<String, Integer> m = new TreeMap<String, Integer>();
for (String word : args) {
Integer freq = m.get(word);
m.put(word, (freq == null ? 1 : freq + 1));
}
}

Notice how both get() and put() are invoked. Clearly this means that
the map is traversed twice - once to find the value, and once to
figure out where to put a new value (which happens to be the same
place!). In C++ (where I come from), maps contain objects of type
pair<key, value>, and one can therefore retrieve the pair<> in
question and update the value with only one traversal of the map. Is
there any way to avoid two traversals in Java short of writing a kludgy
wrapper object such as

class WrapsAnInt {
private int wrapped;

public WrapsAnInt( int w ) { wrapped = w; }
public void setWrapped( int w ) { wrapped = w; }
public int getWrapped() { return wrapped; }
}

and put *those* in the map? If not, why didn't Java simply introduce
the concept of a pair<> like C++ did and fix the problem?

I notice that noone mentioned Map.Entry, which is one way of achieving
what Christopher is looking for.

http://java.sun.com/j2se/1.5.0/docs/api/java/util/Map.html#entrySet()

However, the Map interface does not provide an optimized way of looking
up the entry in question, but rather expects the user to iterate over
all the entries until they find the one they are looking for.

e.g.

Map.Entry<K,V> getEntry(Object key)

might be a useful method. Obviously I am not considering the additional
complexity that this adds to all implementations, etc.

Rogan
 
C

Christopher Benson-Manica

Rogan Dawes said:
Map.Entry<K,V> getEntry(Object key)
might be a useful method. Obviously I am not considering the additional
complexity that this adds to all implementations, etc.

What just occurred to me is that implementing getEntry() for a HashMap
would be non-trivial, if I understand how HashMap works correctly. It
seems like it would have been a fairly simple addition to TreeMap
however...
 
R

Robert Klemme

What just occurred to me is that implementing getEntry() for a HashMap
would be non-trivial, if I understand how HashMap works correctly. It
seems like it would have been a fairly simple addition to TreeMap
however...

HashMap.java:

Entry getEntry(Object key) {
Object k = maskNull(key);
int hash = hash(k);
int i = indexFor(hash, table.length);
Entry e = table;
while (e != null && !(e.hash == hash && eq(k, e.key)))
e = e.next;
return e;
}

IOW, the method exists already (JDK 1.4 and later) - it's just not public.

Kind regards

robert
 
C

Christopher Benson-Manica

(Continuing a thread from comp.lang.java.programmer...)
F'up set.

(I still have no idea why, but since I'd rather continue the
conversation than debate whether or not my comments constitute
"advocacy", I've superseded the previous post and directed the
conversation to your group of choice.)
a) You don't traverse the map twice (traversing means systematically
visiting every entry). You perform a lookup and an insertion. Only if
these operations degenerate each element in the map would have been visited.

You're right, "traverse" was not the right word.
b) You assume that the C++ way is clean (with the need to explicitly
create an extra pair<K,T> object for each entry), and the Java way is dirty.

I'm not saying Java is dirty, I'm just saying it seems to require two
lookups unless the programmer takes countermeasures.
I claim the C++ way is dirty, because it exposes implementation interna.

It's certainly an arguable point.
c) You claim there is an equivalent of the get() method in C++ for maps,
which return a pair<K,T>. To the best of my knowledge, there isn't.

find( const key_type& x ) returns an iterator to the pair with key x,
if it exists.
Clearly it doesn't mean that.

But the lookup is performed twice, yes?
That kludge is in fact a re-creation of the C++ kludge. You are adding
another level of indirection. Which is what the C++ pair<K,T> provides
in the first place in C++.

Yes, it's exactly a recreation of the C++ kludge, and I'm just
astonished that apparently the desire for it is small enough that
isn't built into the java.util.map...
 
P

Patricia Shanahan

Robert said:
What just occurred to me is that implementing getEntry() for a HashMap
would be non-trivial, if I understand how HashMap works correctly. It
seems like it would have been a fairly simple addition to TreeMap
however...

HashMap.java:

Entry getEntry(Object key) {
Object k = maskNull(key);
int hash = hash(k);
int i = indexFor(hash, table.length);
Entry e = table;
while (e != null && !(e.hash == hash && eq(k, e.key)))
e = e.next;
return e;
}

IOW, the method exists already (JDK 1.4 and later) - it's just not public.

Kind regards

robert


I don't really understand the attraction of using Entry.

The cost of the second lookup is likely to be very low. The HashMap code
is in the instruction cache. The relevant portions of the hash table are
in the data cache. The same branches will be taken and not-taken as in
the original lookup.

The cost of creating a new Integer object each time a count increments,
and garbage collecting the old one, may be greater than the cost of the
second lookup. Using Entry does nothing about that.

The mutable counter solution seems simple, clean, and obvious to me.

Patricia
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top