ConcurrentHashMap - concurrency between clear() and iterator onmap.entrySet()

R

Rakesh

I have a concurrentHashMap with the following use case.

Map<K, V> map = new ConcurrentHashMap<K, V>()

Thread 1:

A: Iterable<Entry<K,V>> it = map.entrySet() ;
// walks through the iterator

Thread 2:

B: map.clear();
C: map = functionThatRefillsConcurrentMap();


I was reading the thread-safety guarantees in javadocs but could not
find a definitive answer for this.
(using JDK 6 update 12 on Ubuntu 8.04 ).

What is the concurrency guarantee that I could expect in the above
mentioned scenario .

Specifically -
* Thread 1 executes A
* Walks through the iterator
* Thread 2 executes B
* At this time - When Thread 1 checks for iterator.hasNext() or
iterator.next() - would that crash with a race condition /
deterministically conclude that there are no elements in the map and
exit gracefully.
 
K

Knute Johnson

Rakesh said:
I have a concurrentHashMap with the following use case.

Map<K, V> map = new ConcurrentHashMap<K, V>()

Thread 1:

A: Iterable<Entry<K,V>> it = map.entrySet() ;
// walks through the iterator

Thread 2:

B: map.clear();
C: map = functionThatRefillsConcurrentMap();


I was reading the thread-safety guarantees in javadocs but could not
find a definitive answer for this.
(using JDK 6 update 12 on Ubuntu 8.04 ).

What is the concurrency guarantee that I could expect in the above
mentioned scenario .

Specifically -
* Thread 1 executes A
* Walks through the iterator
* Thread 2 executes B
* At this time - When Thread 1 checks for iterator.hasNext() or
iterator.next() - would that crash with a race condition /
deterministically conclude that there are no elements in the map and
exit gracefully.

From the docs;

"The "Concurrent" prefix used with some classes in this package is a
shorthand indicating several differences from similar "synchronized"
classes. For example java.util.Hashtable and
Collections.synchronizedMap(new HashMap()) are synchronized. But
ConcurrentHashMap is "concurrent". A concurrent collection is
thread-safe, but not governed by a single exclusion lock. In the
particular case of ConcurrentHashMap, it safely permits any number of
concurrent reads as well as a tunable number of concurrent writes.
"Synchronized" classes can be useful when you need to prevent all access
to a collection via a single lock, at the expense of poorer scalability.
In other cases in which multiple threads are expected to access a common
collection, "concurrent" versions are normally preferable. And
unsynchronized collections are preferable when either collections are
unshared, or are accessible only when holding other locks.

Most concurrent Collection implementations (including most Queues) also
differ from the usual java.util conventions in that their Iterators
provide weakly consistent rather than fast-fail traversal. A weakly
consistent iterator is thread-safe, but does not necessarily freeze the
collection while iterating, so it may (or may not) reflect any updates
since the iterator was created."

When thread B clears the map, I think thread A will see no elements and
act accordingly. It should not throw a ConcurrentModificationException.
 
T

Tom Anderson

Rakesh said:
What is the concurrency guarantee that I could expect in the above
mentioned scenario .

Specifically -
* Thread 1 executes A
* Walks through the iterator
* Thread 2 executes B
* At this time - When Thread 1 checks for iterator.hasNext() or
iterator.next() - would that crash with a race condition /
deterministically conclude that there are no elements in the map and
exit gracefully.

From the docs;

["]Most concurrent Collection implementations (including most Queues)
also differ from the usual java.util conventions in that their Iterators
provide weakly consistent rather than fast-fail traversal. A weakly
consistent iterator is thread-safe, but does not necessarily freeze the
collection while iterating, so it may (or may not) reflect any updates
since the iterator was created."

When thread B clears the map, I think thread A will see no elements and
act accordingly. It should not throw a ConcurrentModificationException.

I didn't know about the modified semantics of concurrent iterators. I'm a
bit surprised, but i can see why it's like that.

Anyway, i thought i'd try it:



import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.Iterator;
public class ConcurrentIteration {
public static void main(String... args) throws InterruptedException {
int n = Integer.parseInt(args[0]);
final Map<Integer, Integer> map = new ConcurrentHashMap<Integer, Integer>();
for (int i = 0; i < n; ++i) {
map.put(i, i);
}
Thread t = new Thread(new Runnable() {public void run() {
map.clear();
System.out.println(map);
}});
Iterator<Integer> it = map.keySet().iterator();
for (int i = 0; i < (n - 1); ++i) {
it.next();
}
t.start();
t.join();
System.out.println(it.next());
}
}



I tried this with a few different values of n, and in all cases, the
iteration finished as if the clear hadn't happened. So, no
ConcurrentModificationException, but also no early termination.

tom
 
T

Tom Anderson

Tom Anderson said:
Rakesh wrote:

What is the concurrency guarantee that I could expect in the above
mentioned scenario .

Specifically -
* Thread 1 executes A
* Walks through the iterator
* Thread 2 executes B
* At this time - When Thread 1 checks for iterator.hasNext() or
iterator.next() - would that crash with a race condition /
deterministically conclude that there are no elements in the map and
exit gracefully.

From the docs;

["]Most concurrent Collection implementations (including most Queues)
also differ from the usual java.util conventions in that their Iterators
provide weakly consistent rather than fast-fail traversal. A weakly
consistent iterator is thread-safe, but does not necessarily freeze the
collection while iterating, so it may (or may not) reflect any updates
since the iterator was created."

When thread B clears the map, I think thread A will see no elements and
act accordingly. It should not throw a ConcurrentModificationException.

I didn't know about the modified semantics of concurrent iterators. I'm a
bit surprised, but i can see why it's like that.

Anyway, i thought i'd try it:

I tried this with a few different values of n, and in all cases, the
iteration finished as if the clear hadn't happened. So, no
ConcurrentModificationException, but also no early termination.

You don't start the thread that clears the collection until after you've
traversed the iterator--there only concurrent access is the last item.

Very good point. Thinking about it, the way i usually write iterators
invovles getting the next item before it's asked for, and keeping it in
the iterator, so if whoever wrote ConcurrentHashMap's iterator did the
same, you'd expect my test to always behave as i observed, regardless of
how the iterator and map really interact.
I used the following revision, which often failed to iterate the whole list.
Uncommenting the print statement about line 27 slows the main thread down
enough to give the clearing thread an opportunity. However, even slowing
down the gap between hasNext and next, the iterator always correctly
reported a potential next item. That is, clear discards iterated items that
have not yet been seen, so when the last it.next() occurs in your loop, the
final item is already waiting in the iterator and immune to the clear.

You're quite right. Here's yet another variant, similar to yours, but
retaining deterministic timing:

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.Iterator;
public class ConcurrentIteration {
public static void main(String... args) throws InterruptedException {
int n = Integer.parseInt(args[0]);
final Map<Integer, Integer> map = new ConcurrentHashMap<Integer, Integer>();
for (int i = 0; i < n; ++i) {
map.put(i, i);
}
Thread t = new Thread(new Runnable() {public void run() {
map.clear();
System.out.println(map);
}});
Iterator<Integer> it = map.keySet().iterator();
it.next();
t.start();
t.join();
int i = 1;
try {
for (; i < (n - 1); ++i) {
if (!it.hasNext()) throw new RuntimeException("no next element");
it.next();
}
if (!it.hasNext()) throw new RuntimeException("no next element");
System.out.println(it.next());
}
catch (RuntimeException e) {
System.err.println("*** i = " + i);
throw e;
}
}
}

This always fails with the 'no next element' exception, never a
NoSuchElementException or anything else (including a successful
termination!). That supports two of your conclusions, namely that the
relationship between hasNext and next is preserved, and that the actions
of the clearing thread do become visible to the iterating thread. The
number printed in the catch block is always 2 or 3; the number appears to
be a consistent function of the size of the map. That suggests that the
iterator caches the next element, or occasionally the next two elements
(?!), but no more.

I don't really get why the fail-fast semantics were abandoned for the
concurrent map. Would it just have been too slow? I can't see how you'd do
it without a bit of synchronisation (probably a volatile read) on every
call to next, which could be quite a performance hit. But according to
this:

http://www.ibm.com/developerworks/java/library/j-jtp08223/

next does full-blown synchronisation anyway!

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top