Thread safe access to elements of a growing array without synchronization

A

Adam Warner

Hi all,

I have a class with this field:

private volatile int[] typeSpecificPos=Const.intArray0;

where Const.intArray0 is simply a new int[0] singleton. As the array grows
empty slots are filled with -1. An element remains constant once it is
filled with a non-negative value (and this is the path that must be fast).

The field is volatile so any thread is guaranteed to always reference the
latest adjusted array assigned to typeSpecificPos.

Is the non-synchronized method below completely thread safe? I'll paste it
in and then explain each step of my approach:

public int getTypeSpecificSlotPos(int type) {
int typedPos;
try {typedPos=typeSpecificPos[type];} catch (ArrayIndexOutOfBoundsException e1) {
synchronized (this) {
try {typedPos=typeSpecificPos[type];} catch (ArrayIndexOutOfBoundsException e2) {
typedPos=TypeEqOffsetMap.getOrCreateOffset(type, this);
int oldLen=typeSpecificPos.length, newLen=Misc.ndxBelowPow2Lim(type);
int[] newArray=new int[newLen];
System.arraycopy(typeSpecificPos, 0, newArray, 0, oldLen);
for (int i=oldLen; i<newLen; ++i) {newArray=-1;}
newArray[type]=typedPos;
typeSpecificPos=newArray;
return typedPos;
}
if (typedPos>=0) return typedPos;
typedPos=TypeEqOffsetMap.getOrCreateOffset(type, this);
typeSpecificPos[type]=typedPos;
return typedPos;
}
}
if (typedPos>=0) return typedPos;
synchronized (this) {
typedPos=typeSpecificPos[type];
if (typedPos>=0) return typedPos;
typedPos=TypeEqOffsetMap.getOrCreateOffset(type, this);
typeSpecificPos[type]=typedPos;
return typedPos;
}
}

First I try to extract an int from the array. I either obtain the int or
the array turns out to be too small. If the array is of sufficient size
then I check whether the element is >=0. If it is I return the value. This
is the fast unsynchronized path.

If the element is negative then I synchronize upon the object containing
the field. Once synchronized I again check whether another thread has
already set the element. If it has I return the non-negative value.
Only one thread succeeds in setting the non-negative value of the element.

If the array was originally too small then I synchronize upon the object
containing the field and again try to read the element. If the element
still cannot be read I grow the array, set the element and assign the new
array to typeSpecificPos before returning the non-negative element.

Upon the second check if the array was of sufficient size then the array
must have been resized in another thread. I check whether the element is
non-negative for safety because I'm unsure whether a different thread with
a different offset that is no longer out of bounds could end up within
this path.

I hope I haven't overlooked anything that could render this approach
non-thread safe!

Regards,
Adam
 
O

Owen Jacobson

Hi all,

I have a class with this field:

private volatile int[] typeSpecificPos=Const.intArray0;

where Const.intArray0 is simply a new int[0] singleton. As the array
grows empty slots are filled with -1. An element remains constant once
it is filled with a non-negative value (and this is the path that must
be fast).

Possible issues:
1. The array is replaced entirely with a new, larger array immediately
after the current thread accesses it.

2. An item in the array is replaced immediately after the current thread
accesses it.

If it's safe for a thread to believe a value hasn't been set yet when it
has, then you only *need* to synchronize setting new values. If it's not
safe for a thread to "miss the boat" on a newly-added then you need to
serialize not only access to the array but also use of the results, which
has to happen at a higher level and you may want to consider whether
threads are even the right approach for this part of the program.

You need to synchronize adding new values either way.
 
C

Chris Uppal

Adam said:
Is the non-synchronized method below completely thread safe? I'll paste it
in and then explain each step of my approach:

No. Or, more accurately, I haven't checked through the code in detail, but I
think it's a safe bet that it contains invalid assumptions.

Consider this:

static volatile int data = new int[32];

That happens in a thread-safe manner as the class is initialised (it requires
special provision in the language spec to ensure that it /is/ thread-safe).
The array contains all zeros.

Now in one thread:

data[0] = 666;

And then /later/ in another thread:

int first = data[0].

The variable 'first' is /not/ guaranteed to be set to 666. Note that we are
not talking about the possibility that thread2 might execute before thread1.
The issue is that the value written to the memory backing data[] as seen by
thread1 is not guaranteed to be propagated through the (potentially different)
memory backing the same array as seen by thread2.

Roughly speaking, "volatile" is useful only for primitive types[*], everything
else requires synchronisation.

([*] It can also be useful for immutable objects too, but that requires very
special care.)

-- chris
 
A

Adam Warner

Hi all,

I have a class with this field:

private volatile int[] typeSpecificPos=Const.intArray0;

where Const.intArray0 is simply a new int[0] singleton. As the array
grows empty slots are filled with -1. An element remains constant once
it is filled with a non-negative value (and this is the path that must
be fast).

Possible issues:
1. The array is replaced entirely with a new, larger array immediately
after the current thread accesses it.

2. An item in the array is replaced immediately after the current thread
accesses it.

If it's safe for a thread to believe a value hasn't been set yet when it
has, then you only *need* to synchronize setting new values.

Ah ha! I don't need to synchronize anything nor even set the field as
volatile. Obtaining a value of -1 or an out of bounds exception is
harmless. It just means the offset will be looked up via a slower
O(n) route (instead of getting it from what this is--an O(1) cache). Even
if a new array replaces an old array that in the meantime had offsets
added to it all the cache has to do is look up those offsets again.
Volatile is not needed because it doesn't matter if an old array is looked
up. Everything will eventually sync to a larger array with non-negative
offsets even with high contention.

Synchronization is needed accessing the actual slots obtained via the
previous slot offset lookup [getTypeSpecificSlotPos(type)]:

private volatile Item[] slots=Const.ItemArray0;

/** getSlot is thread safe but unsynchronized with setSlot (so it could
* return an older value while setSlot is building a new array). */
public Item getSlot(Item key) {
int type=key.type();
int pos=getTypeSpecificSlotPos(type);
try {return slots[pos];} catch (ArrayIndexOutOfBoundsException e) {
return null;
}
}

/** setSlot is thread safe and synchronized so other updates are not lost
* while a new array is being built. */
public synchronized void setSlot(Item key, Item value) {
int type=key.type();
int pos=getTypeSpecificSlotPos(type);
try {slots[pos]=value;} catch (ArrayIndexOutOfBoundsException e) {
Item[] newArray=new Item[pos+1];
System.arraycopy(slots, 0, newArray, 0, slots.length);
newArray[pos]=value;
slots=newArray;
}
}

Since getSlot may already return an older value must slots be volatile?
(i.e. if slots is not volatile will setSlot always see the latest array
reference?)

Regards,
Adam
 
A

Adam Warner

Adam said:
Is the non-synchronized method below completely thread safe? I'll paste it
in and then explain each step of my approach:

No. Or, more accurately, I haven't checked through the code in detail, but I
think it's a safe bet that it contains invalid assumptions.

Consider this:

static volatile int data = new int[32];

That happens in a thread-safe manner as the class is initialised (it requires
special provision in the language spec to ensure that it /is/ thread-safe).
The array contains all zeros.

Now in one thread:

data[0] = 666;

And then /later/ in another thread:

int first = data[0].

The variable 'first' is /not/ guaranteed to be set to 666. Note that we are
not talking about the possibility that thread2 might execute before thread1.
The issue is that the value written to the memory backing data[] as seen by
thread1 is not guaranteed to be propagated through the (potentially different)
memory backing the same array as seen by thread2.

Roughly speaking, "volatile" is useful only for primitive types[*], everything
else requires synchronisation.

([*] It can also be useful for immutable objects too, but that requires very
special care.)

Are these correct implications:

A consistent view across threads of data's elements require data
to be accessed as...

synchronized (data) {
....
}

....for all reads and writes to the array 'data'?

But if I'm prepared to accept the transitory reading of stale data I only
need to synchronize upon writes.

Is it too much to hope for zero overhead synchronization of arrays when
running upon architectures with strong memory models that already provide
a thread consistent view of shared memory?

Regards,
Adam
 
T

Thomas Hawtin

The variable 'first' is /not/ guaranteed to be set to 666. Note that we are
not talking about the possibility that thread2 might execute before thread1.
The issue is that the value written to the memory backing data[] as seen by
thread1 is not guaranteed to be propagated through the (potentially different)
memory backing the same array as seen by thread2.

Roughly speaking, "volatile" is useful only for primitive types[*], everything
else requires synchronisation.

That's true of ye olde pre-1.5 Java. The new Java Memory Model (JMM) in
1.5 (and in practice 1.4) makes volatile useful. Volatile writes
"happen-before" reads of the written value. So any write before the
volatile write in the writing thread is visible to any read after the
volatile read in the reading thread. Clear?
([*] It can also be useful for immutable objects too, but that requires very
special care.)

In the old memory model, theoretically (and demonstrated practically on
some DEC Alphas), uninitialised immutable objects are possible.


So, to the code. It looks to me as if it is ok in the new JMM, but not
the old. As Chris says, the -1 are not necessarily visible to the
reading thread, leaving the initial 0s accessible. In the new JMM, the
volatile access ensures that there is a happens-before relationship in
there. The synchronized is irrelevant, as the read is not in a
synchronized block.

An easy fix for the old JMM, is to change the values by 1, so that 0
represents an uninitialised slot. That is always a good strategy.

If we use that fix, then we can ditch the volatile altogether. In the
new JMM, reading a volatile is relatively expensive. More expensive than
writing and almost as expensive as an uncontended lock acquisition. Any
non-thread local heap values in your registers will need to be dropped.

You should see a further speed up if you move all the slow path code out
of the method. Try blocks and unnecessary exceptions probably aren't
going to help either.

private int[] typeSpecificPos = Const.EMPTY_INT_ARRAY;

public int getTypeSpecificSlotPos(int type) {
int[] typeSpecificPos = this.typeSpecificPos;
if (type < typeSpecificPos.length) {
int typedPos = typeSpecificPos[type] - 1;
if (typedPos != -1) {
return typedPos;
}
}
return getTypeSpecificSlotPosSlowPath(type);
}

As always, I have to comment on style: It does not help to have multiple
statements on one line, multiple declarations on one line, long methods
or removing spaces.

Tom Hawtin
 
A

Adam Warner

The variable 'first' is /not/ guaranteed to be set to 666. Note that
we are not talking about the possibility that thread2 might execute
before thread1. The issue is that the value written to the memory
backing data[] as seen by thread1 is not guaranteed to be propagated
through the (potentially different) memory backing the same array as
seen by thread2.

Roughly speaking, "volatile" is useful only for primitive types[*],
everything else requires synchronisation.

That's true of ye olde pre-1.5 Java. The new Java Memory Model (JMM) in
1.5 (and in practice 1.4) makes volatile useful. Volatile writes
"happen-before" reads of the written value. So any write before the
volatile write in the writing thread is visible to any read after the
volatile read in the reading thread. Clear?
([*] It can also be useful for immutable objects too, but that requires
very special care.)

In the old memory model, theoretically (and demonstrated practically on
some DEC Alphas), uninitialised immutable objects are possible.


So, to the code. It looks to me as if it is ok in the new JMM, but not
the old. As Chris says, the -1 are not necessarily visible to the
reading thread, leaving the initial 0s accessible. In the new JMM, the
volatile access ensures that there is a happens-before relationship in
there. The synchronized is irrelevant, as the read is not in a
synchronized block.

Thank you for these clarifications Thomas. I am primarily interested in
the new memory model since the old memory model has "several serious
flaws" <http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html>.

To reinforce my understanding: If the reference to an array is declared
volatile then different threads accessing array elements through this
volatile reference *will see a consistent view of memory*: "Under the new
memory model, writing to a volatile field has the same memory effect as a
monitor release, and reading from a volatile field has the same memory
effect as a monitor acquire. In effect, because the new memory model
places stricter constraints on reordering of volatile field accesses with
other field accesses, volatile or not, anything that was visible to thread
A when it writes to volatile field f becomes visible to thread B when it
reads f."

Fantastic.
An easy fix for the old JMM, is to change the values by 1, so that 0
represents an uninitialised slot. That is always a good strategy.

If we use that fix, then we can ditch the volatile altogether.

Since without the volatile designation another thread could see the
uninitialised array contents of 0 (which is also a cached value!).
Changing the values by 1 to render 0 an uncached value is clearly a good
strategy.
In the new JMM, reading a volatile is relatively expensive. More
expensive than writing and almost as expensive as an uncontended lock
acquisition. Any non-thread local heap values in your registers will
need to be dropped.

"Effectively, the semantics of volatile have been strengthened
substantially, almost to the level of synchronization. Each read or write
of a volatile field acts like 'half' a synchronization, for purposes of
visibility."
You should see a further speed up if you move all the slow path code out
of the method. Try blocks and unnecessary exceptions probably aren't
going to help either.

Catching an ArrayIndexOutOfBoundsException is part of the slow path. I
understand your advice to move all of the slow path code out of the method
as the fast path is more likely to inlined. But my approach guarantees
that there will be only one array bounds check within the fast path.
With your approach...
private int[] typeSpecificPos = Const.EMPTY_INT_ARRAY;

public int getTypeSpecificSlotPos(int type) {
int[] typeSpecificPos = this.typeSpecificPos;
if (type < typeSpecificPos.length) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ explicit array bounds check
int typedPos = typeSpecificPos[type] - 1;
^^^^^^^^^^^^^^^^^^^^^ implicit array bounds check
if (typedPos != -1) {
return typedPos;
}
}
return getTypeSpecificSlotPosSlowPath(type);
}

....I have to hope that the runtime will elide the second bounds check upon
typeSpecificPos. Is this an optimisation I should be able to rely upon?
As always, I have to comment on style: It does not help to have multiple
statements on one line, multiple declarations on one line, long methods
or removing spaces.

Yet one of the approaches below uses three times the amount of vertical
real estate. Let's just agree that reasonable people may disagree upon
some matters of style and all be right.

for (int i=oldLen; i<newLen; ++i) {newArray=-1;}

for (int i = oldLen; i < newLen; ++i) {
newArray = -1;
}

Regards,
Adam
 
C

Chris Uppal

Adam said:
Is it too much to hope for zero overhead synchronization of arrays when
running upon architectures with strong memory models that already provide
a thread consistent view of shared memory?

Probably ;-)

But I think Thomas's reformulation of your approach will work. (Assuming that
the array in intended to have logical write-once semantics per slot, which I
/think/ is the case here)

-- chris
 
C

Chris Uppal

Thomas Hawtin wrote:

[me:]
Roughly speaking, "volatile" is useful only for primitive types[*],
everything else requires synchronisation.

That's true of ye olde pre-1.5 Java. The new Java Memory Model (JMM) in
1.5 (and in practice 1.4) makes volatile useful. Volatile writes
"happen-before" reads of the written value. So any write before the
volatile write in the writing thread is visible to any read after the
volatile read in the reading thread. Clear?

It is undoubtedly time that I gritted my teeth and buckled down to working
properly through JLS3 chap 17. However, from my current sketchy understanding
of the spec, I am not getting that accessing the elements of the array pointed
to by a volatile variable[*] has any specific happens-before relationship
with anything. Do you have a reference (in the JLS or elsewhere) that talks
specifically about that case ? Or maybe you have followed the JLS wording in
more detail than I have. It's a bit odd that the JLS itself doesn't (that I
can find) mention this case at all -- which is particularly strange given that
the change (if real) has rather huge implications.

([*] or other non-volatile value transitively reachable via a volatile
reference.)

BTW, please don't be offended by my doubting you -- I know that you know your
stuff, but this seems to me to be a case were "extraordinary claims require
extraordinary proof" ;-)

-- chris
 
C

Chris Uppal

[I'm just picking up on one point here. See my reply to Thomas for more about
the effects (or not) on array access via a volatile reference.]

Adam said:
...I have to hope that the runtime will elide the second bounds check upon
typeSpecificPos. Is this an optimisation I should be able to rely upon?

Whether it elides it or not, the values used in the comparison are going to be
in L2 cache at worst (or equivalent on other architectures). What with
piplelining, etc, there's a fair chance that the second check will have zero
overhead.

-- chris
 
T

Thomas Hawtin

Chris said:
Thomas Hawtin wrote:

[me:]
Roughly speaking, "volatile" is useful only for primitive types[*],
everything else requires synchronisation.
That's true of ye olde pre-1.5 Java. The new Java Memory Model (JMM) in
1.5 (and in practice 1.4) makes volatile useful. Volatile writes
"happen-before" reads of the written value. So any write before the
volatile write in the writing thread is visible to any read after the
volatile read in the reading thread. Clear?

It is undoubtedly time that I gritted my teeth and buckled down to working
properly through JLS3 chap 17. However, from my current sketchy understanding
of the spec, I am not getting that accessing the elements of the array pointed
to by a volatile variable[*] has any specific happens-before relationship
with anything. Do you have a reference (in the JLS or elsewhere) that talks
specifically about that case ? Or maybe you have followed the JLS wording in
more detail than I have. It's a bit odd that the JLS itself doesn't (that I
can find) mention this case at all -- which is particularly strange given that
the change (if real) has rather huge implications.

You are correct for your example, but Adam's example was a bit
different. Reading a volatile reference of an array and then writing to
an element does not force the value to be written out immediately.

In the original example, if a valid value has not been read then it is
always double checked under synchronisation. For that to work, the array
needs to be initialised with an invalid values, -1. In the code the -1
values are written and only after that is the newly created array
written to the volatile. In order to read elements of the array, the
reading thread needs to read the volatile.

Essentially volatiles of any type work to protect data that was local to
the write thread before the volatile was written and does not change
after the write. In Adam's example volatile works to ensure that the
initial zeros are hidden, but other updates rely on the synchronisation.
It's like a one-time synchronise.

[What irritates me a little is that there is apparently no way to do a
plain read of a volatile with the old semantics. You can request a
spurious-failable compare-and-set with the old, through
weakCompareAndSet. But you can't cheaply read the value to be the CAS
on. Not that Sun's JVM implements weakCompareAndSet any differently to
compareAndSet.]


The important part of Chapter 17 is bullet 2 of 17.4.4 (p561) -

"A write to a volatile variable v synchronizes-with all subsequent
reads of v by any thread"

Compare to bullet 1 -

"An unlock action on monitor m synchronizes-with all subsequent
lock actions on m"

Tom Hawtin
 
T

Thomas Hawtin

Adam said:
To reinforce my understanding: If the reference to an array is declared
volatile then different threads accessing array elements through this
volatile reference *will see a consistent view of memory*: "Under the new

Well, they need not see any further writes to the array (hence the
necessity to double-check under synchronisation).
Catching an ArrayIndexOutOfBoundsException is part of the slow path. I
understand your advice to move all of the slow path code out of the method
as the fast path is more likely to inlined. But my approach guarantees
that there will be only one array bounds check within the fast path.
With your approach...

Array bounds checking is very, very, very cheap. Especially if the
length has already been read.

Perhaps more dubious in terms of performance of my subtracting one
before the comparison. That means the values needs to have arithmetic
performed on it before it can be checked, whereas comparison to zero can
probably be done as part of the load.

You could always measure it.
Yet one of the approaches below uses three times the amount of vertical
real estate. Let's just agree that reasonable people may disagree upon
some matters of style and all be right.

Until it comes to reading someone else's code. Sometimes standards do good.

Tom Hawtin
 
A

Adam Warner

Well, they need not see any further writes to the array (hence the
necessity to double-check under synchronisation).

I've just found strong evidence this is correct from the
java.util.concurrent.atomic package:

"The AtomicIntegerArray, AtomicLongArray, and AtomicReferenceArray classes
further extend atomic operation support to arrays of these types. These
classes are also notable in providing volatile access semantics for their
array elements, which is not supported for ordinary arrays."

Doug Lea made this comment about ordinary arrays in August 2003:
<http://www.cs.umd.edu/~pugh/java/memoryModel/archive/1554.html>

"I'd guess that only highly-memory-model-conscious programmers will ever
use these classes, so Hans's and Sarita comments and concerns about
ordinary Java arrays still apply."

Java's memory model remains so weak with respect to volatile arrays that
most programmers will write broken threaded code that only works because
it is running upon an architecture with a strong memory model. Those who
are "highly-memory-model-conscious programmers" will have to suffer the
higher overhead of, say, AtomicReferenceArray which is a generic class
that imposes at least one _extra level of indirection_ upon every element
lookup and a dynamic cast upon every element retrieved because of type
erasure.

Regards,
Adam
 
P

Patricia Shanahan

Adam said:
Hi all,

I have a class with this field:

private volatile int[] typeSpecificPos=Const.intArray0;

where Const.intArray0 is simply a new int[0] singleton. As the array grows
empty slots are filled with -1. An element remains constant once it is
filled with a non-negative value (and this is the path that must be fast).
....

Here is an alternative approach. Whether this is any good or not depends
on sizes, numbers of threads etc.

Give each thread its own fast cache array. When it can't find the data
it needs in its array, it executes a synchronized method that gets the
data and updates its cache.

In this design, the cache is not shared between threads. Each instance
is both read and written by a single thread, so it is obviously
consistent. The shared data is accessed only on the slow path, with
synchronization, also easy to make consistent.

Patricia
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top