Question whether a problem with race conditions exists in this case

S

Saxo

I have a class Node as displayed below that holds a new value and the
previous value. Which one applies is determined by the value of
useNewValue. In my application there is a list of such nodes where for
each node the current value is replaced with a new one. The new values
should become effective for the entire list all >at once< through an
atomic change of useNewValue in every node. For that purpose, when the
new value is set, for every node in the list the same AtomicBoolean
instance is passed on to setNewValue(...), which is stored in
useNewValueParam. When the commit is done, the value of this instance
of AtomicBoolean is changed to true (this way the thread doing the
commit does not have to enter the synchronized block as all the other
threads calling aNode.get) and thus the new value of every node
becomes visible at once to every thread calling aNode.get().

public class Node {

AtomicBoolean useNewValue = new AtomicBoolean(false);
private Object newValue = null;
private Object previousValue = null;
private Object lock = new Object();

public Object get() {
synchronized(lock) {
if(useNewValue.get()) // 1
return newValue; // 2
return previousValue; // 3
}
}

public void setNewValue(Object newValue, AtomicBoolean
useNewValueParam) {
synchronized(lock) {
if(this.useNewValue.get())
previousValue = this.newValue;
this.newValue = newValue;
// useNewValueParam is allways set to false when setNewValue is
called
this.useNewValue = useNewValueParam;
}
}
}

At the same time there is never more than one thread iterating over
the node list calling setNewValue. This is made sure through
serialization of threads that want to iterate over the list calling
setNewValue. Serialization of threads is a bit crude, but not relevant
at the moment for the discussion of the problem described here.

My question is now whether this approach is free of race conditions or
starvation issues if implemented as described above. I have some
doubts whether everything is fine here as useNewValue is changed by
the commit thread by reference without entering synchronized(lock)
{ ... }. So is everything still fine if a context switch happens
between line 1 and 2 or between line 1 and 3? It would be for sure if
the commit thread entered the synchronized(lock) { ... } block, but it
does not (changing to the new values all at once wouldn't be possible
otherwise).

I would be thankful if someone could look over this and tell me her/
his opinion.

Thanks al lot, Oliver
 
A

Andreas Leitgeb

Saxo said:
public class Node {
AtomicBoolean useNewValue = new AtomicBoolean(false);
private Object newValue = null;
private Object previousValue = null;
private Object lock = new Object();

I think, you would at least have to declare these volatile.
Otherwise, a different thread might e.g. still see an old
value in newValue, after the AtomicBoolean's value changes
to true.
 
L

Lew

I think, you would at least have to declare these volatile.

Nope, they use 'synchronized'. 'volatile' wouldn't work anyway, because these
have to change together, so they must use 'synchronized'.
 
D

Daniel Pitts

I have a class Node as displayed below that holds a new value and the
previous value. [snip]

public class Node {

AtomicBoolean useNewValue = new AtomicBoolean(false);
private Object newValue = null;
private Object previousValue = null;
private Object lock = new Object();

public Object get() {
synchronized(lock) {
if(useNewValue.get()) // 1
return newValue; // 2
return previousValue; // 3
}
}

public void setNewValue(Object newValue, AtomicBoolean
useNewValueParam) {
synchronized(lock) {
if(this.useNewValue.get())
previousValue = this.newValue;
this.newValue = newValue;
// useNewValueParam is allways set to false when setNewValue is
called
If it is always false, why pass it in at all? I don't believe this comment.
this.useNewValue = useNewValueParam;
}
}
}
Don't use tabs on usenet, it makes formatting bonkers. Use 2 to 4
spaces for indenting.

Since you have a lock here, you don't need to use an AtomicBoolean.
Even so, you're not using the AtomicBoolean as expected. The field for
it should be final, and you call "set(value)" on it, rather than replace
it with another AtomicBoolean instance. That defeats the purpose.
At the same time there is never more than one thread iterating over
the node list calling setNewValue. This is made sure through
serialization of threads that want to iterate over the list calling
setNewValue. Serialization of threads is a bit crude, but not relevant
at the moment for the discussion of the problem described here.
The threads are all serialized because of the synchronize(lock).
My question is now whether this approach is free of race conditions or
starvation issues if implemented as described above. I have some
doubts whether everything is fine here as useNewValue is changed by
the commit thread by reference without entering synchronized(lock)
{ ... }. So is everything still fine if a context switch happens
between line 1 and 2 or between line 1 and 3? It would be for sure if
the commit thread entered the synchronized(lock) { ... } block, but it
does not (changing to the new values all at once wouldn't be possible
otherwise).
synchronized(lock) prevents more than one thread from being inside *any*
synchronized(lock) block at *any* given time, so you will never have a
setter and getter running at the same time. This code will "work", but
is still not great quality.
I would be thankful if someone could look over this and tell me her/
his opinion.
Unrelated, it might be useful to use Generics here. I don't know much
of your use-case, but I'll demonstrate how it might be used. There are
actually two ways I would suggest doing this, depending on the frequency
of sets vs gets. If they are about equal, or sets are more frequent, I
suggest this class:

public class Node<V> {
private boolean useNew;
private V previousValue;
private V value;
private final Object lock = new Object();

public V getValue() {
synchronize(lock) {
return useNew ? value : previousValue;
}
}

public void setValue(V value, boolean useNew) {
synchronize(lock) {
previousValue = getValue();
this.value = value;
this.useNew = useNew;
}
}
}

If the sets are somewhat infrequent, I suggest the following class. It
will allow much faster access by multiple threads for getValue(). It
does have a small amount of overhead when setting the value.

public class Node<V> {
private volatile State<V> state = new State<V>(null, null, true);
private final Object setLock = new Object();

public V getValue() {
// volatile allows us to safely read a value.
return state.getValue();
}

public V setValue(V newValue, boolean useNew) {
// we need to serialize access here, otherwise we might get the
// wrong previous value.
synchronized(setLock) {
state = new State<V>(state.value, newValue, useNew);
}
}

// Note, this <V> is independent of the <V> in Node.
private static class State<V> {
private final V previousValue;
private final V newValue;
private volatile boolean V useNew;
public State(V previousValue, V newValue, boolean useNew) {
this.previousValue = previousValue;
this.newValue = newValue;
this.useNew = useNew;
}
public V getValue() {
return useNew ? newValue : previousValue;
}
}
}
 
D

Daniel Pitts

Nope, they use 'synchronized'. 'volatile' wouldn't work anyway, because these
have to change together, so they must use 'synchronized'.
There are other ways to atomically change multiple "fields". See my
other post.True if the only change was to add volatile and remove synchronized.
 
L

Lew

Saxo said:
I have a class Node as displayed below that holds a new value and the

First, good job with your SSCCE (http://sscce.org/). That was a well thought-
out and complete job.

Second, _never_ use TAB characters in code posts except in literals. To
indent, use two or four spaces per indent level. Two tends to fit better in
forum posts.
previous value. Which one applies is determined by the value of
useNewValue. In my application there is a list of such nodes where for
each node the current value is replaced with a new one. The new values
should become effective for the entire list all >at once< through an
atomic change of useNewValue in every node. For that purpose, when the
new value is set, for every node in the list the same AtomicBoolean
instance is passed on to setNewValue(...), which is stored in
useNewValueParam. When the commit is done, the value of this instance
of AtomicBoolean is changed to true (this way the thread doing the
commit does not have to enter the synchronized block as all the other
threads calling aNode.get) and thus the new value of every node
becomes visible at once to every thread calling aNode.get().

public class Node {

AtomicBoolean useNewValue = new AtomicBoolean(false);

A 'volatile boolean' should do here. Actually, a non-'volatile' variable will be just fine:

boolean useNewValue;

No need to set to 'false' what is already set to 'false'.

Why did you declare it package-private?
private Object newValue = null;

Why do you redundantly set this to 'null'?
private Object previousValue = null;
private Object lock = new Object();

Why not use the instance itself as the lock?
public Object get() {
synchronized(lock) {
if(useNewValue.get()) // 1
return newValue; // 2
return previousValue; // 3
}
}

public void setNewValue(Object newValue, AtomicBoolean
useNewValueParam) {

It is ridiculous to pass an 'AtomicBoolean'. You just need a 'boolean'.
synchronized(lock) {
if(this.useNewValue.get())
previousValue = this.newValue;
this.newValue = newValue;
// useNewValueParam is allways set to false when setNewValue is
called

That's patently untrue.

What the heck do you mean by this comment? I cannot make sense of it.
this.useNewValue = useNewValueParam;

You write a very confusing logic here. Is 'useNewValueParam' (terrible name, BTW) meant to affect the next time the accessor is called?

If so, you've pretty much thrown all your concurrency out the window. Any number of 'set' calls could occur, each flipping your boolean the other way, and you'll never know whether it's going to land butter side up.

If all you care about is that each reader see the correct current state whatever it may be after whatever number of setters, then fine.

Regardless, you need to document the heck out of what you intend to do withthis parameter. What the heck are you even trying to accomplish?
}
}
}

My question is now whether this approach is free of race conditions or
starvation issues if implemented as described above. I have some

You over-implemented it by a mile.

Use just one lock.
doubts whether everything is fine here as useNewValue is changed by
the commit thread by reference without entering synchronized(lock)
{ ... }. So is everything still fine if a context switch happens
between line 1 and 2 or between line 1 and 3? It would be for sure if
the commit thread entered the synchronized(lock) { ... } block, but it
does not (changing to the new values all at once wouldn't be possible
otherwise).

You have synchronized those lines. They are all in the same critical section. All accesses are covered by the same lock (plus several other unnecessary ones). You're good.

Except for how you coded it, that is.

Use the instance itself as the monitor, unless you really, really need a separate one. It would be kind to maintainers to comment your implementationchoice, and that in turn forces you to have a darned good reason.

Throwing a crapload of locks at a problem willy-nilly is superstitious programming. That is not effective. Don't do it.

*Think* about what you're doing. What references have you checked to improve your ability to reason about concurrent programming?

Seriously, which ones have you read?

I really am asking.

Brian Goetz's _Java Concurrency in Practice_ is an excellent resource.

You synchronize each block of reads (accesses) with the same monitor that you use for each block of writes (mutations). You have the same actors in each block, no more, no less. In your example, you have three. And insteadof 'Object', consider using generics.

public class Node<V> // off the top of my head, not even compiled
{
private V value;
private V previous;

private boolean useCurrent;

synchronized public V getValue()
{
return useCurrent? value : previous;
}

synchronized public void setValue(V value, boolean useCurrent)
{
previous = this.value;
this.value = value;
this.useCurrent = useCurrent;
}
}

This code will not behave the same as yours, but I presumptuously concludedthat your code did the wrong thing. Either way, you can see how simple you should make it. It should be quite straightforward to reason about how this code will work.
 
S

Saxo

I think, you would at least have to declare these volatile.
Otherwise, a different thread might e.g. still see an old
value in newValue, after the AtomicBoolean's value changes
to true.

Yes, good point. Thanks! The commit thread that changes useNewValue on
commit from false to true by reference from outside the synchronized
block does not enter the synchronized block. Therefore, the contents
of these values might still be cached for threads calling get().
 
S

Saxo

If it is always false, why pass it in at all? I don't believe this comment.>                    this.useNewValue = useNewValueParam;

It is passed in so that the visibility change from previousValue to
newValue can be done with a single atomic change of useNewValue by
reference from outside the synchronized block for all nodes in the
list at once. This is also the reason why boolean wouldn't work as not
accessible by reference and why it has to be aAtomicBoolean as the
commit thread changing the value of useNewValue does not enter the
synchronized block.

The threads are all serialized because of the synchronize(lock).

This is true for all threads calling aNode.get(), but not for the
single commit thread that changes useNewValue atomically by reference
from outside the synchronized block.
 
S

Saxo

It is ridiculous to pass an 'AtomicBoolean'.  You just need a 'boolean'..

Well, no. useNewValue is changed by reference. That's why it has to
been an object. When it is changed it is not protected by the
synchronized block since it is changed by reference from the commit
threas.
That's patently untrue.

What the heck do you mean by this comment?  I cannot make sense of it.

This way the single AtomicBoolean object is passed that will be
changed from false to true by reference, so that all new values become
visible for any thread calling aNode.get() "at once".
You write a very confusing logic here.  Is 'useNewValueParam' (terriblename, BTW) meant to affect the next time the accessor is called?

If so, you've pretty much thrown all your concurrency out the window.  Any number of 'set' calls could occur, each flipping your boolean the otherway, and you'll never know whether it's going to land butter side up.

Yes, that is why there is only a single commit thread at any time that
calls setNewValue and changes useNewValue. This is explained in my
initial post in the second block counting from below: "At the same
time there is never more than one thread iterating over the node list
calling setNewValue. This is made sure through serialization of
threads.

My initial post is quite a long read. I know ...
Throwing a crapload of locks at a problem willy-nilly is superstitious programming.  That is not effective.  Don't do it.

*Think* about what you're doing.  What references have you checked to improve your ability to reason about concurrent programming?

Seriously, which ones have you read?

I really am asking.

This is really funny and made me laugh. Thanks :).

Regards, Oliver
 
A

Andreas Leitgeb

Lew said:
Nope, they use 'synchronized'.

You're right on that... I didn't look at the rest carefully
enough.
'volatile' wouldn't work anyway, because these
have to change together, so they must use 'synchronized'.

I thought, that was the whole point of the code: first store
the new value in a separate var, then later do the switchover.
I even think, that the synchronized blocks would be redundant,
if the fields were volatile. Afterall, there wasn't any
check-and-set or similar combined operation involved.

From examples posted here during the last few years, I've come
to think, that one just cannot really understand concurrency
and memory-barriers. If one thinks he does, then he either
missed the hairier bits, or just doesn't (perhaps rightly so)
believe them.

Doesn't volatile imply: "really look at what's in the field,
rather than what is in local CPU's cache" ?
 
E

Eric Sosman

I have a class Node as displayed below that holds a new value and the
previous value. Which one applies is determined by the value of
useNewValue. In my application there is a list of such nodes where for
each node the current value is replaced with a new one. The new values
should become effective for the entire list all>at once< through an
atomic change of useNewValue in every node. For that purpose, when the
new value is set, for every node in the list the same AtomicBoolean
instance is passed on to setNewValue(...), which is stored in
useNewValueParam. When the commit is done, the value of this instance
of AtomicBoolean is changed to true (this way the thread doing the
commit does not have to enter the synchronized block as all the other
threads calling aNode.get) and thus the new value of every node
becomes visible at once to every thread calling aNode.get().

public class Node {

AtomicBoolean useNewValue = new AtomicBoolean(false);

Why isn't this `private'? Is something else going on that
you haven't told us about?
private Object newValue = null;
private Object previousValue = null;
private Object lock = new Object();

What does `lock' buy you? Why not just synchronize on the
Node itself?
public Object get() {
synchronized(lock) {
if(useNewValue.get()) // 1
return newValue; // 2
return previousValue; // 3
}
}

public void setNewValue(Object newValue, AtomicBoolean
useNewValueParam) {
synchronized(lock) {
if(this.useNewValue.get())
previousValue = this.newValue;
this.newValue = newValue;
// useNewValueParam is allways set to false when setNewValue is
called

I don't see why that would matter.
this.useNewValue = useNewValueParam;
}
}
}

At the same time there is never more than one thread iterating over
the node list calling setNewValue.

So `synchronized' is just to ensure that the single setNewValue()
caller doesn't overlap any get() callers, is that right? (Nothing
wrong with that; I'm just trying to test my understanding of what
you're up to.)
This is made sure through
serialization of threads that want to iterate over the list calling
setNewValue. Serialization of threads is a bit crude, but not relevant
at the moment for the discussion of the problem described here.

My question is now whether this approach is free of race conditions or
starvation issues if implemented as described above. I have some
doubts whether everything is fine here as useNewValue is changed by
the commit thread by reference without entering synchronized(lock)
{ ... }. So is everything still fine if a context switch happens
between line 1 and 2 or between line 1 and 3? It would be for sure if
the commit thread entered the synchronized(lock) { ... } block, but it
does not (changing to the new values all at once wouldn't be possible
otherwise).

When you change the shared AtomicBoolean, some threads executing
get() may have already observed the old value and may have decided
which value to return based on that now-outdated observation. So yes,
there's a race: It is possible for get() to return an old value after
the AtomicBoolean changes from true to false, or a new value after it
changes from false to true.

What's the larger problem you're trying to solve?
 
A

Andreas Leitgeb

Lew said:
First, good job with your SSCCE (http://sscce.org/). That was a well thought-
out and complete job.

But it seems like it was futile, as at least two regulars didn't understand
the reason behind the use of AtomicBoolean, despite the sccee.
Second, _never_ use TAB characters in code posts except in literals.

Why even "except in literals"? There's \t for (String-)literals.
A 'volatile boolean' should do here. Actually, a non-'volatile' variable will be just fine:
boolean useNewValue;

The initialization is probably only to avoid null-issues
before first setNewValue. After that, there's only *one*
AtomicBoolean instance for all relevant Node instances.
Why not use the instance itself as the lock?

That, however, is a good question...
 
S

Saxo

     Why isn't this `private'?  Is something else going on that
you haven't told us about?

I just forgot to set it private.
     What does `lock' buy you?  Why not just synchronize on the
Node itself?

The purpose is only to indicate that some more fine-gtrained locking
would be used for the real thing instead of doing a synchronized(this)
{ ... } thing.
     I don't see why that would matter.

You mean that comment "useNewValueParam is allways set to false when
setNewValue is called"? To make sure that after setting the new value
the previous value is still returned on a call to get() until
useNewValue is set to true. This is why

if(this.useNewValue.get())
previousValue = this.newValue;

is done to make sure that the previous value is still returned in case
useNewValue was true before it was replaced with the new AtomicBoolean
through

this.useNewValue = useNewValueParam;
     So `synchronized' is just to ensure that the single setNewValue()
caller doesn't overlap any get() callers, is that right?  (Nothing
wrong with that; I'm just trying to test my understanding of what
you're up to.)

Yes, exactly.

     What's the larger problem you're trying to solve?

Well, I'm basically thinking about how to implement a database commit.
I don't use a database. Instead, the data is on the heap in a simple
list. When the commit is done, all the new values in all the nodes in
my list become visible at once (by changing useNewValue as explained)
to any thread calling get(). This way I get around changing the values
in a big big synchronized block which would cause quite some lock
contention.

Also, if only the elements in my list at position 1 and 3 have
changed, I only need to call setNewValue on the nodes at these
positions. I don't have to lock the entire list. If the values for the
elements at position 2 and 4 have to be changed at the same time, the
commit could be done in parallel to increase throughput (because there
are no overlapping regions) instead of locking the entire list.
 
S

Saxo

     When you change the shared AtomicBoolean, some threads executing
get() may have already observed the old value and may have decided
which value to return based on that now-outdated observation.  So yes,
there's a race: It is possible for get() to return an old value after
the AtomicBoolean changes from true to false, or a new value after it
changes from false to true.

Oh shit, of course! I have thought about it. First though was to call

if(useNewValue.get()) {
// ...
}

that often till it returns the same value. But useNewValue could
change just thereafter. So you would have to call this infinitely
often till you are sure you get the right value. Obviously, this
doesn't work. Then I forgot about the issue.

Thanks for the hint. Now I have to think how to solve this ...

Chhers, Oliver
 
M

markspace

This way the single AtomicBoolean object is passed that will be
changed from false to true by reference, so that all new values
become visible for any thread calling aNode.get() "at once".

But there isn't a single AtomicBoolean, and it isn't passed in. It's
assigned as an instance variable initialization. There's one
AtomicBoolean per Node object.

This is really important, because it makes your whole code wrong. If
you want to pass something in, then pass it in.

public class Node {
private final AtomicBoolean useNewValue;

public Node( AtomicBoolean commonFlag ) {
useNewValue = commonFlag;
}

...

}
 
E

Eric Sosman

But there isn't a single AtomicBoolean, and it isn't passed in. It's
assigned as an instance variable initialization. There's one
AtomicBoolean per Node object.

The instance field gets changed in the setNewValue() call.
The original value is only there, as far as I can see, to cough
up a "false" until it's replaced.

I still wish I knew what he's trying to do, though.
 
S

Saxo

     The instance field gets changed in the setNewValue() call.
The original value is only there, as far as I can see, to cough
up a "false" until it's replaced.

Yes, right :).
     I still wish I knew what he's trying to do, though.

I just posted two replies to your last post to answer your questions.
You might have missed those due to a race condition ;-)). Have to go
to bed now, though. The sun has set some hours ago over here ...

Cheers, Oliver
 
E

Eric Sosman

[...]
What's the larger problem you're trying to solve?

Well, I'm basically thinking about how to implement a database commit.
I don't use a database. Instead, the data is on the heap in a simple
list. When the commit is done, all the new values in all the nodes in
my list become visible at once (by changing useNewValue as explained)
to any thread calling get(). This way I get around changing the values
in a big big synchronized block which would cause quite some lock
contention.

You'll still have the problem that threads T1 and T2 could get()
different values from the same Node, and both be working with those
conflicting values at the same time. An "instantaneous" change will
not avoid the issue -- so either (a) it better not be an issue, or
(b) you need The Mother Of All Locks around the whole shebang ...

Also, the problem ordinarily addressed by a "commit" mechanism
is somewhat different: You want to make changes to Nodes N1 and N2,
and you want to make sure an observer sees either the two old values
or the two new values, but never a mixture. Again, "instantaneous"
change won't solve the problem, not even in the simple case of just
one mutator and one observer:

Observer: Fetch the N1 value, then ...
Mutator: CHANGE THE WORLD INSTANTANEOUSLY
Observer: ... fetch the N2 value

For the observer's view of N1 and N2 to be consistent, it's not
enough that each get() be guarded against interference; you need
the pair of get()'s guarded as a unit. Very likely, you also need
the changes to N1 and N2 guarded as a unit.
 
M

markspace

Yes, right :).


That's not really good enough. If the instance value exists for any
time, then there's a chance that a thread will come in before you have a
chance to use it. It's a big trap for anyone who didn't design the
code. My way was much more clear and also bullet proof.
 
L

Lew

Saxo said:
Oh shit, of course! I have thought about it. First though was to call

if(useNewValue.get()) {
// ...
}

that often till it returns the same value. But useNewValue could
change just thereafter. So you would have to call this infinitely
often till you are sure you get the right value. Obviously, this
doesn't work. Then I forgot about the issue.

Thanks for the hint. Now I have to think how to solve this ...

Bearing in mind that you don't need, and shouldn't use, an 'AtomicBoolean' for
this in the first place.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,522
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top