What in C++11 prohibits mutex operations from being reordered?

Alexander Terekhov · Apr 5, 2013

Michael Podolsky wrote:

[... m1.unlock() "sequenced before" m2.lock() reordering ...]

May be I am not completely following your idea. Is there any important/
significant difference between what you are proposing to do with
lock_but_if_blocked_unlock_another and between my code which looked
like:

Both are the same in the sense that reordering is happening in
deadlock-free manner (executing thread won't wait for m2 while holding
m1 locked).

The following can not be reordered (reordering in whatever manner may
not be observed):

m1.lock() "sequenced before" m2.lock()
m1.lock() "sequenced before" m2.unlock()
m1.unlock() "sequenced before" m2.unlock()

Reordering of m1.unlock() "sequenced before" m2.lock() may be observed
using (sane) trylock().

regards,
alexander.

?? Tiib · Apr 5, 2013

Why affecting observable behavior is problematic?

Because observable behavior is sacred. It can't be changed unless
there is undefined behavior. All programming languages are for
describing observable behavior and so are on one step above
assembly languages (that describe machine instructions). C++ is
not exception there.

Alexander Terekhov · Apr 5, 2013

Michael Podolsky wrote:
[...]

That is what you see in my last example. Ordered execution does not
deadlock. Reordered execution deadlocks. Both executions are allowed
for observable behavior of abstract machine.

Now if you agree with this, then probably your argument is not enough
to explain why mutex operations cannot be reordered.

With mutexes in the absence of trylock() the program is guaranteed to
behave SC (and obviously without injected deadlocks).

With trylock() in the game, reordering m1.unlock() -> m2.lock()
(overlapping of subsequent-as-coded critical regions) can be observed as
far as transition of mutex states is concerned but deadlocks still won't
be injected.

What is so hard to understand here?

regards,
alexander.

Bart van Ingen Schenau · Apr 5, 2013

Why affecting observable behavior is problematic?

Because that is the only thing in which an actual execution is NOT
allowed to deviate from what the abstract machine would do.
And adhering strictly to the semantics of the abstract machine would make
just about any optimization technique impossible.

"An instance of the abstract machine can thus have more than one
possible execution for a given program and a given input."

1.9p3

Yes, and the actual execution must correspond to the observable behavior
of one of those possible executions of the abstract machine.
That non-determinism is the price you pay for having the possibility of
concurrency.

Regards, Michael

Bart v Ingen Schenau

Bart van Ingen Schenau · Apr 5, 2013

On Apr 4, 1:53 pm, Bart van Ingen Schenau

It would be if you prove that according to the standard C++11 abstract
machine cannot deadlock, say, on two threads executing

thread 1:
m1.lock();
m1.unlock();
m2.lock();
m2.unlock();

thread 2:
m2.lock();
m2.unlock();
m1.lock();
m1.unlock();

Yes, I know, if this code is executed "sequentially" it cannot deadlock,
but I mean C++11 abstract machine

The C++ abstract machine executes the code faithfully as written without
any form of optimization being applied. This means that the abstract
machine can't ever deadlock on the code above because at no time does a
single thread hold both locks at the same time.

I am not completely sure you are right here, saying that non-reordered
code execution is what defines the execution of abstract machine. Here
is example of program which can not deadlock if executed in "ordered"
way, but might deadlock if executed with reordering. And I suppose you
should agree that BOTH executions are what the standard allows, i.e.
they should both conform to the abstract machine.

thread 1:
// atomic1 and atomic2 are initially 0 (zero)
atomic1.store(1, memory_order_relaxed);
atomic2.store(2, memory_order_relaxed);

thread2:
int a2 = atomic2.load(memory_order_seq_cst);
int a1 = atomic1.load(memory_order_seq_cst);
if(a2==2 && a1==0)
GO_AND_DEADLOCK_OUR_PROGRAM(); // never returns,
// blocks all threads ))

thread2 will execute GO_AND_DEADLOCK_OUR_PROGRAM() only if assignments
to atomic1 and atomic2 in thread1 were reordered by compiler (hardware
reordering aside). And such execution is allowed by standard!

You are wrong here. It does NOT take a reordering of the two stores in
thread 1 to have thread 2 deadlock. It is sufficient that the results
from those writes become visible to thread 2 out of order. This is a
possibility, even for the abstract machine.
Clause 1.10/5 notes that relaxed atomic operations do not provide
synchronization and clause 1.10/6 notes that modifications on different
atomic objects can become visible to other threads in inconsistent
orders. Together, this allows the deadlock to occur in the abstract
machine even without reordering the stores.

That is what you see in my last example. Ordered execution does not
deadlock. Reordered execution deadlocks. Both executions are allowed for
observable behavior of abstract machine.

Now if you agree with this, then probably your argument is not enough to
explain why mutex operations cannot be reordered.

I don't agree here, because non-reordered execution can also deadlock.
That is the trouble with relaxed memory order. Change three atomics with
relaxed memory order and six threads can see different orders of
modification all at the same time (and that is a valid execution of the
abstract machine).

Regards, Michael

Bart v Ingen Schenau

Alexander Terekhov · Apr 5, 2013

Bart van Ingen Schenau wrote:

[... modifying a bunch of atomics under relaxed memory order ...]

I don't agree here, because non-reordered execution can also deadlock.
That is the trouble with relaxed memory order. Change three atomics with
relaxed memory order and six threads can see different orders of
modification all at the same time (and that is a valid execution of the
abstract machine).

The key is whether compiler/implementation is allowed to come up with
six different sequences of modifications and choose the order randomly
at runtime - that way reordering may be observable even if all threads
share the same view of memory (run on a uniprocessor/same core... i.e.
'hardware' memory model is SC).

regards,
alexander.

Michael Podolsky · Apr 5, 2013

Because that is the only thing in which an actual execution is NOT
allowed to deviate from what the abstract machine would do.
And adhering strictly to the semantics of the abstract machine would make
just about any optimization technique impossible.

Yes, and the actual execution must correspond to the observable behavior
of one of those possible executions of the abstract machine.

Which means that if we change the observable behavior but still stay
inside the allowed SET of executions of NONDETERMINISTIC abstract
machine, we are all right.

The criteria of violation of the standard is not changing of
observable behavior from a confirming one to another, but(!) going out
of SET of possible executions of abstract machine.

Regards, Michael

Michael Podolsky · Apr 5, 2013

Because observable behavior is sacred. It can't be changed unless
there is undefined behavior. All programming languages are for
describing observable behavior and so are on one step above
assembly languages (that describe machine instructions). C++ is
not exception there.

Three is no THE "observable behavior". There is a "SET allowable
behaviors" (I am citinng 1.9p3). There is no problem to change
observable behavior of your program if you remain inside this set.

And btw this has nothing to "undefined behavior". All this set of
allowable behaviors is not undefined, i.e. is well-defined by the
standard.

Regards, Michael

Michael Podolsky · Apr 5, 2013

The following can not be reordered (reordering in whatever manner may
not be observed):

m1.lock() "sequenced before" m2.lock()
m1.lock() "sequenced before" m2.unlock()
m1.unlock() "sequenced before" m2.unlock()

Taking the same mutex twice??? Hmmm... Did you actually mean
"m2.lock()" in the second line?

Reordering of m1.unlock() "sequenced before" m2.lock() may be observed
using (sane) trylock().

BTW, I actually share your pity about what they did with trylock in C+
+.
Could not they do two flavors of trylock - one for CS and one (sane)
just for the case of people who want real real trylock and not smth
undeterministic.

Regards, Michael

Alexander Terekhov · Apr 5, 2013

Michael said:
Taking the same mutex twice??? Hmmm... Did you actually mean
"m2.lock()" in the second line?

I mean it as in

the following can not be reordered (reordering in whatever manner may
not be observed):

m1.lock() "sequenced before" m2.lock()

also, the following can not be reordered either:

m1.lock() "sequenced before" m2.unlock()

and, the following can not be reordered as well:

m1.unlock() "sequenced before" m2.unlock()

is it clear now?

regards,
alexander.

?? Tiib · Apr 5, 2013

Three is no THE "observable behavior". There is a "SET allowable
behaviors" (I am citing 1.9p3).

That is why I snipped it as irrelevant. On cases of unspecified
behaviors (something may or may not happen) standard tends to tell the
limits of allowable behaviors. (not so on case of undefined behaviors).

There is no problem to change
observable behavior of your program if you remain inside this set.

That is cite of ?1.9/3? Mine copy has such: "Certain other aspects
and operations of the abstract machine are described in this
International Standard as unspecified (for example, order of evaluation
of arguments to a function). Where possible, this International Standard
defines a set of allowable behaviors. These define the
nondeterministic aspects of the abstract machine. An instance of the
abstract machine can thus have more than one possible execution for a
given program and a given input."

And btw this has nothing to "undefined behavior". All this set of
allowable behaviors is not undefined, i.e. is well-defined by the
standard.

What set of allowable behaviors for what unspecified behavior you are
talking about? What paragraph?

Michael Podolsky · Apr 6, 2013

....
is it clear now?

Alexander, I agree with the ordering rules you wrote (they are
naturally follow from the fact that lock is "acquire" and unlock is
"release".

I just meant that in your example you are locking your mutex twice.
Just look at your code. m1 is locked twice. That I do not understand
if you really meant to write it.

Regards, Michael

Alexander Terekhov · Apr 6, 2013

Michael Podolsky wrote:
[...]

I just meant that in your example you are locking your mutex twice.
Just look at your code. m1 is locked twice. That I do not understand
if you really meant to write it.

Yeah, and m2 is doubly unlocked similarly... seriously, it is just a
list of paired "sequenced before" ops on m1 and m2 that may not be
observed as reordered. This list of three pairs is not supposed to be
"run" by the same thread line by line in whatever order if that is how
you see it.

regards,
alexander.

Bart van Ingen Schenau · Apr 6, 2013

Bart van Ingen Schenau wrote:

[... modifying a bunch of atomics under relaxed memory order ...]

I don't agree here, because non-reordered execution can also deadlock.
That is the trouble with relaxed memory order. Change three atomics
with relaxed memory order and six threads can see different orders of
modification all at the same time (and that is a valid execution of the
abstract machine).

Click to expand...

The key is whether compiler/implementation is allowed to come up with
six different sequences of modifications and choose the order randomly
at runtime - that way reordering may be observable even if all threads
share the same view of memory (run on a uniprocessor/same core... i.e.
'hardware' memory model is SC).

If you use knowledge that lies outside the realm covered by the standard
(all threads share the exact same view of the memory), then you can
indeed observe that certain optimizations have take place, such as
reordering certain statements.
This is similar to observing that the compiler has eliminated some code
by analyzing the runtime characteristics of a program.

Both can only be observed by using knowledge that is present about a
particular real implementation that is not available for the abstract
machine.

All of this is perfectly fine, as long as the observable behaviour (as
defined by the standard) corresponds to the observable behaviour of one
of the possible executions of the abstract machine.

regards,
alexander.

Bart v Ingen Schenau

?? Tiib · Apr 6, 2013

On Apr 5, 6:27?am, Bart van Ingen Schenau

Which means that if we change the observable behavior but still stay
inside the allowed SET of executions of NONDETERMINISTIC abstract
machine, we are all right.

Observable behavior and access to volatile objects are the only way how
one can decide if the behavior is valid (within set) or not.

The criteria of violation of the standard is not changing of
observable behavior from a confirming one to another, but(!) going out
of SET of possible executions of abstract machine.

That set of allowable behaviors only applies to unspecified behavior.
Standard is always explicit about each unspecified behavior. So tell us
what unspecified behavior you are talking about?

Mutex is synchronization object so compiler can not deduce if there will
be code for some other thread compiled or not (that may make misbehavior
of mutex observable. That makes it quite hard for a compiler to compile
it into something that deviates from what is required by standard.

Bart van Ingen Schenau · Apr 7, 2013

That set of allowable behaviors only applies to unspecified behavior.
Standard is always explicit about each unspecified behavior. So tell us
what unspecified behavior you are talking about?

My guess would be, the unspecified order in which modifications made to
objects become visible in other threads when no synchronization
primitives are used, and the unspecified order in which different threads
get to do their work.

Mutex is synchronization object so compiler can not deduce if there will
be code for some other thread compiled or not (that may make misbehavior
of mutex observable. That makes it quite hard for a compiler to compile
it into something that deviates from what is required by standard.

I don't understand what you try to say here.

Bart v Ingen Schenau

James Kanze · Apr 7, 2013

Why affecting observable behavior is problematic?

"An instance of the abstract machine can thus have more than one
possible execution for a given program and a given input."

1.9p3

Because the expression is a short-hand for "changing observable
behavior in a way that results in behavior which is not in the
set of possible behaviors, as defined by the standard." It's
certain that in an expression like f() + g(), where both f() and
g() contain observable behavior, the compiler can change the
order of these observable behaviors depending on the level of
optimization. On the other hand, it cannot interleave the
functions in a way that would introduce undefined behavior
(which is not present in the original code). It can only
interleave execution to the extent that is can prove that the
resulting behavior corresponds to a possible behavior without
the interleaving.

?? Tiib · Apr 7, 2013

My guess would be, the unspecified order in which modifications made to
objects become visible in other threads when no synchronization
primitives are used, and the unspecified order in which different threads
get to do their work.

Yes, but mutex is synchronization primitive and it is described how it helps
things to become visible in other threads in specified order.

I don't understand what you try to say here.

I meant that compiler may reorder mutexes or even optimize those out if
it can prove that such reordering or removal does not result with things becoming
observably desynchronized. I also meant that on common case (a case when mutex
synchronizes anything at all) such proof is hard to imagine.

Bart van Ingen Schenau · Apr 9, 2013

Yes, but mutex is synchronization primitive and it is described how it
helps things to become visible in other threads in specified order.

That is right, but the unspecified order in which thread do their work
means that even with synchronization by mutexes there are multiple
possible orders in which the observable side effects can occur.

I meant that compiler may reorder mutexes or even optimize those out if
it can prove that such reordering or removal does not result with things
becoming observably desynchronized. I also meant that on common case (a
case when mutex synchronizes anything at all) such proof is hard to
imagine.

I agree with that.

Bart v Ingen Schenau

Öö Tiib · Apr 9, 2013

That is right, but the unspecified order in which thread do their work
means that even with synchronization by mutexes there are multiple
possible orders in which the observable side effects can occur.

Yes, that mostly relies on fact that non-atomic side effect in one
thread is not observable in other thread without data race (undefined
behavior) unless the side effect "happens before" observing. The synchronization objects like mutexes are meant for to achieve that
"happens before" e.g. synchronization.

I understand that if such "happens before" is achieved then accessing
atomic objects or acquiring and releasing ownership of other mutexes
(not the one used for achieving "happens before" above) are also
observable side effects (there even can not be data races). That is
why I asked what unspecified behavior OP meant.

C++0x memory model and atomics, some questions	5	Sep 1, 2010
Subtle difference between C++0x MM and other MMs	25	Aug 24, 2008
Mutex entirely in ANSI-C possible, without I/O routines?	4	Nov 8, 2004
Mutex question / request	1	Jul 25, 2004
Synchronization Algorithm Verificator for C++0x	31	Aug 2, 2008
Locking objects in an array	25	May 5, 2009
Anything in the language to support better recognition of vector operations?	5	Nov 8, 2010
Suspected Memory Leak in Multithread queue implmenetation	2	Jan 15, 2007

What in C++11 prohibits mutex operations from being reordered?

Alexander Terekhov

?? Tiib

Alexander Terekhov

Bart van Ingen Schenau

Bart van Ingen Schenau

Alexander Terekhov

Michael Podolsky

Michael Podolsky

Michael Podolsky

Alexander Terekhov

?? Tiib

Michael Podolsky

Alexander Terekhov

Bart van Ingen Schenau

?? Tiib

Bart van Ingen Schenau

James Kanze

?? Tiib

Bart van Ingen Schenau

Öö Tiib

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads