What in C++11 prohibits mutex operations from being reordered?

Michael Podolsky · Apr 4, 2013

Probably your implementation is not correct. I think you should pass
std::memory_order_acq_rel to exchange() (which is a read-modify-write
operation) instead of just std::memory_order_acquire.

Yes, I see. But now I am ready to cite the C++11 standard itself (just
found now THAT exactly place):

Note: For example, a call that acquires a mutex will perform
an acquire operation on the locations comprising the mutex.
Correspondingly, a call that releases the same mutex will
perform a release operation on those same locations.
As you see, no mention of std::memory_order_acq_rel

Also, a simple
std::atomic_flag should be more appropriate to implement a "spinlock".

Hmmm... more appropriate? Why?

Regards, Michael

Michael Podolsky · Apr 4, 2013

again correction:

Note: For example, a call that acquires a mutex will perform
an acquire operation on the locations comprising the mutex.
Correspondingly, a call that releases the same mutex will
perform a release operation on those same locations.

that was C++11 1.10p5

Luca Risolia · Apr 4, 2013

Hmmm... more appropriate? Why?

It's faster

Michael Podolsky · Apr 4, 2013

It's faster

You are kidding!

Luca Risolia · Apr 4, 2013

You are kidding!

No..to be more precise: std::atomic_flag is the only atomic type
required to be lock-free.

?? Tiib · Apr 4, 2013

Yep, this is a very basic stuff as for memory ordering in C++. Sorry,
do not see a relation to the discussed problem.

So you find above very basic. Relaxed ready.store can't become
visible in thread 2 before data=42. Yet you do not for some reason see any relation with the mutex2 locking that you somehow feel may happen and
become visible in thread 2 before data=42 like that:

void thread_1()
{
// these lines can't be reordered in any way
data=42;
std::atomic_thread_fence(std::memory_order_release);
mutex2.lock();
// ...
// ...
}

I am in difficulties to understand what is the reason why you see
the relaxed atomic bool access can be ordered better than mutex lock?

Michael Podolsky · Apr 4, 2013

No..to be more precise: std::atomic_flag is the only atomic type
required to be lock-free.

This makes sense, I should agree.

Regards, Michael

Michael Podolsky · Apr 4, 2013

On Apr 4 said:
So you find above very basic. Relaxed ready.store can't become
visible in thread 2 before data=42. Yet you do not for some reason see any relation with the mutex2 locking that you somehow feel may happen and
become visible in thread 2 before data=42 like that:

? ?void thread_1()
? ?{
? ? ? ?// these lines can't be reordered in any way
? ? ? ?data=42;
? ? ? ?std::atomic_thread_fence(std::memory_order_release);
? ? ? ?mutex2.lock();
? ? ? ?// ...
? ? ? ?// ...
? ?}

I am in difficulties to understand what is the reason why you see
the relaxed atomic bool access can be ordered better than mutex lock?

I now see that you made an argument and intended to demonstrate with
your code that the reordering I am talking about can not happen.

I agree that "Relaxed ready.store can't become visible in thread 2
before data=42". I.e. I agree that in the code

1:? data=42;
2: std::atomic_thread_fence(std::memory_order_release);
3:? ready.store(true,std::memory_order_relaxed);

lines 1 and 3 must not be reordered by compiler.

And I do agree that in the code

1:?data=42;
2:?std::atomic_thread_fence(std::memory_order_release);
3:?mutex2.lock();

lines 1 and 3 must not be reordered by compiler.

But if we are talking about implementing of mutexes and if we decide
to use fences (as you are doing), our code will rather look like:

1:?data=42;
2:?std::atomic_thread_fence(std::memory_order_release);
3: mark mutex1 as unlocked
4:?mutex2.lock();

Here line 1 is a last line in the code section protected by mutex1,
lines 2 and 3 - pseudo-implementation of mutex1.unlock(),
line 4 is a lock on mutex 2.

What line 2 does - it prevents data=42 to sink out of critical
section, so it is guaranteed that data=42 is done before mutex1 is
marked as unlocked (the later may be relaxed, yet must be an atomic
operation).

So in the end, lines 3 and 4 may still be reordered, a fence on line 2
does not prohibit it. And that is equivalent to locking mutex2 before
unlocking mutex1.

Regards, Michael

Michael Podolsky · Apr 4, 2013

On 04/04/2013 03:49, Michael Podolsky wrote:

No. Merely considering a while loop does not prevent any statement
reordering.

I meant, it is a 'while' loop, potentially infinite, which reads an
atomic variable on each iteration. This is why the unlock() opearation
will never reordered below this operator. Just my feeling, not a c++
rule.

I.e. we could reorder across one atomic operation, may be across
two. ))) But not across infinite number of atomic operations, even if
they are relaxed.

Regards, Michael

Alexander Terekhov · Apr 4, 2013

?? Tiib wrote:
[...]

Oh. I did not understand that you meant 'pthread_mutex_trylock()'
by 'trylock()'. m.try_lock() is perhaps *not* meant to be implemented in
terms of m.try_lock_until(yesterday). I do not see that C++ thread
support library is so tightly bound to POSIX pthread design.

I don't think it's a good idea to contradict POSIX... it seems that

int main() {
// C++ mutex that can fail spuriously for try/timed ops
mutex m;
return m.try_lock(); // or m.try_lock_until(yesterday);
}

may behave differently than

int main() {
// POSIX mutex
mutex m;
return m.trylock(); // or m.timedlock(yesterday);
}

(under POSIX rules the program may not fail to acquire the mutex
provided that ctor/try{timed}lock() won't throw reporting
out-of-resources condition)

[...]

Ah whatever exact optimization there can be. It is anyway not true
that 'a.unlock()' "sequenced before" 'b.lock()' may be observably
reordered. a.try_lock() may spuriously fail so it is bad observation tool
and there are no a.trylock() in C++ so how you observe that reordering?

Using implementation that provides extra guarantee that neither
try_lock() nor timed versions can fail spuriously (basically providing
POSIX rules which is allowed by C++ std rules).

regards,
alexander.

Alexander Terekhov · Apr 4, 2013

Michael Podolsky wrote:
[...]

deadlock. Do you mean something in the following style:

m1.lock();
// m1 critical section
bool succeeded = m2.try_lock();
m1.unlock();

if(!succeeded)
m2.lock();
// m2 critical secion
m2.unlock();

m1.lock();
bool blocked = m2.lock_but_if_blocked_unlock_another(m1);
if (!blocked) m1.unlock();

would also do it (in effect allowing deadlock-free reordering on
compiler level).

regards,
alexander.

Jeff Flinn · Apr 4, 2013

SC semantics

What are SC semantics?

Thanks, Jeff

Michael Podolsky · Apr 4, 2013

What are SC semantics?

Thanks, Jeff

SC - Sequentially Consistent. More precisely it's SC DRF -
sequentially consistent data race free.

Michael Podolsky · Apr 4, 2013

Michael Podolsky wrote:

[...]

deadlock. Do you mean something in the following style:

Click to expand...

m1.lock();
// m1 critical section
bool succeeded = m2.try_lock();
m1.unlock();

Click to expand...

if(!succeeded)
? ?m2.lock();
// m2 critical secion
m2.unlock();

Click to expand...

m1.lock();
bool blocked = m2.lock_but_if_blocked_unlock_another(m1);
if (!blocked) m1.unlock();

would also do it (in effect allowing deadlock-free reordering on
compiler level).

regards,
alexander.

In these three lines I now see no option to reorder them on compile
level. Or do you mean to reorder inside the body of
lock_but_if_blocked_unlock_another function?

Regards, Michael

Alexander Terekhov · Apr 4, 2013

Michael Podolsky wrote:
[...]

In these three lines I now see no option to reorder them on compile
level. Or do you mean to reorder inside the body of
lock_but_if_blocked_unlock_another function?

I mean the transformation of

m1.lock();
m1.unlock();
m2.lock();

to

m1.lock();
bool blocked = m2.lock_but_if_blocked_unlock_another(m1);
if (!blocked) m1.unlock();

in effect reordering original to

m1.lock();
m2.lock();
m1.unlock();

but only on the fast path where deadlock can't occur.

regards,
alexander.

James Kanze · Apr 4, 2013

On Apr 3, 7:17?am, ?? Tiib <[email protected]> wrote:

Click to expand...

[...]
Yeah, sorry for being unclear. I do not get what exact part of
puzzle it is then that you miss? "sequenced before" it was not
maybe "synchronizes with" or "happens before"? Mutex access can
not be reordered.

Mutex access can be reordered under the exact same conditions as
anything else. It is *not* part of the observable behavior in
itself.

On the other hand, I've never heard of a compiler capable of
proving much of anything involving mutexes, and especially not
that changing their order doesn't affect observable behavior
(especially as it usually does).

Bart van Ingen Schenau · Apr 4, 2013

Hi Everyone,

My question is about the memory model in the context of C++11 standard.

2. I then should ask the question: which part of the standard prevents
lock() and unlock() operations on DIFFERENT mutexes to be reorered, i.e.
what prevents the following sequence:

m1.lock();
m1.unlock();
m2.lock();
m2.unlock();

to be compiled into:

m1.lock();
m2.lock();
m1.unlock();
m2.unlock();

which obviously must not be allowed because of potential deadlocks.

That would be 1.9/1.
The lock and unlock MAY be reordered, but only if the compiler can prove
that the execution of the reordered code yields the same observable
behavior as the non-reordered code would yield in the (concurrent)
abstract machine.
If the reordering introduces a deadlock that is not present in the
original program, then the observable behavior would change (non-
termination versus termination of the program).

If the mutex is implemented using volatile objects, then the reordering
is prohibited on the grounds that it would affect the order in which
volatile objects are accessed.

Regards,
Michael

Bart v Ingen Schenau

Michael Podolsky · Apr 5, 2013

Michael Podolsky wrote:

[...]

In these three lines I now see no option to reorder them on compile
level. Or do you mean to reorder inside the body of
lock_but_if_blocked_unlock_another function?

Click to expand...

I mean the transformation of

m1.lock();
m1.unlock();
m2.lock();

to

m1.lock();
bool blocked = m2.lock_but_if_blocked_unlock_another(m1);
if (!blocked) m1.unlock();

in effect reordering original to

m1.lock();
m2.lock();
m1.unlock();

but only on the fast path where deadlock can't occur.

May be I am not completely following your idea. Is there any important/
significant difference between what you are proposing to do with
lock_but_if_blocked_unlock_another and between my code which looked
like:

m1.lock();
// m1 critical section
bool succeeded = m2.try_lock();
m1.unlock();
if(!succeeded)
m2.lock();
// m2 critical secion
m2.unlock();

Regards, Michael

Michael Podolsky · Apr 5, 2013

On Apr 4, 1:53?pm, Bart van Ingen Schenau

That would be 1.9/1.

It would be if you prove that according to the standard C++11 abstract
machine cannot deadlock, say, on two threads executing

thread 1:
m1.lock();
m1.unlock();
m2.lock();
m2.unlock();

thread 2:
m2.lock();
m2.unlock();
m1.lock();
m1.unlock();

Yes, I know, if this code is executed "sequentially" it cannot
deadlock, but I mean C++11 abstract machine

The lock and unlock MAY be reordered, but only if the compiler can prove
that the execution of the reordered code yields the same observable
behavior as the non-reordered code would yield in the (concurrent)
abstract machine.

I am not completely sure you are right here, saying that non-reordered
code execution is what defines the execution of abstract machine. Here
is example of program which can not deadlock if executed in "ordered"
way, but might deadlock if executed with reordering. And I suppose you
should agree that BOTH executions are what the standard allows, i.e.
they should both conform to the abstract machine.

thread 1:
// atomic1 and atomic2 are initially 0 (zero)
atomic1.store(1, memory_order_relaxed);
atomic2.store(2, memory_order_relaxed);

thread2:
int a2 = atomic2.load(memory_order_seq_cst);
int a1 = atomic1.load(memory_order_seq_cst);
if(a2==2 && a1==0)
GO_AND_DEADLOCK_OUR_PROGRAM(); // never returns,
// blocks all threads

))

thread2 will execute GO_AND_DEADLOCK_OUR_PROGRAM() only if assignments
to atomic1 and atomic2 in thread1 were reordered by compiler (hardware
reordering aside). And such execution is allowed by standard!

If the reordering introduces a deadlock that is not present in the
original program, then the observable behavior would change (non-
termination versus termination of the program).

That is what you see in my last example. Ordered execution does not
deadlock. Reordered execution deadlocks. Both executions are allowed
for observable behavior of abstract machine.

Now if you agree with this, then probably your argument is not enough
to explain why mutex operations cannot be reordered.

If the mutex is implemented using volatile objects, then the reordering
is prohibited on the grounds that it would affect the order in which
volatile objects are accessed.

Yes. And no, as hardware may still reorder accesses to volatiles. But
I am not talking about volatiles here. And mutexes are not supposed to
be based on volatile operations.

Regards, Michael

Michael Podolsky · Apr 5, 2013

On the other hand, I've never heard of a compiler capable of
proving much of anything involving mutexes, and especially not
that changing their order doesn't affect observable behavior
(especially as it usually does).

Why affecting observable behavior is problematic?

"An instance of the abstract machine can thus have more than one
possible execution for a given program and a given input."

1.9p3

Regards, Michael

C++0x memory model and atomics, some questions	5	Sep 1, 2010
Subtle difference between C++0x MM and other MMs	25	Aug 24, 2008
Mutex entirely in ANSI-C possible, without I/O routines?	4	Nov 8, 2004
Mutex question / request	1	Jul 25, 2004
Synchronization Algorithm Verificator for C++0x	31	Aug 2, 2008
Locking objects in an array	25	May 5, 2009
Anything in the language to support better recognition of vector operations?	5	Nov 8, 2010
Suspected Memory Leak in Multithread queue implmenetation	2	Jan 15, 2007

What in C++11 prohibits mutex operations from being reordered?

Michael Podolsky

Michael Podolsky

Luca Risolia

Michael Podolsky

Luca Risolia

?? Tiib

Michael Podolsky

Michael Podolsky

Michael Podolsky

Alexander Terekhov

Alexander Terekhov

Jeff Flinn

Michael Podolsky

Michael Podolsky

Alexander Terekhov

James Kanze

Bart van Ingen Schenau

Michael Podolsky

Michael Podolsky

Michael Podolsky

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads