parallelism: how can I ensure data is seen by another thread or consumer?

M

Markus

Hello everyone.

Recently I stumbled upon an interesting problem related to thread-parallel
programming in C (and similarily C++). As an example assume a simple "buffer"
array of size 8, e.g. with static lifetime and exteral linkage. One thread fills
the buffer structure, and the other in some way evaluates its contents, e.g.

do {
synchronize_with_other_thread();
for (int i=0;i<8;++i) {
sum += buffer;
}
} while (sum < THRESHOLD);

From what I understand, the programming standard (C99) assumes a single serial
instruction stream. To ensure each loop iteration retrieves current values from
buffer[], one has to make it volatile, which prevents a lot of useful
optimizations by the compiler. In any other case, the compiler would be
_allowed_ to keep the whole contents of buffer[] in the register file and
operate on that.

It is clear, that in practice various ways exist to avoid "volatile" for buffer
[] and allow the compiler to optimize mor aggressively. Especially, one could do
the evaluation (here the sum) in a non-inline function. For data with external
linkage, any library call inside "synchronize_with_other_thread()" will be
sufficient, too, as a compiler cannot assure buffer[] is not affected by it. But
both solutions rely on the inablilities of compiler and linker, and would not be
guaranteed to work with an "omniscient" compiler that is allowed to perform
inter-procedure optimizations.

Clearly related to this, I could not find out about the actual meaning of
casting volatile pointers to non-volatile. Or applied to the example, is there
really a difference from the version above to

do {
synchronize_with_other_thread();
int* buffer_nonvolatile = (int*) buffer; // buffer being qualified volatile
for (int i=0;i<8;++i) {
sum += buffer_nonvolatile;
}
} while (sum < THRESHOLD);

The problem I see here is that only the contents of buffer[] are volatile, not
its address. *buffer_nonvolatile is for sure invariant for all while-iterations
(if it is static), and the semantics are identical with just not declaring
buffer[] volatile.

I'd really be glad if someone could comment on that, either correcting or
confirming my assumptions.

Best regards,
Markus
 
M

MisterE

do {
synchronize_with_other_thread();
int* buffer_nonvolatile = (int*) buffer; // buffer being qualified
volatile
for (int i=0;i<8;++i) {
sum += buffer_nonvolatile;
}
} while (sum < THRESHOLD);


To do that you need buffer to be a volatile pointer, not a pointer to a
volatile integer.
This way the compiler will assign buffer_nonvolatile to buffers value each
time through the loop, thus it *could* change value and a compiler won't
optimise it away right?

Wrong. A smart compiler will figure out to store the last value of the non
volatile pointer, and if its the same value this time, it will use the same
sum values instead of fetching them again. So you really need buffer to be a
volatile pointer to a volatile integer, which means your buffer_nonvolatile
pointer does have to be pointing to volatile data after all.... in other
words your code is wrong and if it wasn't it would be pointless anyway. You
should just make buffer_nonvolatile volatile. But i guess this depends on
what your sync is doing.

You do know that:
volatile int *foo;
declares a pointer to a volatile integer, not a volatile pointer.
To do a volatile pointer to a normal integer you would do: int * volatile
foo;
and a volatile pointer to a volatile integer you do: volatile int * volatile
foo;
 
M

MisterE

do {
synchronize_with_other_thread();
int* buffer_nonvolatile = (int*) buffer; // buffer being qualified
volatile
for (int i=0;i<8;++i) {
sum += buffer_nonvolatile;
}
} while (sum < THRESHOLD);


To do that you need buffer to be a volatile pointer, not a pointer to a
volatile integer.
This way the compiler will assign buffer_nonvolatile to buffers value each
time through the loop, thus it *could* change value and a compiler won't
optimise it away right?

Wrong. A smart compiler will figure out to store the last value of the non
volatile pointer, and if its the same value this time, it will use the same
sum values instead of fetching them again. So you really need buffer to be a
volatile pointer to a volatile integer, which means your buffer_nonvolatile
pointer does have to be pointing to volatile data after all.... in other
words your code is wrong and if it wasn't it would be pointless anyway. You
should just make buffer_nonvolatile volatile. But i guess this depends on
what your sync is doing.

You do know that:
volatile int *foo;
declares a pointer to a volatile integer, not a volatile pointer.
To do a volatile pointer to a normal integer you would do: int * volatile
foo;
and a volatile pointer to a volatile integer you do: volatile int * volatile
foo;
 
M

Markus

Thanks for your reply, MisterE.
do {
synchronize_with_other_thread();
int* buffer_nonvolatile = (int*) buffer; // buffer being qualified
volatile
for (int i=0;i<8;++i) {
sum += buffer_nonvolatile;
}
} while (sum < THRESHOLD);


To do that you need buffer to be a volatile pointer, not a pointer to a
volatile integer.
This way the compiler will assign buffer_nonvolatile to buffers value each
time through the loop, thus it *could* change value and a compiler won't
optimise it away right?

Wrong. A smart compiler will figure out to store the last value of the non
volatile pointer, and if its the same value this time, it will use the same
sum values instead of fetching them again. So you really need buffer to be a
volatile pointer to a volatile integer, which means your buffer_nonvolatile
pointer does have to be pointing to volatile data after all.... in other
words your code is wrong and if it wasn't it would be pointless anyway. You
should just make buffer_nonvolatile volatile. But i guess this depends on
what your sync is doing.
The problem I see here is that only the contents 
of buffer[] are volatile, not its address.


Sorry if my explanation was not clear enough, but I think you did not
contradict, but basically confirm what I assumed (but was not really sure
about): There is no other means of ensuring current data from the buffer is read
but declaring and using the data volatile. It is not possible to temporarily
have non-volatile access to otherwise volatile data.


Perhaps I did not make it clear enough that my idea was to get rid of the
volatile-ness in some compute intensive kernel. Assume, e.g., buffer[] contains
data for an image, and a complex filter--having enough potential for
optimization by the compiler--should be applied to it. If you write parallel
applications, preventing aggressive compiler optimizations by having everything
volatile is most probably not what you want.

You do know that:
volatile int *foo;
declares a pointer to a volatile integer, not a volatile pointer.
To do a volatile pointer to a normal integer you would do: int * volatile
foo;
and a volatile pointer to a volatile integer you do: volatile int * volatile
foo;

I did not really think about it, although I know this distinction in context of
the "const" qualifier. But thank you for the clarification.

Thanks,
Markus
 
B

Ben Bacarisse

Markus said:
Sorry if my explanation was not clear enough, but I think you did
not contradict, but basically confirm what I assumed (but was not
really sure about): There is no other means of ensuring current data
from the buffer is read but declaring and using the data
volatile. It is not possible to temporarily have non-volatile access
to otherwise volatile data.

Perhaps I did not make it clear enough that my idea was to get rid
of the volatile-ness in some compute intensive kernel. Assume, e.g.,
buffer[] contains data for an image, and a complex filter--having
enough potential for optimization by the compiler--should be applied
to it. If you write parallel applications, preventing aggressive
compiler optimizations by having everything volatile is most
probably not what you want.

Yes, and it is very unlikely that you have to resort to that sort of
thing. The trouble is that you won't find out here. Standard C has
nothing to say about concurrency and what it has to say about volatile
is not enough for you know what is and is not safe.

The system you are programming for must provide thread primitives and
its documentation (or a Usenet group about it) is the only place where
you will find out what is and is not guaranteed. There will, most
likely, be some simply synchronisation primitive that will allow the
producer to put a pointer to a new frame into some queue where it can
be consumed by the filter without any interference.

If there is not, then you need to build one. Ask, say, in
comp.programming.threads about building a semaphore from whatever
atomic memory operations your system provides.
 
M

Markus

Ben Bacarisse said:
Perhaps I did not make it clear enough that my idea was to get rid
of the volatile-ness in some compute intensive kernel. Assume, e.g.,
buffer[] contains data for an image, and a complex filter--having
enough potential for optimization by the compiler--should be applied
to it. If you write parallel applications, preventing aggressive
compiler optimizations by having everything volatile is most
probably not what you want.
Yes, and it is very unlikely that you have to resort to that sort of
thing. The trouble is that you won't find out here. Standard C has
nothing to say about concurrency and what it has to say about volatile
is not enough for you know what is and is not safe.
The system you are programming for must provide thread primitives and
its documentation (or a Usenet group about it) is the only place where
you will find out what is and is not guaranteed. There will, most
likely, be some simply synchronisation primitive that will allow the
producer to put a pointer to a new frame into some queue where it can
be consumed by the filter without any interference.

Even adding a real queue (what my example resembled is a single-entry queue),
the problem behind does not change, at least if you "re-use" your buffer storage
instead of allocating fresh memory every time. My point is, that the consumer
thread WILL get references to the SAME memory address sooner or later. If that
does not point to volatile data, there is no reason for the consumer to assume
that the referenced data has changed.

In practice, all this is usually not a concern, as the compiler will not create
code checking and exploiting that (although it would be allowed to). But from a
theoretical point of view, the buffer storage needs to be qualified volatile
IMHO.
If there is not, then you need to build one. Ask, say, in
comp.programming.threads about building a semaphore from whatever
atomic memory operations your system provides.

Sorry, my question was actually not about threading and synchronization (despite
volatile variables *are* actually sufficient for the synchronization necessary
in my little example). By the way, I thought comp.lang.c was the better place
for my problem than comp.programming.threads, because my problem is not related
to practical aspects of thread programming, and is also encountered in non-
threaded code.


Perhaps I can bring all my quite long emails down to the following question:

According to the language standard (not from a practical view that exploits the
weaknesses of compilation): Is it necessary or not to have each and every object
that is changed by one and read by another thread after proper synchronization
to be qualified volatile if one wants to ensure the second one also gets the new contents?
 
B

Ben Bacarisse

Markus said:
Perhaps I can bring all my quite long emails down to the following question:

According to the language standard (not from a practical view that
exploits the weaknesses of compilation): Is it necessary or not to
have each and every object that is changed by one and read by
another thread after proper synchronization to be qualified volatile
if one wants to ensure the second one also gets the new contents?

We may be talking past each other. The C standard say nothing about
concurrency and very little about volatile. In practise the two
concepts are separate and rarely interact: volatile is not enough to
implement even the simplest concurrency control[1] and it is rarely
required by C extensions that provide concurrency.

Whatever extension you are using that provides the concurrency must do
so with some basic set of primitives. These are what you need to
use. Qualifying shared arrays as volatile is unlikely to be
required. I can't be more specific because I don't know what you are
using, so I can only make general remarks.

Finally, do consider posting in comp.programming.threads. There are
helpful people there and some real experts about everything from
concurrent systems design to the lowest-level memory barrier issues.
for one thing, nothing else can be said from the point of view of
standard C (the topic here).

[1] It *may* be enough, but that would be an accident of one
particular compiler/target machine combination. Standard C guarantees
that accesses won't be optimised away, but nothing more. If standard
C ever embraces concurrency it will have to provide some sort of
guarantees about the memory model but I'd bet the house that it won't
do that via tightening the meaning of volatile -- it will most likely
borrow the work done in the C++ committee.
 
M

Markus

After some discussion on comp.programming.threads, I finally found the reason of
my incomprehension: Pthreads is not only "some" library, which only provides
unified access to system calls and platform-specific assembly.

This also was what Ben Bacarisse said:
[...] Whatever extension you are using that provides the concurrency must do
so with some basic set of primitives. These are what you need to use. [...]

The synchronization primitives, like mutexes and so on, do not only temporal
synchronization, but also "inform" the compiler that data might have been
changed. C does not provide such a concept itself, so compiler and pthread
library must agree on how giving that hint.

Locking a mutex is therefore more than a usual function call, it has additional
semantics a "normal" C-function could not provide. This was the point I just
didn't know and took me so long to understand.


Best regards and thanks for your help,
Markus
 
L

lawrence.jones

Ben Bacarisse said:
The C standard say nothing about concurrency

"Nothing" is a bit too strong -- the C Standard does say *something*
about concurrency in the guise of signal handlers, but not very much.
If standard C ever embraces concurrency it will have to provide some
sort of guarantees about the memory model but I'd bet the house that it
won't do that via tightening the meaning of volatile -- it will most
likely borrow the work done in the C++ committee.

Thread support is a hot topic for C1X, so it's likely that the C
Standard *will* embrace concurrency in the not too distant future. And
yes, we are borrowing heavily from the work being done in C++.
 
N

Nick Keighley

"Nothing" is a bit too strong -- the C Standard does say *something*
about concurrency in the guise of signal handlers, but not very much.

does the stuff about sequence points allow some sort
of concurrency. As in the order of execution isn't
completly specified between sequence points.

I'm probably babbling hopelessly

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top