I already posted this to comp.programming.threads, but the group over
there seems to be arguing from a purely theoretical standpoint rather
than addressing the reality of compiler optimizations and processor
architecture. I was hoping for a more straight-forward answer from
this group.
The theoretical standpoint is the best one though, since it is the
only way to write portable code.
You should be asking in a Windows programming group, since the
question is about Windows synchronization.
I'm worried that the main thread will read the value of _bRunning in
the test() function, and then when it comes time to do the assert,
simply re-use the existing value it has already read, rather than
going back to main memory to get the new value which was set by the
other thread. Is this a valid concern?
Yes, I think so. I would have thought that SetEvent would use release
memory semantics, and WaitForSingleObject acquire ones, in which case
it is fine as is. However, Microsoft don't document this...
Does _bRunning need to be tagged as volatile?
I can't find enough information to say...
My understanding from school and reading is that it does, but that
supposition blows so many holes in the way most MT programs I've seen
are written its scary.
volatile is not needed as long as critical sections or mutexes are
used. However, you aren't using them in your example.
For example, here's another example.. even
simpler:
class B
{
public:
B() : m_pList(NULL) {}
void add( int n )
{
CSingleLock( &crit, TRUE );
if( m_pList == NULL )
m_pList = new std::list<int>();
m_pList->push_back( n );
}
int size()
{
CSingleLock( &crit, TRUE );
if( m_pList == NULL )
return 0;
return m_pList->size();
}
private:
CCriticalSection crit;
std::list<int> * m_pList;
};
In this example, does m_pList need to be declared as
std::list<int> * volatile m_pList
No, since the critical section ensures that each thread/processor
reloads the relevant members from main memory, and writes them out
when releasing the section.
Suppose I have a dozen threads all competing to add items to this
list, without volatile, is there a possibility that one thread will
have a "cached" value of m_pList which is out of date?
Nope, the semantics of critical sections prevent that.
If the threads
only called add(), it seems that there is no way for the NULL value of
m_pList to be cached in a register; what if the threads also called
size() prior to calling add, however, would this introduce a
possibility for the pointer value to be stored in a register?
Given the relatively small set of registers on Intel processors, how
big an issue is volatile on that platform? What effect does locality
of referencing have? If I reference the same variable 30 C lines apart
from each other does that make a difference from accessing it 2 lines
apart? Do non-inlined function-calls change this behavior?
volatile isn't sufficient on a multiprocessor machine, since it only
forces a reload for that CPU, not a complete cache synchronization
with main memory. Memory barriers do the necessary cache
synchronization, and they are used in the various synchronization
functions.
What are the general guidelines for using volatile?
Use it with the Interlocked* functions when you just want single
atomic values. Also, I suspect it is useful when writing kernel device
driver type code.
I'm no expert though - the lot over in comp.programming.threads are,
or you might try a Windows group.
Tom