question re. usage of "static" within static member functions of aclass

Joshua Maurice · Sep 9, 2009

This isn't any more thread-safe than any of the previous versions. In C
++0x, the static version will be thread-safe, but this version still
won't be.

REH

Really? This is pretty thread-safe. Specifically, there will be
exactly 1 object creation of type "Data" for the entire program
(assuming the program doesn't die during static init). That is, there
will be exactly 1 object creation as long as there are no concurrent
calls to Data::instance before main if this is in the same static lib
as main, or as long as there are no concurrent calls to Data::instance
in the static init of the dll. These are generally safe assumptions as
non-trivial threads are usually not made in static init, but it's
still good to note such assumptions / requirements explicitly.

So, exactly what isn't thread-safe about this approach?

Also, what about C++0x will make this thread-safe? C++0x guarantees
that the initializer of a function local static is called exactly once

ourInstance = new Data ;

is not an initializer of a function local static, so the new rules of C
++0x play no part in guaranteeing its safety.

REH · Sep 9, 2009

Really? This is pretty thread-safe. Specifically, there will be
exactly 1 object creation of type "Data" for the entire program
(assuming the program doesn't die during static init). That is, there
will be exactly 1 object creation as long as there are no concurrent
calls to Data::instance before main if this is in the same static lib
as main, or as long as there are no concurrent calls to Data::instance
in the static init of the dll. These are generally safe assumptions as
non-trivial threads are usually not made in static init, but it's
still good to note such assumptions / requirements explicitly.

I didn't notice where James was initializing the static object with
instance(). I read it wrong. Although, I think since now the static
object is accessed outside the function, there is potential
initialization ordering issues if another static object utilizes
ourInstance.

So, exactly what isn't thread-safe about this approach?

Also, what about C++0x will make this thread-safe? C++0x guarantees
that the initializer of a function local static is called exactly once
on the first pass, even when called concurrently. In this case, the
relevant portion we care about:> ourInstance = new Data ;

is not an initializer of a function local static, so the new rules of C
++0x play no part in guaranteeing its safety.

Your arguing the wrong code. I said the example given by the first
poster would be thread-safe in C++0x.

REH

Chris M. Thomasson · Sep 10, 2009

James Kanze said:
news:[email protected]...

Click to expand...

During a code review, I found the following lines of code:

Click to expand...

[...]

Click to expand...

The "instance" method was implemented as follows:
Data* Data::instance()
{
static Data* model = new Data();
return model;
}
I have never come across a situation where a pointer was set
to static in such a case. Is this valid?

Click to expand...

Click to expand...

It's a singleton.

Click to expand...

And to answer the question, it's perfectly valid. A pointer is
an object, just like any other variable, and obeys the same
rules as other variables.

The storage that `model' points to will never be destroyed,
also it's not thread-safe.

Click to expand...

Not being destroyed is presumably the reason the code is written
this way. Most of the time, you don't want a singleton to be
destructed. In other word, it's a feature designed to avoid
pitfalls. As for thread-safety, it depends on the
implementation, it is thread safe---modulo bugs---in g++. (But
you're probably right for most implementations.)

You can get around static initialization and destruction ordering issues by
using a strongly thread-safe smart pointer to manage the singleton. The
pseudo-code would be something like the following pseudo-code:
_____________________________________________________________________
template<typename T>
static local_ptr<T>
once()
{
static global_ptr<T> g_instance;

local_ptr<T> lptr(g_instance);

if (! lptr)
{
static pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_mutex_lock(&g_mutex);

lptr = g_instance;

if (! lptr)
{
try
{
lptr.reset(new T());
}

catch (...)
{
pthread_mutex_unlock(&g_mutex);

throw;
}

g_instance = lptr;
}

pthread_mutex_unlock(&g_mutex);
}

return lptr;
}
_____________________________________________________________________

This is strongly thread-safe and will always work no matter how the static
ctor/dtor ordering comes out. The caveat, well, it's definitely not as
efficient as using a raw pointer and explicitly leaking the singleton.

Joshua Maurice · Sep 10, 2009

During a code review, I found the following lines of code:
[...]
The "instance" method was implemented as follows:
Data* Data::instance()
{
static Data* model = new Data();
return model;
}
I have never come across a situation where a pointer was set
to static in such a case. Is this valid?
It's a singleton.

Click to expand...

Click to expand...

And to answer the question, it's perfectly valid. A pointer is
an object, just like any other variable, and obeys the same
rules as other variables.

Click to expand...

Not being destroyed is presumably the reason the code is written
this way. Most of the time, you don't want a singleton to be
destructed. In other word, it's a feature designed to avoid
pitfalls. As for thread-safety, it depends on the
implementation, it is thread safe---modulo bugs---in g++. (But
you're probably right for most implementations.)

Click to expand...

You can get around static initialization and destruction ordering issues by
using a strongly thread-safe smart pointer to manage the singleton. The
pseudo-code would be something like the following pseudo-code:
_____________________________________________________________________
template<typename T>
static local_ptr<T>
once()
{
static global_ptr<T> g_instance;

local_ptr<T> lptr(g_instance);

if (! lptr)
{
static pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_mutex_lock(&g_mutex);

lptr = g_instance;

if (! lptr)
{
try
{
lptr.reset(new T());
}

catch (...)
{
pthread_mutex_unlock(&g_mutex);

throw;
}

g_instance = lptr;
}

pthread_mutex_unlock(&g_mutex);
}

return lptr;}

_____________________________________________________________________

This is strongly thread-safe and will always work no matter how the static
ctor/dtor ordering comes out. The caveat, well, it's definitely not as
efficient as using a raw pointer and explicitly leaking the singleton.

There are so many things wrong with that code sample, I don't even
know where to start. (Exaggeration. I do know where to start.)

Firstly and most importantly, you're using double checked locking,
which is broken in effectively all C++ implementations. Don't do that.
Please read, continue to re-read if you don't get it, this excellent
paper:
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

Next, you also have a race condition on the construction of the mutex
itself, in addition to the double checked locking of the singleton
instance construction. PTHREAD_MUTEX_INITIALIZER is entirely
equivalent to calling pthread_mutex_init, and thus is not thread-safe.
Repeat:

lptr.reset(new T());

is not properly guarded (double checked locking) and

static pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;

is not properly guarded (simple race condition). Both are race
conditions when the first call of this function may be concurrent.

Also, users may want multiple singletons of the same type. One unit of
code makes a singleton of type T, and another piece of code entirely
separate will make another singleton of type T. Your code does not
support that. Instead, you will get exactly 1 instance of T across the
entire program. This is a significant limitation. You could
potentially get it around it with something like:
//header code
template <typename singleton_t, typename once_type>
singleton_t & getSingleton();

//code using the header to make a singeton in a hpp
class singleton_x_type { /* ... */ };
class once_type_for_singleton_X {};
singleton_x_type & getSingletonX()
{
return getSingleton<singleton_x_type, once_type_for_singleton_X>
();
}
If another piece of code wanted their own singleton of type X, then
they could define their own once_type and instantiate getSingleton on
that new once_type.

Lastly, RAII is your friend. Don't ever explicitly call
pthread_mutex_lock or pthread_mutex_unlock except in your portability
layer mutex wrapper class. This is not a correctness issue per se, but
vastly improves the chances that the code will be correct.

Your solution could be made correct by eager initializing it by adding
namespace { bool force_init =

nce<your_type>(), true); }
However, at that point, your fixed code is pretty much equivalent to
several examples already posted, aka:
T& getSingleton()
{ static T* x = new T;
return *x;
}
namespace { bool force_init = (getSingleton(), true); }

Chris M. Thomasson · Sep 10, 2009

You can get around static initialization and destruction ordering issues
by
using a strongly thread-safe smart pointer to manage the singleton. The
pseudo-code would be something like the following pseudo-code:
_____________________________________________________________________ [...]
_____________________________________________________________________

This is strongly thread-safe and will always work no matter how the
static
ctor/dtor ordering comes out. The caveat, well, it's definitely not as
efficient as using a raw pointer and explicitly leaking the singleton.

Click to expand...

There are so many things wrong with that code sample, I don't even
know where to start. (Exaggeration. I do know where to start.)

Actually, well... How much experience do you have wrt multi-threading
issues?

Firstly and most importantly, you're using double checked locking,
which is broken in effectively all C++ implementations.

I explicitly stated that one can:

"get around static initialization and destruction ordering issues by using a
strongly thread-safe smart pointer to manage the singleton"

Do you have any idea what I am writing about here? Go ahead and put it on
the self because this discussion can get advanced rather quickly!

Don't do that.

;^)

Na. Anyway, double-checked locking IS 100% workable, period. You just don't
know it yet. That is NOT my problem. Oh well, shi% happens.

Please read, continue to re-read if you don't get it, this excellent
paper:
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

What about it? Anyway, that paper is WELL known to me. Been there, done
that. Yawn.

Next, you also have a race condition on the construction of the mutex
itself,

;^/

Learn about POSIX:

http://www.opengroup.org/onlinepubs/7990989775/xsh/pthread_mutex_init.html

in addition to the double checked locking of the singleton
instance construction.

This is PERFECTLY fine. You just need to learn how to do it properly.

PTHREAD_MUTEX_INITIALIZER is entirely
equivalent to calling pthread_mutex_init, and thus is not thread-safe.

:^O

Learn about POSIX:

http://www.opengroup.org/onlinepubs/7990989775/xsh/pthread_mutex_init.html

Repeat:
is not properly guarded (double checked locking) and
Yawn.

is not properly guarded (simple race condition). Both are race
conditions when the first call of this function may be concurrent.

I am getting tired.

Also, users may want multiple singletons of the same type. One unit of
code makes a singleton of type T, and another piece of code entirely
separate will make another singleton of type T. Your code does not
support that. Instead, you will get exactly 1 instance of T across the
entire program. This is a significant limitation.

FINALLY! You actually make sense; congratulations!

:^D

You could
potentially get it around it with something like:
//header code
template <typename singleton_t, typename once_type>
singleton_t & getSingleton();

//code using the header to make a singeton in a hpp
class singleton_x_type { /* ... */ };
class once_type_for_singleton_X {};
singleton_x_type & getSingletonX()
{
return getSingleton<singleton_x_type, once_type_for_singleton_X>
();
}
If another piece of code wanted their own singleton of type X, then
they could define their own once_type and instantiate getSingleton on
that new once_type.
Sure.

Lastly, RAII is your friend. Don't ever explicitly call
pthread_mutex_lock or pthread_mutex_unlock except in your portability
layer mutex wrapper class.

Sure. Although, using try/catch is 100% perfectly fine. Whatever, this was
pseudo-code; lol. Anyway, here is example of RAII:

http://groups.google.com/group/comp.lang.c++/msg/63c3aecb9d0fbaa6

Blah, blah.

This is not a correctness issue per se, but
vastly improves the chances that the code will be correct.

Yes; RAII is very convenient.

Your solution could be made correct by eager initializing it by adding
namespace { bool force_init = nce<your_type>(), true); }

My pseudo-code IS correct. IMVHO, you obviously need some education, that's
all; there is absolutely NOTHING to be ashamed about. I personally LOVE to
earn __all__ about new things!!!

However, at that point, your fixed code is pretty much equivalent to
several examples already posted, aka:
T& getSingleton()
{ static T* x = new T;
return *x;
}
namespace { bool force_init = (getSingleton(), true); }

NO!

;^/

Chris M. Thomasson · Sep 10, 2009

Chris M. Thomasson said:
You can get around static initialization and destruction ordering
issues
by
using a strongly thread-safe smart pointer to manage the singleton. The
pseudo-code would be something like the following pseudo-code:
_____________________________________________________________________ [...]
_____________________________________________________________________

This is strongly thread-safe and will always work no matter how the
static
ctor/dtor ordering comes out. The caveat, well, it's definitely not as
efficient as using a raw pointer and explicitly leaking the singleton.

Click to expand...

Click to expand...

There are so many things wrong with that code sample, I don't even
know where to start. (Exaggeration. I do know where to start.)

Click to expand...

Actually, well... How much experience do you have wrt multi-threading
issues?

Firstly and most importantly, you're using double checked locking,
which is broken in effectively all C++ implementations.

Click to expand...

I explicitly stated that one can:

"get around static initialization and destruction ordering issues by using
a strongly thread-safe smart pointer to manage the singleton"

Do you have any idea what I am writing about here? Go ahead and put it on
the self because this discussion can get advanced rather quickly!

Ummm, that last sentence should read as:

Go ahead and put it on the SHELF because this discussion can get advanced
rather quickly!

[...]

Sorry about that.

Chris M. Thomasson · Sep 10, 2009

Sorry for being so darn harsh!

;^(....

James Kanze · Sep 10, 2009

On Sep 8, 11:39 pm, James Kanze <[email protected]> wrote:

[...]

Well, to be precise, the last version above isn't even
applicable, because the ctor is private. I've already
straightened all the points above, partly by myself, mostly
with Paavo, as you can see from all the posts between the one
you are quoting from me and the one I'm replying to. I suppose
you didn't receive them. Freakin' NNTP.

I saw that latter. The problem isn't that I don't see articles;
the problem is that I don't see them in the same order. So I
respond to one article before seeing the next.

[...]

Thank you for showing an example of your implementation, it
helps me confronting the different approaches and getting a
better grip on the subject.

Actually, I posted it because I thought it might intregue you.
It does depend on a somewhat subtle feature of C++: the fact
that Data:

urInstance is initialized twice, first to NULL, and
then with the initialization expression.

James Kanze · Sep 10, 2009

There are so many things wrong with that code sample, I don't
even know where to start. (Exaggeration. I do know where to
start.)

Firstly and most importantly, you're using double checked
locking, which is broken in effectively all C++
implementations. Don't do that. Please read, continue to
re-read if you don't get it, this excellent
paper:http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

I doubt that Chris would have missed the issues addressed in
that paper. He doesn't tell us what local_ptr is, so it's hard
to say, but presumably, it takes steps to ensure the necessary
synchronization. (Double checked locking is possible, at least
on most platforms, provided you're not adverse to a little
assembler. It's not possible with just C++ and mutexes,
however, but I think it can be done under Posix using thread
local storage. Whether it's worth the bother is a different
question, given that much simpler solutions exist.)

Until we know what's behind local_ptr and global_ptr, however,
there's no way to be sure.

Next, you also have a race condition on the construction of
the mutex itself, in addition to the double checked locking of
the singleton instance construction. PTHREAD_MUTEX_INITIALIZER
is entirely equivalent to calling pthread_mutex_init, and thus
is not thread-safe.

No. PTHREAD_MUTEX_INITIALIZER is static initialization,
guaranteed to occur before any user written code is executed.

Joshua Maurice · Sep 10, 2009

On Sep 9 said:
On Sep 9 said:

You can get around static initialization and destruction ordering issues
by
using a strongly thread-safe smart pointer to manage the singleton. The
pseudo-code would be something like the following pseudo-code:
_____________________________________________________________________ [...]
_____________________________________________________________________
This is strongly thread-safe and will always work no matter how the
static
ctor/dtor ordering comes out. The caveat, well, it's definitely not as
efficient as using a raw pointer and explicitly leaking the singleton..

Click to expand...

There are so many things wrong with that code sample, I don't even
know where to start. (Exaggeration. I do know where to start.)

Click to expand...

Actually, well... How much experience do you have wrt multi-threading
issues?

Firstly and most importantly, you're using double checked locking,
which is broken in effectively all C++ implementations.

Click to expand...

I explicitly stated that one can:

"get around static initialization and destruction ordering issues by using a
strongly thread-safe smart pointer to manage the singleton"

Do you have any idea what I am writing about here? Go ahead and put it on
the self because this discussion can get advanced rather quickly!

Ok, so apparently your "strong thread-safe" pointer class has barrier
semantics on "operator!" and "reset". If that's the case, then yes, it
does make this double checked locking safe. I'm sorry here.

However, next time please emphasize that "my pointer class has barrier
semantics on operator! and reset". You should see the recent post
about what "thread-safe" means in terms of boost shared pointer. The
general result of that thread was several people 'knew' the definition
of thread-safe in the context of boost shared pointer, except that
each definition was different.

;^/

Learn about POSIX:

http://www.opengroup.org/onlinepubs/7990989775/xsh/pthread_mutex_init...

Quoting Open Group

In cases where default mutex attributes are appropriate, the macro PTHREAD_MUTEX_INITIALIZER can be used to initialise mutexes that are statically allocated. The effect is equivalent to dynamic initialisation by a call to pthread_mutex_init() with parameter attr specified as NULL, except that no error checks are performed.

Now, I admit this has always been vague to me. Let's see. It says that
"the effect is equivalent to dynamic initialization by a call to
pthread_mutex_init". Unfortunately, I'm used to the C++ world where
\dynamic initialization\ means something very specific. Under that
definition, using it as the initializor of a function local static
would be a race condition. However, it seems that C does not allow
function local statics to be initialized with anything but a constant
expression (as determined by comeau online), making the code what C++
calls \static initialization\. I am thus entirely wrong on this point.
I'm sorry.

And yes, I rarely use posix directly. I'm in a world where my code has
to work on windows and posix, and all windows mutexes require runtime
init, so I have to program to the lowest common denominator. I'm still
wrong, just explaining my background.

--
That leaves us with your code, which while being obfuscated, is also
correct, given a certain latitude in your favor aka the exact
definition of a "strongly thread-safe pointer class". If your pointer
class just uses pthread mutexes, then your example, once simplified of
redundant mutexes, is equivalent to

struct pthread_mutex_guard
{ pthread_mutex_guard(pthread_mutex_t & y) : x(y) { pthread_mutex_lock
(x); }
~pthread_mutex_guard() { pthread_mutex_unlock(x); }
pthread_mutex_t & x;
};
template <typename T>
T& once()
{
static T* t = 0;
{
static pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_guard g(m);
if (0 == t)
t = new T;
}
return *t;
}

If your pointer class makes use of read and write memory barriers,
then it's equivalent to

struct pthread_mutex_guard
{ pthread_mutex_guard(pthread_mutex_t & y) : x(y) { pthread_mutex_lock
(x); }
~pthread_mutex_guard() { pthread_mutex_unlock(x); }
pthread_mutex_t & x;
};
template <typename T>
T& once()
{
static T* t = 0;
if (0 == t)
{
static pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_guard g(m);
if (0 == t)
{
T* tmp = new T;
write_barrier(); /* platform specific */
t = tmp;
}
}
read_barrier(); /* platform specific */
return *t;
}

Either way, in this context of trying to demonstrate the proper way of
writing a singleton, one should not hide away crucial implementation
details as you did. It is not useful to other readers to say "here's
what it looks like, except for the important implementation details".
You used something that looked very much like the "common" double
checked locking anti-pattern, but with "hidden magic" which made it
correct. That will only help perpetuate the myth that the "common"
double checked locking anti-pattern is correct.

PS: Note that both of my solutions in this post are relatively devoid
of error checking. You should probably at least assert on the return
code of pthread functions.

PPS: All windows standard mutexes have runtime init, so you cannot do
either of these approaches without significant modifications, or
without rolling your own mutex built on some of the atomic primitives
like test and swap. See:
http://www.ddj.com/cpp/199203083?pgno=7
for a good discussion on how to do this.

Alternatively, use a namespace scope variable to force construction
before main (or before dlopen returns for static init of a dll) to
"guarantee" correctness as demonstrated in countless posts in this
thread.

Joshua Maurice · Sep 10, 2009

I doubt that Chris would have missed the issues addressed in
that paper. He doesn't tell us what local_ptr is, so it's hard
to say, but presumably, it takes steps to ensure the necessary
synchronization. (Double checked locking is possible, at least
on most platforms, provided you're not adverse to a little
assembler. It's not possible with just C++ and mutexes,
however, but I think it can be done under Posix using thread
local storage. Whether it's worth the bother is a different
question, given that much simpler solutions exist.)

Until we know what's behind local_ptr and global_ptr, however,
there's no way to be sure.

No. PTHREAD_MUTEX_INITIALIZER is static initialization,
guaranteed to occur before any user written code is executed.

Indeed. I should probably think a little more before I speak, err
type. I was having a bad day. Thank you James and I'm sorry again
Chris.

Francesco · Sep 10, 2009

[...]

And doesn't avoid the order of initialization issues the other
versions were designed to avoid.

Click to expand...

Well, to be precise, the last version above isn't even
applicable, because the ctor is private. I've already
straightened all the points above, partly by myself, mostly
with Paavo, as you can see from all the posts between the one
you are quoting from me and the one I'm replying to. I suppose
you didn't receive them. Freakin' NNTP.

Click to expand...

I saw that latter. The problem isn't that I don't see articles;
the problem is that I don't see them in the same order. So I
respond to one article before seeing the next.

Fine, no problem.

[...]

Thank you for showing an example of your implementation, it
helps me confronting the different approaches and getting a
better grip on the subject.

Click to expand...

Actually, I posted it because I thought it might intregue you.
It does depend on a somewhat subtle feature of C++: the fact
that Data:urInstance is initialized twice, first to NULL, and
then with the initialization expression.

Uh, well, I've read your code but I didn't think about the fact that
ourInstance had to be initialized before being compared to NULL - in
other words, I missed to catch the subtlety.

Can I rely on this default, hidden initialization for static members
of any type?

Something like this:
-------
class A {
static int data;
public:
static int get_data();
};

int A::data = get_data();

int A::get_data() {
if (data == 0) data = 42;
return data;
}
-------

Is the above guaranteed to work for any type?
(using the appropriate value on the rhs of ==, I mean)

Best regards,
Francesco

Chris M. Thomasson · Sep 10, 2009

James Kanze said:
You can get around static initialization and destruction
ordering issues by using a strongly thread-safe smart
pointer to manage the singleton. The pseudo-code would be
something like the following pseudo-code:
_____________________________________________________________________ [...]
_____________________________________________________________________
This is strongly thread-safe and will always work no matter
how the static ctor/dtor ordering comes out. The caveat,
well, it's definitely not as efficient as using a raw
pointer and explicitly leaking the singleton.

Click to expand...

Click to expand...

There are so many things wrong with that code sample, I don't
even know where to start. (Exaggeration. I do know where to
start.)

Click to expand...

Firstly and most importantly, you're using double checked
locking, which is broken in effectively all C++
implementations. Don't do that. Please read, continue to
re-read if you don't get it, this excellent
paper:http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

Click to expand...

I doubt that Chris would have missed the issues addressed in
that paper. He doesn't tell us what local_ptr is, so it's hard
to say, but presumably, it takes steps to ensure the necessary
synchronization. (Double checked locking is possible, at least
on most platforms, provided you're not adverse to a little
assembler. It's not possible with just C++ and mutexes,
however, but I think it can be done under Posix using thread
local storage. Whether it's worth the bother is a different
question, given that much simpler solutions exist.)

Until we know what's behind local_ptr and global_ptr, however,
there's no way to be sure.

global/local_ptr was meant to represent a smart pointer class that can
achieve strong thread-safety. Here is a full blown implementation of such a
smart pointer:

http://atomic-ptr-plus.sourceforge.net

http://sourceforge.net/projects/atomic-ptr-plus/files
(download the atomic-ptr-plus package)

in this case, global_ptr == atomic_ptr.

BTW, any general purpose multi-threaded smart pointer that did not have
release semantics for storing into, and acquire semantics for loading from,
is basically busted by default.

;^)

You can also take a look at my experimental smart pointer here:

http://webpages.charter.net/appcore/vzoom/refcount

[...]

Chris M. Thomasson · Sep 10, 2009

On Sep 9 said:
On Sep 9 said:

You can get around static initialization and destruction ordering
issues
by
using a strongly thread-safe smart pointer to manage the singleton.
The
pseudo-code would be something like the following pseudo-code:
_____________________________________________________________________ [...]

_____________________________________________________________________

Click to expand...

This is strongly thread-safe and will always work no matter how the
static
ctor/dtor ordering comes out. The caveat, well, it's definitely not
as
efficient as using a raw pointer and explicitly leaking the
singleton.
There are so many things wrong with that code sample, I don't even
know where to start. (Exaggeration. I do know where to start.)

Click to expand...

Actually, well... How much experience do you have wrt multi-threading
issues?

Click to expand...

I explicitly stated that one can:

"get around static initialization and destruction ordering issues by
using a
strongly thread-safe smart pointer to manage the singleton"

Do you have any idea what I am writing about here? Go ahead and put it
on
the self because this discussion can get advanced rather quickly!

Click to expand...

Ok, so apparently your "strong thread-safe" pointer class has barrier
semantics on "operator!" and "reset".

Humm... Why would it need any memory barrier on `operator ! ()'? The load
from the global pointer to the local pointer already has to have implied
acquire semantics, so putting membar in `operator !' would be adding
needless overhead. As for `reset()', it has to have implied release
semantics.

If that's the case, then yes, it
does make this double checked locking safe. I'm sorry here.

However, next time please emphasize that "my pointer class has barrier
semantics on operator! and reset".

I thought it was a given. How could any general purpose multi-threaded smart
pointer class work without implied acquire on load and release on store
memory barrier semantics?

You should see the recent post
about what "thread-safe" means in terms of boost shared pointer.

You mean this one:

http://groups.google.com/group/comp.lang.c++/browse_frm/thread/f363f8a8010f9323

?

The
general result of that thread was several people 'knew' the definition
of thread-safe in the context of boost shared pointer, except that
each definition was different.

I just know that shared_ptr only provides the basic/normal thread-safety
level. I gave several examples in that discussion.

Quoting Open Group
In cases where default mutex attributes are appropriate, the macro
PTHREAD_MUTEX_INITIALIZER
can be used to initialise mutexes that are statically allocated. The
effect is equivalent to
dynamic initialisation by a call to pthread_mutex_init() with parameter
attr specified as
NULL, except that no error checks are performed.

Now, I admit this has always been vague to me. Let's see. It says that
"the effect is equivalent to dynamic initialization by a call to
pthread_mutex_init". Unfortunately, I'm used to the C++ world where
\dynamic initialization\ means something very specific. Under that
definition, using it as the initializor of a function local static
would be a race condition. However, it seems that C does not allow
function local statics to be initialized with anything but a constant
expression (as determined by comeau online), making the code what C++
calls \static initialization\. I am thus entirely wrong on this point.
I'm sorry.

And yes, I rarely use posix directly. I'm in a world where my code has
to work on windows and posix, and all windows mutexes require runtime
init, so I have to program to the lowest common denominator. I'm still
wrong, just explaining my background.

That's fine.

That leaves us with your code, which while being obfuscated, is also
correct, given a certain latitude in your favor aka the exact
definition of a "strongly thread-safe pointer class". If your pointer
class just uses pthread mutexes, then your example, once simplified of
redundant mutexes, is equivalent to

[...]

One point, a smart pointer class that is 100% lock-based is fine. However,
it can become a fairly significant bottleneck.

If your pointer class makes use of read and write memory barriers,
then it's equivalent to

[...]

FWIW, most systems, DEC Alpha aside for a moment, would have
`read_barrier()' defined to be a no-op because of implicit data-dependant
load barrier (e.g., Linux `read_barrier_depends()').

Either way, in this context of trying to demonstrate the proper way of
writing a singleton, one should not hide away crucial implementation
details as you did.

I did not want to bother showing an implementation of the smart pointer just
to get across that it has acquire/release on load/store respectively. Humm,
perhaps that was a mistake!

;^(

It is not useful to other readers to say "here's
what it looks like, except for the important implementation details".
You used something that looked very much like the "common" double
checked locking anti-pattern, but with "hidden magic" which made it
correct. That will only help perpetuate the myth that the "common"
double checked locking anti-pattern is correct.

Fair enough.

PS: Note that both of my solutions in this post are relatively devoid
of error checking. You should probably at least assert on the return
code of pthread functions.

Yes. Refer to one of my other posts in this thread:

http://groups.google.com/group/comp.lang.c++/msg/63c3aecb9d0fbaa6

Take note of the `PTHREAD_UNEXPECTED' macro.

;^)

PPS: All windows standard mutexes have runtime init, so you cannot do
either of these approaches without significant modifications, or
without rolling your own mutex built on some of the atomic primitives
like test and swap. See:
http://www.ddj.com/cpp/199203083?pgno=7
for a good discussion on how to do this.

Humm... One quote from that page has me a bit "uneasy":

"If no wait is necessary, locking a QLock requires just one atomic
instruction and no system calls"

The author fails to mention that along with the atomic instruction comes a
very nasty and performance destroying `#StoreLoad | #StoreStore' memory
barrier. This will be executed regardless of any contention.

Alternatively, use a namespace scope variable to force construction
before main (or before dlopen returns for static init of a dll) to
"guarantee" correctness as demonstrated in countless posts in this
thread.

If you are fine with leaking the singleton and are a disciplined programmer,
then it will work.

Chris M. Thomasson · Sep 10, 2009

Humm...

There is really no "guarantee" of correctness wrt this technique in the
presence of threads because it is not thread-safe in any way, shape or form.
It's not a general purpose construct unless there is explicit documentation
that documents all the caveats.

I can easily break the heck out of it. For instance, check this crap out:
_____________________________________________________________________
#include <pthread.h>
#include <sched.h>
#include <cstdio>

void yield()
{
for (unsigned i = 0; i < 10; ++i)
{
sched_yield();
}
}

template<typename T, unsigned T_id = 0>
class non_destroying_singleton
{
static T* g_instance;

public:
static T& instance();
};

template<typename T, unsigned T_id>
T& non_destroying_singleton<T, T_id>::instance()
{
yield();

if (! g_instance)
{
yield();

g_instance = new T();

yield();
}

yield();

return *g_instance;
}

template<typename T, unsigned T_id>
T* non_destroying_singleton<T, T_id>::g_instance =
&non_destroying_singleton<T, T_id>::instance();

struct foo
{
foo()
{
std:

rintf("(%p)->foo::foo()\n", (void*)this);
}

~foo()
{
std:

rintf("(%p)->foo::~foo()\n", (void*)this);
}
};

extern "C"
void* killer_thread(void*)
{
std:

uts("killer_thread");

non_destroying_singleton<foo>::instance();

return NULL;
}

struct killer
{
pthread_t m_tid[10];

killer()
{
std:

uts("killer::killer()");

for (unsigned i = 0; i < 10; ++i)
{
pthread_create(&m_tid, NULL, killer_thread, NULL);
}
}

~killer()
{
std:uts("killer::~killer()");

for (unsigned i = 0; i < 10; ++i)
{
pthread_join(m_tid, NULL);
}
}
};

static killer g_killer;

int
main()
{
std:uts("main");

return 0;
}
_____________________________________________________________________

Oh shi%! Massive race-condition here... `foo' can be created multiple times
and leaked all over the damn place. Here is the output I happened to get:
_____________________________________________________________________
killer::killer()
killer_thread
killer_thread
killer_thread
killer_thread
killer_thread
killer_thread
(0024B7B8)->foo::foo()
killer_thread
killer_thread
(0024B808)->foo::foo()
(0024B818)->foo::foo()
(0024B828)->foo::foo()
killer_thread
killer_thread
main
killer::~killer()
_____________________________________________________________________

KA-BOOOOOOM!!!!!!!!!!!

:^o

If you are fine with leaking the singleton and are a disciplined
programmer, then it will work.

Click to expand...

You need to be very disciplined.

;^)

Chris M. Thomasson · Sep 10, 2009

[...]

PPS: All windows standard mutexes have runtime init, so you cannot do
either of these approaches without significant modifications, or
without rolling your own mutex built on some of the atomic primitives
like test and swap. See:
http://www.ddj.com/cpp/199203083?pgno=7
for a good discussion on how to do this.

Alternatively, use a namespace scope variable to force construction
before main (or before dlopen returns for static init of a dll) to
"guarantee" correctness as demonstrated in countless posts in this
thread.

Windows has everything one needs in order to create a 100% correct DCL
pattern. The following code will work with Windows, modulo bugs of course
because I just hacked it together:
______________________________________________________________________
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <sstream>
#include <cassert>
#include <exception>
#include <cstdio>

class win_dcl_mutex
{
HANDLE m_mutex;

private:
static std::string prv_get_name()
{
std:

stringstream name;

name << "DCL_MUTEX_" << GetCurrentProcessId();

return name.str();
}

public:
win_dcl_mutex() throw()
: m_mutex(CreateMutex(NULL, TRUE, prv_get_name().c_str()))
{
if (! m_mutex)
{
assert(m_mutex);
std::unexpected();
}

else if (GetLastError() == ERROR_ALREADY_EXISTS)
{
if (WaitForSingleObject(m_mutex, INFINITE) !=
WAIT_OBJECT_0)
{
assert(m_mutex);
CloseHandle(m_mutex);
std::unexpected();
}
}
}

~win_dcl_mutex() throw()
{
if (! ReleaseMutex(m_mutex))
{
assert(m_mutex);
CloseHandle(m_mutex);
std::unexpected();
}

if (! CloseHandle(m_mutex))
{
assert(m_mutex);
std::unexpected();
}
}
};

template<typename T, unsigned T_id = 0>
class win_non_destroying_singleton
{
static T* prv_load(T* volatile* pptr)
{
T* ptr1 = *pptr;

if (! ptr1)
{
ptr1 = (T*)InterlockedCompareExchangePointer(
(void* volatile*)pptr, NULL, NULL);
}

if (ptr1)
{
// does `InterlockedCompareExchange()' peform a
// membar on failure? here is a funny work-around:

T* ptr2 = (T*)InterlockedExchangePointer(
(void* volatile*)pptr, ptr1);

if (ptr1 != ptr2)
{
assert(ptr1 == ptr2);
std::unexpected();
}
}

return ptr1;
}

static T* prv_store(T* volatile* pptr, T* ptr1)
{
assert(ptr1);

T* ptr2 = (T*)InterlockedExchangePointer(
(void* volatile*)pptr, ptr1);

if (ptr2)
{
assert(! ptr2);
std::unexpected();
}

return ptr1;
}

public:
static T& instance()
{
static T* g_instance = NULL;

T* local = prv_load(&g_instance);

if (! local)
{
win_dcl_mutex lock;

if (! (local = g_instance))
{
local = prv_store(&g_instance, new T());
}
}

return *local;
}
};

int
main()
{
{
int& x1 = win_non_destroying_singleton<int>::instance();
int& x2 = win_non_destroying_singleton<int>::instance();
}

return 0;
}

______________________________________________________________________

BTW, I had to code the `win_non_destroying_singleton<T>:

rv_load()'
function that way because Microsoft does not document whether or not
`InterlockedCompareExchange()' performs a memory barrier on the failure
case. If it did, then it could simply look like this:
______________________________________________________________________
static T* prv_load(T* volatile* pptr)
{
return (T*)InterlockedCompareExchangePointer(
(void* volatile*)pptr, NULL, NULL);
}
______________________________________________________________________

Also, if you are using a recent version of MSVC (e.g., I think it's 8 or
higher), then the class can look like:
______________________________________________________________________
template<typename T, unsigned T_id = 0>
struct win_non_destroying_singleton
{
static T& instance()
{
static T* volatile g_instance = NULL;

T* local = g_instance;

if (! local)
{
win_dcl_mutex lock;

if (! (local = g_instance))
{
local = g_instance = new T();
}
}

return *local;
}
};
______________________________________________________________________

This is because those versions of MSVC have acquire/release semantics for
loading/storing from/to volatile variables on certain platforms (e.g.,
PowerPC)...

Joshua Maurice · Sep 10, 2009

Windows has everything one needs in order to create a 100% correct DCL
pattern. The following code will work with Windows, modulo bugs of course
because I just hacked it together:

[snip probably correct impl]

I should probably just shut up and take my losses. I am still right
that there is no mutex on windows without runtime init, but your
workaround, which Boost also uses to do boost_once IIRC, is quite
good.

I forget which versions, but windows also have their own native
version of pthread_once, so you could also use that. Also, as you
mentioned, volatile would work, for whichever VS versions and
platforms make it into barriers.

James Kanze · Sep 10, 2009

On 10 Set, 13:04, James Kanze <[email protected]> wrote:

[...]

Uh, well, I've read your code but I didn't think about the
fact that ourInstance had to be initialized before being
compared to NULL - in other words, I missed to catch the
subtlety.

You missed the fact that the function used to initialize the
variable uses the previous value of the variable.

In fact, it's a clever trick, with both the positive and
negative implications of "clever". The suggestion else thread
of using a separate boolean variable to accomplish this achieves
exactly the same results, and is doubtlessly more readable.

Can I rely on this default, hidden initialization for static
members of any type?

For any variable. The standard describes the initialization of
objects with static lifetime as occuring in three phases: zero
initialization, static initialization and dynamic
initialization. In practice, there's no way to distinguish the
first two, since both occur before any code you write is
executed. (In practice, on most systems, the first two are done
by the system, when the program is loaded, either by copying an
image from disk or by zapping the block containing the variables
with 0's. This only works, of course, on systems where null
pointers and 0.0 are represented by all 0 bits, but that's the
vast majority of the systems today.`)

Something like this:
-------
class A {
static int data;
public:
static int get_data();
};

int A::data = get_data();

int A::get_data() {
if (data == 0) data = 42;
return data;
}
-------

Is the above guaranteed to work for any type?
(using the appropriate value on the rhs of ==, I mean)

Or using 0. Yes. For class types, it applies recursively,
until you reach a basic type which can be "zero initialized".
The one exception is references, since by definition, a
reference can't be zero initialized.

Jerry Coffin · Sep 11, 2009

[ ... ]

Firstly and most importantly, you're using double checked locking,
which is broken in effectively all C++ implementations. Don't do that.
Please read, continue to re-read if you don't get it, this excellent
paper:
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

This is a decent paper, but it should be kept in context. Up until
shortly before he (helped) write that paper, Andrei seems to have
thought that the 'volatile' keyword was sufficient to give assurances
necessary for multithreading (in essence that reading or writing a
volatile variable acted as a memory barrier).

That paper followed shortly after he realized that this just wasn't
so. The tone of the paper is unfortunate though -- it comes off as
basically saying there's a problem with double-checked locking, which
really isn't the case at all. The problem is that C++ (up through the
2003 standard) simply lacks memory barriers. Double-checked locking
is one example of code that _needs_ a memory barrier to work
correctly -- but it's only one example of many.

Chris, OTOH, has been playing with multi-threaded programming for a
while. He quite blithely takes for granted a fair number of things
that the current standard doesn't provide at all -- mostly because
without those, there's basically no way to discuss the areas that
interest him. A couple of those are objects with acquire and release
semantics -- i.e. with memory barriers in the right places. The
minute you add a few multithreading primitives like that, the
problems with DCLP disappear.

The real problem was never with the DCLP itself, but with attempting
to do multi-threaded programming without the tools necessary for the
job.

Joshua Maurice · Sep 11, 2009

[ ... ]

Firstly and most importantly, you're using double checked locking,
which is broken in effectively all C++ implementations. Don't do that.
Please read, continue to re-read if you don't get it, this excellent
paper:
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

Click to expand...

This is a decent paper, but it should be kept in context. Up until
shortly before he (helped) write that paper, Andrei seems to have
thought that the 'volatile' keyword was sufficient to give assurances
necessary for multithreading (in essence that reading or writing a
volatile variable acted as a memory barrier).

That paper followed shortly after he realized that this just wasn't
so. The tone of the paper is unfortunate though -- it comes off as
basically saying there's a problem with double-checked locking, which
really isn't the case at all. The problem is that C++ (up through the
2003 standard) simply lacks memory barriers.

Sorry what? There's something which resembling threading guarantees in
C++03?

Double-checked locking
is one example of code that _needs_ a memory barrier to work
correctly -- but it's only one example of many.

Chris, OTOH, has been playing with multi-threaded programming for a
while. He quite blithely takes for granted a fair number of things
that the current standard doesn't provide at all -- mostly because
without those, there's basically no way to discuss the areas that
interest him. A couple of those are objects with acquire and release
semantics -- i.e. with memory barriers in the right places. The
minute you add a few multithreading primitives like that, the
problems with DCLP disappear.

The real problem was never with the DCLP itself, but with attempting
to do multi-threaded programming without the tools necessary for the
job.

The problem is that most people I talk to at my work, they have
certain notions of how threading works. One of these notions is that
one can reason about threading by considering all possible
interleavings of instructions (which, of course, is incorrect). Sadly,
most of my colleagues still don't "get it". The proto-typical example
I use: "Suppose a thread does a write A and a write B in source code.
Physically after that processor core does both load instructions,
another thread may see write A and not write B, another thread may see
write B and not see write A, another thread may see both, and another
thread may see neither." If one was incorrectly taught threading,
which is the case I think for most university graduates nowadays, it
is very hard to break your old habits and understand how threading
really works.

Ex: One reply I just got was "What if it was:
x = a;
y = x;
Surely another thread cannot see the write to y and not see the write
to x?" (BTW, no, a thread may still see the write to y without seeing
the write to x.)

DCL is the most common application I see of this flawed reasoning. The
heart of DCL is that one relies on investigating the possible
interleavings of source code "instructions" to see if it will work
out. IMHO, as soon as you modify DCL to be correct in C or C++, it is
no longer DCL. You have fundamentally changed the basis of your belief
of its correctness when you add in proper synchronization. No longer
are you doing a single check outside of all synchronization, the
hallmark of DCL. It only superficially resembles DCL.

Now, arguably, your post Jerry and mine is merely a definition over
terms. I hope I've generated more light than heat, but I don't think
it will be useful to get into a discussion of whether DCL is by
definition broken or merely usually incorrectly implemented. Define it
however you want. I still think that DCL by definition is incorrect in
C and C++, and modifications to make it correct render it no longer
DCL.

problem compiling static library	7	May 27, 2014
destruction of static member	2	Dec 18, 2009
Declaring a static const member of a class.	4	Mar 2, 2012
Why are member functions allowed to access private data members of aclass?	19	Jun 27, 2010
A question about initilization of a static member	9	Mar 22, 2007
linkage of static and inline functions	5	Nov 27, 2007
memory usage of an object's member functions	1	Dec 30, 2004
template class and static member variables	5	Nov 3, 2006

question re. usage of "static" within static member functions of aclass

Joshua Maurice

REH

Chris M. Thomasson

Joshua Maurice

Chris M. Thomasson

Chris M. Thomasson

Chris M. Thomasson

James Kanze

James Kanze

Joshua Maurice

Joshua Maurice

Francesco

Chris M. Thomasson

Chris M. Thomasson

Chris M. Thomasson

Chris M. Thomasson

Joshua Maurice

James Kanze

Jerry Coffin

Joshua Maurice

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads