How to write a library with a static object?

J

Jeroen

Hi all,

I've got a question about writing a library. Let me characterize that
library by the following:

* there is a class A which is available to the user
* there is a class B that is used in severel 'underwater operations'
* there is a list which stores objects of class B

There are several issues I'm not sure about:

* Is it best practice to put everything in hpp-files, so no cpp file is
used (#include is al you need to use the lib)?
* How do I get an instance of the list? It should always be available to
my library. Is it wise to have a class C in the hpp-file with static
member 'list<class_B> my_list;' which provides me my list? Or is there a
better way to do so...

Regards,

Jeroen
 
J

John Harrison

Jeroen said:
Hi all,

I've got a question about writing a library. Let me characterize that
library by the following:

* there is a class A which is available to the user
* there is a class B that is used in severel 'underwater operations'
* there is a list which stores objects of class B

There are several issues I'm not sure about:

* Is it best practice to put everything in hpp-files, so no cpp file is
used (#include is al you need to use the lib)?

It's a possibility. Main issue I guess would be increased compile time
for your users if you have a lot of code.
* How do I get an instance of the list? It should always be available to
my library. Is it wise to have a class C in the hpp-file with static
member 'list<class_B> my_list;' which provides me my list? Or is there a
better way to do so...

The big issue here is the 'static initialisation order fiasco'. There's
no perfect answer but having your list as a static member of a global
function is the best you can do. See here

http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.12
Regards,

Jeroen

john
 
J

Jeroen

John said:
It's a possibility. Main issue I guess would be increased compile time
for your users if you have a lot of code.



The big issue here is the 'static initialisation order fiasco'.
There's no perfect answer but having your list as a static member of a
global function is the best you can do. See here

http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.12



john

OK, thanx. Very helpfull to read the link. Let me try a better
construction: if I go back to my code (yet to write...), then I have:

* class A that's available to the user
* class B that is used 'underwater', but only if the user does
something with class A
* a list for objects of class B (only accessed if the user does
something with class A...)

So maybe I am safe if I put the list as a static member of class A?

class B {
int blah;
};

class A {
private:
static list<B> my_list;
static void put_B_in_the_list(const B& b);
static B& find_B_in_the_list(int blah);

int more_blah;
};

Given the fact that the list is only accessed if the user of the library
uses class A, this should prevent the init fiasco described in the link
you gave?

Thanx again,

Jeroen
 
J

John Harrison

Jeroen said:
OK, thanx. Very helpfull to read the link. Let me try a better
construction: if I go back to my code (yet to write...), then I have:

* class A that's available to the user
* class B that is used 'underwater', but only if the user does
something with class A
* a list for objects of class B (only accessed if the user does
something with class A...)

So maybe I am safe if I put the list as a static member of class A?

class B {
int blah;
};

class A {
private:
static list<B> my_list;
static void put_B_in_the_list(const B& b);
static B& find_B_in_the_list(int blah);

int more_blah;
};

Given the fact that the list is only accessed if the user of the library
uses class A, this should prevent the init fiasco described in the link
you gave?

Thanx again,

Jeroen

No, if your user attempts to use class A in the construction of a global
object there is no guarantee that my_list will have been constructed.
You only get that guarantee by putting my_list as a static in a global
function as shown in the FAQ.

john
 
J

Jeroen

John Harrison schreef:
No, if your user attempts to use class A in the construction of a global
object there is no guarantee that my_list will have been constructed.
You only get that guarantee by putting my_list as a static in a global
function as shown in the FAQ.

john

OK, thanks John. I found that also in the FAQ. I'm not really sure if a
global function in the hpp-file may cause a problem if that hpp-file is
included in multiple source files of a project, but I can switch to the
static member-function holding the static list-pointer as shown in 10.16
of the FAQ.

Jeroen
 
J

John Harrison

Jeroen said:
John Harrison schreef:



OK, thanks John. I found that also in the FAQ. I'm not really sure if a
global function in the hpp-file may cause a problem if that hpp-file is
included in multiple source files of a project, but I can switch to the
static member-function holding the static list-pointer as shown in 10.16
of the FAQ.

Jeroen

A global function would not be a problem if it was declared inline. Same
holds for a static member function, it must be (explicitly or
implicitly) declared inline.

john
 
A

Adrian Hawryluk

John said:
No, if your user attempts to use class A in the construction of a global
object there is no guarantee that my_list will have been constructed.
You only get that guarantee by putting my_list as a static in a global
function as shown in the FAQ.

john

Excuse me, but does anyone know why this fiasco even exist? Why is
there not some dependency mechanism in place? Or something like the
static local object being constructed when the control flows to them (as
opposed to over them)?

One other thing. In that FAQ we have Fred defined as so:

// File x.cpp

#include "Fred.h"

Fred& x()
{
static Fred* ans = new Fred();
return *ans;
}


Why is it not defined like this:

// File x.cpp

#include "Fred.h"

Fred& x()
{
static Fred ans;
return &ans;
}

It looks to me that the compilers are built pretty flimsily.


Adrian
 
J

John Harrison

Adrian said:
Excuse me, but does anyone know why this fiasco even exist? Why is
there not some dependency mechanism in place? Or something like the
static local object being constructed when the control flows to them (as
opposed to over them)?

It's a good question. One issue would be that with such a scheme the
order of initialisation would be unpredictable, and that wouldn't mesh
well with the one guarantee that the standard does give, which is that
definitions within a single file are initialised in the order that they
occur in that file. I know at times I found that certainty to be useful.

There may be reasons why what you're suggesting isn't easy to achieve, I
don't know, but representatives of the major compiler wiriters work on
the C++ standards board so they wouldn't have decided on this fiasco
with out good reason.

One other thing. In that FAQ we have Fred defined as so:

// File x.cpp

#include "Fred.h"

Fred& x()
{
static Fred* ans = new Fred();
return *ans;
}


Why is it not defined like this:

// File x.cpp

#include "Fred.h"

Fred& x()
{
static Fred ans;
return &ans;
}

If you read FAQ 10.14 he explains why the pointer is used. Personally I
think the issue raised in that FAQ is more of a theoretical concern, I
never been bitten by it in practise (unlike the initalisation order fiasco).
 
A

Adrian Hawryluk

John said:
It's a good question. One issue would be that with such a scheme the
order of initialisation would be unpredictable, and that wouldn't mesh
well with the one guarantee that the standard does give, which is that
definitions within a single file are initialised in the order that they
occur in that file. I know at times I found that certainty to be useful.

I wasn't saying that order of initialisation as they occur in the source
file should change. I ment that the order of initilisation in the group
of different source files should change based on the dependency of one
source file to another. It seems reasonable and doable. 'make' can do
it based on these dependencies (when given that some compilers can spit
out dependencies based on the included files).
There may be reasons why what you're suggesting isn't easy to achieve, I
don't know, but representatives of the major compiler wiriters work on
the C++ standards board so they wouldn't have decided on this fiasco
with out good reason.

Sounds like they're lazy. :)
If you read FAQ 10.14...

Hmm, should have read a little further. :)
> ...he explains why the pointer is used. Personally I
think the issue raised in that FAQ is more of a theoretical concern, I
never been bitten by it in practise (unlike the initalisation order
fiasco).

This just sounds like another dependency issue. Argh, its sooooo doable.

I noticed you didn't say anything about that, but left it in all the
same. :)


Adrian
 
J

John Harrison

Adrian said:
I wasn't saying that order of initialisation as they occur in the source
file should change. I ment that the order of initilisation in the group
of different source files should change based on the dependency of one
source file to another. It seems reasonable and doable. 'make' can do
it based on these dependencies (when given that some compilers can spit
out dependencies based on the included files).

That can't work, for one thing dependencies only occur at runtime, they
can't be worked out in advance. Possible for constructors to contain if
statements so a compiler or linker doesn't know which branch of the if
statement will be taken and so can't work out which globals are
dependent on which.

Secondly it's perfectly possible for files to be mutally dependendent.
One global in one file requires another global in a second file, but a
different global in that second file requires yet another global in the
first file.
Sounds like they're lazy. :)

I think you're underestimating the problem.

john
 
A

Adrian Hawryluk

John said:
That can't work, for one thing dependencies only occur at runtime, they
can't be worked out in advance. Possible for constructors to contain if
statements so a compiler or linker doesn't know which branch of the if
statement will be taken and so can't work out which globals are
dependent on which.

Secondly it's perfectly possible for files to be mutally dependendent.
One global in one file requires another global in a second file, but a
different global in that second file requires yet another global in the
first file.

Yeah, I had thought about this the night I posted it. Initialisation
would have to be worked out at compile time (very difficult, but not
impossible using path analysis) or run time (much easier to implement).
Mutual dependencies would be a problem if done on the file level, so
it would have to initialise in order of occurrence in the file first,
and then order of dependency second.

This may cause some objects in a file to be initialised out of order,
but that shouldn't be a problem since the order of initialisation is
based on need. Since it is more difficult to perform path analysis on
the code, this could be done by the following algorithm:

Init static object:
1. Put object at the end of the initialisation double linked list
2. Store insertion point marker
3. Init Object.
4. Object needs other Object that is not yet initialised, so insert
other in initialisation list at insertion marker.
5. Init other Object. Loop back to 4 if another dependency is found.

So you see, this is done for each object found in a file. Destruction
would be done in reverse order of the list. Everything is fixed up nicely.
I think you're underestimating the problem.

No, I don't think so.


Adrian
 
J

John Harrison

Adrian said:
Yeah, I had thought about this the night I posted it. Initialisation
would have to be worked out at compile time (very difficult, but not
impossible using path analysis) or run time (much easier to implement).
Mutual dependencies would be a problem if done on the file level, so it
would have to initialise in order of occurrence in the file first, and
then order of dependency second.

This may cause some objects in a file to be initialised out of order,
but that shouldn't be a problem since the order of initialisation is
based on need. Since it is more difficult to perform path analysis on
the code, this could be done by the following algorithm:

Init static object:
1. Put object at the end of the initialisation double linked list
2. Store insertion point marker
3. Init Object.
4. Object needs other Object that is not yet initialised, so insert
other in initialisation list at insertion marker.
5. Init other Object. Loop back to 4 if another dependency is found.

So you see, this is done for each object found in a file. Destruction
would be done in reverse order of the list. Everything is fixed up nicely.



No, I don't think so.


Adrian

Global initialisation dependencies can only be done at run time. Imagine
a constructor which reads a file and then decides which path to go
depending on what it reads.

Something like what you propose could work (I don't follow it exactly)
although you've made one simplification. When a global object is being
constructed, a reference to another global that appears in the
initialisation list would be constructed before the original object,
whereas something that appears in the body of the constructor would be
constructed after the original object. Objects are considered
constructed when the body of the constructor is entered. But still
something like that could work.

But consider the cost. The compiler cannot know when it's compiling
function f in file A.cpp whether any other file might use function f
during the construction of a global object. So the compiler must put
'has it been constructed yet?' logic before every reference to a global
object. And note that global object doesn't just mean global variable,
it also means every global array element.

Checks would also have to be done on every pointer indirection, since a
pointer could be made to point to an uninitialised global variable. The
compiler cannot know when compiling function f which takes pointer p as
a parameter whether function f might be called during the initialisation
of a global with the p containing the address of another uninitialised
global.

I think this would be a completely unacceptable overhead.

john
 
A

Adrian Hawryluk

> AdrianHawrylukwrote:
>
>
>
> nicely.
>
>
>
>
> Global initialisation dependencies can only be done at run time. Imagine
> a constructor which reads a file and then decides which path to go
> depending on what it reads.
>
> Something like what you propose could work (I don't follow it exactly)
> although you've made one simplification. When a global object is being
> constructed, a reference to another global that appears in the
> initialisation list would be constructed before the original object,
> whereas something that appears in the body of the constructor would be
> constructed after the original object. Objects are considered
> constructed when the body of the constructor is entered. But still
> something like that could work.
>
> But consider the cost. The compiler cannot know when it's compiling
> function f in file A.cpp whether any other file might use function f
> during the construction of a global object. So the compiler must put
> 'has it been constructed yet?' logic before every reference to a global
> object. And note that global object doesn't just mean global variable,
> it also means every global array element.
> Checks would also have to be done on every pointer indirection, since a
> pointer could be made to point to an uninitialised global variable. The
> compiler cannot know when compiling function f which takes pointer p as
> a parameter whether function f might be called during the initialisation
> of a global with the p containing the address of another uninitialised
> global.

_Assuming_ that the compiler is stupid and cannot optimise, the cost is
still not that great in comparison to code that works cleanly.

However, compilers are not that stupid anymore. It should know enough
that an object 'A' is static (it is initialising it after all) or not.
And when you touch another static object from that constructor (or
member function that it calls), it would check a static bool that would
indicate if the programme is in initialisation mode or in running mode.
If in running mode, it could assume that everything is initialised and
not worry about using it. If in initialisation mode, it would determine
if that object is initialised and if not, initialise it if and only if
it about to call a member function.

As for passing pointers and references around, that is not a concern
unless it is referencing an class that is instantiated in the static
space and only if a member function or friend is called on the reference
or pointer. In which case, if in running mode assume all is well,
otherwise it is in initialisation mode so it would have to determine if
the pointer points to the static space. If it is not, then assume all
is well (as this has been allocated in the heap or is some io object),
otherwise determine if that object is not initialised and initialise it
if it is not.

You say 'consider the cost'. Well, what is that cost right now when you
call:

obj_t& getObj()
{
static obj_t* pObj = new obj_t();
return *pObj;
// -- or --
static obj_t obj();
return obj;
}

You think that is free? It is just the same as what I am describing as
that code requires the 'has it been constructed yet?' logic in it as
well. You are loosing very little if anything in speed (_slight_
pointer speed reduction for pointers to classes that have been declared
in the static space) in running mode. There will be a little more
overhead in initialisation mode, but then initialisation is usually a
little slow to begin with and that overhead will be negligible in
comparison. However you are gaining assured initialisation. What
exactly is the cost of running down an initialisation bug? Or any bug
for that matter that is beyond the control of the programmer? A lot
higher than one that they are in control, that is for sure.

Is it easy? Not like writing a Hello World programme, but writing good
programmes/algorithms rarely are. And since this is the foundation of
what we are coding on, then I don't give a flying f***. I say, "Do it
right!"
> I think this would be a completely unacceptable overhead.

There are lots of different overhead, running overhead is only a small
part. Maintenance is far greater. If a programme is modified which is
written on the edge of working and someone modifies it and it breaks
because of some initialisation problem, that could be far worse overhead
in terms of cost of maintenance.

However, if you really think it is, there is always execution code path
analysis which is used in optimisers. This too could be used to ensure
proper dependency initialisation. You said:

This may be partially true, but not entirely. Yeah you may not know
which execution path is going to be taken, but you can still setup a
dependency tree and use the methods that I have already described for
the rest 'indeterminable' paths, which would reduce the overhead that I
stated *by far*.

It can be done, it should be done. About my remark about
'representatives of the major compiler writers work on the C++ standards
board' being lazy. I'm probably wrong, they're not lazy, they just
don't want to take responsibility.


Adrian
 
J

John Harrison

Adrian said:
_Assuming_ that the compiler is stupid and cannot optimise, the cost is
still not that great in comparison to code that works cleanly.

Well I guess the point is that the compiler cannot make these kind of
optimisations. Only the linker which sees the whole program can.
Traditionally linkers are dumb. They have to cope with many different
languages so have a pretty simple model of how a program works. C++
specific optimisations are not part of traditional linkers.

It's true that more sophisticated linkers are now on the market. I don't
know of any that do the kind of optmisation that you're suggesting, but
it would be possible. In fact this discussion has increased by
understanding of what the C++ standard says and why. They have
deliberately left the door open for exactly what you are proposing. But
I don't know of any compiler/linker that actually does this kind of thing.
However, compilers are not that stupid anymore. It should know enough
that an object 'A' is static (it is initialising it after all) or not.
And when you touch another static object from that constructor (or
member function that it calls),

or global function that it calls. Almost no code can be assumed to not
require this kind of checking.

it would check a static bool that would
indicate if the programme is in initialisation mode or in running mode.
If in running mode, it could assume that everything is initialised and
not worry about using it. If in initialisation mode, it would determine
if that object is initialised and if not, initialise it if and only if
it about to call a member function.

As for passing pointers and references around, that is not a concern
unless it is referencing an class that is instantiated in the static
space and only if a member function or friend is called on the reference
or pointer.

Any form of pointer dereference would require a check. The compiler
cannot know if a particular class is instantiated in the static cpace
because it doesn't have access to the whole program, so it must include
the check on every pointer dereference. The linker could then remove the
check, but traditional linkers don't.

In which case, if in running mode assume all is well,
otherwise it is in initialisation mode so it would have to determine if
the pointer points to the static space. If it is not, then assume all
is well (as this has been allocated in the heap or is some io object),
otherwise determine if that object is not initialised and initialise it
if it is not.

You say 'consider the cost'. Well, what is that cost right now when you
call:

obj_t& getObj()
{
static obj_t* pObj = new obj_t();
return *pObj;
// -- or --
static obj_t obj();
return obj;
}

You think that is free? It is just the same as what I am describing as
that code requires the 'has it been constructed yet?'

Yes it is, but you've chosen to accept that cost by wrapping your global
access in a function. This is a good situation, if you want the safety
and you're prepared to accept the cost, then there is a way that you can.

logic in it as
well. You are loosing very little if anything in speed (_slight_
pointer speed reduction for pointers to classes that have been declared
in the static space) in running mode. There will be a little more
overhead in initialisation mode, but then initialisation is usually a
little slow to begin with and that overhead will be negligible in
comparison. However you are gaining assured initialisation. What
exactly is the cost of running down an initialisation bug? Or any bug
for that matter that is beyond the control of the programmer? A lot
higher than one that they are in control, that is for sure.

Is it easy? Not like writing a Hello World programme, but writing good
programmes/algorithms rarely are. And since this is the foundation of
what we are coding on, then I don't give a flying f***. I say, "Do it
right!"


There are lots of different overhead, running overhead is only a small
part. Maintenance is far greater. If a programme is modified which is
written on the edge of working and someone modifies it and it breaks
because of some initialisation problem, that could be far worse overhead
in terms of cost of maintenance.

However, if you really think it is, there is always execution code path
analysis which is used in optimisers. This too could be used to ensure
proper dependency initialisation. You said:


This may be partially true, but not entirely. Yeah you may not know
which execution path is going to be taken, but you can still setup a
dependency tree and use the methods that I have already described for
the rest 'indeterminable' paths, which would reduce the overhead that I
stated *by far*.

It can be done, it should be done. About my remark about
'representatives of the major compiler writers work on the C++ standards
board' being lazy. I'm probably wrong, they're not lazy, they just
don't want to take responsibility.

Your original question was 'why does this problem even exist' I didn't
know the answer when you first asked. This discussion has meant I have a
much better understanding of why it is so. I guess we are going to have
to disagree on whether the current situation is a good thing or not.
Maybe a future revision of the standard will make your suggestions
compulsory.


john
 
A

Adrian Hawryluk

John said:
Well I guess the point is that the compiler cannot make these kind of
optimisations. Only the linker which sees the whole program can.
Traditionally linkers are dumb. They have to cope with many different
languages so have a pretty simple model of how a program works. C++
specific optimisations are not part of traditional linkers.
>
It's true that more sophisticated linkers are now on the market. I don't
know of any that do the kind of optmisation that you're suggesting, but
it would be possible. In fact this discussion has increased by
understanding of what the C++ standard says and why. They have
deliberately left the door open for exactly what you are proposing. But
I don't know of any compiler/linker that actually does this kind of thing.

True. But there are compilers that generate an executable. Are all of
these compilers executing a linker programme? If not, then the compiler
can do the smarts. Otherwise, perhaps you are right in that the linker
would have to become smarter, and/or the object format would have to
change to describe dependencies to simplify this work.
or global function that it calls. Almost no code can be assumed to not
require this kind of checking.

Yeah, I meant to say any function that it calls that operates on a
static object.
it would check a static bool that would

Any form of pointer dereference would require a check. The compiler
cannot know if a particular class is instantiated in the static cpace
because it doesn't have access to the whole program, so it must include
the check on every pointer dereference. The linker could then remove the
check, but traditional linkers don't.

You are right, this sounds like it would have to be done on the linker side.
In which case, if in running mode assume all is well,

Yes it is, but you've chosen to accept that cost by wrapping your global
access in a function. This is a good situation, if you want the safety
and you're prepared to accept the cost, then there is a way that you can.

No, I've chosen to accept that accessing a static should by default have
this type of semantics.
logic in it as

Your original question was 'why does this problem even exist' I didn't
know the answer when you first asked. This discussion has meant I have a
much better understanding of why it is so. I guess we are going to have
to disagree on whether the current situation is a good thing or not.
Maybe a future revision of the standard will make your suggestions
compulsory.

I definitely have a better grasp as well. It is not a trivial problem,
just a fairly difficult one, and one that most likely needs to be
deferred to the linker. The object file would also probably have to
somehow contain semantics of dependencies for this to be resolved fully

Thanks for the discussion.


Adrian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top