Can initialization of static class members be forced before main?

Discussion in 'C++' started by Carsten Fuchs, Apr 3, 2008.

  1. Dear group,

    is it possible to guarantee that a static member of a class that is in a different compilation unit
    than main(), is still initialized before main()?

    Details:
    My intention is to have "self registering" classes, by having them have a static member object named
    "typeinfo" whose constructor adds itself to a global list. I've googled, and the relevant quote from
    the standard seems to be:
    That is, whenever the initialization is deferred (as I can reproduce with VC++ 2005), my classes
    fail to register themselves and so they're missing from the global list.

    Is there a way to make sure that such initialization is not deferred (either for all nonlocal
    objects with static storage duration, or just for a set of selected ones), but completed before
    main() begins?

    Thank you very much for your help!

    Best regards,
    Carsten
     
    Carsten Fuchs, Apr 3, 2008
    #1
    1. Advertisements

  2. Your Problem is not main() but the initialization sequence of the static
    objects. Your repository is static as well and there is no guarantee the
    this global list is initialized before the desired classes.

    Either do the lifetime-management of this singleton (the static list) on
    your own by providing a getInstance() method or ensure that the insert
    to the list is done by a function in the same translation unit where the
    list is defined.


    Marcel
     
    Marcel Müller, Apr 3, 2008
    #2
    1. Advertisements

  3. Hi Marcel,
    Oh, I'm using a getInstance() method for accessing the list already in order to avoid static order
    initialization problems, sorry for not having pointed that out in my earlier post.

    The problem is that some of my classes just aren't in that list after main() has begun.
    This is because they are in translation units of their own, and the compiler apparently deferred
    their initialization until before first use, rather than initializing all of them before main().

    In fact, when a class is missing in the list after main() has begun, and I then actually and
    explicitly *use* the static typeinfo member of that class, I can see how the constructor of that
    typeinfo member is run (because it adds itself to the global list and prints some debug output).

    Pseudocode:

    int main()
    {
    print(ListOfRegisteredClasses); // Most are missing.

    // Explicitly use one of the typeinfos of a missing class:
    std::cout << WindowT::typeinfo.classname;

    // Verify that the ctor of WindowT::typeinfo has run:
    print(ListOfRegisteredClasses); // WindowT is there now.

    return 0;
    }

    ("Using" the static typeinfo member is by the way not as easy as it seems, a real memory read or
    write operation must be involved, or otherwise the compiler is clever enough to just do nothing and
    to further defer the ctor call. I.e., (void)WindowT::typeinfo and similar statements like simple
    if-tests just do nothing.)

    In summary, what is actually needed is a way to have to compiler and/or linker initialize all
    nonlocal objects with static storage duration before main() begins, without deferring the
    initialization until before the first use...
    Ideas?

    Best,
    Carsten
     
    Carsten Fuchs, Apr 3, 2008
    #3
  4. Carsten Fuchs

    acehreli Guest

    Have a static object in main's translation unit, which explicitly
    calls those functions in the other translation units so that the other
    statics are constructed before main().

    int init_main_static()
    {
    // call other translation unit functions like blah_getInstance()
    }

    static const int main_static = init_main_static();

    int main()
    {
    /* ... */
    }

    Ali
     
    acehreli, Apr 3, 2008
    #4
  5. Hi,

    Oh, that's something different.

    In fact you won't come around an explicit list of all required classes,
    because what you expect is a list of all (some) classes that may never
    be used. No compiler will ever be able to do that job for you.

    I like to ask why you need a class to be in the list if it is not used
    so far. This implies an implicit dependency of the list (or some other
    object that depends on the list) to the registered classes. Obviously
    this dependency is in some magic way that the compiler does not know.


    Marcel
     
    Marcel Müller, Apr 3, 2008
    #5
  6. Carsten Fuchs

    James Kanze Guest

    In practice, it is guaranteed, and a lot of code counts on it.
    At least, as long as everything is statically linked; static
    objects in dynamic linked modules will not be initialized before
    the module is loaded (for obvious reasons). At least on the
    systems I use (Solaris and Linux), they will be initialized
    before you return from the dlopen which loads the module,
    however. (But you'll have to count on system by system
    guarantees here. C++ doesn't know about dynamic linking, and
    Posix doesn't know about C++ and dynamic initialization of
    static objects.)
    I do similar things a lot. Basically, the registry has to use
    the singleton pattern, since you can't otherwise guarantee that
    it is constructed before the other static objects (even if all
    are constructed before main).
    To the best of my knowledge, there are no systems where the
    initialization is "deferred", in the sense above. In fact, I
    rather doubth that it is possible, since the "deferred"
    initialization imposes a topological sort---even in the case of
    cycles.
    In the scenario you described above, you have a problem even if
    all of the initialization occurs before entering main.
    Quality of implementation. No known compiler defers it, and no
    new compiler would dare to defer it, given the amount of code
    that would break.
     
    James Kanze, Apr 3, 2008
    #6
  7. Carsten Fuchs

    James Kanze Guest

    That's strange, because I've used this idiom with many
    compilers, and I can assure you that it works. Very well, in
    fact.

    Are you sure that everything is correctly linked in, and that
    you're not trying to do any funny business with dynamic linking.
    (I've also made it work with dynamic linking, but in those
    cases, is was done intentionally, to control dynamically what
    was available in the list, and to be able to add modules to the
    list without stopping the application.)
    It sounds like some sort of implicit dynamic loading is
    occuring.
    Most of my experience is under Unix systems, but the little I've
    done under Windows seems to work as well. Just make sure that
    the relevant code is statically linked.
     
    James Kanze, Apr 3, 2008
    #7
  8. In windows, if the Global list and the self registering classes are in
    different
    libraries (and if they are DLLs as James stated) the code something
    like;

    AggregateEntity.cpp
    -----------------------
    int AggregateEntityobjinit=0;
    namespace
    {
    DTSBaseEntity* createAggregateEntity(){
    return new AggregateEntity;
    }
    const bool registered =
    ObjFactory::Instance().Register("AggregateEntity",createAggregateEntity);
    }

    will fail to work since every DLL has its own copy of ObjFactory.So
    you have to
    pass the singleton from DLL to the main module

    DLLEXPORT ObjFactory * initFactory()
    {
    AggregateEntityobjinit = 0 //This will force the registration. (I
    think it's
    //connected to the last sentence in
    3.6.2./1
    return ObjFactory::Instance();
    }

    This way you can get all classes registered .Note that this is only
    required if registration takes place in a DLL you won;t need it if it
    happens to be in a static library or executable itself

    Hope it helps
    Hurcan Solter
     
    hurcan solter, Apr 4, 2008
    #8
  9. Ok, thx, that is of course what I'll have to resort to if nothing else works... I had hoped there is
    a way to avoid referring to the objects in the other translation units explicitly.

    Best,
    Carsten
     
    Carsten Fuchs, Apr 4, 2008
    #9
  10. Hi Marcel,
    Ok, I understand that, although please note that James Kanze said quite the opposite in another post
    in this thread.
    All I want is that the compiler initializes whatever nonlocal static objects are there, not having
    him figure out which objects are (or appear to be) never be used and thus save or defer the work of
    initialization.
    Well, each list element is a "typeinfo" about a class. A typeinfo contains the name string of the
    class, a unique and "stable" type number, a callback function pointer for creating an instance of
    its class, and other information which altogether represents the graph of an entire class
    inheritance hierarchy.

    I use this information very intensively, e.g. for:
    - being able to instantiate classes from class name string (*very* useful for embedded scripting
    language support),
    - being able to instantiate classes from type/class number (useful for serialization over network or
    to disk).

    Of course the neither the compiler nor the executable can know in advance which classes will be
    instantiated, because that depends on the user generated script and/or incoming network messages.

    The fact that my classes and thus their static "typeinfo" members end up somewhat isolated in their
    own translation units is just a (desired) side effect, and in a sense this means that the compilers
    concluding of the classes never being used is wrong.
    What I hoped for was a way to overcome this wrong conclusion, and to have to compiler just
    initialize all static objects that are there.

    Best,
    Carsten
     
    Carsten Fuchs, Apr 4, 2008
    #10
  11. Hi James,

    Well, I was surprised, too, because quite the contrary is true:
    The problems that I have described occur in a normal, statically linked Win32 executable.

    In a DLL, where I've implemented exactly the same idiom, it works without problems, everyone
    registers before the DLL begins (the DLL uses the very same code for the static "typeinfo" class
    members, just with another class hierarchy that registers itself to another list that is global only
    to the DLL).

    The same DLL is also compiled under Linux in a .so library, which also works.

    I've not yet compiled and tested the exe that causes the problems under Linux though, will do so soon.
    Indeed, but as mentioned above, this is a statically linked Win32 exe.
    Which is, as mentioned above, the very problem...

    It actually seems as if the VC++ 2005 compiler takes the freedom to defer the initialization of
    nonlocal objects with static storage duration in other translation units, just as is described in
    the standard.

    Best regards,
    Carsten
     
    Carsten Fuchs, Apr 4, 2008
    #11
  12. Hi,

    Yeah, I understand all that, and with DLLs and shared objects, that's my experience, too.
    That's also why I was so surprised to find it differently with a statically linked Win32 exe.
    Yes, that's clear.
    Hmm. I don't know enough in this regards, unfortunately.
    I don't understand this. Why?

    If every static class member was initialized, everyones ctor would have run, and thus everyone would
    have registered itself with the global list. Then, the global list (which uses the singleton
    pattern) would be complete (contain all classes) when main() begins.
    Well, although I see your point, other posters in this thread have argued to the contrary. Both
    argumentations (do and do not the deferral) sound sound to me...

    I guess I'll test the same code when built under Linux later, and I guess I'll also ask this at
    microsoft.public.vc.language

    Best,
    Carsten
     
    Carsten Fuchs, Apr 4, 2008
    #12
  13. Hi Hurcan,

    Well, I understand the additional issues when DLLs are involved, but the problem that I describe
    occurs with a statically linked executable, no DLLs involved. I've observed and tested this with
    VC++ 2005.

    Best,
    Carsten
     
    Carsten Fuchs, Apr 4, 2008
    #13
  14. Carsten Fuchs

    James Kanze Guest

    [...]
    Well, I don't have my full environment up and running under
    Windows at present, but I have recently installed enough to give
    it a quick try, and with Windows 2008 (that's the version I
    think I've got here), I couldn't reproduce the symptom.

    Are you sure you're telling the linker to incorporate all of
    your modules in the final binary? (I ask, because just putting
    them in a library isn't sufficient---by definition, components
    in a library are only included if they resolve an unresolved
    external. You have to link the .obj files themselves. And the
    reason it works with a DLL, of course, is because despite its
    name, a DLL isn't a library, but an object file.)

    FWIW: I compiled the following code attached below with
    cl -EHs -GR main.cc [A-Z]*.cc
    and it displays TypeOne when run, as expected. The only
    compiler I have here is VC++ 2008, but I've moved the code into
    my working partition on the shared file system, so it will show
    up Monday at work, where I have a VC++ 2005. In the meantime,
    you can experiment with it, and try to see what you are doing
    differently. (The first line of each file is an identical
    delimiter, so you shouldn't have any problem breaking it up into
    files.)

    /
    ****************************************************************************/
    /* File:
    Registry.hh */
    /* Author: J.
    Kanze */
    /* Date:
    04/04/2008 */
    /* Copyright (c) 2008 James
    Kanze */
    /*
    ------------------------------------------------------------------------
    */
    //[email protected] Registry.hh

    #ifndef Registry_hh_20080404fClvsiRjn9XbKqb7dfNOiMwl
    #define Registry_hh_20080404fClvsiRjn9XbKqb7dfNOiMwl

    #include <map>
    #include <string>
    #include <typeinfo>
    #include "TypeIdWrap.hh"

    class RegisteredObject
    {
    public:
    virtual ~RegisteredObject() {}
    virtual std::string id() const = 0 ;

    protected:
    RegisteredObject( std::type_info const& id ) ;
    } ;

    class Registry
    {
    public:
    typedef std::map< TypeIdWrap, RegisteredObject* >
    Map ;

    static Registry& instance() ;
    void enrol( std::type_info const& key,
    RegisteredObject& obj ) ;
    RegisteredObject* get( std::type_info const& key ) const ;
    Map::const_iterator begin() const ;
    Map::const_iterator end() const ;

    private:
    Registry() ;
    Registry( Registry const& other ) ;
    Registry& operator=( Registry const& ) ;

    Map myRegistry ;
    } ;
    #endif
    // Local Variables: --- for emacs
    // mode: c++ --- for emacs
    // tab-width: 8 --- for emacs
    // End: --- for emacs
    // vim: set ts=8 sw=4 filetype=cpp: --- for vim
    /
    ****************************************************************************/
    /* File:
    Registry.cc */
    /* Author: J.
    Kanze */
    /* Date:
    04/04/2008 */
    /* Copyright (c) 2008 James
    Kanze */
    /*
    ------------------------------------------------------------------------
    */

    #include "Registry.hh"

    #include <assert.h>

    RegisteredObject::RegisteredObject(
    std::type_info const&
    id )
    {
    Registry::instance().enrol( id, *this ) ;
    }

    Registry&
    Registry::instance()
    {
    static Registry theOneAndOnly ;
    return theOneAndOnly ;
    }

    void
    Registry::enrol(
    std::type_info const&
    key,
    RegisteredObject& obj )
    {
    TypeIdWrap wrap( key ) ;
    assert( myRegistry.find( wrap ) == myRegistry.end() ) ;
    myRegistry.insert( Map::value_type( wrap, &obj ) ) ;
    }

    RegisteredObject*
    Registry::get(
    std::type_info const&
    key ) const
    {
    Map::const_iterator entry = myRegistry.find( TypeIdWrap( key ) ) ;
    return entry == myRegistry.end()
    ? NULL
    : entry->second ;
    }

    Registry::Map::const_iterator
    Registry::begin() const
    {
    return myRegistry.begin() ;
    }

    Registry::Map::const_iterator
    Registry::end() const
    {
    return myRegistry.end() ;
    }

    Registry::Registry()
    {
    }
    // Local Variables: --- for emacs
    // mode: c++ --- for emacs
    // tab-width: 8 --- for emacs
    // End: --- for emacs
    // vim: set ts=8 sw=4 filetype=cpp: --- for vim
    /
    ****************************************************************************/
    /* File:
    TypeIdWrap.hh */
    /* Author: J.
    Kanze */
    /* Date:
    04/04/2008 */
    /* Copyright (c) 2008 James
    Kanze */
    /*
    ------------------------------------------------------------------------
    */
    //[email protected] TypeIdWrap.hh

    #ifndef TypeIdWrap_hh_20080404C8jumjuDkbecW3ml8ozcnAgh
    #define TypeIdWrap_hh_20080404C8jumjuDkbecW3ml8ozcnAgh

    #include <typeinfo>

    class TypeIdWrap
    {
    public:
    TypeIdWrap( std::type_info const& id ) ;
    bool operator<( TypeIdWrap const& other ) const ;

    private:
    std::type_info const*
    myId ;
    } ;
    #endif
    // Local Variables: --- for emacs
    // mode: c++ --- for emacs
    // tab-width: 8 --- for emacs
    // End: --- for emacs
    // vim: set ts=8 sw=4 filetype=cpp: --- for vim
    /
    ****************************************************************************/
    /* File:
    TypeIdWrap.cc */
    /* Author: J.
    Kanze */
    /* Date:
    04/04/2008 */
    /* Copyright (c) 2008 James
    Kanze */
    /*
    ------------------------------------------------------------------------
    */

    #include "TypeIdWrap.hh"

    TypeIdWrap::TypeIdWrap(
    std::type_info const&
    id )
    : myId( &id )
    {
    }

    bool
    TypeIdWrap::eek:perator<(
    TypeIdWrap const& other ) const
    {
    return myId->before( *other.myId ) ;
    }
    // Local Variables: --- for emacs
    // mode: c++ --- for emacs
    // tab-width: 8 --- for emacs
    // End: --- for emacs
    // vim: set ts=8 sw=4 filetype=cpp: --- for vim
    /
    ****************************************************************************/
    /* File:
    TypeOne.hh */
    /* Author: J.
    Kanze */
    /* Date:
    04/04/2008 */
    /* Copyright (c) 2008 James
    Kanze */
    /*
    ------------------------------------------------------------------------
    */
    //[email protected] TypeOne.hh

    #ifndef TypeOne_hh_20080404tcojigCiecDjm3obshpJiinh
    #define TypeOne_hh_20080404tcojigCiecDjm3obshpJiinh

    #include "Registry.hh"

    class TypeOne : public RegisteredObject
    {
    public:
    TypeOne() ;
    virtual std::string id() const ;
    } ;
    #endif
    // Local Variables: --- for emacs
    // mode: c++ --- for emacs
    // tab-width: 8 --- for emacs
    // End: --- for emacs
    // vim: set ts=8 sw=4 filetype=cpp: --- for vim
    /
    ****************************************************************************/
    /* File:
    TypeOne.cc */
    /* Author: J.
    Kanze */
    /* Date:
    04/04/2008 */
    /* Copyright (c) 2008 James
    Kanze */
    /*
    ------------------------------------------------------------------------
    */

    #include "TypeOne.hh"

    TypeOne::TypeOne()
    : RegisteredObject( typeid( TypeOne ) )
    {
    }

    std::string
    TypeOne::id() const
    {
    return "TypeOne" ;
    }

    TypeOne myInstance ;
    // Local Variables: --- for emacs
    // mode: c++ --- for emacs
    // tab-width: 8 --- for emacs
    // End: --- for emacs
    // vim: set ts=8 sw=4 filetype=cpp: --- for vim
    /
    ****************************************************************************/
    /* File:
    main.cc */
    /* Author: J.
    Kanze */
    /* Date:
    04/04/2008 */
    /* Copyright (c) 2008 James
    Kanze */
    /*
    ------------------------------------------------------------------------
    */

    #include <iostream>
    #include <string>
    #include "Registry.hh"

    int
    main()
    {
    Registry const& r = Registry::instance() ;
    for ( Registry::Map::const_iterator i = r.begin() ;
    i != r.end() ;
    ++ i ) {
    std::cout << i->second->id() << std::endl ;
    }

    return 0 ;
    }
    // Local Variables: --- for emacs
    // mode: c++ --- for emacs
    // tab-width: 8 --- for emacs
    // End: --- for emacs
    // vim: set ts=8 sw=4 filetype=cpp: --- for vim
     
    James Kanze, Apr 4, 2008
    #14
  15. Hi James,

    first of all, thank you very much for your patience and help!

    I've thought about and experimented with this over the weekend: You are right! :) I put my .objs
    into a library, then used that for linking. When I didn't explicitly refer somewhere to the classes
    in these modules, they were not linked in at all and thus the constructor of their static members
    obviously cannot be run (especially not before main()).
    When I supply the .obj files directly as input to the linker, everything works as expected.

    That was the crucial tip!

    In hindsight, it sounds logical, but can you please tell me where or how I can learn more about such
    issues? I tried before at MSDN (LINK and LIB documentation), as well as the "Linker and Loaders"
    book by Levine, none of which gave me sufficient insight though, or even a hint.

    I think that with the GNU linker, the command line option --whole-archive allows me to keep the
    object files in a library and still have them all included in the final binary, even though they're
    not immediately required for resolving something.
    Do you happen to know the equivalent for the Visual C++ linker? I've searched MSDN, but there seems
    to be no equivalent linker switch?
    Instead of "object file", you meant "(kind of) executable file", did you?
    The lib vs. obj treatment was the issue, I never suspect that they were treated differently (I
    thought that supplying a lib to the linker is just equivalent to enumerating the contained obj files)...

    Again, a thousand thanks for your help!! :)

    Best regards,
    Carsten
     
    Carsten Fuchs, Apr 7, 2008
    #15
  16. Hello all,

    I would like to conclude this thread with a short summary of (my understanding of) the results:

    As James explained, most(?) linkers (at least on Windows and Unix platforms) include all the symbols
    that are in object files (.obj, .o) into the executable. For static libraries (.lib, .a) this is
    true only for those that resolve otherwise unresolved externals.

    Such symbols of static libraries can however be "forced" into the executable with linker-specific
    means: The GNU linker has the --whole-archive command-line option for this purpose, the Visual C++
    linker supports #pragma comment(linker, "/INCLUDE ...") and the /INCLUDE command-line option
    (searching microsoft.public.vc.language for "linker /include force" yields plenty of related posts).
    /INCLUDE however requires a symbol name as its parameter (possibly mangled in C++), and thus is less
    "comfortable" and "stable" than --whole-archive.

    I can see only two portable solutions:

    a) Have each module have an Init() method that is called from a place that is known to be covered,
    e.g. at the beginning of main(). This seems to be the most portable and most reliable solution, but
    requires that the "self-registering classes" are explicitly enumerated once more in the place that
    calls all the Init() functions.

    b) Pass the .obj files to the linker individually, rather than having them combined in a .lib. This
    might not easily be feasible for everyone though, but the big plus over "a)" above seems to be that
    if the "library of self-registering classes" allows the user code to derive its own self-registering
    classes, and thus augment the libraries set of self-registering classes, then this approach looks
    like the one that comes with the least confusion and clutter.

    Best regards,
    Carsten
     
    Carsten Fuchs, Apr 7, 2008
    #16
  17. Carsten Fuchs

    James Kanze Guest

    [...]
    It really should have occured to me from the first; it seems to
    be a very common error. (And let me guess that you're a bit
    younger than I am.)
    The school of hard knocks?

    Seriously, I really don't know. The linker documentation when I
    was starting (e.g. for the Interdata 8/32) explained it clearly,
    but most of the documentation today seems to be concerned with
    which buttons to click on, rather than what you are actually
    doing. The result is while it wouldn't occur to anyone of my
    generation to expect files in a library to show up in the
    executable unless they were explicitly mentionned, the problem
    crops up regularly with younger people.
    I haven't seen the "Linker and Loaders" book by Levine, but that
    sort of surprises me. (The online table of contents gives a
    section "Purpose of Libraries".)
    It sounds like it would do it.
    Not off hand. I wasn't even aware of the GNU option. (Until
    recently, I didn't use the GNU linker, but rather the Sun one.)
    The old Intel linkers had options to force the inclusion of
    specific object files from a library, but I don't remember one
    for forcing them all, and of course, the Microsoft linker isn't
    the old Intel one.

    The solution I've usually adopted in such cases (when I wanted
    to just deliver a library, and have it work as usual for the
    client ) is to generate and compile a special source file
    programmatically, in the makefile; this file would contain an
    external reference to all of the target files, and would be
    referred to explicitly in the file which managed the map. Use
    of the map triggers the inclusion of the map's object file,
    inclusion of the map's object file triggers inclusion of this
    file, and inclusion of this file triggers inclusion of
    everything else.
    Well, it ends up being linked into a larger executable, and it
    doesn't have an entry point. But I think that under Windows,
    it does behave somewhat like an executable. Under Unix, a
    shared object (which isn't necessarily shared, but is
    dynamically loaded), is much more like an object file, but both
    have certain characteristics of object files---in particular,
    they are linked into an executable as a whole.
     
    James Kanze, Apr 8, 2008
    #17
  18. Carsten Fuchs

    James Kanze Guest

    Just a small precision. On all systems I've seen, a (true)
    library is nothing more than a collection of object files,
    possibly with an added index. The granularity of traditional
    linkers is the object file. (I believe the the Microsoft linker
    can use a reduced granularity, incorporating some parts of an
    object file, and not others, but this definitely isn't the case
    for the linker I know best---Sun Solaris.) The traditional
    linker only incorporates object files into the executable. It
    treats libraries as a set of conditionally specified object
    files. Most traditional linkers also treat object files and
    libraries in the order they are specified, and once they have
    finished with a library, will not go back and reconsider it;
    Microsoft seems to be the exception here, and will recurse over
    the complete list of libraries several times.

    Dynamic linkers work a bit differently, since they normally only
    process one file at a time, and don't consider what might or
    might not have been defined elsewhere. (This is only partially
    true, at least under Unix. And there is a lot more differences
    in the ways dynamic linking works than in static linking.)
    You don't need an Init() function. Any reference to something
    in the file will do. I typically just create a table of
    addresses to the initialization objects, and refer to this table
    somewhere. And I'll use a shell script invoked from the
    makefile to generate the source code for this table, so the only
    "list" of the concerned files is in the makefile (which needs it
    anyway, in order to know which files to add to the library).
    The second also has the advantage that the files don't even have
    to have a publicly accessible variable. It's definitely the way
    to go if you're working at the application level. If your
    delivering a library, however, it does add a complication for
    the client, who has to unpack your library into object files,
    and use wildcards in their link command.
     
    James Kanze, Apr 8, 2008
    #18
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.