Initialization of unused global variables from static libraries.

M

Milan Radovich

Hello,

I have 2 files:

a.cc:
---------------------------------------
int main()
{
return 0;
}
---------------------------------------

b.cc:
---------------------------------------
#include <iostream>

bool foo()
{
std::cout << "foo" << std::endl;
return true;
}

const bool bar = foo();
---------------------------------------

If I compile the application like that:

g++ -c a.cc b.cc
g++ a.o b.o -o a

then when application is executed, it outputs a string "foo".

But if I create a static library from b.cc and then link application
with that library, like that:

g++ -c a.cc b.cc
ar rcs libb.a b.o
g++ -L. -lb a.o -o a2

then when application is ran, it doesn't print anything.

After I searched for the reason why this happens, I found that
when linking with static libraries, linker omits code which is not
reachable through any execution path.

The question is:

1) Is it a correct behaviour according to a Standard?

2) Is there is a pattern which forces linker to add such a code to an
executable?


A case where such an idiom is very usefull is described in Chapter 8.3 of
Andrei Alexandrescu's "Modern C++ Design".
 
Ö

Öö Tiib

Hello,

I have 2 files:

a.cc:
---------------------------------------
int main()
{
return 0;}

---------------------------------------

b.cc:
---------------------------------------
#include <iostream>

bool foo()
{
std::cout << "foo" << std::endl;
return true;

}

const bool bar = foo();
---------------------------------------

If I compile the application like that:

g++ -c a.cc b.cc
g++ a.o b.o -o a

then when application is executed, it outputs a string "foo".

But if I create a static library from b.cc and then link application
with that library, like that:

g++ -c a.cc b.cc
ar rcs libb.a b.o
g++ -L. -lb a.o -o a2

then when application is ran, it doesn't print anything.

After I searched for the reason why this happens, I found that
when linking with static libraries, linker omits code which is not
reachable through any execution path.

The question is:

1) Is it a correct behaviour according to a Standard?

Yes. Standard does not describe "modules", "library" is used only in
context of standard libraries and "linker" is mentioned only in some
remarks. Standard does not describe how to make and how to interface
with custom libraries and what it means. There are no words about
shared and static custom libraries and so on. That is all up to
implementations.
2) Is there is a pattern which forces linker to add such a code to an
executable?

Usually there are some relations between modules that you link
together. When you call (or at least make linker to believe that you
might call) something from translation unit where your bool bar is,
then that compilation unit gets linked in and bar gets initialized.
 
M

Milan Radovich

Usually there are some relations between modules that you link
together. When you call (or at least make linker to believe that you
might call) something from translation unit where your bool bar is,
then that compilation unit gets linked in and bar gets initialized.

The question is, how it could be done before static library is written.
That is, I want to write a.cc so that when libb.a is written and linked
with a.o, global variables from libb.a are linked in and initialized
when application is ran.

I don't want to change a.cc every time a new static library is added
to application.
 
P

Paul Bibbings

Milan Radovich said:
Hello,

I have 2 files:

a.cc:
---------------------------------------
int main()
{
return 0;
}
---------------------------------------

b.cc:
---------------------------------------
#include <iostream>

bool foo()
{
std::cout << "foo" << std::endl;
return true;
}

const bool bar = foo();
---------------------------------------

If I compile the application like that:

g++ -c a.cc b.cc
g++ a.o b.o -o a

then when application is executed, it outputs a string "foo".

But if I create a static library from b.cc and then link application
with that library, like that:

g++ -c a.cc b.cc
ar rcs libb.a b.o
g++ -L. -lb a.o -o a2

Although this has no effect in relation to your current issue, you will
need to change the ordering here in the second invocation of g++. As
described in the documentation for gcc, "The linker handles an archive
file by scanning through it for members which define symbols that have
so far been referenced but not defined." What this means is, you have
to be correct as to ordering in supplying libraries and object files to
g++. As you have it above, you are passing g++ libb.a (-lb) *before* it
has seen your object file, a.o. Were you to require any library code
from libb.a in a.o (though you don't in the example given) this would
not be linked in because it would not "have so far been referenced but
not defined" at this point. This needs to be:

g++ -L. a.o -o a2 -lb

(As I have mentioned, the difference is not apparent in the example as
given, but will be an issue in more complete code).
then when application is ran, it doesn't print anything.

After I searched for the reason why this happens, I found that
when linking with static libraries, linker omits code which is not
reachable through any execution path.

The question is:

1) Is it a correct behaviour according to a Standard?

It is correct in the sense that the Standard puts little to no
requirements on static libraries; indeed it hardly mentions them at
all. In effect, it is an implementation detail. The difference you are
experinecing is not an inconsistency either, since you are applying two
different build models and the Standard, as mentioned, puts no
requirements on their `effect' being the same.
2) Is there is a pattern which forces linker to add such a code to an
executable?

If I understand it correctly, you can view a static library (an
`archive') as a collection of the object files from which it is
composed. The linker will make the code from any one of the archived
object files available *only* if your program requires the code. By
`required', I mean that your user code references a symbol in that
archive file that is defined there. Currently, your code does not
reference any symbol in libb.a:b.o. The `pattern', then, is to ensure
that it does.

Note: this does not necessarily mean that your user code must reference
either bool foo() or const bool bar. It must, however, reference
/something/ in libb.a:b.o, or this won't be linked in. Indeed, why
should it? If you are not actually making any use of /any/ of the code
in libb.a:b.o, then what is the point of that code? Why would you want
const bool bar initialized and bool foo() called in doing so?
A case where such an idiom is very usefull is described in Chapter 8.3 of
Andrei Alexandrescu's "Modern C++ Design".

I am not familiar with this work and hence am not aware of how this
idiom is used here. Perhaps if you give a little more context about
what it is this idiom is intended to achieve, more concrete help can be
given.

Regards

Paul Bibbings
 
Ö

Öö Tiib

The question is, how it could be done before static library is written.
That is, I want to write a.cc so that when libb.a is written and linked
with a.o, global variables from libb.a are linked in and initialized
when application is ran.

I don't want to change a.cc every time a new static library is added
to application.

I see. However what such singletons does? It is not actively related
to anything already written. It is just statically linked? What
benefits it provides to main module? Currently the uncertainty with
lots of threads causes issues with some idioms and patterns of
Alexandrescu, some of whose were probably meant to work well in single-
threaded environments.
 
P

Paul Bibbings

Milan Radovich said:
The question is, how it could be done before static library is written.
That is, I want to write a.cc so that when libb.a is written and linked
with a.o, global variables from libb.a are linked in and initialized
when application is ran.

I don't want to change a.cc every time a new static library is added
to application.

I am not sure that this quite makes sense to me. You appear to be
saying that you are writing a library that you are intending to link
into your program but that you don't want to have to change your program
code to actually make use of any of the library code you write.

As I have explained to some extent in a previous reply, as soon as your
user program code requires, in any direct or indirect way, code
incorporated into your library libb.a by the archived object file b.o,
then that object file will be linked in and the proper initialization
(of const bool bar, in this instance) will occur. If no such code is
required by your program, then this object code will not be linked in
and the initialization won't occur; but, of course, your not using it in
this second scenario, so why should it?

As an example, extending your original code:

// file: b.cc
#include <iostream>

void baz() { }

bool foo()
{
std::cout << "foo" << std::endl;
return true;
}

const bool bar = foo();

// file: a.cc
void baz();

int main()
{
baz();
return 0;
}

22:07:07 Paul Bibbings@JIJOU
/cygdrive/d/CPPProjects/nano $rm a.o a2.exe b.o libb.a

22:07:28 Paul Bibbings@JIJOU
/cygdrive/d/CPPProjects/nano $g++ -c a.cc b.cc

22:07:46 Paul Bibbings@JIJOU
/cygdrive/d/CPPProjects/nano $ar rcs libb.a b.o

22:07:58 Paul Bibbings@JIJOU
/cygdrive/d/CPPProjects/nano $g++ -L. a.o -o a2 -lb

22:08:11 Paul Bibbings@JIJOU
/cygdrive/d/CPPProjects/nano $./a2
foo

Note that the added code does not directly make use of either bool
foo(), or const bool bar. It merely makes use of *a* symbol defined in
libb.a:b.o and that object file is linked in from the archive and
initialization of const bool bar occurs, calling bool foo().

To summarize, in the code as given *nothing* in a.cc makes use of any of
the code in b.cc which, in your second build process, is archived as
libb.a:b.o. Initialization of const bool bar does not occur *for this
reason*. Of course, in your actual implementation something /will/ make
use of some code in libb.a:b.o, or why are you building it? Further, it
will almost certainly use code in that object file that *requires* const
bool bar to be initialized, or else why is its initialization important
to you? In this event, then, this initialization will happen and it
will not require any further, specific `pattern' on your part other than
that you "use what your program requires."

Regards

Paul Bibbings
 
B

Balog Pal

Milan Radovich said:
then when application is executed, it outputs a string "foo".
But if I create a static library from b.cc and then link application
with that library, like that:

g++ -c a.cc b.cc
ar rcs libb.a b.o
g++ -L. -lb a.o -o a2

then when application is ran, it doesn't print anything.
1) Is it a correct behaviour according to a Standard?

IIRC the mandated behavior is that objects in a TU that appear at namespace
scope will be initialized (in proper order related to each other) sometimes
before any code in that TU gets executed.

That makes both your observed behaviors correct.
2) Is there is a pattern which forces linker to add such a code to an
executable?

Sure, call function in that routine. Or use an implement-specific way to
force an extern/import symbol in one of your objects.
 
M

Milan Radovich

I am not familiar with this work and hence am not aware of how this
idiom is used here. Perhaps if you give a little more context about
what it is this idiom is intended to achieve, more concrete help can be
given.

Suppose there is a program which given a file performs some action
upon it, and different actions should be performed on different file types.
In order to make a program easily extensible, it has a hierarcy of file types,
where the program operates on abstract class File by calling it's function
Work. For each file type, new class is created. It inherits from File
and overrides File's private virtual function DoWork (called by Work).

In order to be able to extend the program easily, there is some (singleton)
FilesRegistry which given file's descriptor returns handler for this file.
When user wants to add handler for a new file type, he writes his handler
and registers it with FilesRegistry by calling it's static RegisterHandler
function.

Since we don't want file handler writer to modify the main program
(which may even be non-modifiable, like a framework), it is enough
for the user to define a global variable in his handler's file and initialize
it with return value of call to RegisterHandler function. Since global
variables should be initialized automatically, his handler would be
registered and everything work.

But...

If due to some compilation environment's quirks a static library should
be built from the new file handler and only then linked with the main
program, such a clean solution becomes impossible and the user is
forced to modify main code (by calling some function from his hander
or something like that).

This is undesirable, since modification of main source code may
introduce bugs, makes program less modular and complicates
creation of new file handers.

We want to avoid those problems.
 
M

Milan Radovich

Depends on the tools. But in general, your code needs to reference
whatever it is that you want linked in.

This is what I want to avoid in order to decouple those two pieces of
code, see an example in my reply to Paul Bibbings.
 
M

Milan Radovich

Sure, call function in that routine. Or use an implement-specific way to
force an extern/import symbol in one of your objects.

The code should be multiplatform and therefore very general, therefore
implementation-specific solutions are not suitable. Neither is function call,
as I just explained in reply to other posts in the thread.
 
F

Fred Zwarts

Milan Radovich said:
The code should be multiplatform and therefore very general, therefore
implementation-specific solutions are not suitable. Neither is
function call, as I just explained in reply to other posts in the
thread.

Libraries and linkers are platform specific, anyhow.
Maybe you can put the import of a symbol in the linker description,
so that you don't need to make your C++ code platform specific,
but only the build procedure, which is already platform specific.
 
A

Alf P. Steinbach

* Milan Radovich:
Hello,

I have 2 files:

a.cc:
---------------------------------------
int main()
{
return 0;
}
---------------------------------------

b.cc:
---------------------------------------
#include <iostream>

bool foo()
{
std::cout << "foo" << std::endl;
return true;
}

const bool bar = foo();
---------------------------------------

If I compile the application like that:

g++ -c a.cc b.cc
g++ a.o b.o -o a

then when application is executed, it outputs a string "foo".

But if I create a static library from b.cc and then link application
with that library, like that:

g++ -c a.cc b.cc
ar rcs libb.a b.o
g++ -L. -lb a.o -o a2

then when application is ran, it doesn't print anything.

After I searched for the reason why this happens, I found that
when linking with static libraries, linker omits code which is not
reachable through any execution path.

The question is:

1) Is it a correct behaviour according to a Standard?

No, regardless of how the code is physically packaged.

§3.7.1/2 "If an object of static storage duration has initialization or a
destructor with side effects, it shall not be eliminated even if it appears to
be unused, except that a class object or its copy may be eliminated as specified
in 12.8".

§12.8 then talks about elimination of copy constructor calls, in §12.8/15, which
is not your case.

2) Is there is a pattern which forces linker to add such a code to an
executable?

I'd try an object with a constructor.


Cheers & hth.,

- Alf
 
A

Alf P. Steinbach

* Pete Becker:
Alf said:
* Milan Radovich:

No, regardless of how the code is physically packaged.

It's a subtle point, but the object does not have to be constructed
because it's not part of the program. See "Phases of translation",
[lex.phases], paragraph 1, bullet 9. The second sentence is what matters
here: "Library components are linked to satisfy external references to
entities not defined in the current translation." If there are no
external references to the object, it doesn't have to be linked in (and
arguably should not be linked in).

Generally I wouldn't dream of contradicting you on the basic meaning of the
standard.

However §3.7.1/2 "If an object of static storage duration has initialization or
a destructor with side effects, it shall not be eliminated even if it appears to
be unused, except that a class object or its copy may be eliminated as specified
in 12.8" seems to much more clearly & directly /require/ the initialization.

Especially considering that §3.7.1/2 only makes a difference for an object that
is not referenced, or "unused" as it says, -- that that's what it's
specifically about.

If it doesn't apply to the "unused" objects it says that it applies to, then what?

My only guess for a way that your interpretation could be right would be the
weasel words "appears to" that are used in §3.7.1/2, but what could that refer
to other than appearances wrt. linking?


Cheers,

- Alf
 
K

Keith H Duggar

Alf said:
* Milan Radovich:
No, regardless of how the code is physically packaged.

It's a subtle point, but the object does not have to be constructed
because it's not part of the program. See "Phases of translation",
[lex.phases], paragraph 1, bullet 9. The second sentence is what matters
here: "Library components are linked to satisfy external references to
entities not defined in the current translation." If there are no
external references to the object, it doesn't have to be linked in (and
arguably should not be linked in).

Yikes! This is new to me. So from time to time I use an idiom
like this:

in file foolib/foo.cpp
<code>
namespace {

class FooGuard {
public :
FooGuard ( ) {
//library initialization activity
}
~FooGuard ( ) {
//library teardown activity
}
} ;

FooGuard TheFooGuard ;
}
</code>

Obviously TheFooGuard has external linkage but there will be no
"external references" to the object if I understand that term. And
if I understand you point correctly then this idiom is broken??

Thanks for the help!

KHD
 
F

Fred Zwarts

Pete Becker said:
Alf said:
* Milan Radovich:

No, regardless of how the code is physically packaged.

It's a subtle point, but the object does not have to be constructed
because it's not part of the program. See "Phases of translation",
[lex.phases], paragraph 1, bullet 9. The second sentence is what
matters here: "Library components are linked to satisfy external
references to entities not defined in the current translation." If
there are no external references to the object, it doesn't have to be
linked in (and arguably should not be linked in).

Interesting. What is the definition of a library component?
Assuming that a library component is a translation unit:
If a library component is linked in, should it be linked in completely,
or is it also allowed to link in only those parts from a library component
that satisfy external references?
(Sorry for asking, but I don't have my copy of the standard here.)
 
F

Fred Zwarts

Pete Becker said:
"Unused" objects are objects that are part of the program but not
"used".

Does the standard make a difference between "unused" and "not referenced"?

It is very well possible that a variable is not referenced, but it is used in the main program.
I often use a technique in which a translation unit other than the main program,
defines a static variable with a factory. The constructor adds the factory to a global list of factories.
The main program only references the same global list. In this way the program can use the factory
without referencing it. (This makes it possible to add factories without any change to the main program.)
In this case the factory static variable is used, but not referenced by the main program.
The addition of such variables, although they are not referenced, by the main program,
makes that the main program has a very different functionality.

I thought that this was exactly the reason why it says:
§3.7.1/2 "If an object of static storage duration has initialization or a
destructor with side effects, it shall not be eliminated even if it appears to
be unused ..."

The destructor of the factory class has a side effect (adding itself to a list),
so it should not be eliminated. I don't understand why this should not hold for
library entities. (The term object is ambiguous when talking about C++ and about compilers.)
 
F

Fred Zwarts

Pete Becker said:
It's not defined. The old Unix rule was that each source file gets
compiled into an object file, and each object file is a single entity:
either it gets linked in in its entirety, or it doesn't get linked in.
Some compilers produce object files with finer granularity: the linker
can pull in only the parts that are actually needed, and ignore other
parts of the object file.


Yes. <g>

If the term "library component" is not defined,
then the question could equally well be answered with No. :)

But, as C++ talks in terms of translation units, I think that the linker
should not eliminate parts of a unit, execpt if it can deduce that it has no effect
on the execution of a program. Even variables that are not referenced can
have effects on the execution of a program, as I said in another post.
Maybe the new standard should define this more clearly.
 
A

Alf P. Steinbach

* Pete Becker:
"Unused" objects are objects that are part of the program but not "used".

I can't find the qualification "are part of the program" in the standard.

In particular, the definition of "used" in §3.2/2 (the ODR) does not have that
qualification.

Chapter & verse, please?

static my_type m;

int main()
{
return 0;
}

Although 'm' is unused, it is part of the program (here, because it's in
the same translation unit as main) and it has to be constructed and
destroyed.

Sorry, again, chapter & verse for the "part of the program" qualification would
be nice.

Also, chapter & verse for the special casing of the TU containing "main" would
be very nice, since the idea is so strongly tied to & fundamental in your POV.

I'm unable to find it anywhere.

But if you put 'm' into a library, it's not part of the program unless
there is an external reference to it in the program.

On the contrary, §2/1 notes that "A C++ program need [sic] not all be translated
at the same time".

The standard then goes on to explain in §2/2 that translation units (which
literally are parts of the program) can be put in e.g. libraries.

I.e., being in a library is not incompatible with being a "part of the program",
at least in the literal meaning -- but does the standard define the term?

That's the point of
libraries: if you don't use it (in the colloquial sense) it doesn't have
to get linked in. If that weren't the case, "Hello, world" would have to
pull in the entire standard library.

Nope.

The linker only needs to pull in those things referenced directly or indirectly
by "main" or by static objects that under §3.7.1/2 can't be eliminated.


Cheers,

- Alf
 
A

Alf P. Steinbach

* Pete Becker:
Of course not. It's implicit in every part of the standard. The standard
doesn't need to say that the foo.obj on my hard drive is not part of
your program. It does need to say when and which objects and functions
in libraries are part of the program, and it does that.

I don't understand. The first sentence says that "Of course" I can't find it in
the standard, whereas the last sentence says that "it does [say which object and
functions ... are part of the program]", which apparenly is self-contradictory.
Where does the standard say this?

Once an object
is part of the program, the rest of the standard applies to it.

Then §3.7.1/2 applies to those objects, but do you have a reference in the
standard for which objects /are/ "part of the program"?

For the response you're responding to I dug up an old posting by Steve Clamage
who also referred to this wooly concept-without-reference, essentially sharing
your POV. Then in a later thread Dave Abrahams implicitly adopted my POV here.
There was also one thread with Francis Glassborow, but unclear about his view.

But no matter the authority arguments (and you're definitely an authority, which
is why I'm taking your claim seriously); I think §3.7.1/2 is extremely clear,
that unused objects can't be eliminated when they have initialization or
destruction with side effect, and would be /meaningless/ under your POV where it
depends on the physical packaging of translation units, and where it depends on
whether an object is defined in the "main" function's TU.

Do you have a reference for the special-casing of the "main" function's TU?

I think if the standard's discussion of that special casing of the "main" TU was
found, then one could perhaps start to nest to find more info.


Cheers,

- Alf
 
Ö

Öö Tiib

* Pete Becker:




Of course not. It's implicit in every part of the standard. The standard
doesn't need to say that the foo.obj on my hard drive is not part of
your program. It does need to say when and which objects and functions
in libraries are part of the program, and it does that.

I don't understand. The first sentence says that "Of course" I can't find it in
the standard, whereas the last sentence says that "it does [say which object and
functions ... are part of the program]", which apparenly is self-contradictory.
Where does the standard say this?

I think that closest thing is in 2.1/9 that mentions what will be
taken from library. But it feels that there you are back in round 1
interpreting these vague words. For me it means that library component
(probably translation unit in library?) is not linked into program
unless it satifies external references that are not yet resolved.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,835
Latest member
KetoRushACVBuy

Latest Threads

Top