Techniques to reduce executable size

Qu0ll · Mar 24, 2009

I come from a Java world where we have tools like ProGuard which analyze all
the components of an application and strip out classes and members that are
not being used. Is there an equivalent in C++ or does this happen
automatically? For example, if I use one or two classes in the Boost
library, do I get the entire library when I link my program? Similarly if I
use just part of the STL, do I get the whole thing or just those parts I
use?

--
And loving it,

-Qu0ll (Rare, not extinct)
_________________________________________________
(e-mail address removed)
[Replace the "SixFour" with numbers to email me]

Victor Bazarov · Mar 24, 2009

Qu0ll said:
I come from a Java world where we have tools like ProGuard which analyze
all the components of an application and strip out classes and members
that are not being used. Is there an equivalent in C++ or does this
happen automatically?

C++ has a concept, related mostly to templates, that you don't pay for
what you don't use. Compiler implementors develop their tools with
reduced program size in mind, of course, it's always one of the goals.
For example, linkers (the programs that tie different object modules
together) can perform some reduction by not linking modules from which
no function is used.

> For example, if I use one or two classes in the
Boost library, do I get the entire library when I link my program?

Most likely not.

Similarly if I use just part of the STL, do I get the whole thing or
just those parts I use?

Only the parts that you use. That's one of the selling points of C++
templates.

The overall reduction in machine code is not always possible by the
compiler/linker, since the use of functions/modules can be dependent on
the data the program has to process. Theoretically, if you have an
exhaustive set of tests, you can run your program under a *coverage*
tool to collect coverage data and then manually remove the code that is
never executed.

V

Qu0ll · Mar 24, 2009

Victor Bazarov said:
C++ has a concept, related mostly to templates, that you don't pay for
what you don't use. Compiler implementors develop their tools with
reduced program size in mind, of course, it's always one of the goals. For
example, linkers (the programs that tie different object modules together)
can perform some reduction by not linking modules from which no function
is used.

Most likely not.

Only the parts that you use. That's one of the selling points of C++
templates.

The overall reduction in machine code is not always possible by the
compiler/linker, since the use of functions/modules can be dependent on
the data the program has to process. Theoretically, if you have an
exhaustive set of tests, you can run your program under a *coverage* tool
to collect coverage data and then manually remove the code that is never
executed.

Thanks Victor for the prompt, comprehensive reply. So therefore there is no
need for a tool like ProGuard as the linker and the template mechanism do
this automatically. Does this apply to individual classes defined in the
same physical file? That is, will the linker only link in those that are
actually used even when they are defined in the same compilation unit as
some which are not used? In Java we have each class in a separate file but
this doesn't appear to be the way in C++.

--
And loving it,

-Qu0ll (Rare, not extinct)
_________________________________________________
(e-mail address removed)
[Replace the "SixFour" with numbers to email me]

Victor Bazarov · Mar 24, 2009

Qu0ll said:
Thanks Victor for the prompt, comprehensive reply. So therefore there
is no need for a tool like ProGuard as the linker and the template
mechanism do this automatically. Does this apply to individual classes
defined in the same physical file? That is, will the linker only link
in those that are actually used even when they are defined in the same
compilation unit as some which are not used? In Java we have each class
in a separate file but this doesn't appear to be the way in C++.

The trick with the templates is that they aren't really compiled
separately from your code. When used in your code, each template is
*instantiated* by the compiler, and the code is added to the program and
is actually shared between the modules (the linker should take care of
unifying the code). Templates that aren't instantiated, are only
compiled for the sake of syntax check, but the machine code is not
generated for those.

V

James Kanze · Mar 24, 2009

I come from a Java world where we have tools like ProGuard
which analyze all the components of an application and strip
out classes and members that are not being used. Is there an
equivalent in C++ or does this happen automatically? For
example, if I use one or two classes in the Boost library, do
I get the entire library when I link my program? Similarly if
I use just part of the STL, do I get the whole thing or just
those parts I use?

I'm not quite sure I understand what the Java tool does; Java
only loads classes on an as needed basis, so you never have
something you don't use. In statically compiled languages
(thus, C++), the linker only pulls in the object files it needs
from a library. Beyond that, it's a question of how the library
files were made---for widely used general purpose libraries,
each function should generally be in a separate object file; for
application specific classes, on the other hand, it's more usual
to use one object file for the entire class, which means that
you get all of the functions for the class as soon as you use
any one of them.

Qu0ll · Mar 24, 2009

[...]

I'm not quite sure I understand what the Java tool does; Java
only loads classes on an as needed basis, so you never have
something you don't use.

ProGuard reduces the size of a JAR by removing classes and members not
required by the application or applet. This is particularly useful for
applets where the emphasis is on trying to keep the JARs as small as
possible to permit faster downloads.

In statically compiled languages
(thus, C++), the linker only pulls in the object files it needs
from a library. Beyond that, it's a question of how the library
files were made---for widely used general purpose libraries,
each function should generally be in a separate object file; for
application specific classes, on the other hand, it's more usual
to use one object file for the entire class, which means that
you get all of the functions for the class as soon as you use
any one of them.

Do you really mean that each function may be in a separate file? So a
single class spans several files?

--
And loving it,

-Qu0ll (Rare, not extinct)
_________________________________________________
(e-mail address removed)
[Replace the "SixFour" with numbers to email me]

James Kanze · Mar 25, 2009

news:[email protected]...

Click to expand...

[...]
In statically compiled languages
(thus, C++), the linker only pulls in the object files it needs
from a library. Beyond that, it's a question of how the library
files were made---for widely used general purpose libraries,
each function should generally be in a separate object file; for
application specific classes, on the other hand, it's more usual
to use one object file for the entire class, which means that
you get all of the functions for the class as soon as you use
any one of them.

Click to expand...

Do you really mean that each function may be in a separate file? So a
single class spans several files?

Yes. From a QoI point of view, I would expect this in any
general purpose library.

There are exceptions, of course. There's no point in doing it
if the application is going to pick up all the functions anyway,
e.g. because they're virtual. And of course, templates are a
completely different problem---functions which aren't used won't
even be compiled, much less have an object file to be linked in
(although this varies somewhat, depending on the instantiation
strategy). But for general purpose libraries, the rule is one
function per source file for non-virtual non-template functions.

James Kanze · Mar 25, 2009

Qu0ll said:
Qu0ll said:

James Kanze wrote: [...]
Do you really mean that each function may be in a separate file? So a
single class spans several files?

Click to expand...

Yes, some linkers can't "dead-strip" unused objects within an
object file, only the entire object file's contents.

A linker can't strip unused objects from an object file and
still be conformant. Some can strip unused functions, but this
functionality isn't very wide spread, and isn't available on
most machines. (It has more to do with the object file format
than the linker, I think. Not including a function which isn't
used, even when other things in the object file are used, is
fairly trivial to implement, IF the information concerning the
extent of the function is present in the object file.)

James Kanze · Mar 26, 2009

James said:
James said:

Qu0ll wrote:
James Kanze wrote:
[...]
Do you really mean that each function may be in a separate file? So a
single class spans several files?
Yes, some linkers can't "dead-strip" unused objects within an
object file, only the entire object file's contents.

Click to expand...

A linker can't strip unused objects from an object file and
still be conformant.

Click to expand...

OK, but it can strip objects which are never ultimately
referenced, and whose constructor has no side-effects which
modify any of the referenced objects.

Whose constructor or destructor has no side-effects which affect
observable behavior. Provided it can distinguish those objects
from ones whose constructor or destructor does have visible
side-effects.

Which is a good reason for avoiding static-duration objects
with side-effects which might not be used by a particular
program.

In general, why link in something that isn't used? (But to tell
the truth, I'm not too sure what you're saying we should avoid.)

Bart van Ingen Schenau · Mar 27, 2009

The Java jar files are probably more similar to dynamic-link libraries, so

I would say that a jar file is comparable to the result of passing all
object files and libraries, that make up an application, to the
librarian instead of the linker.

For Java, that is an acceptable method of packaging, because every
machine that supports Java must be able to execute, interprer or
compile the byte-code.
For C++, there is no such assumption that the machine running the
application has the capability to link object files together. At best,
the machine is able to dynamically load parts of the application (when
it supports DLL/so's). For that reason, the packaging method of jar
files is not suitable for C++ code and there are no tools for C++ that
do something similar to ProGuard.

Another problem is that
such stripping would depend on the application, so different applications
would not be able to share the same dynamic library any more.

And one of the prime reasons for using DLL's is that different
applications are able to share the same library, so you can save disk
(only one copy of the library is needed) and memory space (multiple
running applications sharing the same library instance).
This would be completely negated if you have application-specific
stripped-down DLL versions.

For jar-files, this is not a concern, because their contents are not
shared between applications.

hth
Paavo

Bart v Ingen Schenau

Instantiating a class by name in a string	2	Mar 24, 2009
g++ and DLL linking	1	Apr 4, 2009
O/T: Tag lines, or sigs (Was: JNA performance)	7	May 8, 2011
Merits and uses of static vs. dynamic libraries	20	Apr 12, 2009
Sun rejects IBM buyout, IBM withdraws offer	42	Apr 6, 2009
Strange syntax error	1	Apr 25, 2009
JavaFX unsupported media problem with demos	13	Mar 18, 2009
Loading non-media files from applets	7	Nov 28, 2008

Techniques to reduce executable size

Qu0ll

Victor Bazarov

Qu0ll

Victor Bazarov

James Kanze

Qu0ll

James Kanze

James Kanze

James Kanze

Bart van Ingen Schenau

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads