automatically remove unused #includes from C source?

smachin1000 · Oct 20, 2006

Hi All,

Does anyone know of a tool that can automatically analyze C source to
remove unused #includes?

Thanks,
Sean

Walter Roberson · Oct 20, 2006

Does anyone know of a tool that can automatically analyze C source to
remove unused #includes?

Tricky.

A #define in include1.h might be used in a #define in include2.h that
might be used to build a type in include3.h that might be needed by a
function declaration brought in by include5.h that is #include'd by
include4.h, and the function name might be in a disguised array
initialization form in include6.h and the analyzer would have to
analyze your source to see whether you refer to that function directly
or if you use the array initialization...

In other words, such a tool would have to pretty much be a C compiler
itself, but one that kept track of all the "influences" that went
to build up every token, and figured out what wasn't used after-all.

It might be easier just to start commenting out #include's and
seeing if any compile problems came up.

Roland Pibinger · Oct 20, 2006

It might be easier just to start commenting out #include's and
seeing if any compile problems came up.

Automate that and you have the requested tool!

Best wishes,
Roland Pibinger

Walter Roberson · Oct 20, 2006

Automate that and you have the requested tool!

including a particular file can end up changing the meaning of
something else, but the code might compile fine without it.

For example, you might have an include file that contained

#define _use_search_heuristics 1

Then the code might have

#if defined(_use_search_heuristics)
/* do it one way */
#else
/* do it a different way */
#endif

where the code is valid either way.

Thus in order to test whether any particular #include is really
needed by checking the compile results, you need to analyze the
compiled object, strip out symbol tables and debug information and
compile timestamps and so on, and compare the generated code.

Roland Pibinger · Oct 20, 2006

including a particular file can end up changing the meaning of
something else, but the code might compile fine without it.

For example, you might have an include file that contained

#define _use_search_heuristics 1

Then the code might have

#if defined(_use_search_heuristics)
/* do it one way */
#else
/* do it a different way */
#endif

where the code is valid either way.

You are right in theory. But that kind of include file dependencies
(include order dependencies) is usally considered bad style.

Thus in order to test whether any particular #include is really
needed by checking the compile results, you need to analyze the
compiled object, strip out symbol tables and debug information and
compile timestamps and so on, and compare the generated code.

IMO, this is overdone. You have to test your application after code
changes anyway.

Best regards,
Roland Pibinger

Al Balmer · Oct 20, 2006

You are right in theory. But that kind of include file dependencies
(include order dependencies) is usally considered bad style.

No, he's right in practice. There's no guarantee that a body of
existing code will conform to your (or anyone's) rules of good style.

Walter Roberson · Oct 20, 2006

You are right in theory. But that kind of include file dependencies
(include order dependencies) is usally considered bad style.

It happens often with large projects with automakes and
system dependancies. The included file that changes the meaning
of the rest is a "hints" file.

For example, on the OS I use most often, for a well
known large project (perl as I recall) the autoconfigure
detects that the OS has library entries and include entries
for a particular feature. Unfortunately that particular feature
doesn't work very well in the OS -- broken -and- very inefficient.
So the OS hints file basically says, "Yes I know you've detected
that, but don't use it." So the large project goes aheads and
compiles in the code that performs the task using more standardized
system calls instead of the newer less-standardized API.

IMO, this is overdone. You have to test your application after code
changes

Conformance tests can take 3 days per build, and if you
are checking whether a project with 1500 #includes (distributed
over the source) can survive deleting one particular include
out of one particular module, then you need up to pow(2,1500)
complete builds and conformance tests. Even if each *complete*
application conformance test took only 1 second, it'd take
10^444 CPU years to complete the testing. *Much* faster to break
it into chunks (e.g., by source file) and check to see whether
each chunk still produces the same code after removal of a
particular include: the timing then becomes proportional to
the sum of pow(2,includes_in_this_chunk) instead of the product
of those as would be the case with what you propose.

Walter Bright · Oct 21, 2006

Walter said:
Conformance tests can take 3 days per build, and if you
are checking whether a project with 1500 #includes (distributed
over the source) can survive deleting one particular include
out of one particular module, then you need up to pow(2,1500)
complete builds and conformance tests. Even if each *complete*
application conformance test took only 1 second, it'd take
10^444 CPU years to complete the testing. *Much* faster to break
it into chunks (e.g., by source file) and check to see whether
each chunk still produces the same code after removal of a
particular include: the timing then becomes proportional to
the sum of pow(2,includes_in_this_chunk) instead of the product
of those as would be the case with what you propose.

That is a good idea: selectively removing #include statements, and then
simply seeing if the resulting object code file changes.

Otherwise, a customized C compiler could absolutely tell if there were
any dependencies on a particular #include file.

Don Porges · Oct 21, 2006

Walter Roberson said:
including a particular file can end up changing the meaning of
something else, but the code might compile fine without it.

For example, you might have an include file that contained

#define _use_search_heuristics 1

Then the code might have

#if defined(_use_search_heuristics)
/* do it one way */
#else
/* do it a different way */
#endif

where the code is valid either way.

Thus in order to test whether any particular #include is really
needed by checking the compile results, you need to analyze the
compiled object, strip out symbol tables and debug information and
compile timestamps and so on, and compare the generated code.

Then, analyze it to make sure you don't delete the #include of "seems_unused.h" in this:

seems_unused.h:
-----------------
#define MIGHT_NEED 1

somefile.c:
----------
#ifdef DEFINED_WITH_MINUS_D
int var = MIGHT_NEED;
#endif

-- so that next week, when somebody does gcc -DDEFINED_WITH_MINUS_D, the code still builds.

Neil · Oct 21, 2006

Hi All,

Does anyone know of a tool that can automatically analyze C source to
remove unused #includes?

Thanks,
Sean

doesn't PC-LINT give you a list of unused includes?

Roland Pibinger · Oct 21, 2006

Conformance tests can take 3 days per build, and if you
are checking whether a project with 1500 #includes (distributed
over the source) can survive deleting one particular include
out of one particular module, then you need up to pow(2,1500)
complete builds and conformance tests. Even if each *complete*
application conformance test took only 1 second, it'd take
10^444 CPU years to complete the testing.

That calculaton is quite contrived. I wonder how you would do changes
in you code base besides removing an #include, not to speak of
refactoring.

*Much* faster to break
it into chunks (e.g., by source file) and check to see whether
each chunk still produces the same code after removal of a
particular include:

.... and if it compiles but produces different object code then you
have found an include order dependency bug ;-)

Best regards,
Roland Pibinger

Roland Pibinger · Oct 21, 2006

Otherwise, a customized C compiler could absolutely tell if there were
any dependencies on a particular #include file.

BTW, there is a huge demand for static code analysis tools in C and
C++ (also in a commercial sense). For most of those code analysis
tasks you need to have a fully-fledged (customized) compiler. So, if I
had that compiler ...

Best regards,
Roland Pibinger

Walter Bright · Oct 21, 2006

Roland said:
BTW, there is a huge demand for static code analysis tools in C and
C++ (also in a commercial sense). For most of those code analysis
tasks you need to have a fully-fledged (customized) compiler. So, if I
had that compiler ...

True, I've seen some amazingly high prices quoted for static code
analysis. There's nothing stopping someone from approaching Digital Mars
or other compiler vendors and offering to purchase a license for the
compiler to get into that business.

Roland Pibinger · Oct 21, 2006

True, I've seen some amazingly high prices quoted for static code
analysis. There's nothing stopping someone from approaching Digital Mars
or other compiler vendors and offering to purchase a license for the
compiler to get into that business.

What is stopping you?

jaysome · Oct 21, 2006

doesn't PC-LINT give you a list of unused includes?

Yes.

Ian Collins · Oct 21, 2006

Roland said:
BTW, there is a huge demand for static code analysis tools in C and
C++ (also in a commercial sense). For most of those code analysis
tasks you need to have a fully-fledged (customized) compiler. So, if I
had that compiler ...

Due to extensions, such an tool can only realy be part of the compiler
suite.

Richard Heathfield · Oct 21, 2006

Roland Pibinger said:

What is stopping you?

I don't think Walter Bright needs to approach /anyone/ to purchase a licence
for the Digital Mars compiler.

CBFalconer · Oct 21, 2006

Walter said:
That is a good idea: selectively removing #include statements, and
then simply seeing if the resulting object code file changes.

Otherwise, a customized C compiler could absolutely tell if there
were any dependencies on a particular #include file.

Such an operation would need C99 specs, otherwise the use of
implied int would foul the results. It might be enough to tell the
compiler to insist on prototypes.

Walter Roberson · Oct 21, 2006

That calculaton is quite contrived.

Contrived? Well, yes, in the sense that any large project is likely
to have much *more* than 1500 #include statements. For example, I just
ran a count against the trn4 source (which is less than 1 megabit
when gzip'd), and it has 1659 #include statements. openssl 0.9.7e
has 4679 #include statements (it's about 3 megabytes gzip'd).

I wonder how you would do changes
in you code base besides removing an #include, not to speak of
refactoring.

You seem to have forgotten that you yourself proposed,
"Automate that and you have the requested tool!" in response to my
saying, "It might be easier just to start commenting out #include's".
When I indicated that it is more complex than that and that
comparing object code is necessary (not just looking to see if
looking for compile errors), you said,
"You have to test your application after code changes anyway."

Taken in context, your remark about testing after code changes
must be considered to apply to the *automated* tool you proposed.
And the difficulty with automated tools along these lines is that they
are necessarily dumb: if removing #include file1.h gives you a
compile error, then the tool cannot assume that file1.h is a -necessary-
dependancy (i.e., a tool that could test in linear time): the tool
would have to assume the possibility that removing file1.h
only gave an error because of something in file2.h --- and yes,
there can be backwards dependancies, in which file1.h is needed to
complete something included -before- that point. Thus, in this kind
of automated tool that doesn't know how to parse the C code itself,
full dependancy checking can only be done by trying every -possible-
combination of #include files, which is a 2^N process.

Do you feel that 1 second to "test your application after code changes"
is significantly longer than is realistic? It probably takes longer
than that just to compile and link the source each time.

I wonder how you would do changes
in you code base besides removing an #include, not to speak of
refactoring.

I don't mechanically automate the code change and test process.

... and if it compiles but produces different object code then you
have found an include order dependency bug ;-)

Include order dependencies are not bugs unless the prevailing
development paradigm for the project has declared them to be so.

Once you get beyond standard C into POSIX or system dependancies,
it is *common* for #include files to be documented as being order
dependant upon something else. Better system developers hide
that by #include'ing the dependancies and ensuring that, as far as
is reasonable, that each system include file has guards against
multiple inclusion, but that's a matter of Quality of Implementation,
not part of the standards.

Still, it is true that in the case of multiple source files that
together have 1500 #include, that you would not need to do pow(2,1500)
application tests, if you are using a compiler that supports
independant compilation and later linking. If you do have independant
compilation, then within each source file it is a 2^N process
to find all the #include combinations that will compile, but most of
the combinations will not. Only the versions that will compile need
to go into the pool for experimental linkage; linkage experiments
would be the product of the number of eligable compilations for each
source. Only the linkages that survived would need to go on for testing.
The number of cases that wil make it to testing is not possible to
estimate without statistical information about the probability that any
given #include might turn out to be unneeded.

Walter Bright · Oct 21, 2006

Roland said:
What is stopping you?

I'm pretty overloaded already.

Table of "safe" methods to suppress "unused parameter" warnings?	47	Mar 26, 2014
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023
Meme generator in c	1	Dec 23, 2022
remove unused assemblies (VS2008)	5	Feb 2, 2010
What is the most astounding C++ syntax construct?	0	Dec 22, 2022
Remove hidden from only hovered element	1	Jul 21, 2022
Remove Space, Stuck on lab	1	Nov 6, 2022
Remove class on hover	1	Jul 22, 2022

automatically remove unused #includes from C source?

smachin1000

Walter Roberson

Roland Pibinger

Walter Roberson

Roland Pibinger

Al Balmer

Walter Roberson

Walter Bright

Don Porges

Neil

Roland Pibinger

Roland Pibinger

Walter Bright

Roland Pibinger

jaysome

Ian Collins

Richard Heathfield

CBFalconer

Walter Roberson

Walter Bright

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads