Tools for refactoring header files

S

Simon Cooke

Does anyone know of any tools for refactoring header files?

We're using a third party codebase at work, and pretty much every file
includes a 50Mb precompiled header file. I'm looking for a tool that will
let us figure out which header files are actually needed by each .cpp, and
allow us to break this up so that we're not including the world in each one.

Ideally, the same tool would also recognize where #includes can be replaced
with forward declarations, and even better, it'd automate the updates to the
code files.

Is there such a tool? Or am I SOL until some bright spark writes one?

Thanks,
Si
 
P

Phlip

Simon said:
Does anyone know of any tools for refactoring header files?

I can recall rumors that Lakos pointed out one in his book /Large Scale C++
Software Design/, and I can recall rumors it is not supported. Your Google
is as good as mine.

I would bet folks don't need one because, when they have "a third party
codebase at work," where "pretty much every file includes a 50Mb precompiled
header file," they tend to throw a technique called Pimpl at it:

http://www.gotw.ca/publications/mill04.htm

Get the above-mentioned book, and get /C++ Coding Standards/ by Sutter &
Alexandrescu.

Then hunt down whoever wrote this code and smack them with those books. C++
works because clean logical designs enable clean physical designs, and I
would bet your physical design is also questionable. Read /Working
Effectively with Legacy Code/ by Mike Feathers to get ahead of that problem.

Someone else might indeed know of a tool. I'm only posting because your post
went stale, and you didn't declare that Pimpl was your first line of attack.
It usually is. It's good for your situation because it shows how to clean
nearly everything out of a .h file without changing anything's logical
design. Long term, improving the logical design
 
N

Noah Roberts

I would bet folks don't need one because, when they have "a third party
codebase at work," where "pretty much every file includes a 50Mb precompiled
header file," they tend to throw a technique called Pimpl at it:

http://www.gotw.ca/publications/mill04.htm

Pimpl doesn't help the OP, who already knows the header needs
refactoring.

The only way to fix this problem is hours and hours of cut and paste
operations followed by compiling, followed by tracing where the errors
are comming from. Monolithic headers are just way too complex to wrap
your head around so you _have_ to break down and use the compiler as an
error tracing tool. It doesn't make a very good one and will spit out
difficult to interpret errors but it is the best you got in this case.

Took me over two solid straight days to do ours and there is still a
lot I never touched and let be until I need to pull it apart.

Good luck.
 
S

Simon Cooke

Noah Roberts said:
Pimpl doesn't help the OP, who already knows the header needs
refactoring.

The only way to fix this problem is hours and hours of cut and paste
operations followed by compiling, followed by tracing where the errors
are comming from. Monolithic headers are just way too complex to wrap
your head around so you _have_ to break down and use the compiler as an
error tracing tool. It doesn't make a very good one and will spit out
difficult to interpret errors but it is the best you got in this case.

Yeah - that's why I was hoping for a tool to assist with this - because
we're dealing with a codebase with millions of lines of code.
Took me over two solid straight days to do ours and there is still a
lot I never touched and let be until I need to pull it apart.

Good luck.

Thanks - I'll need it :)

Si
 
P

Phlip

Noah said:
Pimpl doesn't help the OP, who already knows the header needs
refactoring.

Correct. They didn't say "my partial build time is too long", they said they
needed the kind of fix that made me think they started with that problem.
The only way to fix this problem is hours and hours of cut and paste
operations followed by compiling, followed by tracing where the errors
are comming from. Monolithic headers are just way too complex to wrap
your head around so you _have_ to break down and use the compiler as an
error tracing tool. It doesn't make a very good one and will spit out
difficult to interpret errors but it is the best you got in this case.

Can't they do pimpl first, then get a little breathing room before doing
that?

Anecdote: I know a codebase where every high-level class has an Impl suffix,
and it inherits an abstract base class with an Inf suffix. Client classes
are expected to use only the Inf, following the Dependency Inversion
Principle, and Lakos-style short header files.

Except a few Inf classes inherit Impl classes. Ouch!
 
P

Phlip

Roland said:
C++ lacks good refactoring tools, mostly due to the complicated
syntax. See also e.g.
http://www.artima.com/weblogs/viewpost.jsp?thread=11070 .

The word "refactoring" (according to the ISO Refactoring Standard) means
changing the code while passing all its unit tests. The odds of unit tests
here are very low; that's why I recommended the WELC book.

However, the OP asked about "refactoring" header files, which is a different
beast than general C++ refactoring. It typically requires static analysis
and reverse engineering of the header's dependency graph, followed by manual
changes. The analysis can be "fuzzy", whereas automated refactoring of
source must be so "sharp" that behavior absolutely never changes after a
refactor. C++ makes _that_ so hard that we might as well rely on manual
refactoring.

At a shot, I would run Doxygen on the code, take a vacation, come back, and
look at its Graphviz output. IIRC this output shows the header file graphs,
with circles and arrows. I would look for long sequences whose bases can be
Pimpl-ed out of other long sequences.

This search...

http://www.google.com/search?q=c+++header+dependency+tool

....says for its second hit, "This tool scans c++ header files and source
files for #includes that ... It then generates the header and source files
for the entire dependency tree for ..."

So, as usual for modern engineering, it may all come down to the right
Google search expression ;-)
 
N

Noah Roberts

Phlip said:
Can't they do pimpl first, then get a little breathing room before doing
that?

Pimpl isn't always needed or desired. Better to start with other
refactors first. Get the class declarations and stuff separated for
one...

The way I approached the problem was to pull out classes into their own
headers and include that header where the declaration used to be. Get
the thing to compile by including whatever header is needed in that new
header to get things to compile. Then create a blank source file that
includes your new header and try to build an object...this will tell
probably result in more things you depend on that didn't show up when
building it in line with others. Then look for ways to get rid of
headers through "class X;" directives and moving inline functions into
the source files. Then move on to the next class.

Any time anything depends on the main file look for the reason and pull
it out into its own. The first class I tried ended up going down a
bunch of lines of dependencies that had to be pulled out and weeded
through. It was the worst. Took hours of sweat and frustration before
I was even able to compile the first time again but then the rest fell
into place much easier. Anything you depend on is going to be above
your declaration so starting at the top might be a smart move...I
started by trying to pull out the class I needed.

Then start getting rid of the includes in the main header by including
headers in the appropriate source files....find these by removing the
include and looking for what no longer compiles for whatever "reason"
the compiler spits up.

Get rid of the main header...

THEN start looking for ways to lower dependancies amongst the various
header files through actual code refactoring if need be. Until this
point nothing has actually been changed in the design or code at all
except for moving it around in files.
 
N

Noah Roberts

Roland said:
C++ lacks good refactoring tools, mostly due to the complicated
syntax. See also e.g.
http://www.artima.com/weblogs/viewpost.jsp?thread=11070 .

I have found Ref++ (for VS) to be fairly helpful. It offers the basic
refactors...rename, encapsulate, extract func, change sig, introduce
var, move up/down, extract super. It works most of the time. It can
be confused by the preproc (or the accasional phase of the moon/sun
spot error) but usually even figures that stuff out...unfortunately it
can decide to only apply a refactor in some places because of the
preproc so make sure all builds work after (we have several defines
based on product branches...) :p Reasonably priced too...

There appears to be one for xemacs that is much more pricy...haven't
tested it.
 
P

Phlip

So. C++ is the most difficult to parse language on Earth. Known fact.
However, it seems that there are quite a few projects that are already
doing it.

Are they refactoring the #include graph of header files?
 
I

Ira Baxter

Roland Pibinger said:
C++ lacks good refactoring tools, mostly due to the complicated
syntax.

It isn't the syntax, although the syntax defeats most standard parsing
engines (YACC, etc.). The solution there is relatively straightforward.
The hard part is the static sematnics: figuring out what the syntax
says, and what every symbol means. Once you're past that,
you still have to deal with format and comment capture,
multiple dialects, preprocessor directives, and finally get around to
providing tools that can actually transform
the code without breaking it.

The DMS Software Reengineering Toolkit provides a C++ front
end with all the above capability, as well as transformation machinery
to transform and reproduce compilable text with the original comments
and indentation.
See http://www.semanticdesigns.com/Products/FrontEnds/CppFrontEnd.html

This would make a good foundation for an interactive
refactoring tool. (Why isn't it one? Well, it took us awhile to teach
DMS about C++...)

You can read about massive transforms applied to C++ code by DMS in the
paper,
Re-engineering C++ Component Models Via Automatic Program Transformation,
at http://www.semanticdesigns.com/Company/Publications/
 
I

Ira Baxter

Simon Cooke said:
Does anyone know of any tools for refactoring header files?

We're using a third party codebase at work, and pretty much every file
includes a 50Mb precompiled header file. I'm looking for a tool that will
let us figure out which header files are actually needed by each .cpp, and
allow us to break this up so that we're not including the world in each one.

Ideally, the same tool would also recognize where #includes can be replaced
with forward declarations, and even better, it'd automate the updates to the
code files.

Is there such a tool? Or am I SOL until some bright spark writes one?

You're pretty much SOL at this point.

However, we could probably write one for you, based on our C++
program transformation tools.
See http://www.semanticdesigns.com/Products/FrontEnds/CppFrontEnd.html
There's nothing else like it on the planet :-}

How much is it costing your organization to live with the current problem?
 
N

Neeraj

We have recently applied our Jolt award winning dependency matrix based
approach to generic c/c++. It takes the output of Doxygen and creates a
matrix which you can then transform and partition to obtain Lakos style
levelization. You could use it to help you deal with this problem in
the following two different ways:

1. Filter out all inter-file dependencies except for "include" to see
which files are included by which other files.

2. Keep all dependencies but filter out the "include" dependency to see
which files really depend on each other.

The goal of our approach (Lattix LDM) is architecture discovery and
control using inter-module dependencies.. We have applied this approach
fairly successfully to Java and Microsoft C/C++ (where we use bsc
files).

The Doxygen based approach in currently in beta. If you are interested
please send us email (info-AT-lattix.com) and we will be glad to
provide you with more information and make a download available.

Neeraj Sangal
Lattix, Inc.
http://www.lattix.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,076
Latest member
OrderKetoBeez

Latest Threads

Top