C++ build systems

J

Joshua Maurice

(I'm sorry if this is too off topic, but I'm not sure where else to
ask. It does specifically relate to a very important aspect of using C+
+, so I think it's (enough) on topic.)

I'm curious how all of you actually build your code. GNU Make, SCONS,
bjam, vcproj files, etc.?

Specifically, I'm very curious how close your build system is to my
"ideal" build system. My ideal build system has two requirements, fast
and correct. Correct means that a full build at any time produces the
same results as a build from a completely clean local view. Fast means
fast. When builds can take hours or more, I don't want to look at an
error, and wonder if it was because I didn't build from clean,
potentially wasting more hours as I do a sanity check rebuild.

All build systems can be made correct by having it first completely
clean out the local view, then building. However, this is also the
polar opposite of fast. I think it's fair to say that the only way to
achieve correct and fast is to have an incremental build system, a
build system that builds everything which is out of date, and does not
build anything already up to date. (Arguably, that definition is
biased. It infers the use of file timestamps, but there could exist
fast and correct build systems which do not rely upon file
timestamps.)

I think it's also fair to say to achieve fast, you need to have a
parallel build system, a build system that does the build steps in
parallel as much as possible.

Something like "ease of use" is a third aspect of my ideal build
system. A naive build system built upon GNU Make which requires
developers to manually specify header file dependencies may be
technically correct, but it would be ridiculously error prone and a
maintenance nightmare. I don't know a better way to describe what I
want than "idiot-proof". If you miss a header file dependency, the
next incremental build may produce correct results, and it may not. I
want a build system where the incremental build always will be correct
(where correct means equivalent to a build from completely clean) no
matter the input to the build system. (However, if the build system
itself changes, then requiring all developers to fully clean would be
acceptable.)

Lastly, I want it to be portable to basically anything with a C++
implementation. My company supports nearly every server and desktop
platform known to man (Ex: Z/OS, HPUX, AIX, Solaris, WIN, Linux,
several mainframes, and more), and several common Make-like tools
purport to not run on all of these systems. Preferably, I would like
to not be bound to the system specific shell either like Make.
(Although you could just require all developers to install sh and
force sh usage, at least for the makefiles.)

For example, for your build systems, how much of the above does it do?
Specific examples:
1- Does it correctly and automatically track header file
dependencies?
2- If a DLL changes, will you relink all dependent DLLs, transitive
and direct, or will you only relink direct dependent DLLs?
3- Can your developer inadvertently break the incremental build in
such a way that it succeeds on his local machine but fails on the
build machine because he missed specifying a dependency?
4- Is your (unit) test framework incorporated into the build, so that
you can run only the tests which have depend upon a change?
5- Can your build system generate vcproj files on windows? (I
personally like the Visual Studios IDE, and it would be a shame to
lose this by going to a fast, correct build system.)

Thus far, I have not found a satisfactory answer to these questions in
the open source world. Thus, I've been developing my own answer for
the last couple of months, where I hope to achieve (almost) all of
these goals.
 
R

Richard

[Please do not mail me a copy of your followup]

Joshua Maurice <[email protected]> spake the secret code
I'm curious how all of you actually build your code. GNU Make, SCONS,
bjam, vcproj files, etc.?

NAnt scripts that drive Visual Studio (devenv.com) for Windows builds
and Xcode (xcodebuild) for Mac builds.
 
B

Boris Schaeling

[...]For example, for your build systems, how much of the above does it
do?
Specific examples:
1- Does it correctly and automatically track header file
dependencies?
2- If a DLL changes, will you relink all dependent DLLs, transitive
and direct, or will you only relink direct dependent DLLs?
3- Can your developer inadvertently break the incremental build in
such a way that it succeeds on his local machine but fails on the
build machine because he missed specifying a dependency?
4- Is your (unit) test framework incorporated into the build, so that
you can run only the tests which have depend upon a change?
5- Can your build system generate vcproj files on windows? (I
personally like the Visual Studios IDE, and it would be a shame to
lose this by going to a fast, correct build system.)

Thus far, I have not found a satisfactory answer to these questions in
the open source world. Thus, I've been developing my own answer for
the last couple of months, where I hope to achieve (almost) all of
these goals.

I was also looking for a new build system a few months ago and went for
Boost.Build. It's portable, high-level and you can use it without all the
other Boost C++ libraries (or if you use the Boost C++ libraries it's
already available anyway). The biggest problem is the lack of up to date
documentation. When playing around with Boost.Build I started to write my
own (see http://www.highscore.de/cpp/boostbuild/). This should be enough
to get Boost.Build to work (at least I did :) but other build systems
provide more and better documentation.

Boris
 
J

James Kanze

I'm curious how all of you actually build your code. GNU Make,
SCONS, bjam, vcproj files, etc.?

I use GNU Make and some pretty hairy generic makefiles. It's
far from perfect, but I've yet to find anything really better,
and it seems to be about the most portable around.
Specifically, I'm very curious how close your build system is
to my "ideal" build system. My ideal build system has two
requirements, fast and correct. Correct means that a full
build at any time produces the same results as a build from a
completely clean local view. Fast means fast. When builds can
take hours or more, I don't want to look at an error, and
wonder if it was because I didn't build from clean,
potentially wasting more hours as I do a sanity check rebuild.
All build systems can be made correct by having it first
completely clean out the local view, then building. However,
this is also the polar opposite of fast. I think it's fair to
say that the only way to achieve correct and fast is to have
an incremental build system, a build system that builds
everything which is out of date, and does not build anything
already up to date.

The problem is that it is very, very difficult to do this and be
correct. Most build systems I've seen compromise somewhere, and
will occasionally rebuild things that aren't necessary.

(From what I've heard, Visual Age is the exception. But I've
never had the chance to try it, and from what I've heard, it
also compromizes conformance somewhat to achieve this end.)
(Arguably, that definition is biased. It infers the use of
file timestamps, but there could exist fast and correct build
systems which do not rely upon file timestamps.)

I didn't see anything about file timestamps in it. In fact, my
reading of what you require pretty much means that anything
based on file timestamps cannot be used, since it will
definitely result in unnecessary recompilations (e.g. if you add
some comments to a header file). Most make's do a pretty good
job of being correct, but because they depend on file
timestamps, they occasionally recompile more than is necessary.
I think it's also fair to say to achieve fast, you need to
have a parallel build system, a build system that does the
build steps in parallel as much as possible.

GNU make (and many, probably most others) are capable of
partially parallelizing the build, starting compilations of
independent modules in parallel, for example. On single
systems, I've not found that this buys you a lot. Some build
systems (Sun's make, for example, and I think the make for
Clearcase) are capable of distributing the build---starting the
parallel compilations on different machines in the network; that
can be a bit win, especially if you have a relatively large
network. On Unix based systems, at least, this can be simulated
relatively easily using shell scripts for the compiler, but the
simulation will generally only be a first approximation---IIUC,
Sun's make will search out machines with little activity, and
privilege them.
Something like "ease of use" is a third aspect of my ideal
build system. A naive build system built upon GNU Make which
requires developers to manually specify header file
dependencies may be technically correct, but it would be
ridiculously error prone and a maintenance nightmare.

Which is why GNU make (and most other makes) have provisions for
automatically generating the dependencies.
I don't know a better way to describe what I want than
"idiot-proof". If you miss a header file dependency, the next
incremental build may produce correct results, and it may not.
I want a build system where the incremental build always will
be correct (where correct means equivalent to a build from
completely clean) no matter the input to the build system.
(However, if the build system itself changes, then requiring
all developers to fully clean would be acceptable.)

There are two ways of using automatic dependency checking with
GNU make. The first is to specify a special target (typically
"depends"), which is invoked to update the dependencies. The
second updates the dependencies automatically at each make. The
first depends on the user explicitly invoking the target when
any header file usage is changed, which while not as error prone
as requiring the programmer to specify the dependencies
directly, is still open to error. The second has (or had) a
distinct impact in build times.

I currently use the first, because I don't change header use
that often, and the impact on build times was very significant
when I tried the second. The place where I just worked,
however, recently implemented the second, and the impact on
build times seemed quite acceptable, at least on modern machines
with modern compilers, so I'll probably change, when I can find
the time. Note, however, the building the dependencies requires
some collaboration from the compiler; otherwise, you need a
separate run of the compiler and some more or less fancy shell
scripts. (This is what I currently use with VC++: I've not
found an option which would generate the dependencies otherwise,
so I use /E, with the output piped to a shell script. It works
well, but I suspect that trying to do it every time I compile
will impact build times considerably.)
Lastly, I want it to be portable to basically anything with a
C++ implementation. My company supports nearly every server
and desktop platform known to man (Ex: Z/OS, HPUX, AIX,
Solaris, WIN, Linux, several mainframes, and more), and
several common Make-like tools purport to not run on all of
these systems. Preferably, I would like to not be bound to the
system specific shell either like Make. (Although you could
just require all developers to install sh and force sh usage,
at least for the makefiles.)

GNU make, itself, isn't really bound to a shell. On the other
hand, the commands that you use to rebuild the code will be
interpreted by the shell if they contain any metacharacters;
under Unix, this will always be /bin/sh, unless you specify
otherwise, but on other platforms (Windows, at least), it uses
the environment variable SHELL. My own build scripts
(makefiles) make extensive use of the shell, particularly for
executing tests, so I do require a Unix-like shell. Globally,
this is probably the simplest solution anyway: require the
installation of a Unix like shell, and set the environment
variable to point to it.
For example, for your build systems, how much of the above
does it do?
Specific examples:
1- Does it correctly and automatically track header file
dependencies?

Not automatically, but it can be programmed to do so.
2- If a DLL changes, will you relink all dependent DLLs, transitive
and direct, or will you only relink direct dependent DLLs?

I'm not sure. I avoid dynamic linking when it's not necessary.
3- Can your developer inadvertently break the incremental
build in such a way that it succeeds on his local machine but
fails on the build machine because he missed specifying a
dependency?

The developer doesn't specify the dependencies. On the other
hand, because they are generated by a specific target, and (at
least in my case) are platform dependent, it's quite easy for
them to be up to date on one platform, and not on another.
4- Is your (unit) test framework incorporated into the build,
so that you can run only the tests which have depend upon a
change?

The unit test framework is part of the build procedure, with the
"install" target dependent on the "runtest" target (which only
succeeds if all of the unit tests pass). I'm not sure what you
mean with regards to the second part---I've not done so, but it
wouldn't be too difficult to modify the makefiles so that the
entire install procedure is skipped if nothing has been changed
in the component. Other than that, however, I don't want to be
able to install a component where other components can see and
use it without executing all of the unit (regression) tests on
it.
5- Can your build system generate vcproj files on windows? (I
personally like the Visual Studios IDE, and it would be a
shame to lose this by going to a fast, correct build system.)

No idea. I don't even know what a vcproj file is. (In theory,
it should be able to, but I don't know how. And with regards to
the Visual Studios IDE, I've used it a couple of times on my own
code, without any vjproj files in the project itself---I just
open a more or less dummy project, then indicate where it should
find the relevant files.)
Thus far, I have not found a satisfactory answer to these
questions in the open source world. Thus, I've been developing
my own answer for the last couple of months, where I hope to
achieve (almost) all of these goals.

Handling simple dependencies at the file level is pretty
trivial. Handling things like the automatic generation of
dependencies, and requiring a minimum of input from the
developers in each component, can rapidly become very
complicated.
 
J

James Kanze

@gmail.com> wrote:
I was also looking for a new build system a few months ago and
went for Boost.Build. It's portable, high-level and you can
use it without all the other Boost C++ libraries (or if you
use the Boost C++ libraries it's already available anyway).
The biggest problem is the lack of up to date documentation.
When playing around with Boost.Build I started to write my
own (seehttp://www.highscore.de/cpp/boostbuild/). This should
be enough to get Boost.Build to work (at least I did :) but
other build systems provide more and better documentation.

Have they fixed it since I tried it. I found it extremely
unaccurate; it seems to ignore some of the important include
dependencies entirely.
 
B

Boris Schaeling

Have they fixed it since I tried it. I found it extremely
unaccurate; it seems to ignore some of the important include
dependencies entirely.

Even if you tell me when you tried it I'm not sure I can answer your
question. :) If I wasn't reading the Boost.Build mailing list I would
probably think the project is abandoned looking at the documentation (from
2007), the Wiki (from 2006) or the changelog (from 2007). The only
definite source of information is the mailing list (so much about the
documentation :-/).

Boris
 
K

Klaus Rudolph

Joshua said:
I'm curious how all of you actually build your code. GNU Make, SCONS,
bjam, vcproj files, etc.?

I use gnu make for years without any real problems.
Specifically, I'm very curious how close your build system is to my
"ideal" build system. My ideal build system has two requirements, fast
and correct.

make itself is fast. gcc/ld is fast enough for our projects. In addition
I use ccache in combination with gcc to speed up projects which uses
lots of templates.
Correct means that a full build at any time produces the
same results as a build from a completely clean local view.

make simply look for file timestamps. If the timestamps are correct the
build will also be correct. Trouble could result from configuration
tools which put older files in place with old modification date on file
timestamp. But this is not a make problem. This topic must be solved
with your configuration management tools. Big trouble could result from
having different times on several hosts in a build farm. :)
Fast means
fast. When builds can take hours or more, I don't want to look at an
error, and wonder if it was because I didn't build from clean,
potentially wasting more hours as I do a sanity check rebuild.

What we are talking about? Compiling 100 files of 1000 lines each? On
which system and on which hardware? Compiling 10K of files on a 386 will
be not really fast :)
I think it's also fair to say to achieve fast, you need to have a
parallel build system, a build system that does the build steps in
parallel as much as possible.

make can do builds in parallel and this works well in my environment.

Something like "ease of use" is a third aspect of my ideal build
system. A naive build system built upon GNU Make which requires
developers to manually specify header file dependencies may be
technically correct, but it would be ridiculously error prone and a
maintenance nightmare.

Header dependency could also be handled by make. There is no need to do
it by writing handcrafted dep files. I have *ONE* generic makefile which
fulfill my needs in a lot of projects. Also generating docs, tags, deps
and incremental link if needed. Project specific parts will be included
first, so the generic file is "stable".
Lastly, I want it to be portable to basically anything with a C++
implementation. My company supports nearly every server and desktop
platform known to man (Ex: Z/OS, HPUX, AIX, Solaris, WIN, Linux,
several mainframes, and more), and several common Make-like tools
purport to not run on all of these systems. Preferably, I would like
to not be bound to the system specific shell either like Make.
(Although you could just require all developers to install sh and
force sh usage, at least for the makefiles.)
As I know gnu make runs on a lot of the listed os platforms you mentioned.

For example, for your build systems, how much of the above does it do?
Specific examples:
1- Does it correctly and automatically track header file
dependencies? yes
2- If a DLL changes, will you relink all dependent DLLs, transitive
and direct, or will you only relink direct dependent DLLs?
only link that files which must be linked. Incremental link needs some
kind of configuration. Maybe you have a generic makefile which uses the
file system tree for deps, or maybe you have to configure some "SOURCE"
lines in your makefile.
3- Can your developer inadvertently break the incremental build in
such a way that it succeeds on his local machine but fails on the
build machine because he missed specifying a dependency?

if someone modify a makefile, he can implement any problem he wants :)
If you use the same makefile and the some sources the result should be
the same. Maybe I could not understand your question?!


4- Is your (unit) test framework incorporated into the build, so that
you can run only the tests which have depend upon a change?
yes, and you only can check in if tests are successful.
also night builds will put results automatically in bug database.
5- Can your build system generate vcproj files on windows? (I
personally like the Visual Studios IDE, and it would be a shame to
lose this by going to a fast, correct build system.)
I have no need for any m$ ide. And I have no fun on manually configuring
projects. I have *one* generic makefile and no need for any kind of
configuration.

If I need specials, like cross compiling for multiple platforms or
linking against special libs (like for memory bug tracing) or need
special compilations with special debug options, I need some additional
lines in my makefile. But these things could not be done with m$ at all
as I know.

Also having a connection between configuration management and build
system must be handled sometimes in a special way. I have no idea how m$
will handle such needs?
Thus far, I have not found a satisfactory answer to these questions in
the open source world. Thus, I've been developing my own answer for
the last couple of months, where I hope to achieve (almost) all of
these goals.

After lots of years of software development my resume is, that make can
do all the things I need. I have often seen companies with "own" build
systems and a lot of trouble with them. But I have never seen a real
project which needs more than make can handle.
 
R

Richard

[Please do not mail me a copy of your followup]

"Boris Schaeling" <[email protected]> spake the secret code
I was also looking for a new build system a few months ago and went for
Boost.Build. It's portable, high-level and you can use it without all the
other Boost C++ libraries (or if you use the Boost C++ libraries it's
already available anyway). The biggest problem is the lack of up to date
documentation. When playing around with Boost.Build I started to write my
own (see http://www.highscore.de/cpp/boostbuild/). This should be enough
to get Boost.Build to work (at least I did :) but other build systems
provide more and better documentation.

Boris Schaling wrote a tutorial about it recently:
<http://www.highscore.de/cpp/boostbuild/>
 
J

Joshua Maurice

I use GNU Make and some pretty hairy generic makefiles. It's
far from perfect, but I've yet to find anything really better,
and it seems to be about the most portable around.

I'm also of that position.

Thus far, my little side project has been to rewrite the portions of
GNU Make I care about. Rewrite because I felt it would be a good
learning exercise, and because I did not want to go over the halfway-
decent halfway-not-decent C code of GNU Make to add on a couple of
extensions I wanted. (Most importantly built-ins for cat and echo-to-
file.) I've gotten some pretty good results thus far, ~50x faster on
Linux running this not particularly contrived ~250,000 line long
makefile when all of the targets are up to date. (I'm working on
getting legal to let me open source it. It'll be a couple of months.)
(How is it that much faster? I don't know. I've been meaning to look
at why for a while now.)

The problem is that it is very, very difficult to do this and be
correct. Most build systems I've seen compromise somewhere, and
will occasionally rebuild things that aren't necessary.

(From what I've heard, Visual Age is the exception. But I've
never had the chance to try it, and from what I've heard, it
also compromizes conformance somewhat to achieve this end.)

Indeed. I would argue that you would need to solve the Halting Problem
to actually get a system where you only recompile that which you need
to recompile. I don't know anything implementable-by-me offhand which
is better than to compare last modified file timestamps and build
dependent files (ala Make). That's what I meant, not literally
"recompile exactly and only those things which you need to". (Better
things are possible for some languages. I hear that Eclipse's Java
compiler is crazy (awesome), and only recompiles functions which are
out of date. Such a thing is not particularly practical in C++ due to
several reasons, such as the preprocessor and the "include file"
compilation model.)

I think it's fair to
say that the only way to achieve correct and fast is to have
an incremental build system, a build system that builds
everything which is out of date, and does not build anything
already up to date.
[...]
(Arguably, that definition is biased. It infers the use of
file timestamps, but there could exist fast and correct build
systems which do not rely upon file timestamps.)

I didn't see anything about file timestamps in it. In fact, my
reading of what you require pretty much means that anything
based on file timestamps cannot be used, since it will
definitely result in unnecessary recompilations (e.g. if you add
some comments to a header file). Most make's do a pretty good
job of being correct, but because they depend on file
timestamps, they occasionally recompile more than is necessary.

"Out of date" implies timestamps, I think. If you wanted to literally
only recompile that which you need to, you definitely could not rely
upon file timestamps.
 
F

Francesco S. Carta

I use GNU Make and some pretty hairy generic makefiles. It's
far from perfect, but I've yet to find anything really better,
and it seems to be about the most portable around.

I'm also of that position.

Thus far, my little side project has been to rewrite the portions of
GNU Make I care about. Rewrite because I felt it would be a good
learning exercise, and because I did not want to go over the halfway-
decent halfway-not-decent C code of GNU Make to add on a couple of
extensions I wanted. (Most importantly built-ins for cat and echo-to-
file.) I've gotten some pretty good results thus far, ~50x faster on
Linux running this not particularly contrived ~250,000 line long
makefile when all of the targets are up to date. (I'm working on
getting legal to let me open source it. It'll be a couple of months.)
(How is it that much faster? I don't know. I've been meaning to look
at why for a while now.)

The problem is that it is very, very difficult to do this and be
correct. Most build systems I've seen compromise somewhere, and
will occasionally rebuild things that aren't necessary.
(From what I've heard, Visual Age is the exception. But I've
never had the chance to try it, and from what I've heard, it
also compromizes conformance somewhat to achieve this end.)

Indeed. I would argue that you would need to solve the Halting Problem
to actually get a system where you only recompile that which you need
to recompile. I don't know anything implementable-by-me offhand which
is better than to compare last modified file timestamps and build
dependent files (ala Make). That's what I meant, not literally
"recompile exactly and only those things which you need to". (Better
things are possible for some languages. I hear that Eclipse's Java
compiler is crazy (awesome), and only recompiles functions which are
out of date. Such a thing is not particularly practical in C++ due to
several reasons, such as the preprocessor and the "include file"
compilation model.)

I think it's fair to
say that the only way to achieve correct and fast is to have
an incremental build system, a build system that builds
everything which is out of date, and does not build anything
already up to date.
[...]
(Arguably, that definition is biased. It infers the use of
file timestamps, but there could exist fast and correct build
systems which do not rely upon file timestamps.)
I didn't see anything about file timestamps in it. In fact, my
reading of what you require pretty much means that anything
based on file timestamps cannot be used, since it will
definitely result in unnecessary recompilations (e.g. if you add
some comments to a header file). Most make's do a pretty good
job of being correct, but because they depend on file
timestamps, they occasionally recompile more than is necessary.

"Out of date" implies timestamps, I think. If you wanted to literally
only recompile that which you need to, you definitely could not rely
upon file timestamps.

Interesting... what about an editor which can tell a comment when it
sees one, that keeps track of the source - that is, the non-
comments ;-) - and that allows you to save back the file with the old
timestamp _in case_ the source is left formally identical?

I mean, an editor that ignores not only comments but also changed
whitespace scattered among the token sequence.

Maybe such an editor already exists... does it?

I float over Code::Blocks, MinGW & AvgCPU by PoorGear.
Whatever it takes, it takes.

No juice added, sorry.

Cheers,
Francesco
____________________________
Francesco S. Carta, hobbyist
http://fscode.altervista.org
 
J

Jerry Coffin

[ ... ]
Indeed. I would argue that you would need to solve the Halting
Problem to actually get a system where you only recompile that
which you need to recompile. I don't know anything implementable-
by-me offhand which is better than to compare last modified file
timestamps and build dependent files (ala Make). That's what I
meant, not literally "recompile exactly and only those things
which you need to".

The semi-obvious next step is to save a pre-processed version of a
header (or a cryptographic hash of that). This makes it fairly easy
to screen out changes that only affected things like comments, so you
can avoid re-compiling a lot of dependents over a meaningless change.

You can also combine that with normal time-stamp comparison: normally
use a time-stamp comparison, if the time-stamp has changed, re-do the
cryptographic hash of the pre-processed version of the file. If that
has NOT changed, you automatically touch the header's timestamp back
to the previous date (and log what you did, of course). That avoids
rereading and rehashing the header every time you do a build.
 
L

ld

I use gnu make for years without any real problems.




make itself is fast. gcc/ld is fast enough for our projects. In addition
I use ccache in combination with gcc to speed up projects which uses
lots of templates.


make simply look for file timestamps. If the timestamps are correct the
build will also be correct. Trouble could result from configuration
tools which put older files in place with old modification date on file
timestamp. But this is not a make problem. This topic must be solved
with your configuration management tools. Big trouble could result from
having different times on several hosts in a build farm. :)


What we are talking about? Compiling 100 files of 1000 lines each? On
which system and on which hardware? Compiling 10K of files on a 386 will
be not really fast :)




make can do builds in parallel and this works well in my environment.




Header dependency could also be handled by make. There is no need to do
it by writing handcrafted dep files. I have *ONE* generic makefile which
fulfill my needs in a lot of projects. Also generating docs, tags, deps
and incremental link if needed. Project specific parts will be included
first, so the generic file is "stable".


As I know gnu make runs on a lot of the listed os platforms you mentioned.


only link that files which must be linked. Incremental link needs some
kind of configuration. Maybe you have a generic makefile which uses the
file system tree for deps, or maybe you have to configure some "SOURCE"
lines in your makefile.


if someone modify a makefile, he can implement any problem he wants :)
If you use the same makefile and the some sources the result should be
the same. Maybe I could not understand your question?!


yes, and you only can check in if tests are successful.
also night builds will put results automatically in bug database.


I have no need for any m$ ide. And I have no fun on manually configuring
projects. I have *one* generic makefile and no need for any kind of
configuration.

If I need specials, like cross compiling for multiple platforms or
linking against special libs (like for memory bug tracing) or need
special compilations with special debug options, I need some additional
lines in my makefile. But these things could not be done with m$ at all
as I know.

Also having a connection between configuration management and build
system must be handled sometimes in a special way. I have no idea how m$
will handle such needs?


After lots of years of software development my resume is, that make can
do all the things I need. I have often seen companies with "own" build
systems and a lot of trouble with them. But I have never seen a real
project which needs more than make can handle.

agree.

a+, ld.
 
J

Joshua Maurice

[ ... ]
Indeed. I would argue that you would need to solve the Halting
Problem to actually get a system where you only recompile that
which you need to recompile. I don't know anything implementable-
by-me offhand which is better than to compare last modified file
timestamps and build dependent files (ala Make). That's what I
 meant, not literally "recompile exactly and only those things
which you need to".

The semi-obvious next step is to save a pre-processed version of a
header (or a cryptographic hash of that). This makes it fairly easy
to screen out changes that only affected things like comments, so you
can avoid re-compiling a lot of dependents over a meaningless change.

I really don't like the idea of having the correctness of my build
depend on a hash. One time, it's going to fail, and it's going to be
non-obvious why it fails. Though, someone once tried to argue, asking
what's the relative chance of a cosmic particle twiddling a bit on
your machine vs a cryptographic hash collision for a code change. I
suppose he has a point, but using a hash to determine "out of date"
still rubs me the wrong way.
 
J

Jerry Coffin

(e-mail address removed)>, (e-mail address removed)
says...

[ ... ]
I really don't like the idea of having the correctness of my build
depend on a hash. One time, it's going to fail, and it's going to be
non-obvious why it fails.

The "it's going to fail" is pessimistic to an unwarranted degree. In
reality, even if every programmer on earth converted to using it, the
chances of it failing before you die are so minuscule they're well
below consideration.
Though, someone once tried to argue, asking
what's the relative chance of a cosmic particle twiddling a bit on
your machine vs a cryptographic hash collision for a code change. I
suppose he has a point, but using a hash to determine "out of date"
still rubs me the wrong way.

There are all sorts of things that could lead to a random failure,
and as chip geometries shrink, the chances of quite a few of the
others grow. Pick a few clichés for unlikely happenings (e.g. winning
the lottery, being struck by lightning) and you're more likely to
have them happen in the same second than see a collision with a
decent cryptographic hash...
 
J

James Kanze

Indeed. I would argue that you would need to solve the Halting
Problem to actually get a system where you only recompile that
which you need to recompile.

Not at all. As I said, Visual Age does (or did) it. For other
reasons, Visual Age wasn't fully conform, but I think those
reasons are fixable.

What you do need is a partial compiler more or less built into
the editor, which parses enough of the file (header or source)
each time you save it to update what definitions it depends on,
and which compares all of your definitions against saved copies.
You then maintain the list of definitions and last modification
dates somewhere (in a database?), and use it, rather than
timestamps.

It's not really fundamentally different from using timestamps,
except for the granularity.
I don't know anything implementable-by-me offhand which is
better than to compare last modified file timestamps and build
dependent files (ala Make).

Well, it's obviously more difficult to implement, since you need
to implement a large part of a C++ compiler, and somehow
integrate it into your editor. Definitely not a week-end
project:).
"Out of date" implies timestamps, I think.

Maybe. But timestamps of what? Files, or something with a
finer granularity.
 
J

James Kanze

(e-mail address removed)>, (e-mail address removed)
says...
[ ... ]
I really don't like the idea of having the correctness of my
build depend on a hash. One time, it's going to fail, and
it's going to be non-obvious why it fails.
The "it's going to fail" is pessimistic to an unwarranted
degree. In reality, even if every programmer on earth
converted to using it, the chances of it failing before you
die are so minuscule they're well below consideration.

Note that most compilers will use some sort of pseudo-random or
hashed values for other things as well (e.g. the mangled name of
an anonymous namespace), counting on their being different as
well. So the "it's going to fail" should be just as applicable
here.

But of course, in this case, you don't need to use the hash; all
you need to do is to store a previous copy of the file (or the
definitions in it) somewhere; compare the definitions in the
changed version of the file, and if they are different, update
the time stamp in the base where you stored it.
 
J

Joe Smith

I don't have any suggestions, but I have general comments on your ideal
system.
Correct means that a full build at any time produces the
same results as a build from a completely clean local view.

Be warned that on some platforms, there is no way that you would get bit for
bit identical executables both ways, due to embeded timestamps of the
executable.

However, one can obviously get identical machine code, which is probablly
all that is important to you.
Fast means fast. When builds can take hours or more, I don't want
to look at an error, and wonder if it was because I didn't build
from clean, potentially wasting more hours as I do a sanity check
rebuild.

True. Unfortunately long build times come from a variety of reasons, one of
significant note being the the lack of export support in compilers, which
means substancially larger and more complex header files. That can partially
be mitigated with the use of precompiled headers. Unfortunately, that method
is not very cross-platform compatible.
All build systems can be made correct by having it first completely
clean out the local view, then building. However, this is also the
polar opposite of fast. I think it's fair to say that the only way to
achieve correct and fast is to have an incremental build system, a
build system that builds everything which is out of date, and does not
build anything already up to date.

Agreed, although in theory a utility like ccache can be used in conjuction
with a wipe it all and rebuild system, by regonizing compiliations where
nothing has changed, and giving the same result. That is not very
cross-platform compatible, and is just adding complexity, so we will not use
that
(Arguably, that definition is biased. It infers the use of file
timestamps, but there could exist fast and correct build systems
which do not rely upon file timestamps.

There are only two types of systems for identifying if a file is out of
date. Timestamp based systems are one. Content based systems are annother.
With content based systems, the build system maintains a copy of the source
file, and dependencies (headers) that resulted in the current output file.
The system rebuilds the source only if the current file's contents are
different from the stored copy.

In theory by doing things like ignoring comments durring the comparison one
can avoid rebuilding some files that have changed, but not in a way that
could affect the output. However, that may well take more time then it
saves, since it would mean parsing every file every time the build command
is run.

In general, good cryptographic hashes can be used to speed up the
comparison. Never the less, content based systems do require reading and
comparing content of every file every time the buld system is run, which is
why Make uses timestamps.
I think it's also fair to say to achieve fast, you need to have a
parallel build system, a build system that does the build steps in
parallel as much as possible.

Fair enough. This really is only very useful if you have a multicore
machine, or some sort of compile farm, or if you expect there to be a
mixture of IO bound and CPU bound tasks that can be run at the same time.

GNU Make supports this within a single system (one needs some additional
tools to support the compile farm case).
Something like "ease of use" is a third aspect of my ideal build
system. A naive build system built upon GNU Make which requires
developers to manually specify header file dependencies may be
technically correct, but it would be ridiculously error prone and a
maintenance nightmare.

Very true.
I don't know a better way to describe what I
want than "idiot-proof". If you miss a header file dependency, the
next incremental build may produce correct results, and it may not. I
want a build system where the incremental build always will be correct
(where correct means equivalent to a build from completely clean) no
matter the input to the build system. (However, if the build system
itself changes, then requiring all developers to fully clean would be
acceptable.)

This is quite possible. In fact, GNU Make can even automatically do this,
with the correct rules.
The key to this is that the build system needs to keep some information
about the headers each main source file depends on.

While I'm not sure about Visual C++'s compiler, I know that GCC has an
option to run the preprocessor on a file, and print a list of all the
included files. (Direct, and indirect.) (One can get this information from a
VC++ preprocesor run, but I'm not sure it has a convient option for
outputing just this information).

That provides a way to determine the headers needed by the the source. So
when building from a clean state this gets done on every source file. The
build system stores this. In further pases, when checking if the file is up
to date the system knows to check the file, and all headers it uses. If any
have changed, the file must be rebuilt, and the build system must also
re-run the preprocessor to get the new list of require headers (since this
may have changed) and store the new information.
Lastly, I want it to be portable to basically anything with a C++
implementation. My company supports nearly every server and desktop
platform known to man (Ex: Z/OS, HPUX, AIX, Solaris, WIN, Linux,
several mainframes, and more), and several common Make-like tools
purport to not run on all of these systems. Preferably, I would like
to not be bound to the system specific shell either like Make.
(Although you could just require all developers to install sh and
force sh usage, at least for the makefiles.)

This is the only part that GNU Make cannot do as well as would be desired.
But this is still highly desireable.
2- If a DLL changes, will you relink all dependent DLLs, transitive
and direct, or will you only relink direct dependent DLLs?

This does not make sense to me? Are you alking about the static DLL link
stubs?
The DLLs themselves are linked dynamically,
4- Is your (unit) test framework incorporated into the build, so that
you can run only the tests which have depend upon a change?

I've never heard of any build system that has any support for that at a fine
grain scale, only at a fairly ligh level, (such as only running the tests
for a library that has changes, not the ther libraries that are part of the
Superproject).
5- Can your build system generate vcproj files on windows? (I
personally like the Visual Studios IDE, and it would be a shame to
lose this by going to a fast, correct build system.)

Here you presumably would like full integgration with VC++, by which I mean
you can use it no only to edit the source files, but also click on the build
project button and have it run this build system, and allow the use of the
debigger on built files. That is definately theoretically possible. VC++ has
always been modular enough to theoretically support that, although in the
past at least, this could only be done thorough writing a dll for the system
you desired. I think now this might be possible through the textual
configuration files.
 
J

Joshua Maurice

This does not make sense to me? Are you alking about the static DLL link
stubs?
The DLLs themselves are linked dynamically,

Let's consider the compile-time-linked shared object, ex: gcc -L and -
l, not dlopen.

libAA.so depends on libBB.so depends on libCC.so. libAA.so does not
directly use any symbols from libCC.so. If I change a cpp file which
is linked into libCC.so, this can change the symbols libCC.so exports,
and thus could make libBB.so no longer "work" if I removed a symbol
which libBB.so was using from libCC.so. Thus, when I change that cpp
file in libCC.so, I need to relink, aka rebuild, libCC.so, then
rebuild libBB.so. However, we do not need to rebuild libAA.so.

A simple Make-like system would have libAA.so normally (not order-
only) depend on libBB.so, which would normally depend on libCC.so.
Thus, when I change libCC.so in that simple build system, I will
relink all shared objects objects downstream, even though this is not
technically required.

I'm thinking you might be able to pull off not rebuilding all
downstream shared objects by caching results of nm. I'm not quite sure
if it's practical though. I work on this in my spare time when I feel
like it, so it could be a while. I'm thinking that removing a symbol
will be ok. That is easily catchable and you can do the correct
rebuilds. However, suppose you have a link line
-lAA -lBB -lCC -lDD
Suppose you add a symbol to AA, and this symbol is also exported from
BB. This means that if the library was rebuilt, it would find the
symbol from AA, but without a rebuild it would use the symbol from BB.
(At least, I think. I'm not the most familiar with such intricacies.)
This would be slightly more annoying to catch, especially depending on
the exact details of the linker. (Again, frankly, I don't know enough
at the moment.) This also sounds like an ODR violation, so I could
just punt on the undefined behavior.

If possible, I'd like to do this. When my shared object dependencies
get like ~20 levels deep, it could save some time to not relink
everything downstream.
I've never heard of any build system that has any support for that at a fine
grain scale, only at a fairly ligh level, (such as only running the tests
for a library that has changes, not the ther libraries that are part of the
Superproject).

I'm not sure how useful it would be. However, when the tests take ~10
hours to run on the build machine, I'd still like to look for a way to
speed that up.
 
J

Jorgen Grahn

(e-mail address removed)>, (e-mail address removed)
says...

[ ... ]
I really don't like the idea of having the correctness of my build
depend on a hash. One time, it's going to fail, and it's going to be
non-obvious why it fails.

The "it's going to fail" is pessimistic to an unwarranted degree. In
reality, even if every programmer on earth converted to using it, the
chances of it failing before you die are so minuscule they're well
below consideration.
Though, someone once tried to argue, asking
what's the relative chance of a cosmic particle twiddling a bit on
your machine vs a cryptographic hash collision for a code change. I
suppose he has a point, but using a hash to determine "out of date"
still rubs me the wrong way.

There are all sorts of things that could lead to a random failure,
and as chip geometries shrink, the chances of quite a few of the
others grow. Pick a few clichés for unlikely happenings (e.g. winning
the lottery, being struck by lightning) and you're more likely to
have them happen in the same second than see a collision with a
decent cryptographic hash...

Nitpick: it doesn't even have to be cryptographic: I can take md5sum,
and modify it so that if its input starts with

Please note: my hash is d3b07384d113edec49eaa6238ad5ff00

it will return d3b07384d113edec49eaa6238ad5ff00. That's not a
cryptographic, but it's just as useful as one for this purpose.

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top