No unanswered question

  • Thread starter Alf P. Steinbach /Usenet
  • Start date
K

Keith H Duggar

A good insight. Makefiles are part of the source code.

No, not unless the clients /want/ them to be "part of the
source code". Make is a rather general purpose dependency
analysis tool. The dependencies are specified as files; so
if you want one or more makefiles to be in the prereqs of
one or more targets then put them there! The complaint or
"insight" as you call would instead be properly directed
at the dependency generation (whether manual or automatic)
tool not the dependency analysis tool (Make).

The same point applies to the complaining about environment
variables, changes that affect include search path results,
etc. Many of those can be handled with proper scripting to
touch some prereq files. For example, in a build system that
we maintain there is a script that compares the environment
with a capture file. If the current environment differs from
the capture a set of changed variables is calculated and all
all source files (including makefiles) that reference one of
the changed variables is touched thus triggering a rebuild
for the dependencies.

In short, make is /one/ tool, a dependency analysis tool,
that is /part/ of a build system (called Unix). Learn to use
the full suite of tools instead of searching for a "One True
Uber Tool" monolith. Remember the Unix paradigm of "many small
tools working together to solve big problems".

Of course, there are some actual problems with make. A few
have be mentioned in other posts. Another is proper handling
of a single command that outputs multiple targets which is,
well let's say annoying ;-), with make.

KHD
 
J

James Kanze

On 07/09/2010 08:25 PM, Joshua Maurice wrote:

[...]
You are correct about current build systems not handling all
incremental changes correctly,

What do you mean about "not handling incremental changes
correctly". Do you mean that some files are not being
recompiled when they should be? Or do you mean that files are
being recompiled when it isn't necessary (because all you did
was modify an inline function in the header, and the file in
question doesn't use that function)?

For the first, make, used correctly and in conjunction with the
compiler, seems to work well, provided you don't go messing
around with the timestamps on the file.
I've been programming long enough to have come across that
problem. Without more thought on this topic I'm not entirely
sure how to ensure incremental changes are correctly handled,
but it sounds like you'd need to generate some sort of
relationship graph.

Generating the relationship graph is exactly what make does.
 
J

Joshua Maurice

No, not unless the clients /want/ them to be "part of the
source code". Make is a rather general purpose dependency
analysis tool. The dependencies are specified as files; so
if you want one or more makefiles to be in the prereqs of
one or more targets then put them there! The complaint or
"insight" as you call would instead be properly directed
at the dependency generation (whether manual or automatic)
tool not the dependency analysis tool (Make).

Make's model is to have developers specify rules in a turing complete
programming language, aka makefiles. This is a horrible model.

First, as a matter of practicality, very few people in my company, and
I would imagine the industry at large, are anywhere near knowledgeable
as I on build systems. I have an architect in my company who have
sworn up and down that the Recursive Make Considered Harmful solution
wasn't known or implementable 6 years ago, but it was. As most of the
users of make do not even understand the basics of make, they will
break it. Moreover, it's somewhat unreasonable to require them to.
They're supposed to be experts at writing code in the product domain,
not in writing build systems.

Second, make's model is fundamentally fubar. You cannot have a fully
correct incremental build written in idiomatic make ala Recursive Make
Considered Harmful. See else-thread, or the end of this post for a
synopsis. Make was good back in the day when a single project did fit
into a single directory and a single developer knew all of the code,
but when a developer does not know all of the code, make's model no
longer works.

Simply put, this is my use case which make will not handle. I'm
working in a company on a project with over 25,000 source files in a
single build. The compile / link portion takes over an hour on a
developer machine, assuming no random errors, which there frequently
are on an incremental build. I work on one of the root components, a
reusable module which is used by several services (also part of the
same build). It is my explicit responsibility to do a decent effort at
not breaking the build from any checkin. As the closest thing my
company has to a build expert, I know that the build is not
incrementally correct. I hacked a large portion of it together. I can
do an incremental build most of the time, and just cross my fingers
and hope that it's correct, but I have no way of knowing it.

On the bright side, I manage to not break it most of the time.
However, with a large number of developers working on it, the last
build on the automated build machine is almost always broken. On an
almost weekly basis checkin freezes are enacted in an attempt to
"stabilize" the build. The problem is that most other developers are
not as thorough in their own testing as I, and the automated build
machine takes hours to do the full clean recompile. The time from a
checkin to a confirmed bug is quite large, and as the number of build
breakages goes up, so does this turnaround time as compile failures
hide compile failures.

Yes, I know the standard solution is to break up the project into
smaller projects. I would like that too. However, I'm not in a
position of power to make that happen, and no one else seems
interested in changing the status quo there.
The same point applies to the complaining about environment
variables, changes that affect include search path results,
etc. Many of those can be handled with proper scripting to
touch some prereq files. For example, in a build system that
we maintain there is a script that compares the environment
with a capture file. If the current environment differs from
the capture a set of changed variables is calculated and all
all source files (including makefiles) that reference one of
the changed variables is touched thus triggering a rebuild
for the dependencies.

Pretty cool system. I would still argue no, that there is a difference
between what I want and what your system handles. As I mentioned else-
thread, no build system is perfect. The line must be drawn somewhere.
At the very least, the correctness of an incremental build system is
conditional on the correctness of the code of the build system itself.
Moreover, if the developer installs the wrong version of the build
system, then he's also fubar.

However, there is an obvious constrained problem set which make and
other build systems claim to solve. The problem is: "Given a correctly
set up environment, the build system should be able to do a correct
incremental build over any and all possible changes in source control
for the project, and any and all possible changes in a developer's
local view of source control." A developer working on a large project,
who does not know all of the code, is unable to distinguish any finer.
He doesn't know about that other component's code, nor their
makefiles. The entire point of an incremental build system is the
automation of such dependency tracking, and that should include the
build system scripts themselves.
In short, make is /one/ tool, a dependency analysis tool,
that is /part/ of a build system (called Unix). Learn to use
the full suite of tools instead of searching for a "One True
Uber Tool" monolith. Remember the Unix paradigm of "many small
tools working together to solve big problems".

Of course, there are some actual problems with make. A few
have be mentioned in other posts. Another is proper handling
of a single command that outputs multiple targets which is,
well let's say annoying ;-), with make.

Interesting. I'm sure there's some logical fallacy in here somewhere,
but I don't know the name(s). In short, you assert that the Unix way
works, and is better than other ways, especially better than "One True
Uber Tool monolith". I'd rather not get into this discussion, as it's
mostly tangential. The discussion at hand is make is broken. Not being
able to hand multiple output from a single step is annoying to handle,
but it's relatively minor. Its major problems are:
1- It's core philosophy of a file dependency graph with cascading
rebuilds without termination, combined with its idiomatic usage, is
inherently limited and broken without further enhancement.
- a- It will not catch new nodes hiding other nodes in search path
results (such as an include path).
- b- It will not catch removed nodes nor edges which should trigger
rebuilds (such as removing a cpp file will not relink the library).
- c- It will not catch when the rule to build a node has changed which
should trigger a rebuild (such as adding a command line processor
define).
- d- It will not get anywhere close to a good incremental build for
other compilation models, specifically Java. A good incremental Java
build cannot be based on a file dependency graph. In my solution, file
timestamps are involved yes, but the core is not a file dependency
graph with cascading rebuilds without termination conditions.
2- It exposes a full turing complete language to common developer
edits, and these common developer edits wind up in source control. The
immediate conclusion is that a build system based on idiomatic make
can never be incrementally correct over all possible changes in source
control. That is, a developer will inevitably make a change which
breaks the idiomatic usage (what little there is) and will result in
incremental incorrectness. False negatives are quite annoying, but
perhaps somewhat acceptable. False positives are the bane of a build
systems existence, but they are possible when the build system itself
is being constantly modified \without any tests whatsoever\ as is
common with idiomatic make.
 
I

Ian Collins

Make's model is to have developers specify rules in a turing complete
programming language, aka makefiles. This is a horrible model.

But it works and tool can hide them from the nervous developer.
First, as a matter of practicality, very few people in my company, and
I would imagine the industry at large, are anywhere near knowledgeable
as I on build systems. I have an architect in my company who have
sworn up and down that the Recursive Make Considered Harmful solution
wasn't known or implementable 6 years ago, but it was. As most of the
users of make do not even understand the basics of make, they will
break it. Moreover, it's somewhat unreasonable to require them to.
They're supposed to be experts at writing code in the product domain,
not in writing build systems.

Which is why every team I have worked with or managed had one or two
specialists who look after the build system and other supporting tools
(SCM for instance).
Second, make's model is fundamentally fubar. You cannot have a fully
correct incremental build written in idiomatic make ala Recursive Make
Considered Harmful. See else-thread, or the end of this post for a
synopsis. Make was good back in the day when a single project did fit
into a single directory and a single developer knew all of the code,
but when a developer does not know all of the code, make's model no
longer works.

No matter how deep a project's directories go, you can still use a
single flat makefile. I use a single makefile for all my in house code.
There is no better way of supporting distributed building. On larger
projects I have run, we used a single makefile for each layer of the
application (typically no more then 4).
Simply put, this is my use case which make will not handle. I'm
working in a company on a project with over 25,000 source files in a
single build. The compile / link portion takes over an hour on a
developer machine, assuming no random errors, which there frequently
are on an incremental build. I work on one of the root components, a
reusable module which is used by several services (also part of the
same build). It is my explicit responsibility to do a decent effort at
not breaking the build from any checkin. As the closest thing my
company has to a build expert, I know that the build is not
incrementally correct. I hacked a large portion of it together. I can
do an incremental build most of the time, and just cross my fingers
and hope that it's correct, but I have no way of knowing it.

On the bright side, I manage to not break it most of the time.
However, with a large number of developers working on it, the last
build on the automated build machine is almost always broken. On an
almost weekly basis checkin freezes are enacted in an attempt to
"stabilize" the build. The problem is that most other developers are
not as thorough in their own testing as I, and the automated build
machine takes hours to do the full clean recompile. The time from a
checkin to a confirmed bug is quite large, and as the number of build
breakages goes up, so does this turnaround time as compile failures
hide compile failures.

It sounds like it is your company's process that's broken, not make.
Yes, I know the standard solution is to break up the project into
smaller projects. I would like that too. However, I'm not in a
position of power to make that happen, and no one else seems
interested in changing the status quo there.

Ah, so it is!

Your tools have to support your process and your process has to suit
your tools. You can't fix a process problem by using "better" tools.
 
J

Joshua Maurice

But it works and tool can hide them from the nervous developer.


Which is why every team I have worked with or managed had one or two
specialists who look after the build system and other supporting tools
(SCM for instance).

Yes, my company has those too. Unfortunately the build specialists are
only that in name; they have no actual power, and a lot of them have
no actual knowledge of builds. The developers control the build, and
the "build specialists" just manage the automated build machines.
No matter how deep a project's directories go, you can still use a
single flat makefile.  I use a single makefile for all my in house code..
  There is no better way of supporting distributed building.  On larger
projects I have run, we used a single makefile for each layer of the
application (typically no more then 4).

All of your arguments seem to address only one of my complaints, that
a turing complete build script language is hopelessly not
incrementally correct. If the makefile could be made simple enough so
all developers could understand it, and so it's exceedingly hard to
make a mistake bad edit, I could live with that. As an educated guess
on a matter of fact, I do not think this is possible with split
makefiles ala GNU Make includ, nor a single large makefile.

There's still the problems that the idiomatic make usage will not
result in an incrementally correct build system (for the reasons
mentioned else-thread), and it basically will never have a good
incremental build for some other compilation models, like Java.
Ah, so it is!

Your tools have to support your process and your process has to suit
your tools.  You can't fix a process problem by using "better" tools.

Somewhat of a red herring and a straw man in that it doesn't have much
to do about Make's shortcomings. Also, I disagree. One can fix a
process, or at least alleviate the problems, by using better tools.
Also, the better tools I suggest would also help in the situation of a
componentized build with well defined, stable inter-component
interfaces. The build time may be shorter, but you would still be
forced to do a clean build to have some known measure of success. An
incrementally correct build system would be useful in all cases.
 
I

Ian Collins

All of your arguments seem to address only one of my complaints, that
a turing complete build script language is hopelessly not
incrementally correct. If the makefile could be made simple enough so
all developers could understand it, and so it's exceedingly hard to
make a mistake bad edit, I could live with that. As an educated guess
on a matter of fact, I do not think this is possible with split
makefiles ala GNU Make includ, nor a single large makefile.

I loathe nested makefiles!

But with our single makefiles, all the developer edits were to add or
remove source files.
There's still the problems that the idiomatic make usage will not
result in an incrementally correct build system (for the reasons
mentioned else-thread), and it basically will never have a good
incremental build for some other compilation models, like Java.

Maybe. If so, that is because make grew up with C and related
languages. Don't most Java projects use Java based build tools like Ant?
Somewhat of a red herring and a straw man in that it doesn't have much
to do about Make's shortcomings. Also, I disagree. One can fix a
process, or at least alleviate the problems, by using better tools.

Once can to an extent, but that is analogous to prescribing Aspirin
instead of finding to cause of the headache.
Also, the better tools I suggest would also help in the situation of a
componentized build with well defined, stable inter-component
interfaces. The build time may be shorter, but you would still be
forced to do a clean build to have some known measure of success. An
incrementally correct build system would be useful in all cases.

Oh I agree. I don't claim that make like tools are perfect for C++
projects. I have made several attempts over the years (decades!) to
build better makes, but time constraints always got the better of me.

One of my large projects did have some obscure dependencies between
generated files that did require a clean build if the moon was in the
wrong phase when they were edited. We were willing to live with that
given the overall simplicity and speed of the distributed make based
build system.

Life and software development is built around compromise!
 
J

Joshua Maurice

I loathe nested makefiles!

But with our single makefiles, all the developer edits were to add or
remove source files.

Interesting. I think you're saying you had all of your files in a
single directory. At least, I think that's what your saying.
Otherwise, wouldn't the developer have to modify the giant makefile to
add new source directories? I guess you could have done it so it
automatically scans some root dir for all possible source dirs.
However, even then, you'd have to specify the link dependencies
somewhere. I don't see how you get around developers "commonly"
modifying the giant makefile.

I'm working on a project with C++, Java, code generation from a simple
language to Java and C++ to facilitate serialization, an eclipse
plugin build thingy, and various other builds steps. Depending on your
answer, I'm not sure if that would work for me.
Maybe.  If so, that is because make grew up with C and related
languages.  Don't most Java projects use Java based build tools like Ant?

Idiomatic GNU Make is much much closer to incremental correctness than
Ant with Java. Ant with Java, even with its depends task, is
hopelessly bad. At least GNU Make covers 95% or more of cases - It
covers cpp file edits, but doesn't cover arbitrary file creation,
deletion, and build script modification. Ant with Java doesn't even
handle basic Java file edits.
Once can to an extent, but that is analogous to prescribing Aspirin
instead of finding to cause of the headache.

But either way you'll have a headache. We'll still have a build, and
it will still take more time than 0, so an incrementally correct build
will always be useful.

Also, proof by analogy is fraud - Bjarne Stroustrup. (Not that I'm
claiming you're trying to prove by analogy. At least I hope you're
not.)
Oh I agree.  I don't claim that make like tools are perfect for C++
projects.  I have made several attempts over the years (decades!) to
build better makes, but time constraints always got the better of me.

One of my large projects did have some obscure dependencies between
generated files that did require a clean build if the moon was in the
wrong phase when they were edited.  We were willing to live with that
given the overall simplicity and speed of the distributed make based
build system.

Life and software development is built around compromise!

I disagree. Some parts are open to compromise, but other parts really
aren't. Take a compiler, for example. There is little to no room in a
compiler for compromising correctness. (This is very true of most
software. If it doesn't give the right behavior, then in most domains
it's effectively worthless. Correctness in software is usually a hard
requirement.) If a compiler randomly spat out bad code, especially for
the common case, you know people would be up in arms, and it would be
fixed post haste.

A build system is very much like a compiler, but for some reason we
tolerate a lower standard, perhaps because we can always do a full
clean build if we want a guaranteed correct build. As such, it does
become a tradeoff of developer time lost by incorrect builds vs
developer time required to write a correct incremental build system.
I'm just kind of surprised no one in their spare time for fun has done
it yet. Writing such a system only needs to be done once.
 
I

Ian Collins

Interesting. I think you're saying you had all of your files in a
single directory. At least, I think that's what your saying.
Otherwise, wouldn't the developer have to modify the giant makefile to
add new source directories? I guess you could have done it so it
automatically scans some root dir for all possible source dirs.
However, even then, you'd have to specify the link dependencies
somewhere. I don't see how you get around developers "commonly"
modifying the giant makefile.

No, the files were anywhere but there was only one makefile for each
layer or module (where a module comprised a number of libraries). New
files were added as required.
I'm working on a project with C++, Java, code generation from a simple
language to Java and C++ to facilitate serialization, an eclipse
plugin build thingy, and various other builds steps. Depending on your
answer, I'm not sure if that would work for me.

Does the build thingy manage makefiles for you?
But either way you'll have a headache. We'll still have a build, and
it will still take more time than 0, so an incrementally correct build
will always be useful.

Also, proof by analogy is fraud - Bjarne Stroustrup. (Not that I'm
claiming you're trying to prove by analogy. At least I hope you're
not.)

No, I was just saying tools aren't the way to fix a broken process. One
of the most common questions I hear from companies in a mess with their
process is what tool should we use to fix it?
I disagree. Some parts are open to compromise, but other parts really
aren't. Take a compiler, for example. There is little to no room in a
compiler for compromising correctness. (This is very true of most
software. If it doesn't give the right behavior, then in most domains
it's effectively worthless. Correctness in software is usually a hard
requirement.) If a compiler randomly spat out bad code, especially for
the common case, you know people would be up in arms, and it would be
fixed post haste.

True, but build systems are like compilers in another way: they have
bugs. If you write some unusual code that upsets the compiler, you can
file a bug and get it fixed. Or you can change the code. If you do
something unusual which breaks your build, you can try and fix the tool.
Or you can solve the problem a different way. Compromise!
A build system is very much like a compiler, but for some reason we
tolerate a lower standard, perhaps because we can always do a full
clean build if we want a guaranteed correct build. As such, it does
become a tradeoff of developer time lost by incorrect builds vs
developer time required to write a correct incremental build system.

It is and that's the point. We knew we had an issue, probably due to
nested dependencies, that mucked up our build. So we added a process
rule "if you change this file, do a clean build and send out a notice to
the rest of the team". If it had been a file we changed often, we would
have fixed the problem, but as it wasn't, we compromised.

Most projects I've worked on have had multiple targets, so there were
always continuous clean builds running in the background to ensure a
change to one target didn't break another. Where "another" could
include the build process it's self!
I'm just kind of surprised no one in their spare time for fun has done
it yet. Writing such a system only needs to be done once.

Oh I've tried, but there simply isn't enough spare time!
 
K

Keith H Duggar

A fool with a tool is still a fool.

Never heard that before, great quote!

I was actually going to explain in some detail the scripting
solutions I've created to eliminate ALL of the make problems
Joshua complains about. But the more I ruminated on this:

Joshua said:
First, as a matter of practicality, very few people in my company, and
I would imagine the industry at large, are anywhere near knowledgeable
as I on build systems. I have an architect in my company who have
sworn up and down that the Recursive Make Considered Harmful solution
wasn't known or implementable 6 years ago, but it was. As most of the
users of make do not even understand the basics of make, they will
break it.

The more I came to think that said scripts are a significant
competitive advantage! After all, if one of the /industries/
foremost leading build system masters, a prime among firsts,
does not know how to script these problems away then Wow Wow
Wubbsy I've stumbled on scripts made of pure gold!

Though half of me thinks in contradistinction that many have
implemented the same or very similar solutions and that the
problem here is a need for less whinging and more winning.

KHD
 
J

Joshua Maurice

The more I came to think that said scripts are a significant
competitive advantage! After all, if one of the /industries/
foremost leading build system masters, a prime among firsts,
does not know how to script these problems away then Wow Wow
Wubbsy I've stumbled on scripts made of pure gold!

I hesitate to go that far. I doubt I am. I can't quite tell if you're
being sarcastic or not. Uhh, thank you if you're serious, but it's too
much. If you are being sarcastic, then you're putting words into my
mouth.

I just seem to be the only one raising a ruckus about the lack of
incrementally correct builds at the moment. GNU Make and other tools
promised me incrementally correct builds, and I want incrementally
correct builds. This is not too much to ask for. As I did mention else-
thread, I'm sure I'm not the first to come up with these ideas or
notice these problems. I did link to that paper for Java, Ghost
Dependencies, which lays out nearly all of my grievances. The Turing
complete scripting grievance is of my own original creation, though
again I doubt I am the first to say so.
 
J

Joshua Maurice

No, the files were anywhere but there was only one makefile for each
layer or module (where a module comprised a number of libraries). New
files were added as required.


Does the build thingy manage makefiles for you?






No, I was just saying tools aren't the way to fix a broken process.  One
of the most common questions I hear from companies in a mess with their
process is what tool should we use to fix it?





True, but build systems are like compilers in another way: they have
bugs.  If you write some unusual code that upsets the compiler, you can
file a bug and get it fixed.  Or you can change the code.  If you do
something unusual which breaks your build, you can try and fix the tool.
  Or you can solve the problem a different way.  Compromise!

Compilers can have bugs yes, but if a bug is discovered, especially if
it's common usage, it will be fixed, or the compiler will be dropped.
There is no compromise. I claim that simple file creation, file
deletion, and adding or removing command line preprocessor defines, is
the common case, and if an incremental build system doesn't handle
every such case all the time, it is woefully more broken than any
commonly used compiler.

However, as I said before, yes, you can compromise with an incremental
build system, because if it breaks, it only wastes some developer
time, and once he figures it out, he can do a full clean build to work
around in the issue. With a broken compiler, your options are much
more limited, depending on the exact bug.
It is and that's the point.  We knew we had an issue, probably due to
nested dependencies, that mucked up our build.  So we added a process
rule "if you change this file, do a clean build and send out a notice to
the rest of the team".  If it had been a file we changed often, we would
have fixed the problem, but as it wasn't, we compromised.

An email notice? Really? I can understand that a single company has
little motivation to fix this problem, but I am flabbergasted that the
unix community and the other open source communities for C, C++, Java,
and other languages, continue to put up with this. I can only conclude
the problem is smaller for them because they have a smaller code base
size. The problem still exists, but it's not as acute as in my
company. It's when you scale up to a project of this size, ~25,000
source files and growing, with 100+ developers concurrently working on
it, that you start to see these problems really hurt. Sending out an
email notification to do a full clean build like that would result in
several such emails send to me a day, at least.

That still also doesn't address my concern that as I'm at the bottom
of the whole hierarchy, aka most of the other guys depend on me, I
have to do a full clean build before checking in because I don't know
what will or what will not break stuff downstream. Perhaps I might
feel a little safer if it was just Recursive Make Considered Harmful
with purely C++ source code to executables, but the build has a wide
variety of stuff, like Java, code generation to Java and C++, etc.,
and it's not incrementally correct at all. This gets back to my
original point that having to maintain a build script in a Turing
complete language is like writing your own build system every time.
Changes almost certainly are untested before deployed. That really
makes me nervous, and it should make you nervous as well. At least Ian
Collins mentions in a quoted section at the bottom of this post, he
attempts some sort of testing of the build system, but I suspect the
tests are quite minimal and could easily let through bugs. In his own
words, "compromise".

As much as it pains me to quote Peter Olcott, I must agree with him in
a limited basis in this case.

http://groups.google.com/group/comp.lang.c++.moderated/msg/5f02c79589a3a465

Conventional wisdom says to get the code working and then profile it and
then make those parts that are too slow faster.

What if the original goal is to make the code in the ball park of as
fast as possible?

For example how fast is fast enough for a compiler? In the case of the
time it takes for a compiler to compile, any reduction in time is a
reduction of wasted time.

A compiler, or any other developer tool, is used so much that any time
savings will almost always have a return on investment if you use it
for long enough. When amortized over the whole language or programming
community, the investment pays off even sooner. There is no "fast
enough" for a compiler, and there is no "fast enough" for a build
system. Yes, there is a "fast enough" to get it in use or to sell it -
it just has to be comparable to the competition on speed, but if
someone comes along with a equal or better one which is also faster,
we all know which we would use. That's not true of all software today.
Some software today does have a fast enough, but compilers and builds
are nowhere near close to "fast enough". "Fast enough" for a compiler
would be compiling the entire Linux kernel and all apps, on an old
computer, in a second. Anything slower and being faster would be a
selling point for a compiler.

Yes, I admit this is a balancing act. It's developer time spent now vs
developer time saved in the future. However, it appears to obvious to
me that it's an easy win investment, which is why it's so frustrating
that no one has done it already, and even more frustrating that no one
else is interested \now\.

Then there's the personal aspect. I don't like spending 20% or more of
my time on builds. I like coding.

Most projects I've worked on have had multiple targets, so there were
always continuous clean builds running in the background to ensure a
change to one target didn't break another.  Where "another" could
include the build process it's self!

I'm curious. How would you test this? A continuous build in the
background which compared full clean build results to an incremental
build? I always figured I would set this up if I was in control and I
managed to write a correct incremental build system, just as a measure
of sanity checking. However, that's all it would be - sanity checking.
It is nowhere near a set of robust acceptance tests. If I ever had a
failure in this sanity test, I would definitely be adding a new test
to my suite of acceptance tests of my build system.
 
I

Ian Collins

I'm curious. How would you test this? A continuous build in the
background which compared full clean build results to an incremental
build? I always figured I would set this up if I was in control and I
managed to write a correct incremental build system, just as a measure
of sanity checking. However, that's all it would be - sanity checking.
It is nowhere near a set of robust acceptance tests. If I ever had a
failure in this sanity test, I would definitely be adding a new test
to my suite of acceptance tests of my build system.

The first test was did the build succeed? The second was did the built
executable pass its tests?
 
J

Joshua Maurice

The first test was did the build succeed?  The second was did the built
executable pass its tests?

As a test of the cpp source code of your product, that's a very good
test - well, at least as good as the tests which are run. However, the
build system is not the product. The build system is a separate
program with different input and output. You confirmed that the build
system produced "acceptable" output for some input. That's like
testing gcc vs a single simple input source program. You just tested
the incremental build system over one (1) input. This is not a robust
testing scheme, yet the changes to the Turing complete program - the
incremental build system - are deployed after this depressingly low
coverage test.

You said it yourself: sometimes there are failures, and in which case
your solution used in X situation was to send out an email to ignore
the broken program - the incremental build system - and just do a full
clean build. This is an example of the incremental build system
failing a basic acceptance test, but the solution isn't to fix the
program. Instead, the solution is to simply not use it and due a mass
email to the company that the program is known broken for this one
input.

I'm sorry to be so pedantic, but I think this is an important
distinction. I reject your insinuation that your tests for the build
system are anything but \incredibly\ poor. The level of correctness
commonly accepted for incremental build systems is by far the worst of
any commercially or professionally used program ever. Failures and
inadequacies in basic usage are not treated as bug reports but instead
as a reason to sent out a company wide email detailing the bug and a
workaround of "don't use the program (incremental build system)". I am
not trying to draw any sort of conclusions from this particular post.
I'm merely pointing out that AFAIK your so called tests of the build
system are the worst set of tests of any commercially or
professionally used program besides other incremental build systems.
 
I

Ian Collins

As a test of the cpp source code of your product, that's a very good
test - well, at least as good as the tests which are run. However, the
build system is not the product. The build system is a separate
program with different input and output. You confirmed that the build
system produced "acceptable" output for some input. That's like
testing gcc vs a single simple input source program. You just tested
the incremental build system over one (1) input. This is not a robust
testing scheme, yet the changes to the Turing complete program - the
incremental build system - are deployed after this depressingly low
coverage test.

No we did not. I clearly stated above that we ran "continuous clean
builds". The unit test for the product were comprehensive. If they
passed, we could ship it to beta customers. The final executable was
the product, if the build system failed to build it, it would not have
passed its tests.
You said it yourself: sometimes there are failures, and in which case
your solution used in X situation was to send out an email to ignore
the broken program - the incremental build system - and just do a full
clean build. This is an example of the incremental build system
failing a basic acceptance test, but the solution isn't to fix the
program. Instead, the solution is to simply not use it and due a mass
email to the company that the program is known broken for this one
input.

Which as I clearly said was a compromise. Yes we could and maybe should
have fixed it, but the problem was well known and infrequent. The
result of the failure was either a failed (incremental) build, or
failure of the tests added for the change.
I'm sorry to be so pedantic, but I think this is an important
distinction. I reject your insinuation that your tests for the build
system are anything but \incredibly\ poor.

Nonsense. If a pair added a new or updated existing tests and they
failed when they should of passed, the problem was spotted. There was
no way for a "broken build" top pass though unnoticed. The symptom of
the problem was failure to regenerate a header. The invariably caused
the build to fail (if something had been added), or tests to fail (if
something was changed).
The level of correctness
commonly accepted for incremental build systems is by far the worst of
any commercially or professionally used program ever. Failures and
inadequacies in basic usage are not treated as bug reports but instead
as a reason to sent out a company wide email detailing the bug and a
workaround of "don't use the program (incremental build system)".

You are twisting my words out of context. I will say this again in the
hope it gets through: yes we could and maybe should have fixed it, but
the problem was well known and infrequent. I have reported bugs in the
distributed make tool I use and had them fixed.
I am
not trying to draw any sort of conclusions from this particular post.
I'm merely pointing out that AFAIK your so called tests of the build
system are the worst set of tests of any commercially or
professionally used program besides other incremental build systems.

You have no idea what our test were. If any tool in the chain had
misfired, the tests would fail. We weren't testing the build process,
we were testing the product it produced.
 
J

Joshua Maurice

No we did not. I clearly stated above that we ran "continuous clean
builds". The unit test for the product were comprehensive. If they
passed, we could ship it to beta customers. The final executable was
the product, if the build system failed to build it, it would not have
passed its tests.


Which as I clearly said was a compromise. Yes we could and maybe should
have fixed it, but the problem was well known and infrequent. The
result of the failure was either a failed (incremental) build, or
failure of the tests added for the change.


Nonsense. If a pair added a new or updated existing tests and they
failed when they should of passed, the problem was spotted. There was
no way for a "broken build" top pass though unnoticed. The symptom of
the problem was failure to regenerate a header. The invariably caused
the build to fail (if something had been added), or tests to fail (if
something was changed).


You are twisting my words out of context. I will say this again in the
hope it gets through: yes we could and maybe should have fixed it, but
the problem was well known and infrequent. I have reported bugs in the
distributed make tool I use and had them fixed.


You have no idea what our test were. If any tool in the chain had
misfired, the tests would fail. We weren't testing the build process,
we were testing the product it produced.

Let me be very explicit now. There are two programs, the sellable
product, and the build system. Your sellable product presumably has a
very extensive test suite.

The incremental build system doesn't have quite the same extensive
test suite. It can be tested by comparing the output executables of an
incremental build to the executables of a full clean build. It can
also be indirectly tested by testing the output executables with the
sellable product test suite.

Your overall process is presumably something like:

1- Let developers check out whatever source code they want from source
control. Tell them to highly prefer the last promoted changelist /
revision for their work.

2- Every developer checkin, both to product source code and the
incremental build system source code - aka the giant makefile - will
trigger a set of builds on official automated build machine. After a
successful build, the source code changelist will be promoted, aka
declared good. (Now, you can promote a build after a successful
incremental build with tests, or you can choose to only promote a
build after a successful full clean build with tests. This is
tangential to my main argument that the testing of the incremental
build system is pisspoor.)

Under this system, a developer can make a change to the incremental
build system, aka the giant makefile. After a single build, full clean
or incremental depending on the exact flavor, the change to the
incremental build system, aka the giant makefile, will be deployed to
users, aka developers. However, this single build represents only a
small fraction of the possible input to the incremental build system.
The "input" to an incremental build system is:
- the source code of the incremental build system, aka the giant
makefile, at the time of the previous complete build.
- the source code of the incremental build system, aka the giant
makefile, now.
- the source code of the sellable product at the time of the previous
complete build.
- the source code of the sellable product now.

Incremental correctness is the property of an incremental build system
which will always produce output equivalent to a full clean build on
the source code of now. However, the incremental build takes
additional input, the source code of the previous complete build and
the build system of the previous complete build.

To give an example, you might test a change to the giant makefile with
a full clean build. However, a developer might have missed a
dependency somewhere so that under some set of edits, the incremental
build will not be correct.

That is, you did not test a comprehensive set of possible source code
deltas before deploying a change to the build system. In fact, you did
only one (1) test. Developers will be consuming this change, throwing
many different source code deltas at the incremental build system, all
effectively untested.

Let me apply this reasoning to specific points now:

(Reproducing most of his post. Sorry, I can't think of a better way
offhand to structure this.)

No we did not. I clearly stated above that we ran "continuous clean
builds". The unit test for the product were comprehensive. If they
passed, we could ship it to beta customers. The final executable was
the product, if the build system failed to build it, it would not have
passed its tests.

"Continuous clean builds" are irrelevant. A full clean build is not a
test of an incremental build system at all. "Continuous clean builds",
and comparing their output to an incremental build, is, at best, one
very small test of the very large input space of the incremental build
system.
Nonsense. If a pair added a new or updated existing tests and they
failed when they should of passed, the problem was spotted. There was
no way for a "broken build" top pass though unnoticed. The symptom of
the problem was failure to regenerate a header. The invariably caused
the build to fail (if something had been added), or tests to fail (if
something was changed).

There is a difference between "broken build" and "broken incremental
build system". I agree that it sounds like it would be very hard for a
broken build to pass the tests. However, it sounds \very easy\ for a
bug to be introduced in the incremental build \system\.
You are twisting my words out of context. I will say this again in the
hope it gets through: yes we could and maybe should have fixed it, but
the problem was well known and infrequent. I have reported bugs in the
distributed make tool I use and had them fixed.

I don't think I'm twisting them out of context.

Moreover, there is still a fundamental nuance that I'm trying to get
across. If there's a bug in the compiler for named return value
optimization (looking at you MSVC), then it's possible to work around
it. It's not pleasant, and it becomes more unpleasant the more common
it becomes. However, with an incremental build system, you are no
longer in control of the issue. It could be someone else's edit to the
giant makefile, or it could be someone else's addition of a new header
which hides a previously included header. In this case, it requires
you to know everything about the system and its input, all of source
code, or at least keep up on your emails, to work around the bugs.

Suppose you need to go back in time 3 months in source control. Should
you have to look through all of your old emails to use the incremental
build system? The difference is if I go back in time I don't have to
review emails to work around known compiler bugs. To work around the
MSVC named return value optimization bug, I just need to look at a
single C++ function. To work around a header hiding a header, I need
to track all makefile changes and source control file adds. The
difference is that the compiler and the browser are known quantities
with comprehensive test suites, and they change at a slow rate,
whereas the build system is an unknown quantity, always changing,
always introducing new bugs, without even a basic test suite, bugs are
rarely fixed, and bugs are tracked \via email\.
You have no idea what our test were. If any tool in the chain had
misfired, the tests would fail. We weren't testing the build process,
we were testing the product it produced.

I claimed that your tests of the \build system\ are the worst ever.
You reply with a non sequitur, talking about the tests of the \build\.
It doesn't matter if you tested every single possible input to your
sellable product for a given build. The build is not the build system.
The actual testing of the incremental build system before deployment
to users, aka internal developers, is next to nothing. You did not
test your incremental build system - the giant makefile - for
(incremental) correctness any more than you tested gcc for correctness
or emacs for correctness. Yes, you did test it for "full clean build"
correctness reasonable well, but this is not incremental correctness.
It is quite possible, and highly likely, that there are source code
deltas for which the incremental build system - the giant makefile -
will produce incorrect output, and more bugs will creep in all the
time.

And yes, we can talk about if it's reasonable to compromise here. I
argued it's not, but that's another separate point, a normative
statement. Disputing that does nothing to dispute the main point of
this post, a statement of fact that the tests for promotion of your
incremental build \system\ are among the worst test suites of any
commercial or professional software AFAIK except for other incremental
build systems.
 
Ö

Öö Tiib

Let me be very explicit now. There are two programs, the sellable
product, and the build system. Your sellable product presumably has a
very extensive test suite.

Probably it also has several rooms of dedicated human testers. These
presumably test real product for month after code freeze if it does
anything useful and mission critical for someone at all. Giving them
something that does not install or run is out of question, developers
lose face if their build system did spit out something like that
without noticing it itself. Talking about importance of building is
therefore like talking about that there should be also stable
electricity present, otherwise nothing runs. Sure it is so, but it is
relatively cheap to arrange.
The incremental build system doesn't have quite the same extensive
test suite. It can be tested by comparing the output executables of an
incremental build to the executables of a full clean build. It can
also be indirectly tested by testing the output executables with the
sellable product test suite.

Why it is important to have that *incremental* build when you have
large commercial product? You anyway want to have each module with
clear mark in it to what build it belongs. You make build farm that
makes you clean build each time. It is lot simpler solution. What does
a computer cost? Depends perhaps on continent but you likely can buy
several for cost of single month of good worker.

One may want incremental build when he develops something with over
average open-source-project size on sole computer at home. Then he
uses incremental builds and still majority of time goes into building,
not developing nor running tests. That is market for good incremental
build system.
"Continuous clean builds" are irrelevant. A full clean build is not a
test of an incremental build system at all. "Continuous clean builds",
and comparing their output to an incremental build, is, at best, one
very small test of the very large input space of the incremental build
system.

Why? On the contrary. Incremental builds are irrelevant. No one uses
these anyway.
There is a difference between "broken build" and "broken incremental
build system". I agree that it sounds like it would be very hard for a
broken build to pass the tests. However, it sounds \very easy\ for a
bug to be introduced in the incremental build \system\.

Yes. So *do* *not* use incremental build systems and sun is shining
once again.

Test of a clean build is lot simpler. Did it build everything it was
made to build? Yes? Success! No? Fail! That is it. Tested. Simple part
is over. Now can run automatic tests (that presumably takes lot more
time than building) to see if all the modules that were built (all
modules of full product) are good too.

No one notices it because build system does everything automatically.
Checks out changed production branch from repository, runs all sort of
tools and tests and also produces web site about success and the
details and statistics (and changes in such) of various tools ran on
the code and on the freshly built modules. Building is tiny bit of its
job. It can even blame whose exactly changeset in repository did
likely break something. It can use some instant messaging system if
team dislikes e-mails. Of course ... all such features of build system
have to be tested too. If team does not sell the build system, then
testing it is less mission critical. Lets say it did blame an
innocent ... so there is defect and also interested part (wrongly
accused innocent) who wants it to be fixed.
 
J

Jorgen Grahn

No, not unless the clients /want/ them to be "part of the
source code". Make is a rather general purpose dependency
analysis tool. The dependencies are specified as files; so
if you want one or more makefiles to be in the prereqs of
one or more targets then put them there! The complaint or
"insight" as you call would instead be properly directed
at the dependency generation (whether manual or automatic)
tool not the dependency analysis tool (Make).
[snip]

I think you are responding to JM rather than to me.

My claim -- which I now see was was weaker than JM's -- was simply
that within a C++ project, the Makefile (if you use one) is just as
important as the C++ code: it too has to be correct, readable, etc.
Not something "that CM guy" handles and noone else has to think about.

/Jorgen
 
J

Joshua Maurice

Why it is important to have that *incremental* build when you have
large commercial product? You anyway want to have each module with
clear mark in it to what build it belongs. You make build farm that
makes you clean build each time. It is lot simpler solution. What does
a computer cost? Depends perhaps on continent but you likely can buy
several for cost of single month of good worker.

One may want incremental build when he develops something with over
average open-source-project size on sole computer at home. Then he
uses incremental builds and still majority of time goes into building,
not developing nor running tests. That is market for good incremental
build system.

First, let me note that the post to which you are replying only argued
that he basically did not test the incremental build system. I was
quite explicit in this. However, else-thread, I made arguments in
favor of incremental. Let me add a couple new replies here.

Some companies don't have those build farms. For starters, they're
expensive. I also disagree that build farms are the "simpler"
solution. Maintaining a build farm is a huge PITA. Just maintaining
automated build machines for the dozen or so platforms my company
supports for emergency bug fixes, hot fixes, and daily ML builds takes
the full time employment of 3+ people. The simpler solution is to have
the developer build everything himself. A lot less moving parts. A lot
less different versions running around. A lot less active Cruise
Control instances. Is building it all in one giant build more
sensible? Perhaps not; it depends on the situation, but a build farm
is definitely not simpler than building everything yourself in a giant
build.

Now, correct incremental. It's relatively clear to me that this is
more complex than a full clean build every time, aka not simpler. Is
it simpler or more complex than a build farm? I don't know. I would
think that a correct incremental build system is actually simpler than
a build farm. A build farm really is complex and takes effort to
maintain, but a correct incremental build system would only have to be
written once, or at least each different kind of task would only have
to be written once, amortized over all companies, whereas a build farm
would probably have a lot more per-company setup cost.
Why? On the contrary. Incremental builds are irrelevant. No one uses
these anyway.

The argument made was that he did not test the incremental build
system to any significant degree. He argued that he did test the
incremental build system with "continuous clean builds". He is in
error, and you are arguing a non sequitur.

Also, no one uses incremental builds? Why do we even use Make anymore
then? I suppose that Make has some nifty parallelization features, so
I guess it has some use if we ignore incremental.

I must say this is somewhat surprising to hear in a C++ forum. I
expected most people to still be under the incremental spell.
Incremental really is not that hard to achieve. It's just that no one
has really tried as far as I can tell. (Specifically under my
definition of "incremental correctness" which by definition includes
the associated build scripts as source, though I think even ignoring
build script changes, all common, purported incremental, build systems
are not incrementally correct under idiomatic usage.)
Yes. So *do* *not* use incremental build systems and sun is shining
once again.

Test of a clean build is lot simpler. Did it build everything it was
made to build? Yes? Success! No? Fail! That is it. Tested. Simple part
is over. Now can run automatic tests (that presumably takes lot more
time than building) to see if all the modules that were built (all
modules of full product) are good too.

Again, arguing a non sequitur. I would love to have a discussion of if
we should have clean builds, but you reply as though I was making the
argument in that quote that we should have incremental build systems.
I was not. I was very clear and explicit in that post that I was
arguing solely that he did not test the incremental build system for
correctness in any significant way.

Again, to change topics to your new point, why should we have correct
incremental builds? Because it's faster, and componentizing components
might not make sense, and it might be more costly to the company than
a correct incremental build system, especially when the cost of the
incremental build system can be amortized over all companies.

Think about it like this. It's all incremental. Splitting it up into
components is one kind of incremental. It's incremental at the
component level. However, the benefit of this can only go so far.
Eventually there would be too many different components, and we're
right in the situation described in Recursive Make Considered Harmful,
the situation without automated dependency analysis. Yes, we do need
to break it down at the component level at some point. It's not
practical to rebuild all of the linux kernel whenever I compile a
Hello World! app, but nor is it practical to say componentization
solves all problems perfectly without need of other solutions like a
parallel build, a distributed build, build farms, faster compilers,
pImpl, and/or incremental builds.
No one notices it because build system does everything automatically.
Checks out changed production branch from repository, runs all sort of
tools and tests and also produces web site about success and the
details and statistics (and changes in such) of various tools ran on
the code and on the freshly built modules. Building is tiny bit of its
job. It can even blame whose exactly changeset in repository did
likely break something. It can use some instant messaging system if
team dislikes e-mails. Of course ... all such features of build system
have to be tested too. If team does not sell the build system, then
testing it is less mission critical. Lets say it did blame an
innocent ... so there is defect and also interested part (wrongly
accused innocent) who wants it to be fixed.

Yes, a build system does everything "automatically" if you do a full
clean build every time, then it is handled automatically. Well, except
it's slow. And if there's a lot of dependency components which are
frequently changing, and you have to manually get these extra-project
dependencies, then we're in Recursive Make Considered Harmful. If
instead you use some automated tool like Maven to download
dependencies, and you do a full clean build, aka redownload, of those
every time, then it's really really slow. (I'm in the situation now at
work where we use Maven to handle downloading a bazillion different
kinds of dependencies. As Maven has this nasty habit of automatically
downloading newer "versions" of the same snapshot version, it's quite
easy to get inconsistent versions of other in-house components. It's
quite inconvenient and annoying. I've managed to deal with it, and
work around several bugs, to avoid this unfortunate default. Did I
mention I hate Maven as a build system?)

Also, an automated build machine polling source control for checkins
can only tell you which checkin broke the automated build (and tests)
if your full clean build runs faster than the average checkin
interval. At my company, the build of the ~25,000 source file project
can take 2-3 hours on some supported systems, and that's without any
tests. The basic test suite add another 5-6 hours. As a rough guess, I
would imagine we have 100s of checkins a day.

Even if we were to break up the stuff by team, with our level of
testing, I don't think we could meet this ideal of "automated build
machine isolates breaking checkin" without a build farm. Even then, as
all of this code is under active development, arguably a change to my
component should trigger tests of every component downstream, and as
inter-component interfaces change relatively often (but thankfully
slowing down in rate), it might even require recompiles of things
downstream. As the occasional recompile is needed of things
downstream, the only automation solution without incremental is to do
full clean rebuilds of the entire shebang.

Yes, I know the canned answer is "fix your build process". That is
still no reason to use the inferior tools (build systems) which would
help even after "we did the right thing" and componentized. Simply
put, full clean rebuilds do not scale to the size of my company's
project, and I argue that incremental correctness would be the
cheapest way to solve all of the problems.

It's unsurprising that I get about as much support in the company as I
do here.

However, I do admit that it might be a bad business decision to do it
fully in-house at this point in time. As I emphasized else-thread, it
is only easily worth it when amortized over all companies, or when
done by someone in GPL in their spare time for fun. However, the only
people who really need it are the large companies, and any single one
of them has little incentive to do it themselves. It's most
unfortunate.
 
I

Ian Collins

First, let me note that the post to which you are replying only argued
that he basically did not test the incremental build system. I was
quite explicit in this. However, else-thread, I made arguments in
favor of incremental. Let me add a couple new replies here.

Well I still maintain that our process did, indirectly, test the
incremental build system. Let me explain. My teams follow a test
driven development process with continuous integration so they are
continuously adding code, building, integrating and testing. If a
developer adds a failing test or the new code to pass it and the make
(which is always "make test") fails, there is a problem with the build.
These are always incremental builds, run hundreds of times a day (ever
time a few lines of code change).
Some companies don't have those build farms. For starters, they're
expensive.

I have one in my garage. It's only a couple of multi-core boxes, but
its quick and saves me a lot of billable time.
Now, correct incremental. It's relatively clear to me that this is
more complex than a full clean build every time, aka not simpler.

I don't think anyone was advocating a clean build every time, certainly
not me. What goes on on automated background builds is a world apart
from what happens on the developer's desktop.
Is
it simpler or more complex than a build farm? I don't know. I would
think that a correct incremental build system is actually simpler than
a build farm.

The build farm typically severs several masters (unless you have the
luxury of more than one). It will be running distributed incremental
builds for developers and it will be running continuous clean builds,
often for different platforms or with different compile options.

I don't agree with that, 99% of my builds are incremental, often just
the one file I'm editing.
The argument made was that he did not test the incremental build
system to any significant degree. He argued that he did test the
incremental build system with "continuous clean builds". He is in
error, and you are arguing a non sequitur.

I don't think I did although I probably wasn't clear on that point.
 
J

Jorgen Grahn


Why do you say that? They have a goal to meet, and they have to use
two languages to do that: C++ and Make. You don't think it's
unreasonable to require them to know C++ well enough, so why forgive
them if they write Makefiles without having a clue? (I'm assuming here
of course that they have to use Make, or something equivalent.)

Yes, I guess not /everyone/ in the project has to know /all/ about the
build system. But I also think it's dangerous to delegate build and
SCM to someone, especially someone who's not dedicated 100% to the
project, and who doesn't know the code itself.

You get problems like people not arranging their code to support
incremental builds. For example, if all object files have a dependency
on version.h or a config.h which is always touched, incremental builds
don't exist. (Delegating SCM is even worse IMO, but I won't go into
that.)
Yes, my company has those too. Unfortunately the build specialists are
only that in name; they have no actual power, and a lot of them have
no actual knowledge of builds. The developers control the build, and
the "build specialists" just manage the automated build machines.

That is my experience, too. What happens then (if you're lucky) is
that you get unofficial experts within the team -- at least mildly
interested in the topic, but with no time allocated for such work, and
no official status.

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,800
Messages
2,569,657
Members
45,417
Latest member
BonitaNile
Top