J
Joshua Maurice
I'm back. I've been learning a lot over the last couple of months,
toying around with solutions, and realizing the inadequacies of each
approach I've tried.
From a high level perspective, I realized what I want from a build
system: A developer should be able to do any combination of the
following actions and trigger a small / minimal / incremental build,
and it should be correct without any corner cases of incorrectness.
1- Add, remove, and/or edit a source file, such as a Java file, cpp
file, etc.
2- Add, remove, and/or edit a build system script file to invoke a
standard rule, such as adding a new jar to the build, modifying a
classpath of an existing jar, removing a jar from the build, etc.
Specifically, some actions would require a full clean build:
1- If the logic which tracks incremental dependencies is changed, then
the developer must do a full clean.
2- If any other tool, such as Javac version, changes, then the
developer must do a full mean.
These are akin to modifying the build system itself. That I understand
has very hard "backwards compatibility" issues. That's outside the
scope of what I want. However, the aforementioned activities of
messing with source files, and invoking build system macros / rules to
create new standard binaries should "just work", and it should
"just work" quickly.
So, I've been trying to do this for Java. Man this is actually quite
hard, harder the more I learn. I think I finally "struck gold" when I
found this wonderful paper here:
http://www.jot.fm/issues/issue_2004_12/article4.pdf
For build systems focusing on Java, I rank it as just as important as
Recursive Make Considered Harmful.
However, the paper assumes that the build will "cascade", or recompile
everything downstream. I am trying very hard to avoid this if
possible, to get a much smaller rebuild without writing my own Java
compiler ala Eclipse. I think my current solution in my head will
work, using a combination of
1- Ghost Dependencies
2- Each class file depends on the last not-no-op build of all
previously used class files from the last build.
I finally finished an implementation of part 1. Part 2 is much easier
if I rely on Javac's verbose output, but to do that I need to do a
compile up front, passing all the out of date java files, and then a
separate compile per java file to get useful information from Javac's
output, to know for exactly which java file was a class loaded.
So, I post here because I feel better prepared to discuss this
subject. I still disagree that "build from clean" is the correct
answer. That would make our product's build still around ~25 minutes
for just the Java compilation of around ~20,000 source files (and
growing). There must / should be something better. Separation
translation units make so much sense. I just wish Java had them.
And yes, we're also working on "componentizing" to some degree, but
when all of the components are under active development, I would still
very much like builds to be as fast as possible to do regular
integration tests on an automated build machine.
So, anyone know a quick and easy way to get the list of class files
loaded during a compile, and know for exactly which subset of java
files in the compile is the class file is needed? Invoking a separate
Javac after the fact (using tools.jar analyze API) almost 4x my
overall from-clean build time for a subset of real code in my
company's codebase.
toying around with solutions, and realizing the inadequacies of each
approach I've tried.
From a high level perspective, I realized what I want from a build
system: A developer should be able to do any combination of the
following actions and trigger a small / minimal / incremental build,
and it should be correct without any corner cases of incorrectness.
1- Add, remove, and/or edit a source file, such as a Java file, cpp
file, etc.
2- Add, remove, and/or edit a build system script file to invoke a
standard rule, such as adding a new jar to the build, modifying a
classpath of an existing jar, removing a jar from the build, etc.
Specifically, some actions would require a full clean build:
1- If the logic which tracks incremental dependencies is changed, then
the developer must do a full clean.
2- If any other tool, such as Javac version, changes, then the
developer must do a full mean.
These are akin to modifying the build system itself. That I understand
has very hard "backwards compatibility" issues. That's outside the
scope of what I want. However, the aforementioned activities of
messing with source files, and invoking build system macros / rules to
create new standard binaries should "just work", and it should
"just work" quickly.
So, I've been trying to do this for Java. Man this is actually quite
hard, harder the more I learn. I think I finally "struck gold" when I
found this wonderful paper here:
http://www.jot.fm/issues/issue_2004_12/article4.pdf
For build systems focusing on Java, I rank it as just as important as
Recursive Make Considered Harmful.
However, the paper assumes that the build will "cascade", or recompile
everything downstream. I am trying very hard to avoid this if
possible, to get a much smaller rebuild without writing my own Java
compiler ala Eclipse. I think my current solution in my head will
work, using a combination of
1- Ghost Dependencies
2- Each class file depends on the last not-no-op build of all
previously used class files from the last build.
I finally finished an implementation of part 1. Part 2 is much easier
if I rely on Javac's verbose output, but to do that I need to do a
compile up front, passing all the out of date java files, and then a
separate compile per java file to get useful information from Javac's
output, to know for exactly which java file was a class loaded.
So, I post here because I feel better prepared to discuss this
subject. I still disagree that "build from clean" is the correct
answer. That would make our product's build still around ~25 minutes
for just the Java compilation of around ~20,000 source files (and
growing). There must / should be something better. Separation
translation units make so much sense. I just wish Java had them.
And yes, we're also working on "componentizing" to some degree, but
when all of the components are under active development, I would still
very much like builds to be as fast as possible to do regular
integration tests on an automated build machine.
So, anyone know a quick and easy way to get the list of class files
loaded during a compile, and know for exactly which subset of java
files in the compile is the class file is needed? Invoking a separate
Javac after the fact (using tools.jar analyze API) almost 4x my
overall from-clean build time for a subset of real code in my
company's codebase.