How to set up a fast correct java build?

Discussion in 'Java' started by Joshua Maurice, Jan 8, 2010.

  1. I'm sorry if this is answered in a FAQ somewhere, but the first
    comp.lang.java.programmer FAQ I found referenced Java version 1.1, so
    I stopped reading there.

    I'm working on a project now with over 20,000 java source files, in
    addition to more than 4,000 C++ source files, some forms of custom
    code generation, an eclipse build, and probably other things I don't
    know offhand.

    Due to various requirements, we cannot put all of the source files
    into a single jar. Many jars are requirements. (Different jars for
    plugins, for client API, server impl. Then multiply by several
    different products, and we arrive at over required 200 jars.)

    How do you build your java code?

    I'm looking for a fast, correct build. Let me be very specific with
    the term "correct". A build is correct iff a build is equivalent to if
    you completely cleaned the source file system beforehand of previously
    build files.

    To be fast with such a large number of files, you basically need to
    have an incremental parallel build, possibly distributed (though this
    is harder with Java than say C++, for example).

    Let's ignore parallel (and distributed) for the moment. If I can find
    a way to incrementally build java, then I could probably write the
    parallel part myself with make. However, from my limited googling,
    there is no such thing as a correct incremental java compile, except
    maybe Jikes.

    javac doesn't cut it on its own. If it had the "-M" option of gcc or
    any other sane compiler, I could pretty easily hack it together
    myself. I need the compiler's help to do it; the class file does not
    contain sufficient information. Anything based on reading class files
    for dependency information is fatally flawed as it might miss critical
    dependencies. My QA department does not want to take an incremental
    build if there's a 5-10% chance of it being incorrect. They'd rather
    wait for a clean build, and rightfully so. They don't want to waste
    days of work just to learn it's a build issue.

    javamake aka JMake appears to be barely supported, and it seems from
    what little I can gather that it uses information from classfiles, so
    it is also incorrect and insufficient. It's GNU copyleft license might
    also make my company's lawyers wince.

    Ant doesn't cut it either. Its depend task does even less than
    javamake. It doesn't check any dependencies except the classfile
    timestamp on its java source file. I doubt it even checks classfile
    and java timestamps for non-public top level classes.

    As a Hail Mary, I could look up the Java grammar, or some open source
    Java parser, and parse the Java files and get the information myself.
    I just need to be able to get the list of package qualified classes
    used by a Java source file.

    However, perhaps Jikes could do what I need. Does anyone have any
    notable experience with it? Will it give me a list of package
    qualified classes used by each Java source file? Unfortunately, its
    documentation appears non-existent. Could potentially anyone point me
    to it perhaps?

    I was thinking that perhaps there's some way to invoke Eclipse from
    the command line. Is there a way? And is Eclipse just "smart enough"
    to correctly incrementally compile Java code when invoked from the
    command line?

    How do you compile your Java? 20,000 source files is not something to
    laugh at and just "clean each time" nor "cross your fingers and hope
    that Ant depends or javamake catches all of the dependencies" as each
    missed dependency could result in lost developer man weeks. Moreover,
    the problem is exacerbated with the other non-Java components of our
    build. For example, some java source is generated by in-house custom
    tools (in order to get serialization between C++ and Java), and
    incremental becomes even more important as it's not just 20,000 java
    source files anymore. It's even more taking even more time, greatly
    increasing the need for a correct fast (and thus incremental and
    parallel) build.

    Frankly, the current state of affairs in the Java community is not
    acceptable, and even laughable, given that solutions to these problems
    (fast build, correct build) are known and have been known for many,
    many years in the context of C and C++. (That javac cannot or will not
    output dependency information ala gcc -M is amazing.) Having said
    that, I would love to be proven quite wrong. I thank you in advance
    for any advice or insight you are willing to give.
     
    Joshua Maurice, Jan 8, 2010
    #1
    1. Advertising

  2. On Jan 8, 3:08 am, Joshua Maurice <> wrote:
    > However, perhaps Jikes could do what I need. Does anyone have any
    > notable experience with it? Will it give me a list of package
    > qualified classes used by each Java source file? Unfortunately, its
    > documentation appears non-existent. Could potentially anyone point me
    > to it perhaps?


    Nevermind. It appears from Wikipedia that this is also no longer
    updated.

    I just found gcj from the GNU compiler collections as well. Sadly, I
    don't see an option in the manual for outputting class file
    dependencies like gcc -M. It's also not working on my current Linux
    install either, so I can't really test further.

    It appears the "sanest" approach is to obtain or write my own Java
    parser. I don't need to do any syntax or semantic checking at all. I
    just need an exhaustive list of all used package qualified class
    names. Still sounds like a large project though. /sigh
     
    Joshua Maurice, Jan 8, 2010
    #2
    1. Advertising

  3. On Jan 8, 12:08 pm, Joshua Maurice <> wrote:
    > I'm sorry if this is answered in a FAQ somewhere, but the first
    > comp.lang.java.programmer FAQ I found referenced Java version 1.1, so
    > I stopped reading there.
    >
    > I'm working on a project now with over 20,000 java source files, in
    > addition to more than 4,000 C++ source files, some forms of custom
    > code generation, an eclipse build, and probably other things I don't
    > know offhand.
    >
    > Due to various requirements, we cannot put all of the source files
    > into a single jar. Many jars are requirements. (Different jars for
    > plugins, for client API, server impl. Then multiply by several
    > different products, and we arrive at over required 200 jars.)
    >
    > How do you build your java code?
    >
    > I'm looking for a fast, correct build. Let me be very specific with
    > the term "correct". A build is correct iff a build is equivalent to if
    > you completely cleaned the source file system beforehand of previously
    > build files.
    >
    > To be fast with such a large number of files, you basically need to
    > have an incremental parallel build, possibly distributed (though this
    > is harder with Java than say C++, for example).
    >
    > Let's ignore parallel (and distributed) for the moment. If I can find
    > a way to incrementally build java, then I could probably write the
    > parallel part myself with make. However, from my limited googling,
    > there is no such thing as a correct incremental java compile, except
    > maybe Jikes.
    >
    > javac doesn't cut it on its own. If it had the "-M" option of gcc or
    > any other sane compiler, I could pretty easily hack it together
    > myself. I need the compiler's help to do it; the class file does not
    > contain sufficient information. Anything based on reading class files
    > for dependency information is fatally flawed as it might miss critical
    > dependencies. My QA department does not want to take an incremental
    > build if there's a 5-10% chance of it being incorrect. They'd rather
    > wait for a clean build, and rightfully so. They don't want to waste
    > days of work just to learn it's a build issue.
    >
    > javamake aka JMake appears to be barely supported, and it seems from
    > what little I can gather that it uses information from classfiles, so
    > it is also incorrect and insufficient. It's GNU copyleft license might
    > also make my company's lawyers wince.
    >
    > Ant doesn't cut it either. Its depend task does even less than
    > javamake. It doesn't check any dependencies except the classfile
    > timestamp on its java source file. I doubt it even checks classfile
    > and java timestamps for non-public top level classes.
    >
    > As a Hail Mary, I could look up the Java grammar, or some open source
    > Java parser, and parse the Java files and get the information myself.
    > I just need to be able to get the list of package qualified classes
    > used by a Java source file.
    >
    > However, perhaps Jikes could do what I need. Does anyone have any
    > notable experience with it? Will it give me a list of package
    > qualified classes used by each Java source file? Unfortunately, its
    > documentation appears non-existent. Could potentially anyone point me
    > to it perhaps?
    >
    > I was thinking that perhaps there's some way to invoke Eclipse from
    > the command line. Is there a way? And is Eclipse just "smart enough"
    > to correctly incrementally compile Java code when invoked from the
    > command line?
    >
    > How do you compile your Java? 20,000 source files is not something to
    > laugh at and just "clean each time" nor "cross your fingers and hope
    > that Ant depends or javamake catches all of the dependencies" as each
    > missed dependency could result in lost developer man weeks. Moreover,
    > the problem is exacerbated with the other non-Java components of our
    > build. For example, some java source is generated by in-house custom
    > tools (in order to get serialization between C++ and Java), and
    > incremental becomes even more important as it's not just 20,000 java
    > source files anymore. It's even more taking even more time, greatly
    > increasing the need for a correct fast (and thus incremental and
    > parallel) build.
    >
    > Frankly, the current state of affairs in the Java community is not
    > acceptable, and even laughable, given that solutions to these problems
    > (fast build, correct build) are known and have been known for many,
    > many years in the context of C and C++. (That javac cannot or will not
    > output dependency information ala gcc -M is amazing.) Having said
    > that, I would love to be proven quite wrong. I thank you in advance
    > for any advice or insight you are willing to give.


    I don't get why you believe that Java class files do not contain
    enough dependency information. At least the class hierarchy and
    classes referred to by fields and methods (signature and code) should
    be found there. Reflection can be problematic, but it would be with a
    Java parser too. However, since I'm not an expert in the matter, I may
    be missing something.

    If you choose the Java parser way, instead, there are probably ways to
    avoid writing your own. You mentioned you use Eclipse: Eclipse is able
    to gather dependency information, at least at the source level (they
    call it "Call hierarchy" iirc). You might be able to hook at that,
    either with Eclipse's plugin facilities or by hacking Eclipse itself,
    which is open source. Or, you might use an existing Java parser:
    OpenJDK and BeanShell (Java interpreter) surely contain one.

    hth,
    Alessio
     
    Alessio Stalla, Jan 8, 2010
    #3
  4. Joshua Maurice <> wrote:
    > It appears the "sanest" approach is to obtain or write my own Java
    > parser. I don't need to do any syntax or semantic checking at all. I
    > just need an exhaustive list of all used package qualified class
    > names. Still sounds like a large project though. /sigh


    This topic appears here every once in a while.

    Java just doesn't lend itself to structured C/C++-style dependencies,
    because of the 1:n relation-ship of source files to generated .class
    files.

    In C/C++ you have a couple of source-files (ideally one .c and a flock
    of .h's) that determine the contents of a particular object file. And
    you know exactly which object files you need for an executable.

    In Java, however, one source file can result in any number of .class
    files for all the inner, nested, anonymous, synthetic, or further
    (non-public) toplevel classes. (I know, it's a partially redundant list.)

    So the very concept of "which source-files are relevant for this .class"
    is futile, when one doesn't know the set of .class files in advance.

    The only generally *safe* Java build is the complete rebuild.
     
    Andreas Leitgeb, Jan 8, 2010
    #4
  5. Alessio Stalla <> wrote:
    > I don't get why you believe that Java class files do not contain
    > enough dependency information. At least the class hierarchy and
    > classes referred to by fields and methods (signature and code) should
    > be found there. Reflection can be problematic, but it would be with a
    > Java parser too. However, since I'm not an expert in the matter, I may
    > be missing something.


    Indeed: static final fields used from other classes. Only their
    values are used, but their names do not show up in your .class file.
     
    Andreas Leitgeb, Jan 8, 2010
    #5
  6. Andreas Leitgeb <> wrote:
    > Alessio Stalla <> wrote:
    >> I don't get why you believe that Java class files do not contain
    >> enough dependency information. At least the class hierarchy and
    >> classes referred to by fields and methods (signature and code) should
    >> be found there. Reflection can be problematic, but it would be with a
    >> Java parser too. However, since I'm not an expert in the matter, I may
    >> be missing something.

    >
    > Indeed: static final fields used from other classes. Only their
    > values are used, but their names do not show up in your .class file.


    Small correction: static final fields with a compiletime known value.
     
    Andreas Leitgeb, Jan 8, 2010
    #6
  7. On Jan 8, 1:46 pm, Andreas Leitgeb <>
    wrote:
    > Andreas Leitgeb <> wrote:
    > > Alessio Stalla <> wrote:
    > >> I don't get why you believe that Java class files do not contain
    > >> enough dependency information. At least the class hierarchy and
    > >> classes referred to by fields and methods (signature and code) should
    > >> be found there. Reflection can be problematic, but it would be with a
    > >> Java parser too. However, since I'm not an expert in the matter, I may
    > >> be missing something.

    >
    > > Indeed:  static final fields used from other classes.   Only their
    > > values are used, but their names do not show up in your .class file.

    >
    > Small correction:  static final fields with a compiletime known value.


    Ok, thanks, I didn't know it. However, if this is the only problem,
    it's a very infrequent: how often is a static final field initialized
    with a compile time constant changed? Probably infrequently enough
    that you can just perform a full clean and build in those rare cases.

    Cheers,
    Alessio
     
    Alessio Stalla, Jan 8, 2010
    #7
  8. Alessio Stalla <> wrote:
    > ... However, if this is the only problem,
    > it's a very infrequent: how often is a static final field initialized
    > with a compile time constant changed?


    I came across it all too often during devel - but then again, some of
    those compile-time "constants" had better been stored in a properties
    file for runtime :)

    > Probably infrequently enough
    > that you can just perform a full clean and build in those rare cases.


    My latest conclusion after such a discussion here was, that one could
    write a tool to extract only the *API* of every (non-private) .class file
    after a compile, and next time, compile only the modified sources, and
    then rerun the API-extraction, and if anything changed: recompile all.

    Some tools might already do that, and I never even started to
    implement such an API-captchurer, myself - at least it looks like
    a theoretically safe approach, that migh still beat always recompiling
    all.
     
    Andreas Leitgeb, Jan 8, 2010
    #8
  9. Joshua Maurice

    Tom Anderson Guest

    On Fri, 8 Jan 2010, Christian K?tbach wrote:

    > Try maven.
    >
    > It is like Ant but better :)
    >
    > In fact you can produce clean builds of your software, as easy as typing "mvn
    > clean compile" to a commandline.
    >
    > Maven can manage all your dependencies and release softwaremodules.
    >
    > But you need some kind of infrastructure (Artifactory, Nexus).


    Did you even read the original post?

    tom

    --
    There are lousy reviews, and then there's empirical shitness. -- pikelet
     
    Tom Anderson, Jan 8, 2010
    #9
  10. Joshua Maurice

    Tom Anderson Guest

    On Fri, 8 Jan 2010, Alessio Stalla wrote:

    > On Jan 8, 1:46 pm, Andreas Leitgeb <>
    > wrote:
    >> Andreas Leitgeb <> wrote:
    >>> Alessio Stalla <> wrote:
    >>>> I don't get why you believe that Java class files do not contain
    >>>> enough dependency information. At least the class hierarchy and
    >>>> classes referred to by fields and methods (signature and code) should
    >>>> be found there. Reflection can be problematic, but it would be with a
    >>>> Java parser too. However, since I'm not an expert in the matter, I may
    >>>> be missing something.

    >>
    >>> Indeed:  static final fields used from other classes.   Only their
    >>> values are used, but their names do not show up in your .class file.

    >>
    >> Small correction:  static final fields with a compiletime known value.

    >
    > Ok, thanks, I didn't know it. However, if this is the only problem,
    > it's a very infrequent: how often is a static final field initialized
    > with a compile time constant changed? Probably infrequently enough
    > that you can just perform a full clean and build in those rare cases.


    No - because you don't even have enough information to *detect* these
    cases. Consider:

    // A.java
    class A {
    public static final String FOO = "foo";
    }

    // B.java
    class B {
    public static final String FOO = A.FOO;
    }

    You change A.java. How do you know you have to recompile B?

    tom

    --
    There are lousy reviews, and then there's empirical shitness. -- pikelet
     
    Tom Anderson, Jan 8, 2010
    #10
  11. Joshua Maurice

    Tom Anderson Guest

    On Fri, 8 Jan 2010, Andreas Leitgeb wrote:

    > Joshua Maurice <> wrote:
    >> It appears the "sanest" approach is to obtain or write my own Java
    >> parser. I don't need to do any syntax or semantic checking at all. I
    >> just need an exhaustive list of all used package qualified class
    >> names. Still sounds like a large project though. /sigh

    >
    > This topic appears here every once in a while.
    >
    > Java just doesn't lend itself to structured C/C++-style dependencies,
    > because of the 1:n relation-ship of source files to generated .class
    > files.
    >
    > In C/C++ you have a couple of source-files (ideally one .c and a flock
    > of .h's) that determine the contents of a particular object file. And
    > you know exactly which object files you need for an executable.
    >
    > In Java, however, one source file can result in any number of .class
    > files for all the inner, nested, anonymous, synthetic, or further
    > (non-public) toplevel classes. (I know, it's a partially redundant list.)
    >
    > So the very concept of "which source-files are relevant for this .class"
    > is futile, when one doesn't know the set of .class files in advance.


    This is an outrageous calumny! The conclusion from the last Grand
    Incremental Build Debate was that you *could* do it, but only if you had a
    lot more metadata. We came to the same conclusion as Mr Maurice - you'd
    have to write a java parser to extract the requisite information.

    Although i think the only raw data you couldn't get from class files is
    the origins of compile-time constants. This would be a pretty easy thing
    to add to the class file format, as an attribute at the top level of the
    file, indicating the origins of any entries in the constant pool derived
    from compile-time constants in other classes. Sun might be amenable to an
    RFE to add it.

    Anyway, once you have that information, there is some complicated work in
    computing dependencies, but it's entirely possible. It's just that
    nobody's done it, or done it and kept it current. Except possibly Eclipse ...

    > The only generally *safe* Java build is the complete rebuild.


    Certainly, the only *generally available* safe build is a clean one.

    tom

    --
    There are lousy reviews, and then there's empirical shitness. -- pikelet
     
    Tom Anderson, Jan 8, 2010
    #11
  12. Joshua Maurice

    Roedy Green Guest

    On Fri, 8 Jan 2010 03:08:52 -0800 (PST), Joshua Maurice
    <> wrote, quoted or indirectly quoted someone
    who said :

    >How do you build your java code?


    If had your problem I would do it one of two ways:

    1. write a "stomp" program that cranks out ant scripts, one per
    package. See http://mindprod.com/jgloss/ant.html
    to see the end result of such a stomp. I don't distribute my stomp
    program since it is highly customised to my needs, and the code is
    unpolished, but you can look at it in the Subversion repository at
    https://wush.net/websvn/mindprod/listing.php?repname=mindprod&amp;path=/com/mindprod/stomp

    In particular look at the MkAnt.java. Stomp as a whole builds many
    other files besides the build.xml and gleans the information it needs
    from several sources. Much of it is concerned with verifying
    consistency of information in various places.

    2. look into Maven. I have not used it, but I gather you tell it your
    overall goals, point it at your source, and it figures out on its own
    how to build it.
    http://mindprod.com/jgloss/maven.html

    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    There is no end to what can be accomplished if you don’t care who gets the credit.
    ~ Art Rennison
     
    Roedy Green, Jan 8, 2010
    #12
  13. Joshua Maurice

    Tom Anderson Guest

    On Fri, 8 Jan 2010, Joshua Maurice wrote:

    > I'm working on a project now with over 20,000 java source files, in
    > addition to more than 4,000 C++ source files, some forms of custom code
    > generation, an eclipse build, and probably other things I don't know
    > offhand.
    >
    > How do you build your java code?


    With a clean build. I have a much smaller project than you.

    My short and serious answer is that your project is too big. It's big
    enough that trying to manage it as a single entity is utter madness. You
    need to find a way to split it into smaller independent parts, each with a
    more tractable build problem. You may balk at this, and indeed, it would
    be difficult, but if you don't do it, there is simply no way that you're
    going to be able to do builds without pain, not with all the build magic
    in the world. Sorry.

    However, if you do want to keep a single, build, but make it faster, then
    i think your analysis is correct. The pre-compilation (code generation
    etc) and post-compilation (jarring etc) steps are pretty much like those
    in the C world, and can be done with make or ant, or anything that can
    compare timestamps and trigger a process. The problem is the compilation
    of the java, where changes to a file in one place can require spooky
    recompilation at a distance. And, as you say, there isn't a body of
    practice on doing this in java. It's pretty obvious how it could be
    done, but nobody's done it.

    Except Eclipse.

    > I was thinking that perhaps there's some way to invoke Eclipse from
    > the command line. Is there a way?


    http://help.eclipse.org/ganymede/in...t.doc.isv/guide/jdt_apt_building_with_apt.htm

    > And is Eclipse just "smart enough" to correctly incrementally compile
    > Java code when invoked from the command line?


    Possibly.

    > Frankly, the current state of affairs in the Java community is not
    > acceptable, and even laughable, given that solutions to these problems
    > (fast build, correct build) are known and have been known for many, many
    > years in the context of C and C++.


    And have been necessary purely because compiling C is slow, and C's weak
    dynamic binding means you have to build everything all at once. Java
    projects simply work in smaller bits, and do complete builds quickly.
    Before you dismiss this as a cop-out, consider that there are some really
    very big java systems out there - they got built, and you don't hear a
    great wailing and gnashing of teeth in the java community about the pain
    of building. Indeed, try an experiment - find someone with experience of
    both C and java, working in the normal modes for both, and ask him which
    build experience he'd prefer. Your problem stems from working on java in a
    C mindset.

    tom

    --
    There are lousy reviews, and then there's empirical shitness. -- pikelet
     
    Tom Anderson, Jan 8, 2010
    #13
  14. Joshua Maurice

    Roedy Green Guest

    On Fri, 8 Jan 2010 03:08:52 -0800 (PST), Joshua Maurice
    <> wrote, quoted or indirectly quoted someone
    who said :

    >I'm working on a project now with over 20,000 java source files, in
    >addition to more than 4,000 C++ source files, some forms of custom
    >code generation, an eclipse build, and probably other things I don't
    >know offhand.



    from: http://mindprod.com/jgloss/ant.html#CLEANCOMPILE

    Ant and Java combined create a mighty compiling engine that will blow
    your socks off. The reason is, it loads the Java compiler only once no
    matter how many source files you have. So even a clean compile is over
    in a twinkling.

    My rule in Java is this:

    When I am debugging, I do an ordinary compile unless I update a
    non-private static final constant. Then I do a clean compile of the
    source file containing that constant and of all users of that
    constant.

    If something strange seems to be happening, with old code being used,
    I do a bulk clean compile.

    Just prior to a big test or final release, I do a clean compile.

    After changes that affect a number of packages, I do a site wide clean
    compile. I don’t try to sort out which ones really need it.

    Building the jars and the zips takes more time than compiling, so
    compiling is not the bottleneck any more. MAKE-like tracking
    dependencies as a game not worth the candle.

    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    There is no end to what can be accomplished if you don’t care who gets the credit.
    ~ Art Rennison
     
    Roedy Green, Jan 8, 2010
    #14
  15. On Jan 8, 7:28 am, Alessio Stalla <> wrote:
    > I don't get why you believe that Java class files do not contain
    > enough dependency information. At least the class hierarchy and
    > classes referred to by fields and methods (signature and code) should
    > be found there. Reflection can be problematic, but it would be with a
    > Java parser too. However, since I'm not an expert in the matter, I may
    > be missing something.


    I believe my simple example is

    // source foo.java
    public class foo { foo() { bar x = null; } }
    // source bar.java
    public class bar { bar() {} }

    I just double checked with javap, (though I didn't run my own hand-
    written class file parser), and it appears that in foo.class, there is
    not a single reference to "bar". If bar were to be moved so that it
    could no longer be found, then foo's compilation would fail. I don't
    know if there are more cases like this, but I imagine that javac could
    do other optimizations which would remove references to used classes
    from the compiled classfile.

    Then there's also static final int fields.
     
    Joshua Maurice, Jan 8, 2010
    #15
  16. On Jan 8, 7:38 am, Andreas Leitgeb <>
    wrote:
    > Joshua Maurice <> wrote:
    > > It appears the "sanest" approach is to obtain or write my own Java
    > > parser. I don't need to do any syntax or semantic checking at all. I
    > > just need an exhaustive list of all used package qualified class
    > > names. Still sounds like a large project though. /sigh

    >
    > This topic appears here every once in a while.
    >
    > Java just doesn't lend itself to structured C/C++-style dependencies,
    > because of the 1:n relation-ship of source files to generated .class
    > files.
    >
    > In C/C++ you have a couple of source-files (ideally one .c and a flock
    > of .h's) that determine the contents of a particular object file. And
    > you know exactly which object files you need for an executable.
    >
    > In Java, however, one source file can result in any number of .class
    > files for all the inner, nested, anonymous, synthetic, or further
    > (non-public) toplevel classes. (I know, it's a partially redundant list.)
    >
    > So the very concept of "which source-files are relevant for this .class"
    > is futile, when one doesn't know the set of .class files in advance.
    >
    > The only generally *safe* Java build is the complete rebuild.


    I disagree. My plan was to set up make rules like the following. Note
    that these would not be written by hand, but generated from much
    simpler input reminiscent of the input to Maven.

    ## rules for making classfiles of jar1

    # each .class file depends upon its .java source file
    # obtainable through the .class file for non-public top level classes
    jar1/classes/foo.class : jar1/src/foo.java
    jar1/classes/bar.class : jar1/src/bar.java

    # each .class file depends upon the .java source file of all used
    classes
    # this is the information I would need to write a java parser to get
    reliably
    jar1/classes/foo.class : jar1/src/bar.java

    ## rules for making classfiles of jar2

    # each .class file depends upon its .java source file
    # obtainable through the .class file for non-public top level classes
    jar2/classes/baz.class : jar2/src/baz.java

    # each .class file depends upon the .java source file of all used
    classes
    # this is the information I would need to write a java parser to get
    reliably
    jar2/classes/baz.class : jar1/src/foo.java

    # the actual javac invocation needs the .class files for stuff in
    other
    # jars aka other javac invocations, so put an order only dependency
    here
    # to make sure foo.class and bar.class exist when compiling baz.class
    jar2/classes/baz.class | jar1/classes/foo.class jar1/classes/bar.class

    ## then put in rules to make jars
    jar1/jar1.jar : jar1/classes/foo.class jar1/classes/bar.class
    jar2/jar2.jar : jar2/classes/baz.class

    #

    It's more pseudo makefile code at the moment, but I'm pretty sure this
    would work. The build command would be a single javac invocation per
    "jar". The command would first delete all out of date class files of
    the jar, then it would pass only the source java files of those
    deleted class files to javac. Voila: incremental compilation of java.
     
    Joshua Maurice, Jan 8, 2010
    #16
  17. On Jan 8, 11:15 am, Christian Kütbach <> wrote:
    > Try maven.
    >
    > It is like Ant but better :)
    >
    > In fact you can produce clean builds of your software, as easy as typing
    > "mvn clean compile" to a commandline.
    >
    > Maven can manage all your dependencies and release softwaremodules.
    >
    > But you need some kind of infrastructure (Artifactory, Nexus).


    My company is currently using Maven. I think we all hate it with a
    passion. ~3-4 hour build times without tests from clean. ~1 hour build
    times without tests after a complete build and no other changes! It's
    not incrementally correct, and it's not parallel, thus it's quite
    possibly the worst build system I've ever seen.

    Note that I do admit that Maven is just a framework, but it does come
    with standard plugins, and these standard plugins do not amend
    themselves to incremental compilation at all, so I think it's fair to
    say that Maven is not an incremental build system (unless you hack /
    rewrite the bejesus out of it).

    I was thinking of at least writing my own Maven plugin to get its
    reactor dependency information so that I could parallelize it myself
    at the pom level. However, parallelization at the pom level is missing
    out on a lot of parallelization opportunities. See Recursive Make
    Considered Harmful.
    http://aegis.sourceforge.net/auug97.pdf

    I also strongly suspect the internal coding is shit for a lot of it as
    it takes much longer to do its work than a comparable ant or make
    system. Even moreso, all of the network accesses, zipping and
    unzipping, coping, etc., really make it unbearable. Maven just has all
    the signs that it was never seriously intended to replace ant, make,
    or other build systems for large projects. I'm trying to find an
    \alternative\ to Maven, but thanks for mentioning it.
     
    Joshua Maurice, Jan 8, 2010
    #17
  18. On Jan 8, 9:31 pm, Joshua Maurice <> wrote:
    > My company is currently using Maven. I think we all hate it with a
    > passion. ~3-4 hour build times without tests from clean. ~1 hour build
    > times without tests after a complete build and no other changes! It's
    > not incrementally correct, and it's not parallel, thus it's quite
    > possibly the worst build system I've ever seen.


    That's true, at the moment. Maven 3 will probably have parallel builds
    [1]. On well setup project structures this should be very nice.

    [1] http://jira.codehaus.org/browse/MNG-3004
     
    Stefan Lotties, Jan 8, 2010
    #18
  19. On Jan 8, 12:11 pm, Tom Anderson <> wrote:
    > On Fri, 8 Jan 2010, Joshua Maurice wrote:
    > > I'm working on a project now with over 20,000 java source files, in
    > > addition to more than 4,000 C++ source files, some forms of custom code
    > > generation, an eclipse build, and probably other things I don't know
    > > offhand.

    >
    > > How do you build your java code?

    >
    > With a clean build. I have a much smaller project than you.
    >
    > My short and serious answer is that your project is too big. It's big
    > enough that trying to manage it as a single entity is utter madness. You
    > need to find a way to split it into smaller independent parts, each with a
    > more tractable build problem. You may balk at this, and indeed, it would
    > be difficult, but if you don't do it, there is simply no way that you're
    > going to be able to do builds without pain, not with all the build magic
    > in the world. Sorry.


    I agree this is the "best" approach. However, that would require
    changing my company's mindset and culture, and doing significant
    refactoring of code. The culture here is that they don't believe in
    design by contract, in general purpose reusable components, and and
    such, we've coded ourselves into a tangled mess. I've been reassigned
    to try and speed up build times, but before I was working on one of
    the things at the base of the dependency tree. When I make a change, I
    generally had to build everything and run all tests because the tests
    for my component were not comprehensive, mostly because of this
    culture of no design by contract. Instead, any small change made at
    the root may have subtle nuances and break code far far away, either
    at compile time or test time.

    I disagree with your assessment though that no magic bullet will make
    these problems go away. I don't call it a magic bullet, but an
    incremental parallel build would work wonders. If I had that, compile
    times would go down drastically, like 2+ orders of magnitude.

    > However, if you do want to keep a single, build, but make it faster, then
    > i think your analysis is correct. The pre-compilation (code generation
    > etc) and post-compilation (jarring etc) steps are pretty much like those
    > in the C world, and can be done with make or ant, or anything that can
    > compare timestamps and trigger a process. The problem is the compilation
    > of the java, where changes to a file in one place can require spooky
    > recompilation at a distance. And, as you say, there isn't a body of
    > practice on doing this in java. It's pretty obvious how it could be
    > done, but nobody's done it.
    >
    > Except Eclipse.
    >
    > > I was thinking that perhaps there's some way to invoke Eclipse from
    > > the command line. Is there a way?

    >
    > http://help.eclipse.org/ganymede/index.jsp?topic=/org.eclipse.jdt.doc....
    >
    > > And is Eclipse just "smart enough" to correctly incrementally compile
    > > Java code when invoked from the command line?

    >
    > Possibly.
    >
    > > Frankly, the current state of affairs in the Java community is not
    > > acceptable, and even laughable, given that solutions to these problems
    > > (fast build, correct build) are known and have been known for many, many
    > > years in the context of C and C++.

    >
    > And have been necessary purely because compiling C is slow, and C's weak
    > dynamic binding means you have to build everything all at once. Java
    > projects simply work in smaller bits, and do complete builds quickly.
    > Before you dismiss this as a cop-out, consider that there are some really
    > very big java systems out there - they got built, and you don't hear a
    > great wailing and gnashing of teeth in the java community about the pain
    > of building. Indeed, try an experiment - find someone with experience of
    > both C and java, working in the normal modes for both, and ask him which
    > build experience he'd prefer. Your problem stems from working on java in a
    > C mindset.


    I'll bite. How long do you think it would take to recompile 20,000
    java files, and just the java files? I'm about to get numbers on that
    anyway, and I'll share if / when I get them.

    I agree with most of your assessment except with one minor
    qualification: if your codebase is pure java, then it works out pretty
    well. You can just get it all in Eclipse, and you have a wonderfully
    good incremental compiler. However, my codebase is not all java. It
    has custom inhouse codegen which makes Java. It has Java classes
    implemented in JNI, so some of the C++ compile depends on Java compile
    for javah. A lot of the tests are reverse: the Java tests depend upon
    the C++ code, itself some of which is generated by a Java tool, which
    itself depends on more C++ and Java code being built. That's ignoring
    entirely the biggest mess in it all: our Eclipse GUI plugins thingy
    build.

    Maybe in it all it could recompile the java every time. It might work,
    as long as I define proper hackery to have the javah step run iff the
    java source file has changed, not the class file, to not trigger
    rebuilds of C++ stuff. (Would that work?)

    However, one of my goals is still to get the build system overhead
    down to seconds ideally, though I think 1-3 minutes is a more
    reasonable goal. Something where I can hit "build" from root after
    making a change, instead of the current situation where every
    developer thinks he knows more than the build system, and only builds
    subfolders which he knows are affected by his change. However, when
    the developer misses something, and it gets to the official build
    machine streaming build, it causes lots of lost time. I want the
    developer to no longer feel obliged to "hack it" and instead let the
    build system do its job.

    Speaking of Eclipse, perhaps the way to go might be to just do the
    entire build in Eclipse. Write Eclipse build plugins for C++, for our
    custom codegen, etc. I'll have to seriously consider that.
     
    Joshua Maurice, Jan 8, 2010
    #19
  20. Joshua Maurice

    Arne Vajhøj Guest

    On 08-01-2010 06:08, Joshua Maurice wrote:
    > I'm sorry if this is answered in a FAQ somewhere, but the first
    > comp.lang.java.programmer FAQ I found referenced Java version 1.1, so
    > I stopped reading there.
    >
    > I'm working on a project now with over 20,000 java source files, in
    > addition to more than 4,000 C++ source files, some forms of custom
    > code generation, an eclipse build, and probably other things I don't
    > know offhand.
    >
    > Due to various requirements, we cannot put all of the source files
    > into a single jar. Many jars are requirements. (Different jars for
    > plugins, for client API, server impl. Then multiply by several
    > different products, and we arrive at over required 200 jars.)
    >
    > How do you build your java code?
    >
    > I'm looking for a fast, correct build. Let me be very specific with
    > the term "correct". A build is correct iff a build is equivalent to if
    > you completely cleaned the source file system beforehand of previously
    > build files.
    >
    > To be fast with such a large number of files, you basically need to
    > have an incremental parallel build, possibly distributed (though this
    > is harder with Java than say C++, for example).


    > How do you compile your Java? 20,000 source files is not something to
    > laugh at and just "clean each time" nor "cross your fingers and hope
    > that Ant depends or javamake catches all of the dependencies" as each
    > missed dependency could result in lost developer man weeks. Moreover,
    > the problem is exacerbated with the other non-Java components of our
    > build. For example, some java source is generated by in-house custom
    > tools (in order to get serialization between C++ and Java), and
    > incremental becomes even more important as it's not just 20,000 java
    > source files anymore. It's even more taking even more time, greatly
    > increasing the need for a correct fast (and thus incremental and
    > parallel) build.


    Use ant, let it clean everything and build for scratch. And buy
    a box that can do the job.

    > Frankly, the current state of affairs in the Java community is not
    > acceptable, and even laughable, given that solutions to these problems
    > (fast build, correct build) are known and have been known for many,
    > many years in the context of C and C++. (That javac cannot or will not
    > output dependency information ala gcc -M is amazing.)


    Java usually compiles so much faster than C/C++ that it is not a
    problem that need to be handled.

    Arne
     
    Arne Vajhøj, Jan 9, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Prof. Marvel

    How do I correct this -- fast

    Prof. Marvel, Oct 27, 2003, in forum: HTML
    Replies:
    5
    Views:
    389
    Mark Parnell
    Oct 27, 2003
  2. Replies:
    0
    Views:
    705
  3. Michele Simionato

    Python is darn fast (was: How fast is Python)

    Michele Simionato, Aug 23, 2003, in forum: Python
    Replies:
    13
    Views:
    594
  4. Juha Nieminen
    Replies:
    22
    Views:
    1,078
    Kai-Uwe Bux
    Oct 12, 2007
  5. Sam Kong
    Replies:
    10
    Views:
    213
    Ondrej Bilka
    Feb 13, 2007
Loading...

Share This Page