Incremental Java Compile

T

Tom Anderson

I'm guessing that what happens is this: new BB().x gets converted into
an AST which is roughly a FieldAccess where the object is (new BB()) (an
opaque expression) and the field is "x". The code generation sees an
opaque expression--and expressions may be null, so it does the check.
The key is that it doesn't know that the expression is of a type which
cannot return null--Java does not do that much static analysis at
compile time, to my knowledge.

Yes, that sounds entirely plausible. Javac *could* eliminate that null
check if it did more analysis, but it wouldn't really be worth the effort,
so it doesn't.

tom
 
J

Joshua Maurice

Update! Initial estimates put my approach at ~7x (that's 700% more)
build times when calling javac once per java file after a javac
invocation on all of the java files in a directory. This is for my new
Ant-like tool, profiled with YourKit profiler. It's currently doing a
full clean rebuild of about ~2000 java files in about ~3 minutes, 37
seconds (which itself is a near meaningless measure without
understanding the content of the Java files, I know). Only about 15%
of the time is doing "useful" work. The rest is spent redoing javac
invocations to get the required dependency information.

I could probably speed that up by further parallelizing the separate
javac invocations on single files of each java source dir.

I posted a comment to an already existing bug report / enhancement
request over at Sun, I mean Oracle. I wonder if anyone will look or
care.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4639384

In the meantime, I fancy in my head modifying javac myself, though I'm
not sure how practical that would be to maintain privately going
forward.
 
J

Joshua Maurice

Seehttp://www.jot.fm/issues/issue_2004_12/article4.pdf
for "ghost dependencies".

I don't think as presented in the paper that ghost dependencies will
catch this. Again, take the example
//AA.java
public class AA { public final int x = 1; }
//BB.java
public class BB extends AA {}
//CC.java
public class CC { public final int x = new BB().x; }

CC.java has ghost dependencies "CC", "BB", "x", aka all names in the
class file (using the Java technical definition of "name" as a single
identifier, or a list of identifiers separated by dots '.'), then get
all possible interpretations under all imports (including the implicit
import <this-package>.*;), then close over all such prefixes. (Or
something like that. The details are somewhat involved. See the
paper.)

AA.class exports the name "AA", aka the full name of the class.
BB.class exports the name "BB", aka the full name of the class.

I'm not sure offhand if there is a good way to extend ghost
dependencies to catch this case without introduces a lot of false
positives.

--
I've also given some thought as you had to maintain this list keeping
track of super classes. I'm not sure how it would interact with this
example:

//AAA.java
public class AAA { public static int aaa = 1; }
//BBB.java
public class BBB { public static AAA bbb = null; }
//CCC.java
public class CCC { public static BBB ccc = null; }
//DDD.java
public class DDD { public final int ddd = CCC.ccc.bbb.aaa; }

If we chance AAA.aaa to "public static double aaa = 2", then BBB.class
would be a noop recompile, CCC.class would be a noop recompile, but
DDD.class would need a recompile. Again, I think I would need the same
information to make this work without endless cascading; I would need
to know that DDD (directly) uses AAA. I thus think that your / my
scheme of keeping tracking of super classes would not be terribly
effective / productive.

I might have to backtrack and/or apologize. I've actually come back to
this idea here, and I'm thinking it could work decently well.
Specifically, the rules would be:

1- A java file's compilation is out of date when its source file has
been modified since the last compilation.

2- A java file's compilation is out of date when it has a newer Ghost
Dependency, see paper: www.jot.fm/issues/issue_2004_12/article4.pdf

3- A java file's compilation is out of date when one of its output
class files has a reference to a type
3a- whose class file has a last "interface changed" time which is
newer than the java file's last compilation,
3b- or which is in an output class file of an "out of date" java file
which is part of this javac task,
3c- or which has a super type (direct or transitive) whose class has a
last "interface changed" time which is newer than the java file's last
compilation,
3d- or which has a super type (direct or transitive) which is in an
output class file of an "out of date" java file which is part of this
javac task.

4a- A java file's compilation is out of date when
- it has a potentially used constant variable field simple name X
(which is basically any simple name of any name in the source),
- and there is a class file on the compile classpath which "exports" a
constant variable field which has simple name X,
- and the "exported" constant variable field has a "last changed" time
which is newer than the java file's last compilation.
4b- A java file's compilation is out of date when
- it has a potentially used constant variable field simple name X
(which is basically any simple name of any name in the source),
- and there is an "out of date" java file in this javac task which has
a class file which "exports" a constant variable field which has
simple name X.

I just thought this up today from a small discussion on an OpenJDK
mailing list, and do to a couple of realizations about how javac
internally works, specifically that I think closing dependencies over
all super types (direct and transitive) of the dependency would be
equivalent to using javac's -verbose output.

What remains to be seen is if there's any other corner case which I'm
missing.
 
J

Joshua Maurice

I've actually come back to
this idea here, and I'm thinking it could work decently well.
Specifically, the rules would be:

1- A java file's compilation is out of date when its source file has
been modified since the last compilation.

2- A java file's compilation is out of date when it has a newer Ghost
Dependency, see paper:www.jot.fm/issues/issue_2004_12/article4.pdf

3- A java file's compilation is out of date when one of its output
class files has a reference to a type
3a- whose class file has a last "interface changed" time which is
newer than the java file's last compilation,
3b- or which is in an output class file of an "out of date" java file
which is part of this javac task,
3c- or which has a super type (direct or transitive) whose class has a
last "interface changed" time which is newer than the java file's last
compilation,
3d- or which has a super type (direct or transitive) which is in an
output class file of an "out of date" java file which is part of this
javac task.

4a- A java file's compilation is out of date when
- it has a potentially used constant variable field simple name X
(which is basically any simple name of any name in the source),
- and there is a class file on the compile classpath which "exports" a
constant variable field which has simple name X,
- and the "exported" constant variable field has a "last changed" time
which is newer than the java file's last compilation.
4b- A java file's compilation is out of date when
- it has a potentially used constant variable field simple name X
(which is basically any simple name of any name in the source),
- and there is an "out of date" java file in this javac task which has
a class file which "exports" a constant variable field which has
simple name X.

I just thought this up today from a small discussion on an OpenJDK
mailing list, and do to a couple of realizations about how javac
internally works, specifically that I think closing dependencies over
all super types (direct and transitive) of the dependency would be
equivalent to using javac's -verbose output.

What remains to be seen is if there's any other corner case which I'm
missing.

Further update. I implemented it, and my tests failed. The above rules
do not catch the following:

<root of test>/aa/src/main/java/T3.java ... contents
public class T3 { public static final int C = 1; }
<root of test>/bb/src/main/java/T2.java ... contents
public class T2 extends T3 {}
<root of test>/cc/src/main/java/T1.java ... contents
public class T1 extends T2 {}
<root of test>/dd/src/main/java/Test.java ... contents
public class Test { public static final int D = T1.C; }

Each separate directory under <root of test> is a different javac
task. The first build goes like:
- Enter <root>/aa. T3.java has not been built before. Build it now
with a single javac invocation.
- etc. for bb, cc, and dd.

Now, a developer comes in and modifies
<root of test>/bb/src/main/java/T2.java
to
public class T2 extends T3 { int x; }

A second build will come along and not find a rule which declares
Test.java to be "out of date". The problem is again constant variable
fields. The constant variable field T1.C is expanded inline in
Test.class, so I have no reference to the dependency of Test.java to
T1. I thought I might be able to handle constant variable fields with
special rules, but I don't think it will work. The fields can be
"hidden", and catching that with any more hacks is not worth the
trouble to me at the moment.

In other fronts, the OpenJDK mailing list discussion gave me a useful
piece of insight. Apparently one can use the JavacTask API in
tools.jar to get the types of "nodes" in the parse tree. Specifically,
call parse, save the CompilationUnitTree's, then call analyze, then
use ?? to get the types of the "nodes" in the saved
CompilationUnitTree. I have taken brief looks over the type APIs
several times, type mirrors and such, but I'm mostly at a loss. I'll
have to spend some significant time googling and playing around with
those to figure out how to do it, but I was hoping perhaps someone
here has had more experience with this library and can point me to an
example somewhere, or provide an example. Please?
 
J

Joshua Maurice

[...]
Now, a developer comes in and modifies
  <root of test>/bb/src/main/java/T2.java
to
  public class T2 extends T3 { int x; }

Err, that should read
public class T2 extends T3 { int C = 3; }
so that the new field "C" "hides" the super class's field "C".
 
J

Joshua Maurice

This is probably one of the last updates for a while. It appears that
the "magic code" which I wanted from the start is simply the
following. When you transitively close the dependencies this gives
over super types, I think that it is sufficient to do an incrementally
correct, incremental, cascading rebuild with possible early
termination. At least, the tests I have thus far tell me so.


package com.informatica.devops.jicb.incr_java;

import java.util.HashSet;
import java.util.Set;

import javax.lang.model.element.Element;
import javax.lang.model.element.ElementKind;

import com.sun.source.tree.CompilationUnitTree;
import com.sun.source.tree.Tree;
import com.sun.source.util.TreeScanner;
import com.sun.tools.javac.tree.JCTree;
import com.sun.tools.javac.tree.TreeInfo;

public class CompileDependencies {
//must call this after JavacTask.analyze to get useful information
public static Set<String> get(final CompilationUnitTree tree) {
final Visitor visitor = new Visitor();
visitor.scan(tree, null);
return visitor.fullNamesOfAllFoundTypes;
}

private static class Visitor extends TreeScanner<Void, Void> {
public final Set<String> fullNamesOfAllFoundTypes = new
HashSet<String>();
@Override public Void scan(Tree node, Void v) {
if (node != null) {
Element ele = TreeInfo.symbol((JCTree)node);
for ( ; ele != null; ele = ele.getEnclosingElement())
{
final ElementKind kind = ele.getKind();
if (ElementKind.CLASS == kind
|| ElementKind.INTERFACE == kind
|| ElementKind.ANNOTATION_TYPE == kind
|| ElementKind.ENUM == kind) {
fullNamesOfAllFoundTypes.add(ele.toString());
}
}
}
super.scan(node, null);
return null;
}
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top