The curse of constant fields

A

Arved Sandstrom

I would expect almost everyone to do clean before an official build.

It is only with developer builds that safe versus quick is an issue.

I must admit that in development I don't always clean as part of build.

Arne

As the current official builder (all changes to the Subversion production
branch, EAR construction etc) for a reasonably important J2EE app (let's
just say that if I don't include what I should include, all motor vehicle
administrative activities grind to a halt in a Canadian province until we
deploy the previous working EAR) I am pretty fanatic about sanitizing
everything prior to an official build.

After bad experiences with the reliability of IDEs in regards to builds
(unless one is very careful) I no longer use them in any stage of an
official build. All the SVN ops are command line (and a fresh checkout is
involved at some stage), and I use an Ant build file for the EAR
construction. No clean is required since I do a fresh checkout.

After the classes are built I run JUnit tests, and after the EAR is built
I deploy it locally using the server's admin console for some sanity
tests through the application web interface.

I also no longer use IDEs for some Subversion operations. I've
encountered worrisome behaviour in certain cases. For example, with
Eclipse it's one workspace to a branch (I'll never do SVN switching in
the IDE, and in fact I tend to avoid switching when at all possible), I
won't do merges through an IDE, I tend not to do updates through an
IDE...it pretty much boils down to doing small groups of commits.

I mention the Subversion SOP I have because it ties in with how I
approach production and test builds.

I now always clean everything associated with an EAR build/deploy in
development, because I've seen partial cleans mess stuff up too often.
And in any case even with monster J2EE apps the clean takes very little
time. We have Hudson set up to do pretty much all of the above - fresh
checkout, compile, JUnit test suite, build - after each developer
commit...this is more for convenience than anything else.

In case anyone is wondering what I do use an IDE for, it's mainly as a
high-powered editor. :)

AHS
 
M

Mike Schilling

Arne said:
What this thread started with.

If the class file contains 123, then you don't know that it really
was mypack.MyClass.ConstX ...

I think the referencing class file contains both the name of the
referenced constant and its value, but I wouldn't swear to it. Hold
on ...

No, it doesn't, even if you compile with -g. That's just wrong. The
name should be in the constant table, as documentation.
 
A

Arne Vajhøj

Mike said:
I think the referencing class file contains both the name of the
referenced constant and its value, but I wouldn't swear to it. Hold
on ...

No, it doesn't, even if you compile with -g. That's just wrong. The
name should be in the constant table, as documentation.

To me it only make sense to either get it from the other class or
use it as a literal (as it does). Using it as a literal but having
a ref to the other class seems odd to me.

I wonder why they chose not to get it from the other class. Can the
performance gain from not doing it really justify this ?

I believe that C# does the same as Java, so there must be some reason.

Arne
 
A

Arne Vajhøj

As the current official builder (all changes to the Subversion production
branch, EAR construction etc) for a reasonably important J2EE app (let's
just say that if I don't include what I should include, all motor vehicle
administrative activities grind to a halt in a Canadian province until we
deploy the previous working EAR) I am pretty fanatic about sanitizing
everything prior to an official build.

After bad experiences with the reliability of IDEs in regards to builds
(unless one is very careful) I no longer use them in any stage of an
official build. All the SVN ops are command line (and a fresh checkout is
involved at some stage), and I use an Ant build file for the EAR
construction. No clean is required since I do a fresh checkout.

After the classes are built I run JUnit tests, and after the EAR is built
I deploy it locally using the server's admin console for some sanity
tests through the application web interface.

Details probably vary quite a bit, but the overall concept of
a non-IDE-based build is what most companies do.

Arne
 
M

Mike Schilling

Arne said:
To me it only make sense to either get it from the other class or
use it as a literal (as it does). Using it as a literal but having
a ref to the other class seems odd to me.

It's documentation: "I, class A, use the constant pkg.Colors.BLUE".
IMHO, putting more information into something like a class file is
always better (within reason) than putting in less. If you put it in,
at worst you have a bit of wasted space. If you leave it out, there's
information it's impossible to get (as in this example, where its
absence makes it impossible to build a classfile-based dependency
engine.)
I wonder why they chose not to get it from the other class. Can the
performance gain from not doing it really justify this ?

I'd think the gain must be in not loading the class that defines the
constants. Surely the JIT could optimize the access sufficiently
well.
 
T

Tom Anderson

I don't think it is simple.

As far as I can see, then:
* it is not possible only using class files - source code analysis is
needed

Yes, that's true. It would be nice if you could do it with class files
alone, but the constant inlining thing means you can't (although i think
this is the only reason you can't - very annoying!). You'd either need to
parse the source yourself (there are grammars for java for popular
compiler-compilers) or hook into compilation using javax.lang.model
somehow.
* it is not possible only with current code - both current and previous
code is necessary

That's certainly true. Or rather, you need to record the crucial
information from the old code, even if you don't keep the code itself.
Classpath for source code is not a well defined concept.

Perhaps not. But the vast majority of code is laid out as a tree rooted
somewhere, just like class files - wouldn't that be adequate? Or do i
misunderstand?
And a database with info for different versions adds work.

Well, yes. It's the heart of the problem. But i don't think it's a very
complicated database, and i don't think all this adds up to being
astonishingly difficult.

tom
 
T

Tom Anderson

And the values of their constants, which is where we came in.

Ah, i'm defining the values of constants to be part of the interface!
And if there were any changes to that class's ancestor classes, or any
interfaces it implements (directly or indirectly.)

That's why i said that a class is a dependent of "every other class it
references directly, and every class which is a superclass of a class it
references" (with classes implicitly including interfaces).

You probably wouldn't want to track dependencies like that, though - you'd
want to record direct dependencies and then compute transitive closures as
needed.
All of this is feasible, but it's complicated. Once you've figured out
in detail all of the possible sorts of changes you'd like to track (and
that's complex in itself), you've now got to make the right tradeoff
between.

Yes. I think the class-level approach would probably be sufficient.
And you'd like to build this all into the compiler, rather than having
to run successive compilation passes, as each one invalidates a new set
of classes.

I think, but am not certain, that this isn't as big a problem as you might
think. Changes can only cascade via inheritance, not use - if i change the
interface of A, and B uses A, then i need to recompile B, to make sure
it's valid, but if it is, i don't think any classes which depend on B need
recompilation.

Some kind of relationships created by generics might need to be treated in
the same way as inheritance. Not sure. I don't think so.
1. Keeping very detailed dependencies, which minimizes the number of
recompilations, but explodes the size of your dependency database, and
2. Keeping coarser dependencies, which minimized the size of your
dependency database, but increases the number of recompilations

I'm not worried about the size of the dependency database in the
slightest. I don't think the amount of information stored per class would
be bigger than the class files, and in most cases would be smaller. I'm
happy with that.

If space was a concern, recording references as indexes into a table of
class names, rather than using class names directly, would be easy. As
long as you have fewer than 65 536 classes, each reference will fit in two
bytes. If each class as 100 dependencies (which seems like a lot), then
it's only 200 bytes per class. Going up to feature-level granularity only
adds a few bits - it would probably still work with 16-bit references for
all but massive projects, and would definitely work with 32-bit ones.
Doing comparisons between file dates is a lot simpler.

Yes. And doesn't solve the same problem.

tom
 
T

Tom Anderson

To me it only make sense to either get it from the other class or use it
as a literal (as it does). Using it as a literal but having a ref to the
other class seems odd to me.

I'd rather have the reference than not, even if it was used as a literal.
I wonder why they chose not to get it from the other class. Can the
performance gain from not doing it really justify this ?

I think it's to do with the rules about class initialisation. I think they
make certain guarantees about the initialisation of constant expressions -
the order, or not needing to load other classes to do them, or something.
The definition of constant expressions is something like expressions of
string or primitive type that are literals, references to other constant
expressions, and compound expressions built from other constant
expressions. Basically, things which can be resolved at compile-time in a
fairly simple way. The rules effectively say that they have to be
resolved, and stored in their resolved form.

For instance, compile these:

public class Foo {
public static final String ALPHA = "alpha" ;
public static final String BETA = "beta" ;
}

public class Bar {
public static final String ALPHABET = Foo.ALPHA + Foo.BETA ;
}

And run strings over Bar.class. It contains the string "alphabeta".

I think this was a fairly bad decision by the designers of java, given the
headache it causes today. I'd be interested to know their reason for it.

tom
 
T

Tom Anderson

As the current official builder (all changes to the Subversion
production branch, EAR construction etc) for a reasonably important J2EE
app (let's just say that if I don't include what I should include, all
motor vehicle administrative activities grind to a halt in a Canadian
province until we deploy the previous working EAR) I am pretty fanatic
about sanitizing everything prior to an official build.

After bad experiences with the reliability of IDEs in regards to builds
(unless one is very careful) I no longer use them in any stage of an
official build. All the SVN ops are command line (and a fresh checkout
is involved at some stage), and I use an Ant build file for the EAR
construction. No clean is required since I do a fresh checkout.

This sounds like excellent policy.
After the classes are built I run JUnit tests, and after the EAR is
built I deploy it locally using the server's admin console for some
sanity tests through the application web interface.

By hand? No automated testing of the complete app through its web
interface?

Also, presumably, before you deploy to production, you deploy to a staging
or pre-production environment which replicates the production environment,
and where you do some more thorough (manual or automatic) testing, right?
Or rather, you have someone else - a QA or client team - do it?

So getting things wrong in the build is not actually posing a threat to
the production site. I'm certainly not saying that you shouldn't bother
taking precautions to get the build right, but it's not quite the Mad Max
scenario you mention if it goes wrong!
I also no longer use IDEs for some Subversion operations. I've
encountered worrisome behaviour in certain cases. For example, with
Eclipse it's one workspace to a branch (I'll never do SVN switching in
the IDE, and in fact I tend to avoid switching when at all possible), I
won't do merges through an IDE, I tend not to do updates through an
IDE...it pretty much boils down to doing small groups of commits.

Yikes. Has you experience really been that bad? We only use CVS through
Eclipse, and haven't had much trouble - we did hit a snag at one point
about Eclipse being out of sync with what was on disk, so now we always do
a refresh before synchronising. But that's it.
I now always clean everything associated with an EAR build/deploy in
development, because I've seen partial cleans mess stuff up too often.
And in any case even with monster J2EE apps the clean takes very little
time. We have Hudson set up to do pretty much all of the above - fresh
checkout, compile, JUnit test suite, build - after each developer
commit...this is more for convenience than anything else.

That also sounds like an excellent idea. We have a nightly build (and
actually, we don't at the moment, because the thing we're working on takes
a lot of manual intervention to build, and we haven't invested the time in
automating it yet), but i'd love to have a build and test after every
checkin. With a klaxon and flashing light that goes off if it's broken.
Hell yes, a set of traffic lights - green if the current version in CVS is
good, amber if there's just been a checkin and it's currently undergoing
testing, and red if it's broken!

In fact, i fantasise about having a build and test running on a pre-commit
hook, so that if you try to check in code that doesn't build and run, it
gets rejected! This is why i was thinking about an automatic dependency
tracking system, in fact, so you wouldn't need to recompile everything to
do this. Obviously you'd also need a fairly fast set of unit tests - you'd
need to flag anything slow as not to be run on checkin.

tom
 
A

Arne Vajhøj

Tom said:
Yes, that's true. It would be nice if you could do it with class files
alone, but the constant inlining thing means you can't (although i think
this is the only reason you can't - very annoying!). You'd either need
to parse the source yourself (there are grammars for java for popular
compiler-compilers) or hook into compilation using javax.lang.model
somehow.


That's certainly true. Or rather, you need to record the crucial
information from the old code, even if you don't keep the code itself.


Perhaps not. But the vast majority of code is laid out as a tree rooted
somewhere, just like class files - wouldn't that be adequate? Or do i
misunderstand?

I don't think "the vast majority of" is good enough for a tool.

It would need a -javapath similar to -classpath and some advanced
rules about what in classpath matches what in Java path.
Well, yes. It's the heart of the problem. But i don't think it's a very
complicated database, and i don't think all this adds up to being
astonishingly difficult.

"astonishingly difficult" is a flexible concept.

It is doable. It will involve some work to develop. And it will
require some environment setup where it is to be used. It is more
difficult than the C tools that started this thread. It has AFAIK
never been done.

Arne
 
A

Arne Vajhøj

Mike said:
It's documentation: "I, class A, use the constant pkg.Colors.BLUE".
IMHO, putting more information into something like a class file is
always better (within reason) than putting in less. If you put it in,
at worst you have a bit of wasted space. If you leave it out, there's
information it's impossible to get (as in this example, where its
absence makes it impossible to build a classfile-based dependency
engine.)

Considering how late annotations came into the game, then it seems
as the "intelligent processing" of classes was not part of the
original Java design.
I'd think the gain must be in not loading the class that defines the
constants.

Could be. But how many constants does one load without using
something else from the class ? It does not seem to that big
an overhead to me !
Surely the JIT could optimize the access sufficiently
well.

That was my thought.

Arne
 
A

Arne Vajhøj

Tom said:
I'd rather have the reference than not, even if it was used as a literal.


I think it's to do with the rules about class initialisation. I think
they make certain guarantees about the initialisation of constant
expressions - the order, or not needing to load other classes to do
them, or something. The definition of constant expressions is something
like expressions of string or primitive type that are literals,
references to other constant expressions, and compound expressions built
from other constant expressions. Basically, things which can be resolved
at compile-time in a fairly simple way. The rules effectively say that
they have to be resolved, and stored in their resolved form.

For instance, compile these:

public class Foo {
public static final String ALPHA = "alpha" ;
public static final String BETA = "beta" ;
}

public class Bar {
public static final String ALPHABET = Foo.ALPHA + Foo.BETA ;
}

And run strings over Bar.class. It contains the string "alphabeta".

But what if Bar had to do a String concat as part of initialization ?
I don't think it would make a big performance difference for the app.
I think this was a fairly bad decision by the designers of java, given
the headache it causes today. I'd be interested to know their reason for
it.

Too bad Lew is not related to Joshua - if he had been then he could
have invited Joshua over for a snack and inquired him about all the
inside info.

Arne
 
L

Lew

Too bad Lew is not related to Joshua - if he had been then he could
have invited Joshua over for a snack and inquired him about all the
inside info.

Actually, Joshua Bloch is rather outspoken in his criticism of certain aspects
of Java, as well as in his defense of others. I don't recall him addressing
the matter of constants, though.
 
M

Mike Schilling

Arne said:
I don't think "the vast majority of" is good enough for a tool.

Is it a problem, though? Put the full paths of all referenced source
files in the dependency database. If anything's different on the next
run, you need to recompile.
 
J

Joshua Cranmer

Arne said:
Too bad Lew is not related to Joshua - if he had been then he could
have invited Joshua over for a snack and inquired him about all the
inside info.

Took me a few seconds to realize that you were referring to Joshua
Bloch, not me. Although those few seconds of wondering how I got dragged
into this thread were amusing :) .
 
L

Lew

Mike said:
Agree or disagree, he'd know the arguments on both sides and which
prevailed. (And you might learn even more if you invited him over for
a few beers instead of a snack.)

I am willing, and offer here publicly, to invite Mr. Bloch for a few beers and
a chat about Java or any other topic that interest him. If he is ever in the
DC or Maryland area and is willing, the first six-pack is on me.

My current favorite beer is "Loose Cannon" from Clipper City Brewing Co. It's
got triple the hops of even Sam Adams brews.

Pun intended.
 
A

Arved Sandstrom

Tom Anderson said:
This sounds like excellent policy.


By hand? No automated testing of the complete app through its web
interface?

Also, presumably, before you deploy to production, you deploy to a staging
or pre-production environment which replicates the production environment,
and where you do some more thorough (manual or automatic) testing, right?
Or rather, you have someone else - a QA or client team - do it?

We have a team of dedicated testers (they've come up through the ranks on
the business side and know exactly what should happen), and quite a few
others that do it part-time as required (again, business-side people who
know the business rules to a T). By the time a production build is to be
produced, a test build which is identical except for some configuration has
been deployed in a similar environment, and tested for weeks (or months).

The idea of using automated tests for the web GUI has been bruited, but in
our environment it would be unrealistic.

It might be more accurate to say that I am quite concerned to make sure that
the test builds are correct. Because a lot of time can be wasted if a tester
reports that an error is still there, and then it's a question of is the
error still there because the build is flawed, or because the developer who
"fixed" it only fixed it for a different use case or in his own development
environment.
So getting things wrong in the build is not actually posing a threat to
the production site. I'm certainly not saying that you shouldn't bother
taking precautions to get the build right, but it's not quite the Mad Max
scenario you mention if it goes wrong!

That's correct - if things were obviously wrong the first business day after
a new deployment (it just so happens that tomorrow AM is one such), then
we'd quickly deploy the previous EAR and figure out what went wrong.
Yikes. Has you experience really been that bad? We only use CVS through
Eclipse, and haven't had much trouble - we did hit a snag at one point
about Eclipse being out of sync with what was on disk, so now we always do
a refresh before synchronising. But that's it.

It's been bad enough. In Eclipse I avoid the team synchronization stuff as
much as possible. I find it much easier to do my svn status and svn update
etc on the CL, and refresh in Eclipse.

In order to keep my SVN commits as close as possible to a logical changeset,
I find it easier to "svn status" into a file, edit it, and then use
the --file option. It is of course doable in Eclipse, but when you're
wanting to select 15 or 20 diferent files in a dozen different spots in 3 or
4 separate projects, it gets a bit busy in the IDE.
That also sounds like an excellent idea. We have a nightly build (and
actually, we don't at the moment, because the thing we're working on takes
a lot of manual intervention to build, and we haven't invested the time in
automating it yet), but i'd love to have a build and test after every
checkin. With a klaxon and flashing light that goes off if it's broken.
Hell yes, a set of traffic lights - green if the current version in CVS is
good, amber if there's just been a checkin and it's currently undergoing
testing, and red if it's broken!

We pretty much have the klaxon and flashing lights. :) If a build breaks
then Hudson emails every developer, and the email tells you what the base
event was (whose commit), and what broke (if JUnit tests then you follow the
emailed link and drill down). Hudson itself is quite easy to configure, and
jobs (say one for trunk, one for each feature branch etc) are also very easy
to configure. What's nice too is that the Ant build.xml (if such you choose
to use) that you point Hudson at is exactly the same thing that you'd use in
your dev environment, or test or production.
In fact, i fantasise about having a build and test running on a pre-commit
hook, so that if you try to check in code that doesn't build and run, it
gets rejected! This is why i was thinking about an automatic dependency
tracking system, in fact, so you wouldn't need to recompile everything to
do this. Obviously you'd also need a fairly fast set of unit tests - you'd
need to flag anything slow as not to be run on checkin.

tom

This is one thing we don't enforce. It has not been an issue to date.

With the exception of slow tests (slow by their nature), I haven't seen that
large test suites take so long to run that a developer couldn't execute them
on each commit. On the specific app I refer to we have probably close to
3000 JUnit tests, and it's on the order of a minute on my local box to run
them all. They're typically not trivial tests either.

Speaking of tests and Hudson, one handy other thing to hook in is test
coverage, like Emma. It just gets added to the script that Hudson is
provided...you end up with nice graphs of coverage at various levels. IMO
this is indispensable (even in a TDD environrment) for staying on top of
whether your tests are sufficiently blanketing the codebase.

AHS
 
M

Martin Gregorie

Is it a problem, though? Put the full paths of all referenced source
files in the dependency database. If anything's different on the next
run, you need to recompile.
I think there's an even easier way which could easily be built into ant.

Instead of merely compiling every class which is older than the
corresponding source file, set a target timestamp to match the most
recent class file and then recompile all classes whose sources have been
amended more recently than the target. That will sometimes do more than
the minimum amount of work, but I don't think it will ever fail to
recompile everything that needs to be recompiled.

Since ant already has to look at the age of every class and source file,
this would add a minimal overhead to the <javac> step.
 
M

Mike Schilling

Martin said:
I think there's an even easier way which could easily be built into
ant.

Instead of merely compiling every class which is older than the
corresponding source file, set a target timestamp to match the most
recent class file and then recompile all classes whose sources have
been amended more recently than the target. That will sometimes do
more than the minimum amount of work, but I don't think it will ever
fail to recompile everything that needs to be recompiled.

Classes can depend on jars as well as classes. In the sort of large
system where simply doing a clean build is unacceptable, this will
usually be the case, since multiple subsystems are being built, each
producing one or more jars. If you extend this to checking the date
on each jar file in the classpath, the result is that any change to a
subsystem will cause all subsystems that depend on it to be completely
rebuilt.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top