Chris said:
I seem to recall, in years past, that the interface approach was slower.
Is this still the case?
As others have said, it depends, and (if it matters that much to you) you'll
have to measure it.
I got interested enough to try a few experiments (these were all with jdk 1.6,
but 1.5 gives very similar results).
For a start, comparing a method like:
public void method() { ++datum; }
where "datum" is an instance field. Just to get a ballpark figure, calling
that via an interface using the client JVM takes about 12 nanoseconds if called
as an override of an abstract method, vs about 15 if the same code is the
implementation of an interface. That is /not/ the bottom-line, there are many
highly important effects to consider, but even from that approximate data we
can see that the difference between the two kinds of method invocation is
/very/ small. If you are going to invoke the "same" method(s) a billion or two
times then using abstract methods will save you a handful of seconds overall.
More importantly, the time is so small, that it's difficult (but not
impossible) to imagine a case where the body of the method -- even a very
trivial method, as above -- wouldn't dominate the execution time. At this very
fine-grained level, the number of parameters will also effect the invocation
overhead -- in this case the method has one parameter (this), in other cases it
might have more or less than that.
However, if you are /that/ bothered by performance, you are not going to be
using the client JVM in the first place. Switching to the server JVM (in this
specific test) reduces both times, to about 9 vs 12 nanoseconds -- so the
relative difference is about the same.
Another thing to note is the degree of actual polymorphism at the call site.
In my test code (given in full at the end of this post), the naively expected
polymorphism is 3 (there are three different -- albeit identical --
implementations of the abstract and interface methods). What's more, each of
these possibilities occurs equally often. Considerations of polymorphism
affect how the JITed code works. A typical implementation technique (and one
used by Sun's VMs) is to attempt to compile a virtual or interface method call
into a sequence of test-and-branch instructions (thus avoiding a slow vtable
indirection). So the number of possible candidate classes which have to be
tested for at each invocation will affect both the strategy for generating the
code, and its actual execution profile. So, what happens if a given call-site
is not, in fact, heavily polymorphic ? To try that out I commented out the
last two lines of the time() function (below), so that the calls were actually
monomorphic. With the client JVM that completely eliminated the difference
between the two kinds of method invocation (about 12 nanos in either case).
However, with the server JVM, once it had realised that there was no real
polymorphism at the call site (in either case) it was able to inline the method
calls -- but that further permitted it to optimise the loops, and in fact
allowed it to remove such a large chunk of the code (not all) that the test was
no longer useful.
But that kind of optimisation makes considering call sites difficult. In the
case of my test code there are two call sites (in timeInterfaceCall() and
timeAbstractCall()) which /may/ be polymorphic. But there is no guarantee that
the optimisation will leave those alone. If the optimiser had attempted to
"split" either of those methods (effectively pushing the test-and-branch
outside the timing loop) then it would have been able to inline both cases
completely, even in the polymorphic case (there would be three largely
identical copies of each of the two timing methods). I don't know whether
Sun's current generation of JVMs is capable of splitting, but the technique is
moderately well-known, and is implemented in the Animorphic SELF VM on which a
lot of the runtime optimisation in Hotspot is based.
Another point to consider (and it's a flaw in my test) is that conditional
branches directly affect execution time on processors with pipelining and
branch prediction. In my test the CPUs branch prediction would be been almost
perfect, but that might not be the case in a real application.
Anyway, I think the bottom line is more like. The overhead is -- at worst --
very, very, small (but measurable); it is also subject to very active (and
often successful) attempts by the JIT to optimise it out.
Sample code follows. Note that it runs in an infinite loop. It takes the
JITer a while to complete its optimisations -- on my machine the JIT generated
several versions of the timing loops as it put more and more work into
optimising them, it seemed to have settled down after about 20 seconds.
-- chris
======== code ========
interface Interface
{
void interfaceCall();
}
abstract class Abstract
{
abstract public void abstractCall();
}
class Concrete1
extends Abstract
implements Interface
{
public long datum;
public void interfaceCall() { ++datum; }
public void abstractCall() { ++datum; }
}
class Concrete2
extends Abstract
implements Interface
{
public long datum;
public void interfaceCall() { ++datum; }
public void abstractCall() { ++datum; }
}
class Concrete3
extends Abstract
implements Interface
{
public long datum;
public void interfaceCall() { ++datum; }
public void abstractCall() { ++datum; }
}
public class Test
{
private static final int LOOPS_PER = 50 * 1000 * 1000;
// leave this running for a while (20 seconds minimum to allow the
// JITer time to finish optimising
public static void
main(String[] args)
{
Concrete1 c1 = new Concrete1();
Concrete2 c2 = new Concrete2();
Concrete3 c3 = new Concrete3();
while (c1.datum + c2.datum + c3.datum >= 0)
time(c1, c2, c3);
}
private static void
time(Concrete1 c1, Concrete2 c2, Concrete3 c3)
{
// comment out some of these lines to reduce/eliminate
// polymorphism at the call sites in timeAbstractCall()
// and timeInterfaceCall()
timeAbstractCall(c1); timeInterfaceCall(c1);
timeAbstractCall(c2); timeInterfaceCall(c2);
timeAbstractCall(c3); timeInterfaceCall(c3);
}
private static void
timeAbstractCall(Abstract it)
{
long start = System.currentTimeMillis();
for (int i = 0; i < LOOPS_PER; i++)
it.abstractCall();
long end = System.currentTimeMillis();
report(end-start, " abstract", LOOPS_PER);
}
private static void
timeInterfaceCall(Interface it)
{
long start = System.currentTimeMillis();
for (int i = 0; i < LOOPS_PER; i++)
it.interfaceCall();
long end = System.currentTimeMillis();
report(end-start, "interface", LOOPS_PER);
}
private static void
report(long millis, String type, int calls)
{
double nanosPerCall = millis * 1.0e6 / calls;
System.out.println(type + ": " + nanosPerCall + " nanosec/call");
}
}
=====================