Java (bytecode) execution speed

Lee · Apr 29, 2007

All other things being equal, we expect an interpreted language to run a
bit slower than native machine code.

I understand that in the beginning, the earliest version of the JVM was
perceived to be demonstrably slower than a coresponding C program; but
that more recent versions have impreved coniderably.

Whats the current state of the art? Would we expect a java program to
run at 0.5 * the speed of C, or 0.7 or 0.9 or what?

Obviously it all depends on what you're doing and how you're doing it,
but still, there must be some rough rule of thumb as to what's a
reasonable expectation for how by Java program should run compared to
the equivalent C routine.

I'm asking programmers, rather than the "announcement/advocacy" groups
in the hopes you will have a more realistic idea of how things actually
work under the sphere of the moon.

Chris Smith · Apr 29, 2007

Lee said:
All other things being equal, we expect an interpreted language to run a
bit slower than native machine code.

Okay, but there are no interpreted languages discussed in the rest of
your post. All common implementations of Java use a JIT compiler, which
should lead to to no foregone conclusions about whether it will be
faster or slower than a compiled language. There are reasons it's
likely to be slower (the compiler must run quickly, so it won't do a lot
of global optimization), and reasons that today's very sophisticated JIT
compilers for Java are likely to run faster (because it runs at runtime,
the compiler can take advantage of statistical information about how
users are using the application right now, and optimize them as the fast
cases).

If you are looking into performance numbers for Java, you are making a
mistake by focusing on the bytecode execution model. In the end, it's
about a wash -- no loss or gain there. The really important stuff is in
garbage collection and management of the heap. Java's language
definition is such that there are often far more heap allocations than a
typical C program, and they are far shorter lived. Different heap
management techniques are used to adapt to that situation. This affects
the performance of Java applications quite a lot.

On average, the best heap allocation management (including garbage
collection) algorithms out there perform slightly worse than a C-style
explicit heap with free lists. Of course, there are applications where
it performs much better because it reduces the amount of computation
that goes into book-keeping; and there are applications where it
performs much worse if the lifecycles of objects in the application are
unusual. (Some poorly performing cases, should you be interested, are
when a large number of objects live just ever so slightly long enough to
make it out of the nursery, or when there are lots of long-lived objects
that contain frequently modified references to very short-lived
objects.)

Whats the current state of the art? Would we expect a java program to
run at 0.5 * the speed of C, or 0.7 or 0.9 or what?

I'd say anywhere from 0.7 to 1.1 times the speed of C would be a
reasonable guess, depending mainly on the nature of object lifetimes.
But it's just that; a guess.

Stefan Ram · Apr 29, 2007

Chris Smith said:
If you are looking into performance numbers for Java, you are
making a mistake by focusing on the bytecode execution model.
In the end, it's about a wash -- no loss or gain there. The
really important stuff is in garbage collection and management
of the heap. Java's language definition is such that there are
often far more heap allocations than a typical C program, and
they are far shorter lived.

Two quotations regarding gargabe collection:

»Your essay made me remember an interesting phenomenon I
saw in one system I worked on. There were two versions of
it, one in Lisp and one in C++. The display subsystem of
the Lisp version was faster. There were various reasons,
but an important one was GC: the C++ code copied a lot of
buffers because they got passed around in fairly complex
ways, so it could be quite difficult to know when one
could be deallocated. To avoid that problem, the C++
programmers just copied. The Lisp was GCed, so the Lisp
programmers never had to worry about it; they just passed
the buffers around, which reduced both memory use and CPU
cycles spent copying.«

<[email protected]>

»A lot of us thought in the 1990s that the big battle
would be between procedural and functional
programming, and we thought that functional
programming would provide a big boost in programmer
productivity. I thought that, too. Some people still think
that. It turns out we were wrong. Functional
programming is handy dandy, but it's not really the
productivity booster that was promised. The real
significant productivity advance we've had in programming
has been from languages which manage memory for you
automatically. It can be with reference counting or
garbage collection; it can be Java, Haskell, Visual Basic
(even 1.0), Smalltalk, or any of a number of scripting
languages. If your programming language allows you to grab
a chunk of memory without thinking about how it's going to
be released when you're done with it, you're using a
managed-memory language, and you are going to be much more
efficient than someone using a language in which you have
to explicitly manage memory. Whenever you hear someone
bragging about how productive their language is, they're
probably getting most of that productivity from the
automated memory management, even if they misattribute it.«

http://www.joelonsoftware.com/articles/APIWar.html

Stefan Ram · Apr 29, 2007

Supersedes: <[email protected]>

I would like to add two quotations to this thread:

»Your essay made me remember an interesting phenomenon I
saw in one system I worked on. There were two versions of
it, one in Lisp and one in C++. The display subsystem of
the Lisp version was faster. There were various reasons,
but an important one was GC: the C++ code copied a lot of
buffers because they got passed around in fairly complex
ways, so it could be quite difficult to know when one
could be deallocated. To avoid that problem, the C++
programmers just copied. The Lisp was GCed, so the Lisp
programmers never had to worry about it; they just passed
the buffers around, which reduced both memory use and CPU
cycles spent copying.«

<[email protected]>

»A lot of us thought in the 1990s that the big battle would
be between procedural and object oriented programming, and
we thought that object oriented programming would provide
a big boost in programmer productivity. I thought that,
too. Some people still think that. It turns out we were
wrong. Object oriented programming is handy dandy, but
it's not really the productivity booster that was
promised. The real significant productivity advance we've
had in programming has been from languages which manage
memory for you automatically.«

http://www.joelonsoftware.com/articles/APIWar.html

Supersedes: <[email protected]>

=?ISO-8859-1?Q?Arne_Vajh=F8j?= · Apr 29, 2007

Lee said:
All other things being equal, we expect an interpreted language to run a
bit slower than native machine code.

Since a Java is not interpreted but JIT compiled, then that point
is not so relevant.

I understand that in the beginning, the earliest version of the JVM was
perceived to be demonstrably slower than a coresponding C program; but
that more recent versions have impreved coniderably.

Whats the current state of the art? Would we expect a java program to
run at 0.5 * the speed of C, or 0.7 or 0.9 or what?

My expectations would be within the range 0.75-1.25 !

The variations between different C compilers with difference
settings and different JVM's with different switches and
the differences for different tasks is so big that the
language difference is insignificant.

Arne

Lee · Apr 30, 2007

Lee said:
All other things being equal, we expect an interpreted language to run a
bit slower than native machine code.

<SNIP>

At least two people were kind enough to point out that Java uses a JIT
compilation system and that in any case the difference of execution time
beween a Java progam and a hypothetical compiled version of the same
algorithm (or as similar as the two languages allow), would probably be
due more to the differences in heap managemant and/or garbage collection
than in raw compilation speed.

In that context it becomes plausible that in some circumstances Java
might actually run faster than an equivalent C/C++ implementation.

I must be missing an important nuance about Java and JIT.

Perhaps someone can "debug" me on this:

I had thought that bytcode was, so to speak, the "machine code" of the
Java Virtual machime. If that were true, I can't see how there would be
any room (or any need) for further compilation of the byte code. The
byte code itelf would "drive" the VM, taking the VM from internal state
to internal state until the computation was done.

But if the bytecode were just a portable abstraction, something "above"
the JVM's machine language but "below" the java source language, that
would create the need to compile the byte code "the rest of the way
down" to the actual jvm machine language, but NOT to native hardware
machine language.

So even in that case, the "compilation" would be down to the VM's
machine language, not the actual hardware's machine language.

But all the descriptions I see on the net about JIT talk in terms of
compilation to native machine language. I can see how that would work
with somethink like say the Pascal p-system, where pascal source would
be compiled into "p-code", and then the p-code would either be
interpreted or "just-in-time" compiled to native hardware machine language.

My problem is that in my conception, when it is a question of running a
virtual machine, the "compilation" would be to that vm's "machine"
language and thats as "low" as you could go.

What have I got wrong?

JT · Apr 30, 2007

I had thought that bytcode was, so to speak,
the "machine code" of the Java Virtual machime.

It is.

If that were true, I can't see how there would be any room

The Java Virtual Machine Specification is a precise document
on the meaning of bytecodes. So JIT compilers simply
attempt to produce native binaries that have the same behavior
as the bytecode.

I don't see how the JVMS prevents that.

(or any need) for further compilation of the byte code.

The need is speed. Interpretation is much slower
than native execution.

But if the bytecode were just a portable abstraction, something "above"
the JVM's machine language but "below" the java source language

No. The byte code is the native machine code of the JVM.

My problem is that in my conception, when it is a question of running a
virtual machine, the "compilation" would be to that vm's "machine"
language

The JIT does not compile to to the VM's machine language.
In fact, the JIT always compiles to the CPU's machine language.

and thats as "low" as you could go.

Why? The JVM itself has full access to the Operating System
that the JVM is running on. Whenever you have native methods
(eg. some of the GUI methods and the IO methods...), the JVM
will have to invoke the corresponding services from the Operating
System.

So, a JVM could invoke a JIT to translate frequently-executed code
into a suitable binary format that the OS can execute.

I see no problem there. (And as you noted, there are many
powerful Java JIT out there)

- JT

Kai Schwebke · Apr 30, 2007

Lee said:
My problem is that in my conception, when it is a question of running a
virtual machine, the "compilation" would be to that vm's "machine"
language and thats as "low" as you could go.

What have I got wrong?

In the end the code does not run on the vm, but on the real machine.
A runtime with "just in time compilation" compiles the virtual machine
code to real, machine dependent code much like a compiler would do.

Kai

Christian · Apr 30, 2007

Kai said:
In the end the code does not run on the vm, but on the real machine.
A runtime with "just in time compilation" compiles the virtual machine
code to real, machine dependent code much like a compiler would do.

Kai

Is there any interpretation going on today in the jvm or is simply
everything compiled to machinecode just in time before execution?

JT · Apr 30, 2007

Is there any interpretation going on today in the jvm or is simply
everything compiled to machinecode just in time before execution?

Depends on the JIT. The default JIT from Sun ("HotSpot")
will initially interpret the code. As the interpreter runs, HotSpot
then
analyzes the runtime behavior and try to identify which methods
should be compiled to native code.

See this section on Sun.com:
http://java.sun.com/products/hotspot/whitepaper.html#hotspot

- JT

Chris Smith · Apr 30, 2007

Lee said:
Perhaps someone can "debug" me on this:

I had thought that bytcode was, so to speak, the "machine code" of the
Java Virtual machime. If that were true, I can't see how there would be
any room (or any need) for further compilation of the byte code. The
byte code itelf would "drive" the VM, taking the VM from internal state
to internal state until the computation was done.

Bytecode is just a file format to represent the actions of a piece of
Java code. Nothing more, and nothing less. It can be used directly, or
it can be converted to a different format.

Early implementations of Java interpreted it; that is, they used it
directly. Actually, implementations of Java on cellular phones and
other embedded devices often still do this because it's more efficient
in terms of memory usage and generally no one does high-performance
computation on a cell phone.

Newer Java implementations (as of 1999 or so) for desktop and server
platforms rarely interpret the bytecode. They translate it into the
native machine language, and let the processor run that native machine
language directly. This is, obviously, much faster.

But if the bytecode were just a portable abstraction, something "above"
the JVM's machine language but "below" the java source language, that
would create the need to compile the byte code "the rest of the way
down" to the actual jvm machine language, but NOT to native hardware
machine language.

There is no JVM machine language. Perhaps what's confusing you is there
is no such thing as "the" JVM. There is a JVM for x86, another for x86-
64-bit platforms, another for Sparc, and so on... There are different
JVMs for different operating systems as well, though they often share
most of the JIT implementation. Each of these implementations of a JVM
contains its own different JIT compiler that generates code appropriate
for that processor.

So in the end, the JVM for a particular platform does the transformation
to the native machine language for that CPU, and from that point on it
just runs the code and sits back and waits for the code to call it;
essentially, after the JIT step, the JVM is essentially just a library
that is called by the application code.

Chris Smith · Apr 30, 2007

Christian said:
Is there any interpretation going on today in the jvm or is simply
everything compiled to machinecode just in time before execution?

Modern JVMs do both; the performance-critical stuff is JIT'ed, but a lot
of one-off initialization code will be interpreted. At one time, JIT
compilers would frequently run before any code was executed; this was
changed because mixed mode (some interpreting, some compiling) reduces
the perceived start-up time of applications.

Wojtek · Apr 30, 2007

Lee wrote :

<SNIP>

At least two people were kind enough to point out that Java uses a JIT
compilation system and that in any case the difference of execution time
beween a Java progam and a hypothetical compiled version of the same
algorithm (or as similar as the two languages allow), would probably be due
more to the differences in heap managemant and/or garbage collection than in
raw compilation speed.

In that context it becomes plausible that in some circumstances Java might
actually run faster than an equivalent C/C++ implementation.

I must be missing an important nuance about Java and JIT.

Perhaps someone can "debug" me on this:

I had thought that bytcode was, so to speak, the "machine code" of the Java
Virtual machime. If that were true, I can't see how there would be any room
(or any need) for further compilation of the byte code. The byte code itelf
would "drive" the VM, taking the VM from internal state to internal state
until the computation was done.

But if the bytecode were just a portable abstraction, something "above"
the JVM's machine language but "below" the java source language, that would
create the need to compile the byte code "the rest of the way down" to the
actual jvm machine language, but NOT to native hardware machine language.

So even in that case, the "compilation" would be down to the VM's machine
language, not the actual hardware's machine language.

But all the descriptions I see on the net about JIT talk in terms of
compilation to native machine language. I can see how that would work with
somethink like say the Pascal p-system, where pascal source would be compiled
into "p-code", and then the p-code would either be interpreted or
"just-in-time" compiled to native hardware machine language.

My problem is that in my conception, when it is a question of running a
virtual machine, the "compilation" would be to that vm's "machine" language
and thats as "low" as you could go.

In a typical native environment you have:

source - what the programmer wants done
object code - what the programmer wants done, but in a form the
computer can understand
library - how to do stuff for a particular operating system
executable - what the programmer wants done along with how to do it for
that OS

So the sequence is:
source -> object code (compiler with optomization switches which the
programmer "guesses" will make the code run faster/better))
object code + library -> executable (linker)
** the executable is distributed
the user runs the executable

In Java you have:

source - what the programmer wants done
bytecode - what the programmer wants done, but in a form the Java
Virutal Machine can understand
** the bytecode is distributed

On the client machine, the JVM reads the byte code and since the JVM is
native to that OS it knows how to do stuff

The sequence is:
source -> byte code (compiler)
** the byte code is distributed
the user runs the JVM pointing to the byte code (The JVM does
not-the-fly optomizations depending on how THAT user uses the
application)
byte code + JVM

In a pure enterpreted environment (such as Perl and PHP)
source - what the programmer wants done
** the source is distributed
the user runs the Perl (or PHP) enterpreted pointing to the source code

This makes more sense with pretty diagrams

Note: Yes I know you can now get Perl and PHP linkers to produce
executables.

RedGrittyBrick · Apr 30, 2007

Wojtek said:
In a pure enterpreted environment (such as Perl and PHP)

These days, things are rarely that simple.
http://www.perl.com/doc/FMTEYEWTK/comp-vs-interp.html

Lee · May 1, 2007

JT said:
It is.

So I'm not completely in cloud cuckoo land. Phew!

The Java Virtual Machine Specification is a precise document
on the meaning of bytecodes. So JIT compilers simply
attempt to produce native binaries that have the same behavior
as the bytecode.

I don't see how the JVMS prevents that.

See below

No. The byte code is the native machine code of the JVM.

The JIT does not compile to to the VM's machine language.
In fact, the JIT always compiles to the CPU's machine language.

Why?

Why I dont understand how you can go "lower" than the VM's machine code:

I'm handicapped by not knowing the design/architecture of the actual
JVM, so even though the best way to explain my difficulty would be to do
so in terms of the actual JVM instructions, I will do the next best
thing, and try to do it in terms of a simpler and entirely mythical
virtual machine.

Lets suppose I have a string handling virtual machine. Its got a string
store and it has two native operations (among others, but we're just
interested in showing why I think its not possible to get "below"
the virtual machines own virtual machine language. Your mission
impossible, should you choose to accept it is to expose the flaw
in how I'm thinking.

The two operations are "Head" which returns the first (zeroth) character
of the string, and "Tail" which returns the substring consisting of the
string that remains after removing the first (Zeroth) character. Gee,
sounds like Lisp car and cons, but "never mind".

So the "byte" code for "Head" is x01 and the byte code for "Tail" is
x02. The implementation of the string virtual machine on a particular
hardware platform consists of the native hardware machine instructions
that make the internal structures of the VM (implemented of course as
"real" structures built out of real memory and real registers and all that.

I suppose in one sense, you can say that the "compilation" of the byte
code x01 and/or x02 is the set of machine instructions used to implement
that part of the string virtual machine in the real hardware.

Compilation of the byte code would be nothing more or less than
re-implementing a portion of the string virtual machine. So that makes
no sense to me, as presumably you've done it right the first time when
you implemented the string virtual machine for that hardware in the
first place.

Are you saying that the vm is dynamically re-implementation at run time?

Another way to see my difficulty is to imagine that a virtual machine
instruction changes the internal state of the virtual machine in some
"Atomic" way. No single native machine instruction can do that, because
the native instructions change the state of the real machine, not the
state of the virtual machine. A small change in state of the virtual
machine involves lots of "non atomic" changes in the state of the
underlying real machine. The implementation of the virtual machine runs
lots of native operating instructions to acheive that effect, but those
instructions are determined when you implement the virtual machine, not
dynamically at run time. Unless of course I'm all wet and what you're
really doing is in fact dynamically re-writing the JVM implementation
which seems a bit mind boggling to me.

The JVM itself has full access to the Operating System
that the JVM is running on. Whenever you have native methods
(eg. some of the GUI methods and the IO methods...), the JVM
will have to invoke the corresponding services from the Operating
System.

Er, yes. A fixed set of instructions determined at implementation time,
for each JVM machine instruction. Or is that not so?

So, a JVM could invoke a JIT to translate frequently-executed code
into a suitable binary format that the OS can execute.

Which means that the implementation of any given Java machine language
primitive is dynamically altered at run time. Eek! Can that be true?

I see no problem there.

You dont? The native hardware instructions that find the head of a
string are re-invented every time somebody does the "head" operation?
Can that be right?

(And as you noted, there are many

Lew · May 1, 2007

Lee said:
Why I dont understand how you can go "lower" than the VM's machine code:

Are you saying that the vm is dynamically re-implementation at run time?
Yes.

dynamically at run time. Unless of course I'm all wet and what you're
really doing is in fact dynamically re-writing the JVM implementation
which seems a bit mind boggling to me.

Yes, that's what's happening.

Er, yes. A fixed set of instructions determined at implementation time,
for each JVM machine instruction. Or is that not so?

That is not so.

Which means that the implementation of any given Java machine language
primitive is dynamically altered at run time. Eek! Can that be true?
Yes.

You dont? The native hardware instructions that find the head of a
string are re-invented every time somebody does the "head" operation?
Can that be right?

No. Just when necessary to optimize the program.

Eric Sosman · May 1, 2007

Lee said:
[...]

Compilation of the byte code would be nothing more or less than
re-implementing a portion of the string virtual machine. So that makes
no sense to me, as presumably you've done it right the first time when
you implemented the string virtual machine for that hardware in the
first place.

For one thing, the virtual-to-native compilation can
eliminate all the decoding of the virtual instructions. A
straightforward interpreter will fetch a virtual instruction,
fiddle with it for a while, and dispatch to an appropriate
sequence of actual instructions that accomplish the virtual
instruction's mission. It may amount to only a few masks, a
few tests, and a big switch construct, but the interpreter
goes through it on every virtual instruction. Once the code
is compiled to native instructions, all the decoding and
dispatching simply vanishes: it was done once, by the compiler,
and need never be done again.

Another effect is that the virtual instructions are quite
often more general than they need to be for particular uses.
Stepping away from your two-instruction string machine for a
moment, let's suppose you've got a virtual instruction that adds
two integers to form their sum. The interpreter probably fetches
operand A, fetches operand B, adds them, and stores the sum in
target C. Well, the virtual-to-native compiler might "notice"
that A,B,C are the same variable, which the program adds to itself
in order to double it. The generated native machine code is then
quite unlikely to do two fetches: one will suffice, followed by
a register-to-register add or a left shift or some such. Not only
that, but the compiler may further notice that C is immediately
incremented after doubling, so instead of storing C and fetching
it back again for incrementation, the native machine code says
"Hey, I've already got it in this here register" and eliminates
both the store and the subsequent fetch.

[...]
So, a JVM could invoke a JIT to translate frequently-executed code
into a suitable binary format that the OS can execute.

Click to expand...

Which means that the implementation of any given Java machine language
primitive is dynamically altered at run time. Eek! Can that be true?

I see no problem there.

Click to expand...

You dont? The native hardware instructions that find the head of a
string are re-invented every time somebody does the "head" operation?
Can that be right?

Could be. The virtual-to-native compiler has the advantage
of being able to see the context in which a virtual instruction
is used, and may be able to take shortcuts, as in the instruction-
combining example above. As an example of a JVM-ish application
of this sort of thing, consider compiling `x += x;', our
familiar doubling example but this time with arrays. Formally
speaking, each array reference requires a range check -- but the
JIT may notice that if the left-hand side passes the range check,
there is no need to do it a second time on the right-hand side.
Even better, the JIT may notice common patterns like

for (int i = 0; i < x.length; ++i)
x += x;

.... and skip the range checking entirely.

A viewpoint you may find helpful, if a little wrenching at
first, is to think of the virtual instruction set as the elements
of a low-level programming language. You could, with sufficient
patience, write Java bytecode by hand, but it might be easier to
write Java and use javac to generate bytecode from it. Either
way, the bytecode is just an expression of a program, written in
a formal language, and there's no reason a translator couldn't
accept that formal language as its "source" for compilation.

John W. Kennedy · May 1, 2007

Eric said:
Could be. The virtual-to-native compiler has the advantage
of being able to see the context in which a virtual instruction
is used, and may be able to take shortcuts, as in the instruction-
combining example above.

It also knows /exactly/ what processor it's running on, and can take
advantage of detailed timing information and new opcodes.

--
John W. Kennedy
"But now is a new thing which is very old--
that the rich make themselves richer and not poorer,
which is the true Gospel, for the poor's sake."
-- Charles Williams. "Judgement at Chelmsford"
* TagZilla 0.066 * http://tagzilla.mozdev.org

Lee · May 1, 2007

Lew said:
Yes.

Awsome.

I kept thinking of the virtual machine as a fixed "simulation"
application, written once and set in stone; but I can see how its
possible to optimize whole blocks of code in ways that are not likely
when considering just one primitive operation.

Wow! I'm still blown away by the concept.

Joshua Cranmer · May 1, 2007

Lee said:
Whats the current state of the art? Would we expect a java program to
run at 0.5 * the speed of C, or 0.7 or 0.9 or what?

In one program contest I participate in, the Java factor is 1.5x, BUT
this is considering that all code is expected to run in 1 second or less
and that great emphasis is placed on optimized code.

I would expect that most applications would run at approximately native
speeds.

As a side-note, said competition used to use a 5x factor (but it used
1.3)....

Speed	4	Dec 5, 2007
execution speed java vs. C	27	Dec 9, 2004
Java vs. JavaApplet execution speed	3	May 18, 2004
Speed?	4	Jan 3, 2004
Speed: bytecode vz C API calls	34	Dec 8, 2003
malloc() execution time?	13	Dec 30, 2009
speed of C++ relative to C#.net and Java	2	Jun 24, 2008
Java speed vs. C++.	30	Dec 5, 2004

Java (bytecode) execution speed

Lee

Chris Smith

Stefan Ram

Stefan Ram

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Lee

JT

Kai Schwebke

Christian

JT

Chris Smith

Chris Smith

Wojtek

RedGrittyBrick

Lee

Lew

Eric Sosman

John W. Kennedy

Lee

Joshua Cranmer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads