Java processors

bob smith · Jul 5, 2012

What ever happened to those processors that were supposed to run Java natively?

Did Sun or anyone else ever make those?

Eric Sosman · Jul 5, 2012

What ever happened to those processors that were supposed to run Java natively?

Did Sun or anyone else ever make those?

http://en.wikipedia.org/wiki/Java_processor

(If you need help clicking links, just ask.)

BGB · Jul 5, 2012

http://en.wikipedia.org/wiki/Java_processor

(If you need help clicking links, just ask.)

and, of those, AFAIK, ARM's Jazelle was the only one to really gain much
widespread adoption, and even then is largely being phased out in favor
of ThumbEE, where the idea is that instead of using direct execution, a
lightweight JIT or similar is used instead.

part of the issue I think is that there isn't really all that much
practical incentive to run Java bytecode directly on a CPU, since if
similar (or better) results can be gained by using a JIT to another ISA,
why not use that instead?

this is a merit of having a bytecode which is sufficiently abstracted
from the underlying hardware such that it can be efficiently targeted to
a variety of physical processors.

this is in contrast to a "real" CPU ISA, which may tend to expose enough
internal workings to where efficient implementation on different CPU
architectures are problematic (say: differences in endianess, support
for unaligned reads, different ways of handling arithmetic status
conditions, ...). in such a case, conversion from one ISA to another may
come at a potentially significant performance hit.

whereas if this issue does not really apply, or potentially even the
output of the JIT will execute faster than it would via direct execution
of the ISA by hardware (say, because the JIT can do a lot more advanced
optimizations or map the code onto a different and more efficient
execution model, such as transforming the stack-oriented code into
register-based machine code), than there is much less merit to the use
of direct execution.

Eric Sosman · Jul 5, 2012

and, of those, AFAIK, ARM's Jazelle was the only one to really gain much
widespread adoption, and even then is largely being phased out in favor
of ThumbEE, where the idea is that instead of using direct execution, a
lightweight JIT or similar is used instead.

part of the issue I think is that there isn't really all that much
practical incentive to run Java bytecode directly on a CPU, since if
similar (or better) results can be gained by using a JIT to another ISA,
why not use that instead?

this is a merit of having a bytecode which is sufficiently abstracted
from the underlying hardware such that it can be efficiently targeted to
a variety of physical processors.

this is in contrast to a "real" CPU ISA, which may tend to expose enough
internal workings to where efficient implementation on different CPU
architectures are problematic (say: differences in endianess, support
for unaligned reads, different ways of handling arithmetic status
conditions, ...). in such a case, conversion from one ISA to another may
come at a potentially significant performance hit.

whereas if this issue does not really apply, or potentially even the
output of the JIT will execute faster than it would via direct execution
of the ISA by hardware (say, because the JIT can do a lot more advanced
optimizations or map the code onto a different and more efficient
execution model, such as transforming the stack-oriented code into
register-based machine code), than there is much less merit to the use
of direct execution.

In principle, a JIT could do better optimization than a
traditional compiler because it has more information available.
For example, a JIT can know what classes are actually loaded in
the JVM and take shortcuts like replacing getters and setters with
direct access to the underlying members. A JIT can gather profile
information from a few interpreted executions and use the data to
guide the eventual realization in machine code. Basically, a JIT
can know what the environment actually *is*, while a pre-execution
compiler must produce code for every possible environment.

On the other hand, a former colleague of mine once observed
that "Just-In-Time" is in fact a misnomer: it's a "Just-Too-Late"
compiler because it doesn't even start work until after you need
its output! Even if the JIT generates code better optimized for
the current circumstances than a pre-execution compiler could,
the JIT's code starts later. Does Achilles catch the tortoise?

Jim Janney · Jul 5, 2012

BGB said:
and, of those, AFAIK, ARM's Jazelle was the only one to really gain
much widespread adoption, and even then is largely being phased out in
favor of ThumbEE, where the idea is that instead of using direct
execution, a lightweight JIT or similar is used instead.

part of the issue I think is that there isn't really all that much
practical incentive to run Java bytecode directly on a CPU, since if
similar (or better) results can be gained by using a JIT to another
ISA, why not use that instead?

The cost of entry into CPU manufacturing is far from cheap, and once
you're in it's anything but a level playing field. Intel has an
enormous advantage due to the amount of money it can plow into improving
its manufacturing processes. And the demand for a system that can only
run JVM-based software is relatively limited.

Back in the day Niklaus Wirth had a system that was optimised for
running Modula-2, with its own processor and operating system written in
Modula-2. I don't remember now what it was called.

BGB · Jul 5, 2012

The cost of entry into CPU manufacturing is far from cheap, and once
you're in it's anything but a level playing field. Intel has an
enormous advantage due to the amount of money it can plow into improving
its manufacturing processes. And the demand for a system that can only
run JVM-based software is relatively limited.

Back in the day Niklaus Wirth had a system that was optimised for
running Modula-2, with its own processor and operating system written in
Modula-2. I don't remember now what it was called.

yes, but ARM already had direct JBC execution in the form of Jazelle,
which it is then phasing out in favor of ThumbEE, which is a JIT-based
strategy.

I suspect this is telling, IOW, that even when one *can* directly
execute on raw hardware, does it actually buy enough to make it worthwhile?

these occurrences imply a few things: Java is a fairly big thing on ARM,
and even then it was likely either not sufficiently performance or
cost-effective to keep direct execution, leading to a fallback strategy
of making extensions to ease JIT compiler output.

yes, on x86 targets, it is a much harder sell.

BGB · Jul 5, 2012

In principle, a JIT could do better optimization than a
traditional compiler because it has more information available.
For example, a JIT can know what classes are actually loaded in
the JVM and take shortcuts like replacing getters and setters with
direct access to the underlying members. A JIT can gather profile
information from a few interpreted executions and use the data to
guide the eventual realization in machine code. Basically, a JIT
can know what the environment actually *is*, while a pre-execution
compiler must produce code for every possible environment.

well, yes, but it isn't clear how this is directly related (since it was
JIT vs raw HW support, rather than about JIT vs compilation in advance).

a limiting factor for JIT and optimizations is that they often have a
much smaller time window, and so are limited mostly to optimizations
which can themselves be performed fairly quickly.

FWIW though, there is also AOT, which can also optimize specifically for
a specific piece of hardware, but avoids a lot of the initial delay of a
JIT by compiling in advance (or on first execution, so the first time
the app will take a longer time to start up, but next time it will start
much faster).

yes, there are a lot of tradeoffs, for example, AOT will not be able to,
say, make decisions informed by profiler output, since in this case it
will not have this information available.

On the other hand, a former colleague of mine once observed
that "Just-In-Time" is in fact a misnomer: it's a "Just-Too-Late"
compiler because it doesn't even start work until after you need
its output! Even if the JIT generates code better optimized for
the current circumstances than a pre-execution compiler could,
the JIT's code starts later. Does Achilles catch the tortoise?

yeah.

even then, there may be other levels of tradeoffs, such as, whether to
do full compilation, or merely spit out some threaded code and run that.

the full compilation then is much more complicated (more complex JIT),
and also slower (since now the JIT needs to worry about things like
type-analysis, register allocation, ...), whereas a simpler strategy,
like spitting out a bunch of function calls and maybe a few basic
machine-code fragments, is much faster and simpler (the translation can
be triggered by trying to call a function, without too many adverse
effects on execution time, and will tend to only translate parts of the
program or library which are actually executed).

some of this can influence VM design as well (going technically OT here):
in my VM it led to the use of explicit type-tagging (via prefixes),
partly because the bytecode isn't directly executed anyways (merely
translated to threaded code by this point), and the "original plan" of
using type-inference and flow-analysis in the JIT backend was just too
much effort to bother with for the more "trivial" threaded-code backend,
so I instead ended up migrating a lot of this logic to the VM frontend,
and using prefixes to indicate types.

I still call the threaded-code execution "interpretation" though, partly
because it is a gray area and from what I can gather, such a thing is
still called an interpreter even when it no longer bases its execution
off direct interpretation of bytecode or similar.

but, the threaded-code is at least sufficiently fast to lessen the
immediate need for the effort of implementing a more proper JIT compiler.

or such...

Jan Burse · Jul 6, 2012

Jim said:
Back in the day Niklaus Wirth had a system that was optimised for
running Modula-2, with its own processor and operating system written in
Modula-2. I don't remember now what it was called.

Do you mean?
http://en.wikipedia.org/wiki/Lilith_(computer)

Arne Vajhøj · Jul 6, 2012

On the other hand, a former colleague of mine once observed
that "Just-In-Time" is in fact a misnomer: it's a "Just-Too-Late"
compiler because it doesn't even start work until after you need
its output! Even if the JIT generates code better optimized for
the current circumstances than a pre-execution compiler could,
the JIT's code starts later. Does Achilles catch the tortoise?

It is my impression that modern JVM's especially with -server
(or equivalent) is rather aggressive about JIT compiling.

..NET CLR always does it first time I believe.

Arne

Martin Gregorie · Jul 6, 2012

Do you mean? http://en.wikipedia.org/wiki/Lilith_(computer)

Well spotted.

IIRC that was roughly contemporary with the Burroughs x700 systems, which
had an interesting take on virtualisation: its MCP OS ran each user
program in a VM that supported the conceptual model used by its
programming language, so a FORTRAN program ran in a word-addressed VM
with a register set, COBOL ran in a byte-addressed VM (also with
registers) while Algol/Pascal/C (if it had existed at the time), ran in a
stack-based VM, all using instruction sets that suited that programming
model. Unfortunately I never got to play with that kit, but wish I had
known more about it because it was well ahead of its time.

The nearest I got to that, somewhat later, was a 2966 running 1900
programs (24bit word addressed, register-based, 6 bit ISO characters)
under George 3 simultaneously with native programs (byte-addressed, stack-
based, 8-bit EBCDIC characters) under VME/B.

IMHO the 2966 trick of hosting a VM per OS with appropriate microcode was
neat, but was blown away by the Burroughs trick of spinning up the
appropriate VM for each application program and controlling the lot from
the same OS. IBM's OS/400 could do this to run S/36 software on an AS/400
but I don't know of anything else that comes close.

Jim Janney · Jul 6, 2012

Jan Burse said:
Do you mean?
http://en.wikipedia.org/wiki/Lilith_(computer)

That's the one, thank you. The idea of writing an OS entirely in a
higher-level language was still pretty novel. I think the Unix kernel
was running about 20% assembly at the time.

BGB · Jul 6, 2012

It is my impression that modern JVM's especially with -server
(or equivalent) is rather aggressive about JIT compiling.

.NET CLR always does it first time I believe.

yes, but also .NET CIL is not really well suited to direct
interpretation (the bytecode does not itself contain full type
information, ...), with the idea being that JIT is the sole "viable" way
to execute it.

so, when starting up a program, the .NET CLR will compile it to native
code, and then begin executing it.

also, very often .NET programs are AOT compiled during or shortly
following installation (if one observes a heavy CPU usage of "ngen.exe",
during or following installation of a .NET app, that is the AOT compiler
doing its thing).

Eric Sosman · Jul 6, 2012

It is my impression that modern JVM's especially with -server
(or equivalent) is rather aggressive about JIT compiling.

My colleague's point was that JITting the code, aggressively
or not, is pre-execution overhead: It is work spent on something
other than running your code. If you just dove in and started
interpreting you might be running more slowly, but you'd have a
head start: Achilles is the faster runner, but cannot overcome
the tortoise's lead if the race is short.

I dunno: Are JIT's nowadays smart enough to recognize code
that will (future tense) execute too few times to be worth JITting?
Static initializers without loops, say? Code in (some) catch
blocks?

Roedy Green · Jul 6, 2012

What ever happened to those processors that were supposed to run Java natively?

Did Sun or anyone else ever make those?

For some reason the early designs had a big problem with heat. This is
very bad thing in a portable unit where low power is the main goal.
--
Roedy Green Canadian Mind Products
http://mindprod.com
Why do so many operating systems refuse to define a standard
temporary file marking mechanism? It could be a reserved lead character
such as the ~ or a reserved extension such as .tmp.
It could be a file attribute bit. Because they refuse, there is no
fool-proof way to scan a disk for orphaned temporary files and delete them.
Further, you can't tell where the orhaned files ame from.
This means the hard disks gradually fill up with garbage.

Lew · Jul 6, 2012

Arne said:
(or equivalent) is rather aggressive about JIT compiling.

.NET CLR always does it first time I believe.

WRT Java, there are options such as "-XX:CompileThreshold=10000" (defaultfor -server).

That means that the HotSpot compiler sees the same code 10000 times
before deciding to compile it.

<http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html>

There's a reason performance white papers for Java discuss
GC and other options. They can influence performance more than
JIT does.

<http://www.oracle.com/technetwork/java/javase/tech/performance-jsp-141338.html>

Not that optimization of the JITter is a bad idea.

<http://www.oracle.com/technetwork/java/6-performance-137236.html#2.1.6>

You can see the effect of memory and other non-JIT enhancements on
the performance of Java 6 here:
<http://www.oracle.com/technetwork/java/6-performance-137236.html#2.3>

<http://www.oracle.com/technetwork/java/hotspotfaq-138619.html>

Perhaps the source code will help you understand:
<http://openjdk.java.net/groups/hotspot/>

Roedy Green · Jul 6, 2012

and, of those, AFAIK, ARM's Jazelle was the only one to really gain much
widespread adoption, and even then is largely being phased out in favor
of ThumbEE, where the idea is that instead of using direct execution, a
lightweight JIT or similar is used instead.

RAM cost has dropped precipitously. However, originally it very
limit. If you did not need a JIT you could use that RAM for the
application.
--
Roedy Green Canadian Mind Products
http://mindprod.com
Why do so many operating systems refuse to define a standard
temporary file marking mechanism? It could be a reserved lead character
such as the ~ or a reserved extension such as .tmp.
It could be a file attribute bit. Because they refuse, there is no
fool-proof way to scan a disk for orphaned temporary files and delete them.
Further, you can't tell where the orhaned files ame from.
This means the hard disks gradually fill up with garbage.

Roedy Green · Jul 6, 2012

If you just dove in and started
interpreting you might be running more slowly, but you'd have a
head start

That is just what JITs do. It is only after a while they have gathered
some stats to they decide which classes to turn to machine code. The
astounding thing is they stop the interpreter in mid flight executing
a method, and replace it with machine code and restart it. That to me
is far more impressive than walking on water.
--
Roedy Green Canadian Mind Products
http://mindprod.com
Why do so many operating systems refuse to define a standard
temporary file marking mechanism? It could be a reserved lead character
such as the ~ or a reserved extension such as .tmp.
It could be a file attribute bit. Because they refuse, there is no
fool-proof way to scan a disk for orphaned temporary files and delete them.
Further, you can't tell where the orhaned files ame from.
This means the hard disks gradually fill up with garbage.

Roedy Green · Jul 6, 2012

Back in the day Niklaus Wirth had a system that was optimised for
running Modula-2, with its own processor and operating system written in
Modula-2. I don't remember now what it was called.

Lilith.
see
http://www.ethistory.ethz.ch/rueckb...weitere_seiten/lilith/index_EN/popupfriendly/
--
Roedy Green Canadian Mind Products
http://mindprod.com
Why do so many operating systems refuse to define a standard
temporary file marking mechanism? It could be a reserved lead character
such as the ~ or a reserved extension such as .tmp.
It could be a file attribute bit. Because they refuse, there is no
fool-proof way to scan a disk for orphaned temporary files and delete them.
Further, you can't tell where the orhaned files ame from.
This means the hard disks gradually fill up with garbage.

Gene Wirchenko · Jul 6, 2012

That is just what JITs do. It is only after a while they have gathered
some stats to they decide which classes to turn to machine code. The
astounding thing is they stop the interpreter in mid flight executing
a method, and replace it with machine code and restart it. That to me
is far more impressive than walking on water.

Do you have a cite for that? Restarting a method could be messy.
Imagine if files are opened, other objects created, etc.

I suspect that it might be as prosaic as a method execution times
counter reaching a threshold value triggering the conversion.

Sincerely,

Gene Wirchenko

Roedy Green · Jul 6, 2012

Unfortunately I never got to play with that kit, but wish I had
known more about it because it was well ahead of its time

I got to write code for the Burroughs 1900, a successor. The code
density was about 10 times what I was used to. I loved the machine,
but it was not that much fun to code since everything was done at a
high level language level. It was just so straightforward. The thing
I found most fun was NCP language. Even a salesman could write a
custom program for polling a set of multi-drop terminals.

The underlying hardware had only 24 bits addressing, but it was bit
addressable. That let you address bytes with 21 bits, a mere 2
megabytes.Yet that little machine pumped out transactions like you
would not believe. It used memory very cleverly dynamically balancing
system, app, database, disk cache.

I suppose they could have extended the architecture, leaving the
per-process limits in place. Univac and Burroughs merged to form
UniSys. I don't know what happened to their various architectures.

Univac had the 1100 36 bit machines, and some mid range IBM
compatibles inherited from RCA. Burroughs had the high end Algol
machines (fiendishly complex), mid range decimal addressed (designed
for easy assembler coding) and the 1900 series -- the interpreter per
language design.

--
Roedy Green Canadian Mind Products
http://mindprod.com
Why do so many operating systems refuse to define a standard
temporary file marking mechanism? It could be a reserved lead character
such as the ~ or a reserved extension such as .tmp.
It could be a file attribute bit. Because they refuse, there is no
fool-proof way to scan a disk for orphaned temporary files and delete them.
Further, you can't tell where the orhaned files ame from.
This means the hard disks gradually fill up with garbage.

How can you make idle processors pick up java work?	3	Jul 30, 2012
How can you make idle processors pick up java work?	3	Jul 31, 2012
How can you make idle processors pick up java work?	3	Jul 31, 2012
How can you make idle processors pick up java work?	3	Jul 31, 2012
How can you make idle processors pick up java work?	3	Jul 31, 2012
How can you make idle processors pick up java work?	3	Jul 31, 2012
How can you make idle processors pick up java work?	2	Jul 31, 2012
looking for performance statistics (native JAVA processors)	1	Mar 5, 2006

Java processors

bob smith

Eric Sosman

BGB

Eric Sosman

Jim Janney

BGB

BGB

Jan Burse

Arne Vajhøj

Martin Gregorie

Jim Janney

BGB

Eric Sosman

Roedy Green

Lew

Roedy Green

Roedy Green

Roedy Green

Gene Wirchenko

Roedy Green

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads