How big is the 1.5 JRE download?

A

Andrew Thompson

How large is the Java 1.5
Runtime Environment download?

[ In this specific instance,
speaking of the Windows download. ]

I have two 1.5 rt.jar's on my system.
<jre>/lib/rt.jar 31,971,302 bytes, and..
<sdk>/jre/lib/rt.jar 35,805,012 bytes.

OTOH, the JRE 1.5 installer itself, is..
j2re-1_5_0-beta-windows-i586.exe 14,445,056 bytes.

How can the installer be less than
half the size of the rt.jar alone?

What have I missed?
 
C

Carl Howells

P.Hill said:
"Compression"?

-Paul

Indeed, that is the answer... But you have to be a bit more precise.

Jar files are Zip files, which means they are, in fact, compressed
already. So just offering "compression" as a magic answer is wrong.

The problem is that zip isn't a very good compression scheme for .class
files. It doesn't take advantage of their structure and common idioms
effectively, and yeilds suboptimal results. MUCH better compression
algorithms, specificly for .class files, have been created. One of
those algorithms is used to package the .class files for the JRE
download. Part of the install process is extracting those .class files
and converting them into jar files.
 
M

Mickey Segal

Carl Howells said:
The problem is that zip isn't a very good compression scheme for .class
files. It doesn't take advantage of their structure and common idioms
effectively, and yeilds suboptimal results. MUCH better compression
algorithms, specificly for .class files, have been created. One of
those algorithms is used to package the .class files for the JRE
download. Part of the install process is extracting those .class files
and converting them into jar files.

It is too bad that such compression is not used for Java archives.
Microsoft's CAB files do considerably better than JAR files, but CAB files
do not seem to do nearly as well as the JRE download seems to do.
 
A

Andrew Thompson

"Compression"?

(slaps forehead) That is what you (..OK, I)
get for assuming Sun stored the rt.jar as
a compressed file.

[ It reduces to 9,808,537 bytes with compression. ]

Well.. that was embarrassing but informative.

Thanks Paul! I can continue slagging .NET for
being a 27 Meg download, ..though I'll have to
wade through over 60 Meg of 'critical updates'
before I can even get to check that, apparently.. :-(
 
M

Michael Borgwardt

Carl said:
Indeed, that is the answer... But you have to be a bit more precise.

Jar files are Zip files, which means they are, in fact, compressed
already.

Not necessarily.
The problem is that zip isn't a very good compression scheme for .class
files. It doesn't take advantage of their structure and common idioms
effectively, and yeilds suboptimal results. MUCH better compression
algorithms, specificly for .class files, have been created. One of
those algorithms is used to package the .class files for the JRE
download.

Can you back that up somehow? I severely doubt that there's all that much
difference in performance between such specialized compression schemes and
ZIP's general entropy encoding which is *quite* good at compressing the
kind of simple redundancies likely to turn up in any kind of executable.

Fact is, rt.jar is NOT compressed, probably to speed up startup times.
This alone is perfectly sufficient to explain the size difference.
 
M

Mickey Segal

Michael Borgwardt said:
Can you back that up somehow? I severely doubt that there's all that much
difference in performance between such specialized compression schemes and
ZIP's general entropy encoding which is *quite* good at compressing the
kind of simple redundancies likely to turn up in any kind of executable.

Fact is, rt.jar is NOT compressed, probably to speed up startup times.
This alone is perfectly sufficient to explain the size difference.

Too bad. It would been nice to have proof of a fabulous compression method.
But the point does remain that CAB files are substantially smaller than JAR
files, and some additional saving can be achieved using obfuscator programs,
so it does seem there is room for improvement on Java archive compression.
 
C

Chris Uppal

Mickey said:
Too bad. It would been nice to have proof of a fabulous compression
method. But the point does remain that CAB files are substantially
smaller than JAR files[...]

Do CAB files support random access to their consituents ?

I don't know myself, but part of the reason for JAR file compression not being
optimal is that each file is compressed (if at all) independently. (As is
familar to old UNIX hackers -- it's much better to compress a tar file, than to
tar a collection of individually compressed files) Also the compressor used
for each file is "naive" in that it has not been pre-trained on the bit
patterns that tend to occur in .class files[*].

There has been a fair amount of work on making JAR-like files smaller, I don't
have references to hand just now but I can find them tomorrow if anyone's
interested.

-- chris

[*] It'd be the work of a few minutes to test the effect of pre-training, I may
try that tomorrow too...
 
C

Chris Uppal

Michael said:
Fact is, rt.jar is NOT compressed, probably to speed up startup times.
This alone is perfectly sufficient to explain the size difference.

Has anyone here ever tested the effect on startup time ?

Never got around to it myself, but my /guess/ would be that compression would
make it better -- less IO needed, though it is probably OS/FS-dependent.

-- chris
 
M

Michael Borgwardt

Mickey Segal said:
Can you back that up somehow? I severely doubt that there's all that much
difference in performance between such specialized compression schemes and
ZIP's general entropy encoding which is *quite* good at compressing the
kind of simple redundancies likely to turn up in any kind of executable.
[]
Too bad. It would been nice to have proof of a fabulous compression method.
But the point does remain that CAB files are substantially smaller than JAR
files, and some additional saving can be achieved using obfuscator programs,
so it does seem there is room for improvement on Java archive compression.

Well, one disadvantage (in regard to achievable compression, advantage
in regard to data safety) of the ZIP/JAR format is that files are
compressed independently, so it can't eliminate inter-file-redundancy.
And of course there are better entropy compression algorithms. But
these factors will usually not result in truly "fabulous" differences.

As for obfuscators, you can't count those because they DESTROY
information - a kind of lossy compression if you will.
 
Z

zoopy

Indeed, that is the answer... But you have to be a bit more precise.

Jar files are Zip files, which means they are, in fact, compressed
already. So just offering "compression" as a magic answer is wrong.

The problem is that zip isn't a very good compression scheme for .class
files. It doesn't take advantage of their structure and common idioms
effectively, and yeilds suboptimal results. MUCH better compression
algorithms, specificly for .class files, have been created. One of
those algorithms is used to package the .class files for the JRE
download. Part of the install process is extracting those .class files
and converting them into jar files.

You're probably referring to Pack200.

Qouting from JSR200 <http://jcp.org/en/jsr/detail?id=200>:
"The JavaTM archive format can compress these classes at the byte level only, leading to a meager
reduction factor of about two. We need to compress the classes much more efficiently, thus making
network transfers faster and therefore more reliable."

"An example of a more effective compression technology is the Pack format which was developed to
reduce the download size of the JRE (J2SE Runtime Environment) installer for windows in J2SE v1.4.1
and J2SE1.4.2. The Pack format simultaneously organizes the layout of all classes and resource files
within a JAR, removing repetitions of shared structures, and yielding a reduction factor of seven to
nine."

Tools for Pack200 are now part of J2SE 5.0:
<http://java.sun.com/j2se/1.5.0/docs/tooldocs/index.html#deployment>
 
M

Mickey Segal

Michael Borgwardt said:
As for obfuscators, you can't count those because they DESTROY
information - a kind of lossy compression if you will.

If an obfuscator replaces a class name such as "SignInJustInTimePanel" with
an obfuscated name such as "qw" does that destroy anything of importance to
the user?
 
P

Paul Lutus

Mickey said:
If an obfuscator replaces a class name such as "SignInJustInTimePanel"
with an obfuscated name such as "qw" does that destroy anything of
importance to the user?

Only if the class is never used as a component in another application.
Obfuscators should only be applied to end-user applications.
 
A

Andrew Thompson

I don't know myself, but part of the reason for JAR file compression not being
optimal is that each file is compressed (if at all) independently.

Aha! That was something I had always
suspected about Zip compression algorithms.

But then, could you not deliver the 'index'
of common parts before the delivery of the first
compressed entry? ..though that might represent
30% of the file entire file, sigh.. :-(
 
A

Andrew Thompson

Has anyone here ever tested the effect on startup time ?

Never got around to it myself, but my /guess/ would be that compression would
make it better -- less IO needed, though it is probably OS/FS-dependent.

I'd have to agree with Michael. What possible
advantage is there to loading a Zip file with
uncompressed files if it is not speed?

There was at least a three times size reductuion
with standard Zip compression, a saving of some
20 Meg of disk space. If compressed files were
actually faster as well, I cannot believe Sun
would be so silly as to not do it.

Of course, a test case blows the entire
philosophising out of the water. Anybody..?
 
A

Andrew Thompson

You're probably referring to Pack200.

Nuh. I have now checked the 1.3.1 jar as well,
not one iota of compression used, it is simply
a big file archive.

Whatever compression techniques Sun is offering,
or planning to offer developers, they are not
applying them to their own rt.jar's.
 
Z

zoopy

Nuh. I have now checked the 1.3.1 jar as well,
not one iota of compression used, it is simply
a big file archive.

Whatever compression techniques Sun is offering,
or planning to offer developers, they are not
applying them to their own rt.jar's.

You must have misinterpreted to whom I was replying: it wasn't you ;-) but Carl Howells,
specifically to his line:
"MUCH better compression algorithms,
specificly for .class files,
have been created"
 
A

Andrew Thompson


Though if you rename the 1.5 rt.jar to
a .zip and open it, WinZip pops up a 'comment'
saying simply 'PACK200'.. It still lists
the 'File compression ratio' as a fat 0% though.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top