deflater/inflater and dictionnary for huffman

N

NOBODY

(deflater/inflater and dictionnary for huffman)

I want to compress/decompress small data like strings of 1k or less.
I know huffman compression with static dictionnary is the best solution.

Seams I could use the java.util.zip.Deflater with strategy Huffman_only and
maybe prepare a dictionnary.

Does anyone can explain the way to use this? I can't find any docs after 2
hours on google and the groups! Javadoc is useless.

Thanks!
 
T

Thomas Weidenfeller

NOBODY said:
Seams I could use the java.util.zip.Deflater with strategy Huffman_only and
maybe prepare a dictionnary.

Does anyone can explain the way to use this? I can't find any docs after 2
hours on google and the groups! Javadoc is useless.

Yep, it is knowledge that is passed from father to son with a secret
handshake. I once tried to interest some Java journals in publishing an
article about "all" the magic of the Zip an Jar classes, but no one was
interested.

First of all, consider using a DeflaterOutputStream instead of a raw
Deflater. It handles all the ugly details of feeding data to the
compression algorithm and writing the results. If you want to have the
result in memory, provide a ByteArrayOutputStream to the
DeflaterOutputStream's constructor.

If you need to do it by your own, I suggest to study the
DeflaterOutputStream's source code (source comes with the J2SDK in a
file called scr.zip or src.jar).

Please note that DeflaterOutputStream uses a Deflater differently than
the Deflater documentation suggests. The documentation suggests to
check needsInput() if deflate() returns with a 0. DeflaterOutputStream
ignores the return value and just loops until needsInput() returns
true. They can do this, because they ensure their output buffer has at
least a size of 1, and they immediately write out any result and reuse
the buffer.

For using Deflator, you need two sets of "pointers" (this is where the
underlying C library shines through). One "pointer" points to the part
of the data that still needs to be processed, the other to some
storage location to which compressed data should be written to.

Of course, you don't have pointers in Java, so what the API expects is
a pair of a byte[] and an integer. The byte[] holds the data, the
integer serves as a pointer into the byte[]. In addition, you need
something to know the remaining data in the byte[], that's another
integer. So for the input data, and the output data you have a triplet
of

byte[] b; // memory holding the data
int off; // position "pointer" into b
int len; // For input: remaining input data in this buffer
// For output: remaining free memory in this buffer
// In both cases: off + len <= b.length

You provide such a triplet to setInput() to tell the Defalter where to
get the data from. And you provide another such triplet to deflate() to
tell the Deflater where to place the output.

Now the really tricky part starts. You call deflate() and then you have
to react according to the return value of deflate():

deflate() == 0 && needsInput():

All data provided via setInput() has been read (but maybe not
completely processed). Either provide more data by calling
setInput() again, or finish the compression.

Finishing the compression is tricky, too:
(a) Call finish()
(b) Flush all data still in internal Deflater buffers. You do this
by calling deflate() in a loop while finished() returns false.
Check the return value of deflate(), because you still
might need to provide more output memory.

deflate() == 0 && !needsInput():

This is undocumented, but important. The Deflater needs more
output storage. Save the already compressed data in the
output byte[], and call deflate() again with more memory.

deflate() > 0

Some output data is available. The data is in the byte[] provided
to deflate(), starts at off as provided to deflate() and has a
length as returned by deflate().

Do whatever you want with the data, then adjust off and len if
necessary, and call deflate() again.

HTH

/Thomas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,158
Latest member
Vinay_Kumar Nevatia
Top