GZIPInputStream (how to uncompress from a succession of byte arrays)

C

chattycow

This message also posted on comp.lang.java.help --

I have a hopefully simple question regarding gunzipping compressed data
not in a file...

I have a gzipped file/object/whatever residing accross a network (it's
large, about 1GB).
The file is broken up into 32Kb chunks and sent accross the network in
these chunks.

I need to decompress the file using GZIP at the other end of the
network from the bunch of 32Kb chunks.
I obviously can't just load 1GB of bytes into an extremely large
ByteArrayInputStream, then push that into GZIPInputStream as I would
quickly run out of memory as the GZIPInputStream class would have you
do using it's constructor and a ByteArrayInputStream.

I need a way to do this a few 32Kb chunks at a time, and keep adding
data to GZIPInputStream(or whatever) for decompressing, and then
reading the uncompressed data as we go.

Any ideas would be really, really, helpful. Code examples would be
extremely helpful as I'm pretty new to Java and love the language so
far.
 
A

Andrey Kuznetsov

I have a hopefully simple question regarding gunzipping compressed data
not in a file...

I have a gzipped file/object/whatever residing accross a network (it's
large, about 1GB).
The file is broken up into 32Kb chunks and sent accross the network in
these chunks.

I need to decompress the file using GZIP at the other end of the
network from the bunch of 32Kb chunks.
I obviously can't just load 1GB of bytes into an extremely large
ByteArrayInputStream, then push that into GZIPInputStream as I would
quickly run out of memory as the GZIPInputStream class would have you
do using it's constructor and a ByteArrayInputStream.

I need a way to do this a few 32Kb chunks at a time, and keep adding
data to GZIPInputStream(or whatever) for decompressing, and then
reading the uncompressed data as we go.

Any ideas would be really, really, helpful. Code examples would be
extremely helpful as I'm pretty new to Java and love the language so
far.

Unified I/O could be helpful - http://uio.imagero.com
It's free and open source.

Andrey
 
C

chattycow

Doesn't apear that unified I/O is going to help.
It doesn't appear to let me create an InputStream, then continually add
to it...unless I'm missing something. All the while having it
encapsulated within GZIPInputStream.

Thanks anyway....Anybody else have any ideas or have done this before?

Thanks,
---Dean.
 
C

Chris Uppal

chattycow said:
I have a gzipped file/object/whatever residing accross a network (it's
large, about 1GB).
The file is broken up into 32Kb chunks and sent accross the network in
these chunks.

I'm not sure what you mean by this. I can think of 3 possible meaning, but each
one has different answers...

1) You have a large lump of gzipped data and you want to read that
incrementally across a network, i.e. without holding the whole thing in memory
at the same time. That's easy because GZipInputStream is inherently
incremental.

2) You have a server of some sort which provides a sequence of 32K chunks of
data, each of which has been independently gzipped. Again that is easy -- just
de-gzip each chunk independently and assemble the results.

3) You have an awkwardly designed system which passes out chunks extracted from
a single gzipped file, but doesn't do so as an uninterrupted stream. Perhaps
some sort of web-service where you keep sending HTTP requests "give me the next
chunk". If so then your first thought should be "how can I fix the design?".
Assuming that you are unable to do so, and since you clearly have too much data
for a simple hack/workaround to suffice, you will have to create a custom
implementation of ReadStream. That implementation would satisfy read()
requests by checking an internal buffer to see if it already had enough data
stored and, if not, then it would refill the buffer by issuing a network
request to your server for the next 32K chunk. You would create an instance of
that class and pass it to the GZipInputStream constructor. You might find that
quite difficult if other parts of your application need to talk to the server
at the same time, or if the protocol for talking to the server is more
complicated than I'm assuming here.

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,522
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top