incremental archive format for outputstream?

NOBODY · Feb 25, 2006

Hi,

Do you know of an 'incremental' archive format that would be suited for an
outputstream?

In other words, is there any archive format that can hold open the existing
entries and allow to append them in an interlaced fashion? (an
incrementally updating archive?)

Let me explain.

Let's say I have 2 types (A nd B) of csv data to send.
A1.csv, A2.csv, A3.csv
B1.csv, B2.csv, B3.csv

I want to write to a stream an archive format that will contain 2 entries
(A and B) where A is the contatenation of A1+A2+A3, and B is the
contatenation of B1+B2+B3.

Now, imagine a zip file. It is easy enough to create a new zip entry A, and
push all A1, A2, A3 files in sequence, and create a second zip entry B and
push B1, B2, B3.

But here is the problem: the sequence is rolling (like a log4j file-size-
rolling appender) and by the time I finished pushing A3, B1 be have rolled
off. I want to push A1 B1, A2 B2, A3 B3.

So, I cannot use java's zipfile, at least not that I know of, to "append
existing entry" instead of putNextEntry().

Something smart like gzip (where you can concatenate independant gzip files
and they become a valid single gzip file) only for multiple entries (that
gzip doesn't have) would be great!

Thanks.

Chris Uppal · Feb 26, 2006

NOBODY said:
I want to write to a stream an archive format that will contain 2 entries
(A and B) where A is the contatenation of A1+A2+A3, and B is the
contatenation of B1+B2+B3.

I doubt if that's possible in any existing archive format. Since the library
doesn't know how many "A" entries you are going to add, it doesn't know where
to put the "B" entries in the output file.

I suggest that you redesign. One simple option would be to use two (or more)
output archives which you write concurrently. A somewhat more complex, but
more elegant (IMO), option would be to layer your own "protocol" over an
existing archive format. So that you use what the archive code thinks of as
"files" as mere "chunks" in (logically) connected streams.

In the latter case, the archive would "think" that it contained:

A.csv/A1.csv
A.csv/A2.csv
B.csv/B1.csv
A.csv/A3.csv
B.csv/B2.csv
B.csv/B3.csv

but your code would interpret that as simply:

A.csv
B.csv

The ZIP file format (which has a table of contents) would be highly suitable
for the lower level of such a scheme, I think. Note that you can use any names
you like for the entries in a ZIP file -- they don't have to be names of real
files (nor even valid filenames).

-- chris

Andrey Kuznetsov · Feb 26, 2006

I doubt if that's possible in any existing archive format. Since the

library
doesn't know how many "A" entries you are going to add, it doesn't know
where
to put the "B" entries in the output file.

possible solution could be to keep table of contents in another file.

Change array shapes	1	Oct 28, 2022
Class decorator to capture the creation and deletion of objects	0	Feb 24, 2014
Dark corners of the JLS: static methods, hiding, and bytecode	0	Feb 22, 2008
VHDL CODE FOR CONTROLLER WHEN PLANE IS DESIGNED USING THREE POINTS	0	Nov 8, 2012
databinding and objects	0	Oct 11, 2006
Importing a CSV file	3	Mar 21, 2006
How to get the string Cartesian Products of 2 list	30	Nov 22, 2007
asp.net 2005 treeview or menu control	0	Oct 23, 2005

incremental archive format for outputstream?

NOBODY

Chris Uppal

Andrey Kuznetsov

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads