Zip file: how to replace entry? Is that possible?

J

josh

Hello there,
after huge amount of time with Google, I decided to ask you for help.

I am wondering is it possible, using Java API, to get some ZIP file and
delete/add/replace some files that are inside it. Everywhere they
provide examples how to create new ZIP or how to read one, using
ZipFile, ZipEntry, Zip[Input/Output]Stream, etc... but how to alter
existing archive :(

Do you have any ideas?
 
R

RedGrittyBrick

josh said:
Hello there,
after huge amount of time with Google, I decided to ask you for help.

I am wondering is it possible, using Java API, to get some ZIP file and
delete/add/replace some files that are inside it. Everywhere they
provide examples how to create new ZIP or how to read one, using
ZipFile, ZipEntry, Zip[Input/Output]Stream, etc... but how to alter
existing archive :(

Do you have any ideas?

Here's what occurs to me on reading your question ...

AFAIK Current operating systems don't provide any mechanism for
inserting data into, or deleting data from, the middle of a sequential
file. This applies whether it is a text file, a ZIP archive or any other
type of sequential file. You have to read the whole current file and
write a new file.

You can then delete old; rename new old; to maintain the illusion that
you have inserted data into the middle of a file.

Obviously files used by databases are not treated as sequential files.
To insert data into the "middle" of a table held in a file I'd expect a
DBMS to overwite one or more blocks and update an index. The data might
be written to an "empty" block or appended to the end of the file. The
database index is what makes the new data seem to be in the middle of
the table (when ordered by an indexed column).

A zip archive is a sequential file, not a database file.

None of the above is specific to Java.

So the answer to your question is to read the existing archive, stash
the extracted files in memory or to temporary disk space then write a
new zip archive including new items or skipping some items as needed.
You can obviously read and write in parallel to reduce usage of
temporary storage (memory or disk)

It's conceivable that some Java API provides a
zipArchive.insertFile(...) method that does all the above unseen in the
background. I'd have expected you to have found it, if it existed.

Just my $0.02 worth.
 
J

josh

Thanks for help.
I did it as Chris Uppal suggested (in the link Andrew wrote).

I am providing an attachment with source code if anyone's interested in
my implementation. It is saving entire ODF (OpenDocumentFormat is a ZIP)
from source file, excluding content.xml which is replaced by altered
version.

Thanks again,
W.Sz.



/**
*
* @author Witold Szczerba
*/

public Class OpenDocumentSpreadsheet {
public static final String CONTENT_FILE = "content.xml";
private File odsTempFile;

......other things.......

/** BETA version, need improvements */
public File save() throws Exception {
File result = null;
InputStream input = null;
OutputStream output = null;
ZipInputStream inZip = null;
ZipOutputStream outZip = null;
try {
result = File.createTempFile("document",".ods");
input = new BufferedInputStream(new FileInputStream(odsTempFile));
output = new BufferedOutputStream(new FileOutputStream(result));
inZip = new ZipInputStream(input);
outZip = new ZipOutputStream(output);

ZipEntry in;
while ((in = inZip.getNextEntry()) != null) {
ZipEntry out;
InputStream source;
if (in.getName().equals(CONTENT_FILE)) {
byte[] contentAsBytes = serializeDocument();
out = new ZipEntry(in);
out.setSize(contentAsBytes.length);
source = new ByteArrayInputStream(contentAsBytes);
} else {
/* just copy from source ZIP */
out = in;
source = inZip;
}
outZip.putNextEntry(out);
IOUtils.copy(source,outZip); //Apache's Commons-IO
}

} catch (Exception e) {
e.printStackTrace();
} finally {
if (inZip!=null) {inZip.closeEntry(); inZip.close(); }
if (outZip!=null) {outZip.closeEntry(); outZip.close();}
IOUtils.closeQuietly(input);
IOUtils.closeQuietly(output);
odsTempFile.delete();
}
odsTempFile = result;
return result;
}
......other things.......
}
 
G

Guest

RedGrittyBrick said:
AFAIK Current operating systems don't provide any mechanism for
inserting data into, or deleting data from, the middle of a sequential
file. This applies whether it is a text file, a ZIP archive or any other
type of sequential file. You have to read the whole current file and
write a new file.

You can then delete old; rename new old; to maintain the illusion that
you have inserted data into the middle of a file.

Obviously files used by databases are not treated as sequential files.
To insert data into the "middle" of a table held in a file I'd expect a
DBMS to overwite one or more blocks and update an index. The data might
be written to an "empty" block or appended to the end of the file. The
database index is what makes the new data seem to be in the middle of
the table (when ordered by an indexed column).

A zip archive is a sequential file, not a database file.

A zip file is *not* a sequential file in the meaning of the word
you use here.

The zip format does have the concept of indexes and length.

And as a consequence the content of a zip file can be updated.

Arne
 
C

Chris Uppal

Arne said:
RedGrittyBrick wrote:

A zip file is *not* a sequential file in the meaning of the word
you use here.

The zip format does have the concept of indexes and length.

And as a consequence the content of a zip file can be updated.

I'd say you are both about 2/3 right. Unfortunately, the bottom line is that
RedGrittyBrick's conclusion is correct -- you cannot update a ZIP file without
re-writing everything from the point at which you make the change up to the end
of the file.

(I don't think any sensible ZIP API would offer even that much flexibility --
better to keep it simple and only allow writing from the very start of the
file, or appendfing to the very end of the (logical) file. My own ZIP stuff
(not Java) allows both, but unless something has changed in 1.6/1.7[*], Java's
built-in ZIP handling doesn't support appending to files.)

The format of a ZIP is first a sequence of entries of variable sizes, followed
by an index to allow random access reading of the file (in case that should be
needed). So you can't change anything in the middle of the file without
rewriting that entry and everything that follows -- just like lines in text
files. But there /is/ an index, and fast random access is supported, but it's
not the kind of ISAM-like index which would allow in-place modification of the
file.

-- chris

[*] I think there are changes in 1.6, but haven't looked for them yet, let
alone looked /at/ them ;-)
 
W

Witold Szczerba

A slightly better version in attachment.
I hope that will help someone eventually :)

W.Sz.


public File save() {
File result = null;
ZipInputStream inZip = null;
ZipOutputStream outZip = null;
try {
result = File.createTempFile("document",".ods");
inZip = new ZipInputStream(
new BufferedInputStream(
new FileInputStream(odsTempFile)));
outZip= new ZipOutputStream(
new FileOutputStream(result));

for (ZipEntry in; (in = inZip.getNextEntry()) != null;) {
ZipEntry out;
InputStream source;
if (in.getName().equals(CONTENT_FILE)) {
byte[] contentAsBytes = serializeDocument();
out = new ZipEntry(in);
out.setSize(contentAsBytes.length);
source = new ByteArrayInputStream(contentAsBytes);
} else {
out = in;
source = inZip;
}
outZip.putNextEntry(out);
IOUtils.copy(source,outZip); //Apache's Commons-IO
}
} catch (Exception e) {
e.printStackTrace();
} finally {
IOUtils.closeQuietly(inZip); //Apache's Commons-IO
IOUtils.closeQuietly(outZip); //Apache's Commons-IO
odsTempFile.delete();
}
odsTempFile = result;
System.out.println("Nowy plik:" + odsTempFile.getPath());
return result;
}


--------------------------------------------------
now you can open it with default application
in Java 6 like this:

File file = odsDocument.save();
if (java.awt.Desktop.isDesktopSupported())
java.awt.Desktop.getDesktop().open(file);
 
A

Andrew Thompson

A slightly better version in attachment.

Please note that this group was not intended
for messages with attachments*. As such, many
news servers will strip them.

* As far as I understand, in any case.

If you have some code worth sharing, please
simply post it in the body of a message,
perhaps delimited by..

<code>
</code>

...or..

<sscce>
</sscce>

etc.
I hope that will help someone eventually :)

I was about to repost the code, then
realised it would probably be wrapped
and broken by Google groups. I would
recommend to reformat code so that any
line extends no more than around 63 chars,
to ensure it is safe from (immediate,
before being quoted) line wrap.

Andrew T.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top