Replacing files in a zip archive

  • Thread starter Дамјан ГеоргиевÑки
  • Start date
Ð

Дамјан ГеоргиевÑки

I'm writing a script that should modify ODF files. ODF files are just
..zip archives with some .xml files, images etc.

So far I open the zip file and play with the xml with lxml.etree, but I
can't replace the files in it.

Is there some recipe that does this ?




--
дамјан ( http://softver.org.mk/damjan/ )

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
 
M

MRAB

Дамјан ГеоргиевÑки said:
I'm writing a script that should modify ODF files. ODF files are just
.zip archives with some .xml files, images etc.

So far I open the zip file and play with the xml with lxml.etree, but I
can't replace the files in it.

Is there some recipe that does this ?
You'll have to create a new zip file and copy the files into that.
 
Ð

Дамјан ГеоргиевÑки

I'm writing a script that should modify ODF files. ODF files are just
.zip archives with some .xml files, images etc.

So far I open the zip file and play with the xml with lxml.etree, but
I can't replace the files in it.

Is there some recipe that does this ?

I ended writing this, pretty specific subclass of ZipFile.
I've added 3 methods, one for getting the .xml content from the document, the other for setting (notifying) back the
changes, and the third to save a copy of the ZipFile to another file (or in memory/StringIO).


class ODFFile(ZipFile):
def __init__(self, file, mode='r', compression=0, allowZip64=False):
ZipFile.__init__(self, file, mode='r', compression=0, allowZip64=False)
self.__unchanged = self.namelist()
self.__updated = {}

def get_xml(self, name):
fp = self.open('%s.xml' % name)
return lxml.etree.parse(fp)

def set_xml(self, name, tree):
name = '%s.xml' % name
if name in self.__unchanged:
self.__unchanged.remove(name)
if tree is not None:
self.__updated[name] = tree

def save_changes(self, fp=None):
if fp is None:
fp = StringIO()
zo = ZipFile(fp, mode='w')
updated = dict(self.__updated)
for zinfo in self.infolist():
name = zinfo.filename
# just copy the unchanged
if name in self.__unchanged:
zo.writestr(zinfo, self.read(zinfo))
# write the changed in the same order
elif name in updated:
s = lxml.etree.tostring(updated.pop(name))
zo.writestr(name, s)
# append the possible remaining (new?)
for name, tree in updated.items():
s = lxml.etree.tostring(tree)
zo.writestr(name, s)
zo.close()
return fp



--
дамјан ( http://softver.org.mk/damjan/ )

Religion ends and philosophy begins,
just as alchemy ends and chemistry begins
and astrology ends, and astronomy begins.
 
Ð

Дамјан ГеоргиевÑки

Which will produce the same output as the original, confounding
your user. You could just write the new values out, since .read
picks the last entry (as I believe it should). Alternatively, if
you want to replace it "in place", you'll need a bit more smarts
when there is more than one copy of a file in the archive (when
z.namelist.count(filename) > 1).

I think I'll just hope that there will never be an ODF file like that,
and if there is, then it's not my problem :)


--
дамјан ( http://softver.org.mk/damjan/ )

Give me the knowledge to change the code I do not accept,
the wisdom not to accept the code I cannot change,
and the freedom to choose my preference.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top