shelve in a ZipFile?

Terry Hancock · Jul 1, 2005

I only just recently had a look at the shelve module, and it
looks pretty handy, my only question being, what if I really
want two shelves? Must I use two files?

Also, it seems strange that shelve only works with
filenames. I would've expected there to at least be
a variant that would put a shelve database on a file
that is opened for read/write (that is a file object).

That would be handy if, for example, I wanted to couple
(and compress into the bargain) by putting my two
shelf files into a single zip archive.

Of course, I could do this with temporary files,
but I wonder if there's a simpler way?

Scott David Daniels · Jul 1, 2005

Terry said:
I only just recently had a look at the shelve module....
That would be handy if, for example, I wanted to couple
(and compress into the bargain) by putting my two
shelf files into a single zip archive.

You are, however, fooling yourself if you think a shelve
solution can be made to gracefully interact with a zip-
compressed version. In order to zip the shelve data, it
must all be seen in a pass, so every update would necessarily
rewrite the entire shelve storage. much better to extract the
entire shelve file, operate on it, and re-compress it.

Even if uncompressed, the zip archive format is not going to
happily allow you to change the size of any of the "files" it
stores.

--Scott David Daniels
(e-mail address removed)

Andreas Kostyrka · Jul 1, 2005

Am Freitag, den 01.07.2005, 10:53 -0700 schrieb Scott David Daniels:

You are, however, fooling yourself if you think a shelve
solution can be made to gracefully interact with a zip-
compressed version. In order to zip the shelve data, it
must all be seen in a pass, so every update would necessarily
rewrite the entire shelve storage. much better to extract the
entire shelve file, operate on it, and re-compress it.

Even if uncompressed, the zip archive format is not going to
happily allow you to change the size of any of the "files" it
stores.

It's even worse: shelve is basically a class that wraps a dictionary. It
provides a dictionary string -> pickable object based on a dictioary
string -> string. bsddb, gdbm etc. probably access files via lowlevel
calls that are not interceptable.

One way to achieve your goals would be to add compression and/or a key
prefix (which would allow multiple dictionaries or at least key spaces
in one file)

Andreas

--Scott David Daniels
(e-mail address removed)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQBCxbhYHJdudm4KnO0RAlf+AKCY5TN0hm2x3JJ+mD6I4ynAVqqIcQCguKbe
0ceAcZcaHiER/7+kPTocIKU=
=YXfO
-----END PGP SIGNATURE-----

Terry Hancock · Jul 2, 2005

You are, however, fooling yourself if you think a shelve
solution can be made to gracefully interact with a zip-
compressed version. In order to zip the shelve data, it
must all be seen in a pass, so every update would necessarily
rewrite the entire shelve storage. much better to extract the
entire shelve file, operate on it, and re-compress it.

Yeah, I've already decided to just go with the two linked
shelf files.

The only drawback to this is that the user can accidentally
separate the two files, which causes one of them to become
useful (1st shelf relies on named references to objects in the
2nd shelf). I used to have problems like that with IRAF, which
stored image headers and images in two related files with
different file extensions. Once they're two separate units,
you have to consider the cases where they get separated.

OTOH, the 2nd shelf file has independent uses, which don't
depend on the 1st, so in that way, it's desireable for it to
be separable.

Probably if I was really worried about it, I'd come up with
a more serious solution, but this is a lightweight script, so
I think I'll just stick with shelve for convenience. Actually,
as small as the databases are in this project, I probably
could've just pickled a tuple of dictionaries (or something
similar). But I think shelves will probably do a better job.

Thanks though,
Terry

Terry Hancock · Jul 2, 2005

It's even worse: shelve is basically a class that wraps a dictionary. It
provides a dictionary string -> pickable object based on a dictioary
string -> string. bsddb, gdbm etc. probably access files via lowlevel
calls that are not interceptable.

One way to achieve your goals would be to add compression and/or a key
prefix (which would allow multiple dictionaries or at least key spaces
in one file)

Yeah, I'm already using a character prefix for header data in the
1st file. Right now, one of the headers tells where to find the 2nd file.

Seems to be working okay.

I'm a little bothered by the idea of lumping both into one dictionary,
though I see this could be done the same way.

I wasn't really looking for a way to compress the data (just thought it
was a nice side benefit), but your post reminded me that I could do
it with zlib on the data *before* storing them in the shelf. I guess if
bulk becomes an issue I'll try that.

Error from zipfile	2	Jul 1, 2008
ZipFile - file adding API incomplete?	3	Nov 17, 2009
Adding new lines to word document using zipfile module within python 2.7?	0	Aug 27, 2013
Zipfile module errors	2	Jun 4, 2008
Extracting zip files containing directories with ZipFile	3	Apr 12, 2009
zipfile + symlink..	0	Jun 23, 2005
Is shelve/dbm supposed to be this inefficient?	1	Aug 2, 2007
Problem with zipfile and newlines	5	Mar 10, 2008

shelve in a ZipFile?

Terry Hancock

Scott David Daniels

Andreas Kostyrka

Terry Hancock

Terry Hancock

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads