shelve in a ZipFile?

T

Terry Hancock

I only just recently had a look at the shelve module, and it
looks pretty handy, my only question being, what if I really
want two shelves? Must I use two files?

Also, it seems strange that shelve only works with
filenames. I would've expected there to at least be
a variant that would put a shelve database on a file
that is opened for read/write (that is a file object).

That would be handy if, for example, I wanted to couple
(and compress into the bargain) by putting my two
shelf files into a single zip archive.

Of course, I could do this with temporary files,
but I wonder if there's a simpler way?
 
S

Scott David Daniels

Terry said:
I only just recently had a look at the shelve module....
That would be handy if, for example, I wanted to couple
(and compress into the bargain) by putting my two
shelf files into a single zip archive.

You are, however, fooling yourself if you think a shelve
solution can be made to gracefully interact with a zip-
compressed version. In order to zip the shelve data, it
must all be seen in a pass, so every update would necessarily
rewrite the entire shelve storage. much better to extract the
entire shelve file, operate on it, and re-compress it.

Even if uncompressed, the zip archive format is not going to
happily allow you to change the size of any of the "files" it
stores.

--Scott David Daniels
(e-mail address removed)
 
A

Andreas Kostyrka

Am Freitag, den 01.07.2005, 10:53 -0700 schrieb Scott David Daniels:
You are, however, fooling yourself if you think a shelve
solution can be made to gracefully interact with a zip-
compressed version. In order to zip the shelve data, it
must all be seen in a pass, so every update would necessarily
rewrite the entire shelve storage. much better to extract the
entire shelve file, operate on it, and re-compress it.

Even if uncompressed, the zip archive format is not going to
happily allow you to change the size of any of the "files" it
stores.

It's even worse: shelve is basically a class that wraps a dictionary. It
provides a dictionary string -> pickable object based on a dictioary
string -> string. bsddb, gdbm etc. probably access files via lowlevel
calls that are not interceptable.

One way to achieve your goals would be to add compression and/or a key
prefix (which would allow multiple dictionaries or at least key spaces
in one file)

Andreas

--Scott David Daniels
(e-mail address removed)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQBCxbhYHJdudm4KnO0RAlf+AKCY5TN0hm2x3JJ+mD6I4ynAVqqIcQCguKbe
0ceAcZcaHiER/7+kPTocIKU=
=YXfO
-----END PGP SIGNATURE-----
 
T

Terry Hancock

You are, however, fooling yourself if you think a shelve
solution can be made to gracefully interact with a zip-
compressed version. In order to zip the shelve data, it
must all be seen in a pass, so every update would necessarily
rewrite the entire shelve storage. much better to extract the
entire shelve file, operate on it, and re-compress it.

Yeah, I've already decided to just go with the two linked
shelf files.

The only drawback to this is that the user can accidentally
separate the two files, which causes one of them to become
useful (1st shelf relies on named references to objects in the
2nd shelf). I used to have problems like that with IRAF, which
stored image headers and images in two related files with
different file extensions. Once they're two separate units,
you have to consider the cases where they get separated.

OTOH, the 2nd shelf file has independent uses, which don't
depend on the 1st, so in that way, it's desireable for it to
be separable.

Probably if I was really worried about it, I'd come up with
a more serious solution, but this is a lightweight script, so
I think I'll just stick with shelve for convenience. Actually,
as small as the databases are in this project, I probably
could've just pickled a tuple of dictionaries (or something
similar). But I think shelves will probably do a better job.

Thanks though,
Terry
 
T

Terry Hancock

It's even worse: shelve is basically a class that wraps a dictionary. It
provides a dictionary string -> pickable object based on a dictioary
string -> string. bsddb, gdbm etc. probably access files via lowlevel
calls that are not interceptable.

One way to achieve your goals would be to add compression and/or a key
prefix (which would allow multiple dictionaries or at least key spaces
in one file)

Yeah, I'm already using a character prefix for header data in the
1st file. Right now, one of the headers tells where to find the 2nd file.

Seems to be working okay.

I'm a little bothered by the idea of lumping both into one dictionary,
though I see this could be done the same way.

I wasn't really looking for a way to compress the data (just thought it
was a nice side benefit), but your post reminded me that I could do
it with zlib on the data *before* storing them in the shelf. I guess if
bulk becomes an issue I'll try that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
DewittMill
Top