Shelve operations are very slow and create huge files

Eric Wichterich · Nov 1, 2003

Hello Pythonistas,

I use Python shelves to store results from MySQL-Queries (using Python
for web scripting).
One script searches the MySQL-database and stores the result, the next
script reads the shelve again and processes the result. But there is a
problem: if the second script is called too early, the error "(11,
'Resource temporarily unavailable') " occurs.
So I took a closer look at the file that is generated by the shelf: The
result-list from MySQL-Query contains 14.600 rows with 7 columns. But,
the saved file is over 3 MB large and contains over 230.000 lines (!),
which seems way too much!

Following statements are used:
dbase = shelve.open(filename)
if dbase.has_key(key): #overwrite objects stored with same key
del dbase[key]
dbase[key] = object
dbase.close()

Any ideas?

Thanks,
Eric

Peter Otten · Nov 1, 2003

Eric said:
Hello Pythonistas,

I use Python shelves to store results from MySQL-Queries (using Python
for web scripting).
One script searches the MySQL-database and stores the result, the next
script reads the shelve again and processes the result. But there is a
problem: if the second script is called too early, the error "(11,
'Resource temporarily unavailable') " occurs.
So I took a closer look at the file that is generated by the shelf: The
result-list from MySQL-Query contains 14.600 rows with 7 columns. But,
the saved file is over 3 MB large and contains over 230.000 lines (!),
which seems way too much!

Let's see:

Are thirty bytes per field, including administrative data, that much?
By the way, don't bother counting the lines in a file containing pickled
data; the pickle protocol inserts a newline after each attribute, unless
you specify the binary mode, e. g.:

shelve.open(filename, binary=True)

Following statements are used:
dbase = shelve.open(filename)
if dbase.has_key(key): #overwrite objects stored with same key
del dbase[key]
dbase[key] = object
dbase.close()

I've never used the shelve module so far, but the rule of least surprise
would suggest that

if dbase.has_key(key):
del dbase[key]
dbase[key] = data

is the same as

dbase[key] = data

Any ideas?

Try to omit the shelve completely, preferably by moving the second script's
operations into the first. If you want to keep two scripts, don't invoke
them independently, make a little batch file or shell script instead.

If you need an intermediate step with a preprocessed snapshot of the MySQL
table, and you have sufficient rights, use a MySQL table for the temporary
data.

Peter

using shelve correctly?	0	Dec 10, 2003
[ANN] Earlybird entries are about to close for OSDC	0	Oct 28, 2004
Asp.net Important Topics.	0	Jan 18, 2007
python-dev Summary for 2004-08-01 through 2004-08-15	17	Aug 24, 2004
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
comp.lang.vhdl FAQ part 3 of 4: products & services	0	Jul 8, 2003
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004

Shelve operations are very slow and create huge files

Eric Wichterich

Peter Otten

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads