Shelve operations are very slow and create huge files

  • Thread starter Eric Wichterich
  • Start date
E

Eric Wichterich

Hello Pythonistas,

I use Python shelves to store results from MySQL-Queries (using Python
for web scripting).
One script searches the MySQL-database and stores the result, the next
script reads the shelve again and processes the result. But there is a
problem: if the second script is called too early, the error "(11,
'Resource temporarily unavailable') " occurs.
So I took a closer look at the file that is generated by the shelf: The
result-list from MySQL-Query contains 14.600 rows with 7 columns. But,
the saved file is over 3 MB large and contains over 230.000 lines (!),
which seems way too much!

Following statements are used:
dbase = shelve.open(filename)
if dbase.has_key(key): #overwrite objects stored with same key
del dbase[key]
dbase[key] = object
dbase.close()

Any ideas?

Thanks,
Eric
 
P

Peter Otten

Eric said:
Hello Pythonistas,

I use Python shelves to store results from MySQL-Queries (using Python
for web scripting).
One script searches the MySQL-database and stores the result, the next
script reads the shelve again and processes the result. But there is a
problem: if the second script is called too early, the error "(11,
'Resource temporarily unavailable') " occurs.
So I took a closer look at the file that is generated by the shelf: The
result-list from MySQL-Query contains 14.600 rows with 7 columns. But,
the saved file is over 3 MB large and contains over 230.000 lines (!),
which seems way too much!

Let's see:

Are thirty bytes per field, including administrative data, that much?
By the way, don't bother counting the lines in a file containing pickled
data; the pickle protocol inserts a newline after each attribute, unless
you specify the binary mode, e. g.:

shelve.open(filename, binary=True)
Following statements are used:
dbase = shelve.open(filename)
if dbase.has_key(key): #overwrite objects stored with same key
del dbase[key]
dbase[key] = object
dbase.close()

I've never used the shelve module so far, but the rule of least surprise
would suggest that

if dbase.has_key(key):
del dbase[key]
dbase[key] = data

is the same as

dbase[key] = data
Any ideas?

Try to omit the shelve completely, preferably by moving the second script's
operations into the first. If you want to keep two scripts, don't invoke
them independently, make a little batch file or shell script instead.

If you need an intermediate step with a preprocessed snapshot of the MySQL
table, and you have sufficient rights, use a MySQL table for the temporary
data.

Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top