Shelve operations are very slow and create huge files

Discussion in 'Python' started by Eric Wichterich, Nov 1, 2003.

  1. Hello Pythonistas,

    I use Python shelves to store results from MySQL-Queries (using Python
    for web scripting).
    One script searches the MySQL-database and stores the result, the next
    script reads the shelve again and processes the result. But there is a
    problem: if the second script is called too early, the error "(11,
    'Resource temporarily unavailable') " occurs.
    So I took a closer look at the file that is generated by the shelf: The
    result-list from MySQL-Query contains 14.600 rows with 7 columns. But,
    the saved file is over 3 MB large and contains over 230.000 lines (!),
    which seems way too much!

    Following statements are used:
    dbase = shelve.open(filename)
    if dbase.has_key(key): #overwrite objects stored with same key
    del dbase[key]
    dbase[key] = object
    dbase.close()

    Any ideas?

    Thanks,
    Eric
     
    Eric Wichterich, Nov 1, 2003
    #1
    1. Advertising

  2. Eric Wichterich

    Peter Otten Guest

    Eric Wichterich wrote:

    > Hello Pythonistas,
    >
    > I use Python shelves to store results from MySQL-Queries (using Python
    > for web scripting).
    > One script searches the MySQL-database and stores the result, the next
    > script reads the shelve again and processes the result. But there is a
    > problem: if the second script is called too early, the error "(11,
    > 'Resource temporarily unavailable') " occurs.
    > So I took a closer look at the file that is generated by the shelf: The
    > result-list from MySQL-Query contains 14.600 rows with 7 columns. But,
    > the saved file is over 3 MB large and contains over 230.000 lines (!),
    > which seems way too much!


    Let's see:

    >>> 3*2**20/14600/7

    30.780117416829746
    >>>


    Are thirty bytes per field, including administrative data, that much?
    By the way, don't bother counting the lines in a file containing pickled
    data; the pickle protocol inserts a newline after each attribute, unless
    you specify the binary mode, e. g.:

    shelve.open(filename, binary=True)

    > Following statements are used:
    > dbase = shelve.open(filename)
    > if dbase.has_key(key): #overwrite objects stored with same key
    > del dbase[key]
    > dbase[key] = object
    > dbase.close()


    I've never used the shelve module so far, but the rule of least surprise
    would suggest that

    if dbase.has_key(key):
    del dbase[key]
    dbase[key] = data

    is the same as

    dbase[key] = data

    > Any ideas?


    Try to omit the shelve completely, preferably by moving the second script's
    operations into the first. If you want to keep two scripts, don't invoke
    them independently, make a little batch file or shell script instead.

    If you need an intermediate step with a preprocessed snapshot of the MySQL
    table, and you have sufficient rights, use a MySQL table for the temporary
    data.

    Peter
     
    Peter Otten, Nov 1, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Earl Teigrob
    Replies:
    7
    Views:
    473
    Scott M.
    Feb 16, 2004
  2. Kevin
    Replies:
    1
    Views:
    488
    ~kurt
    May 25, 2007
  3. Alf P. Steinbach /Usenet

    Slow -- VERY slow brain

    Alf P. Steinbach /Usenet, Jun 16, 2011, in forum: C++
    Replies:
    17
    Views:
    557
    Noah Roberts
    Jun 29, 2011
  4. Nick Green
    Replies:
    4
    Views:
    221
    Nick Green
    Nov 18, 2009
  5. Sara
    Replies:
    0
    Views:
    115
Loading...

Share This Page