Serious problem with Shelve

Discussion in 'Python' started by Rami A. Kishek, Aug 17, 2003.

  1. Hi - this mysterious behavior with shelve is just about to kill me. I
    hope someone here can shed some light. First of all, I have this piece
    of code which uses shelve to save instances of some class I define. It
    works perfectly on an old machine (PII-400) running Python 2.2.1 under
    RedHat Linux 8.0. When I try to run it under Python for windows ME on a
    P-4 1.4 GHz, however, it keeps crashing on reading from the shelved file
    the second time I try to access it. The Windows machine was originally
    running python 1.5.2, so I upgraded to 2.2.3, thinking that would solve
    the problem, but it didn't!

    This is what the error looks like:
    tmprec = myrecs[key]
    File "D:\PROGRAMS\PYTHON22\lib\shelve.py", line 70, in __getitem__
    f = StringIO(self.dict[key])
    KeyError: A_G_08631616188
    ^

    Notes:
    Here's what my program does (it is too much code to include here).
    I have 4 related modules: one containing the class definitions (in all
    other modules I use from classfile import ___); the second module builds
    the shelve file by parsing a large text file containing the data,
    building classes; the third re-opens the file later to do reading and
    writing operations; and the 4th module is a GUI controller that simple
    calls the appropriate functions from the other 2 modules.

    The main breakdown occurs in module 3. Significantly, I initially had
    this module set up as a script in which everything was done on the
    module level, and it was working fine (apparently). The problems
    started appearing when I wrapped code inside functions (I need to do
    that since I want to call it from other modules, and I have about 4000
    lines of code altogether!). I spent painstaking hours trying to isolate
    the problem - I pass the open shelve file as a parameter to all the
    functions that need it, and I close it properly using try: finally
    statements after every use. I also make sure all the keys that go in
    there are unique.

    What module 3 does is a series of short reads and writes to the shelve
    file. First I test if a particular key is in there - if it is not, I
    add an item, if it is, I read the existing item, update it, then write
    it back like this:

    tmprec = myrecs[key] # I read a particular instance from the shelve
    file
    tmprec.field = 1 # I update one field
    #del myrevs[key] # Commented lines are things I tried while
    debugging
    #myrecs.sync() #
    myrecs[key] = tmprec # Then I write it back to the shelve file
    #myrecs.sync()

    This one function apppears to be the guilty party. When I comment it
    out the crash stops. However it is a vital function for my program and
    I need to do it. Note that deleting the original item before reqwriting
    it helped reduce the frequency of crashes, but didn't eliminate it
    completely. The other possibility (which is why I unsuccessfully tried
    the .sync() lines) is that it has to do with the timing of writing to
    disk. The library reference is vague about this, saying that shelve is
    incapable of simultanteous reads and writes, so the file shouldn't be
    opened twice for write. However it does not say whether this implies we
    cannot read and write like this in quick succession.

    More details:
    * The first run of module 3 after creating the shelve file doesn't
    crash, although I suspect it is doing something funny.
    * The second time I get that error above, keeping in mind I am supposed
    to have a key in there called "A_G_0863161618" (without the extra '8' at
    the end), so the database is already corrupted. So the key
    'A_G_08631616188' is in myshelvefile.keys(), the original is no more,
    yet NEITHER can be accesed using myshelvefile[key]!
    * After creation, the shelve file size is only 71 kB. After running
    module 3 - which is supposed to mostly read and not really change the
    file much - the size jumps to 110 kB!
    * If I open the file in a text editor, I notice all sorts of things that
    are not supposed to be there (like directory paths, etc), indicating it
    is corrupted. I do not see those things when I open the file on the
    good (Linux) machine.
    * I did a scandisk to ensure the disk is OK and it is.
     
    Rami A. Kishek, Aug 17, 2003
    #1
    1. Advertising

  2. Rami A. Kishek

    Tim Churches Guest

    On Mon, 2003-08-18 at 03:04, Rami A. Kishek wrote:
    > Hi - this mysterious behavior with shelve is just about to kill me. I
    > hope someone here can shed some light. First of all, I have this piece
    > of code which uses shelve to save instances of some class I define. It
    > works perfectly on an old machine (PII-400) running Python 2.2.1 under
    > RedHat Linux 8.0. When I try to run it under Python for windows ME on a
    > P-4 1.4 GHz, however, it keeps crashing on reading from the shelved file
    > the second time I try to access it. The Windows machine was originally
    > running python 1.5.2, so I upgraded to 2.2.3, thinking that would solve
    > the problem, but it didn't!


    In Python 2.2 or earlier, by default, shelve uses the Berkeley database
    1.8 libraries, which we have found to be seriously broken on all
    platforms we have tried them on. Upgrading to a later version of the
    Berkeley libraries and using the pybsddb module fixed the mysterious,
    inconsistent crashes and segfaults we were seeing with shelve (and which
    were also driving us crazy). The easiest way to upgrade is to move to
    Python 2.3, which includes these later versions, but you can also
    easily install them under earlier version of Python (at least under
    2.2).
    --

    Tim C

    PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
    or at http://members.optushome.com.au/tchur/pubkey.asc
    Key fingerprint = 8C22 BF76 33BA B3B5 1D5B EB37 7891 46A9 EAF9 93D0



    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.0.7 (GNU/Linux)

    iD8DBQA/P+VBeJFGqer5k9ARAkuAAKD3bR7ei6rB4XT+Mk9ifT64gUEM5gCeIBwO
    96YcIZ0DQ7H74iRHLkzcVlc=
    =RXEg
    -----END PGP SIGNATURE-----
     
    Tim Churches, Aug 17, 2003
    #2
    1. Advertising

  3. Well - I installed Python 2.3, but it still doesn't. My program now
    crashes on the first pass. After deleting the old databases and
    creating new ones, I opened them for read and this is what I get:

    self.revs = shelve.open(os.path.join(tgtdir, dbfn))
    File "D:\PROGRAMS\PYTHON23\lib\shelve.py", line 231, in open
    return DbfilenameShelf(filename, flag, protocol, writeback, binary)
    File "D:\PROGRAMS\PYTHON23\lib\shelve.py", line 212, in __init__
    Shelf.__init__(self, anydbm.open(filename, flag), protocol,
    writeback, binary)
    File "D:\PROGRAMS\PYTHON23\lib\anydbm.py", line 82, in open
    mod = __import__(result)
    ImportError: No module named bsddb185


    I will try enclosing that import bsddb185 in anydbm.py in try: except:,
    though I hate messing around with source files, and there may be many
    more such problems. Python developers, be aware of this glitch.


    Tim Churches wrote:
    >
    > > of code which uses shelve to save instances of some class I define.
    > > it keeps crashing on reading from the shelved file
    > > the second time I try to access it.

    >
    > In Python 2.2 or earlier, by default, shelve uses the Berkeley database
    > 1.8 libraries, which we have found to be seriously broken on all
    > platforms we have tried them on. Upgrading to a later version of the
    > Berkeley libraries and using the pybsddb module fixed the mysterious,
    > inconsistent crashes and segfaults we were seeing with shelve (and which
    > were also driving us crazy). The easiest way to upgrade is to move to
    > Python 2.3, which includes these later versions, but you can also
    > easily install them under earlier version of Python (at least under
    > 2.2).
    > --
     
    Rami A. Kishek, Aug 19, 2003
    #3
  4. On Tue, 19 Aug 2003, Rami A. Kishek wrote:

    > File "D:\PROGRAMS\PYTHON23\lib\shelve.py", line 231, in open
    > return DbfilenameShelf(filename, flag, protocol, writeback, binary)
    > File "D:\PROGRAMS\PYTHON23\lib\shelve.py", line 212, in __init__
    > Shelf.__init__(self, anydbm.open(filename, flag), protocol,
    > writeback, binary)
    > File "D:\PROGRAMS\PYTHON23\lib\anydbm.py", line 80, in open
    > raise error, "db type could not be determined"
    > error: db type could not be determined
    >
    > Incidentally, on the other machine I mentioned (the one on which shelve
    > worked perfectly with 2.2.3) shelve still works perfectly after
    > upgrading to 2.3. Since that is a Linux 2 machine, I figure perhaps it
    > is using a different db like gdbm or something ...


    Your shelve file is in DB v1.85 format. Commenting out the lines in
    which.py didn't do anything except deny the shelve module information
    about what the format actually _is_.

    You'll need to find/build a v1.85 compatible module to read the shelve
    then write it out in a later format.

    --
    Andrew I MacIntyre "These thoughts are mine alone..."
    E-mail: (pref) | Snail: PO Box 370
    (alt) | Belconnen ACT 2616
    Web: http://www.andymac.org/ | Australia
     
    Andrew MacIntyre, Aug 19, 2003
    #4
  5. Rami> Well - I installed Python 2.3, but it still doesn't. My program
    Rami> now crashes on the first pass. After deleting the old databases
    Rami> and creating new ones, I opened them for read and this is what I
    Rami> get:

    How did you create those new databases, using an older version of Python
    perhaps? What's happening is that whichdb.whichdb() determined that the
    file you passed into anydbm.open() was an old hash style database, which can
    only be opened in Python 2.3 by the old v 1.85 library, which is only
    exposed through the bsddb185 module.

    Rami> I will try enclosing that import bsddb185 in anydbm.py in try:
    Rami> except:, though I hate messing around with source files, and there
    Rami> may be many more such problems. Python developers, be aware of
    Rami> this glitch.

    That won't work. What's anydbm.open() going to use to open the file?

    Can you explain how the files were created? (Sorry if you explained
    already. I'm just coming to this thread.)

    If you have Python 2.1 or 2.2 laying around with a bsddb module which can
    read the file in question, use Tools/scripts/db2pickle.py to convert the
    file to a pickle, then with Python 2.3, run Tools/scripts/pickle2db.py to
    convert the pickle back to a db file, using the new bsddb. Those two
    scripts are in the Python 2.3 distribution, but not the Python 2.2
    distribution. They should work with Python 2.1 or 2.2, however. This
    problem is exactly why I wrote them.

    Synopsis:

    python2.2 db2pickle.py olddbfile pickle.pck
    python2.3 pickle2db.py newdbfile pickle.pck

    Skip
     
    Skip Montanaro, Aug 19, 2003
    #5
  6. Rami> Incidentally, on the other machine I mentioned (the one on which
    Rami> shelve worked perfectly with 2.2.3) shelve still works perfectly
    Rami> after upgrading to 2.3. Since that is a Linux 2 machine, I figure
    Rami> perhaps it is using a different db like gdbm or something ...

    Try this using python 2.2.3 and python 2.3:

    import whichdb
    whichdb.whichdb(os.path.join(tgtdir, dbfn))

    and see what it prints. That will keep you from guessing about the nature
    of the file.

    Skip
     
    Skip Montanaro, Aug 19, 2003
    #6
  7. Thanks. With your help, I figured out one of the databases accessed WAS
    created with an older Python, so I simply cleaned up that one and now
    everything works!


    >Skip Montanaro wrote:
    >
    >Andrew MacIntyre wrote:
    >
     
    Rami A. Kishek, Aug 19, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. wilson heng via DotNetMonster.com

    serious problem encounter in VS.net

    wilson heng via DotNetMonster.com, Mar 31, 2005, in forum: ASP .Net
    Replies:
    3
    Views:
    539
  2. Michael Mulcahy

    shelve problem

    Michael Mulcahy, Mar 2, 2004, in forum: Python
    Replies:
    0
    Views:
    298
    Michael Mulcahy
    Mar 2, 2004
  3. softwindow
    Replies:
    4
    Views:
    285
    Jim Segrave
    May 26, 2006
  4. Douglas Applegate

    Problem with shelve/gdbm

    Douglas Applegate, Oct 19, 2007, in forum: Python
    Replies:
    0
    Views:
    376
    Douglas Applegate
    Oct 19, 2007
  5. Problem with shelve

    , Nov 6, 2008, in forum: Python
    Replies:
    1
    Views:
    315
    Paul Rudin
    Nov 6, 2008
Loading...

Share This Page