default shelve on linux corrupts, does different DB system help?

P

Paul Sijben

I have the problem that my shelve(s) sometimes corrupt (looks like it
has after python has run out of threads).

I am using the default shelve so on linux I get the dbhash version.
Is there a different DB type I can choose that is known to be more
resilient? And if so, what is the elegant way of doing that?

Paul
 
S

skip

Paul> I have the problem that my shelve(s) sometimes corrupt (looks like
Paul> it has after python has run out of threads).

Paul> I am using the default shelve so on linux I get the dbhash
Paul> version. Is there a different DB type I can choose that is known
Paul> to be more resilient? And if so, what is the elegant way of doing
Paul> that?

You don't say what version of Python you're using or what version of the
Berkeley DB library underpins your installation, but I am going to guess it
is 1.85. This has been known to have serious bugs for over a decade. (Just
in the hash file implementation. The btree and recnum formats are ok.
Unfortunately, the hash file implementation is what everybody has always
gravitated to. Sort of like moths to a flame...)

If that's the case, simply pick some other dbm file format for your shelves,
e.g.:
>>> import gdbm
>>> import shelve
>>> f = gdbm.open("/tmp/trash.db", "c")
>>> f.close()
>>> db = shelve.open("/tmp/trash.db")
>>> db["mike"] = "sharon"
>>> db["4"] = 5
>>> db.keys() ['4', 'mike']
>>> db.close()
>>> f = gdbm.open("/tmp/trash.db", "c")
>>> f.keys() ['4', 'mike']
>>> f['4'] 'I5\n.'
>>> f['mike']
"S'sharon'\np1\n."

As for "uncorrupting" your existing database, see if your Linux distribution
has a db_recover program. If it does, you might be able to retrieve your
data, though in the case of BerkDB 1.85's hash file I'm skeptical that can
be done. I hope you weren't storing something valuable in it like your bank
account passwords.
 
P

Paul Sijben

Thanks very much for a clear and concise explanation of the problem and
the solution!

I am implementing it now in my system. Luckily we caught this one during
testing so no important data has been lost.

Unfortunately windows does not seem to support gdbm. But in our case,
everything that is on the windows client is also available on the linux
server, so we can recreate the DB at the expense of some bandwidth in
case of failures.

Paul

Paul> I have the problem that my shelve(s) sometimes corrupt (looks like
Paul> it has after python has run out of threads).

Paul> I am using the default shelve so on linux I get the dbhash
Paul> version. Is there a different DB type I can choose that is known
Paul> to be more resilient? And if so, what is the elegant way of doing
Paul> that?

You don't say what version of Python you're using or what version of the
Berkeley DB library underpins your installation, but I am going to guess it
is 1.85. This has been known to have serious bugs for over a decade. (Just
in the hash file implementation. The btree and recnum formats are ok.
Unfortunately, the hash file implementation is what everybody has always
gravitated to. Sort of like moths to a flame...)

If that's the case, simply pick some other dbm file format for your shelves,
e.g.:
import gdbm
import shelve
f = gdbm.open("/tmp/trash.db", "c")
f.close()
db = shelve.open("/tmp/trash.db")
db["mike"] = "sharon"
db["4"] = 5
db.keys() ['4', 'mike']
db.close()
f = gdbm.open("/tmp/trash.db", "c")
f.keys() ['4', 'mike']
f['4'] 'I5\n.'
f['mike']
"S'sharon'\np1\n."

As for "uncorrupting" your existing database, see if your Linux distribution
has a db_recover program. If it does, you might be able to retrieve your
data, though in the case of BerkDB 1.85's hash file I'm skeptical that can
be done. I hope you weren't storing something valuable in it like your bank
account passwords.
 
P

Paul Sijben

Thanks very much for a clear and concise explanation of the problem and
the solution!

I am implementing it now in my system. Luckily we caught this one during
testing so no important data has been lost.

Unfortunately windows does not seem to support gdbm. But in our case,
everything that is on the windows client is also available on the linux
server, so we can recreate the DB at the expense of some bandwidth in
case of failures.

Paul

Paul> I have the problem that my shelve(s) sometimes corrupt (looks like
Paul> it has after python has run out of threads).

Paul> I am using the default shelve so on linux I get the dbhash
Paul> version. Is there a different DB type I can choose that is known
Paul> to be more resilient? And if so, what is the elegant way of doing
Paul> that?

You don't say what version of Python you're using or what version of the
Berkeley DB library underpins your installation, but I am going to guess it
is 1.85. This has been known to have serious bugs for over a decade. (Just
in the hash file implementation. The btree and recnum formats are ok.
Unfortunately, the hash file implementation is what everybody has always
gravitated to. Sort of like moths to a flame...)

If that's the case, simply pick some other dbm file format for your shelves,
e.g.:
import gdbm
import shelve
f = gdbm.open("/tmp/trash.db", "c")
f.close()
db = shelve.open("/tmp/trash.db")
db["mike"] = "sharon"
db["4"] = 5
db.keys() ['4', 'mike']
db.close()
f = gdbm.open("/tmp/trash.db", "c")
f.keys() ['4', 'mike']
f['4'] 'I5\n.'
f['mike']
"S'sharon'\np1\n."

As for "uncorrupting" your existing database, see if your Linux distribution
has a db_recover program. If it does, you might be able to retrieve your
data, though in the case of BerkDB 1.85's hash file I'm skeptical that can
be done. I hope you weren't storing something valuable in it like your bank
account passwords.
 
S

skip

Paul> Unfortunately windows does not seem to support gdbm.

That is a known issue, but one that can be solved I think by getting rid of
the old 1.85 version of BerkDB and using something more modern. I believe
the current bsddb module in recent Python versions supports BerkDB 3.x and
4.x. Sleepycat got bought by Oracle awhile ago. I believe you can download
4.7 from here:

http://www.oracle.com/technology/software/products/berkeley-db/index.html

They have a Windows installer as well as links to older versions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top