Database problems

  • Thread starter Edward Grefenstette
  • Start date
E

Edward Grefenstette

Dear Pythonistas,

For a project I'm working on, I need to store fairly large
dictionaries (several million keys) in some form (obviously not in
memory). The obvious course of action was to use a database of some
sort.

The operation is pretty simple, a function is handed a generator that
gives it keys and values, and it maps the keys to the values in a non-
relational database (simples!).

I wrote some code implementing this using anydbm (which used dbhash on
my system), and it worked fine for about a million entries, but then
crashed raising a DBPageNotFoundError. I did a little digging around
and couldn't figure out what was causing this or how to fix it.

I then quickly swapped anydbm for good ol' fashioned dbm which uses
gdbm, and it ran even faster a little longer, but after a million
entries or so it raised the ever-so-unhelpful "gdbm fatal: write
error".

I then threw caution to the winds and tried simply using cPickle's
dump in the hope of obtaining some data persistence, but it crashed
fairly early with a "IOError: [Errno 122] Disk quota exceeded".

Now the question is: is it something wrong with these dbms? Can they
not deal with very large sets of data? If not, is there a more optimal
tool for my needs? Or is the problem unrelated and has something to do
with my lab computer?

Best,
Edward
 
B

Benjamin Kaplan

Dear Pythonistas,

For a project I'm working on, I need to store fairly large
dictionaries (several million keys) in some form (obviously not in
memory). The obvious course of action was to use a database of some
sort.

The operation is pretty simple, a function is handed a generator that
gives it keys and values, and it maps the keys to the values in a non-
relational database (simples!).

I wrote some code implementing this using anydbm (which used dbhash on
my system), and it worked fine for about a million entries, but then
crashed raising a DBPageNotFoundError. I did a little digging around
and couldn't figure out what was causing this or how to fix it.

I then quickly swapped anydbm for good ol' fashioned dbm which uses
gdbm, and it ran even faster a little longer, but after a million
entries or so it raised the ever-so-unhelpful "gdbm fatal: write
error".

I then threw caution to the winds and tried simply using cPickle's
dump in the hope of obtaining some data persistence, but it crashed
fairly early with a "IOError: [Errno 122] Disk quota exceeded".

Now the question is: is it something wrong with these dbms? Can they
not deal with very large sets of data? If not, is there a more optimal
tool for my needs? Or is the problem unrelated and has something to do
with my lab computer?

Best,
Edward
--

Just as a guess, I'd say that you have a disk quota that you're
hitting with your several million key dbm. You might want to talk to
the lab administrator about raising the quota.
 
T

Tim Harig

I then threw caution to the winds and tried simply using cPickle's
dump in the hope of obtaining some data persistence, but it crashed
fairly early with a "IOError: [Errno 122] Disk quota exceeded".

The error is telling you that you have attempted to write the file; but, in
the process you exceeded your disk quota (the space you are allowed to use)
and the operating system would not permit you to finish the file.
Now the question is: is it something wrong with these dbms? Can they
not deal with very large sets of data? If not, is there a more optimal
tool for my needs? Or is the problem unrelated and has something to do
with my lab computer?

The problem is that you are attempting to use more disk space then you are
permitted to use on your account (at least for the selected filesystem).
Possible solutions are to save the data somewhere else where you have a
greater quota, delete some other files from your account to make room for
the new file, or talk to whoever administrates the systems and see if they
will add enough space to your quota to permit you to write this large file.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top