Memory limit to dict?

Peter Beattie · Apr 11, 2006

I was wondering whether certain data structures in Python, e.g. dict,
might have limits as to the amount of memory they're allowed to take up.
Is there any documentation on that?

Why am I asking? I'm reading 3.6 GB worth of BLAST output files into a
nested dictionary structure (dict within dict ...). Looks something like
this:

{ GenomeID:
{ ProteinID:
{ GenomeID:
{ ProteinID, Score, PercentValue, EValue } } } }

Now, the thing is: Even on a machine with 16 GB RAM, the program
terminates with a MemoryError, obviously long before the machine's RAM
is used up.

I've no idea how far the Windows task manager's resource monitor can be
trusted -- probably not as far as I could throw a heavy-set St Bernard
--, but it seems to stop roughly when that monitor records a swap file
size of 2.2 GB.

Barring any revamping of the code itself, which I will have to do
anyway, is there anything so far that would indicate a problem inherent
to Python?

(I can post the relevant code too, of course, if that would help.)

TIA!

Burton Samograd · Apr 11, 2006

I've no idea how far the Windows task manager's resource monitor can be
trusted -- probably not as far as I could throw a heavy-set St Bernard
--, but it seems to stop roughly when that monitor records a swap file
size of 2.2 GB.

Not being a windows expert at all, but I would assume with 32 bit
windows each process in the system can have an address space of ~2
gigs. In linux the process address space is split in half, bottom 2
gigs for OS mappings, top for the process, so it looks like you might
just be hitting the maximum allowed address space mapping.

You should partition your data into hierarchial modules and let python
do the swapping for you...although you have 16 gigs (I have to put a
holy crap after that!) you will always run into process limits, at
least until true 64 bit os's are in vouge.

Felipe Almeida Lessa · Apr 11, 2006

Em Ter, 2006-04-11 Ã s 19:45 +0200, Peter Beattie escreveu:

I was wondering whether certain data structures in Python, e.g. dict,
might have limits as to the amount of memory they're allowed to take up.
Is there any documentation on that?

Why am I asking? I'm reading 3.6 GB worth of BLAST output files into a
nested dictionary structure (dict within dict ...). Looks something like
this:

{ GenomeID:
{ ProteinID:
{ GenomeID:
{ ProteinID, Score, PercentValue, EValue } } } }

I don't have the answer to your question and I'll make a new one: isn't
the overhead (performance and memory) of creating dicts too large to be
used in this scale?

I'm just speculating, but I *think* that using lists and objects may be
better.

My 2 cents,

Steve M · Apr 11, 2006

An alternative is to use ZODB. For example, you could use the BTree
class for the outermost layers of the nested dict, and a regular dict
for the innermost layer. If broken up properly, you can store
apparently unlimited amount of data with reasonable performance.

Just remember not to iterate over the entire collection of objects
without aborting the transaction regularly.

limiting memory consumption of Python itself or a dict in a Python program	2	Sep 19, 2007
Memory error due to the huge/huge input file size	3	Nov 10, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Apr 1, 2008
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Feb 15, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Nov 15, 2007
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Sep 15, 2007
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Aug 15, 2007

Memory limit to dict?

Peter Beattie

Burton Samograd

Felipe Almeida Lessa

Steve M

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads