import data.py using massive amounts of memory

N

Nick Craig-Wood

I've been dumping a database in a python code format (for use with
Python on S60 mobile phone actually) and I've noticed that it uses
absolutely tons of memory as compared to how much the data structure
actually needs once it is loaded in memory.

The programs below create a file (z.py) with a data structure in which
looks like this

-- z.py ----------------------------------------------------
z = {
0 : (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19),
1 : (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20),
2 : (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21),
[snip]
998 : (998, 999, 1000, 1001, 1002, ..., 1012, 1013, 1014, 1015, 1016, 1017),
999 : (999, 1000, 1001, 1002, 1003, ..., 1013, 1014, 1015, 1016, 1017, 1018),
}
------------------------------------------------------------

Under python2.2-python2.4 "import z" uses 8 MB, whereas loading a
pickled dump of the file only takes 450kB. This has been improved in
python2.5 so it only takes 2.2 MB.

$ python2.5 memory_usage.py
Memory used to import is 2284 kB
Total size of repr(z) is 105215
Memory used to unpickle is 424 kB
Total size of repr(z) is 105215

$ python2.4 memory_usage.py
Memory used to import is 8360 kB
Total size of repr(z) is 105215
Memory used to unpickle is 456 kB
Total size of repr(z) is 105215

$ python2.3 memory_usage.py
Memory used to import is 8436 kB
Total size of repr(z) is 105215
Memory used to unpickle is 456 kB
Total size of repr(z) is 105215

$ python2.2 memory_usage.py
Memory used to import is 8568 kB
Total size of repr(z) is 105215
Memory used to unpickle is 392 kB
Total size of repr(z) is 105215

$ python2.1 memory_usage.py
Memory used to import is 10756 kB
Total size of repr(z) is 105215
Memory used to unpickle is 384 kB
Total size of repr(z) is 105215

Why does it take so much memory? Is it some consequence of the way
the datastructure is parsed?

Note that once it has made the .pyc file the subsequent runs take even
less memory than the cpickle import.

S60 python is version 2.2.1. It doesn't have pickle unfortunately, but
it does have marshal and the datastructures I need are marshal-able so
that provides a good solution to my actual problem.

Save the two programs below with the names given to demonstrate the
problem. Note that these use some linux-isms to measure the memory
used by the current process which will need to be adapted if you don't
run it on linux!

-- memory_usage.py -----------------------------------------

import os
import sys
import re
from cPickle import dump

def memory():
"""Returns memory used (RSS) in kB"""
status = open("/proc/self/status").read()
match = re.search(r"(?m)^VmRSS:\s+(\d+)", status)
memory = 0
if match:
memory = int(match.group(1))
return memory

def write_file():
"""Write the file to be imported"""
fd = open("z.py", "w")
fd.write("z = {\n")
for i in xrange(1000):
fd.write(" %d : %r,\n" % (i, tuple(range(i,i+20))))
fd.write("}\n")
fd.close()

def main():
write_file()
before = memory()
from z import z
after = memory()
print "Memory used to import is %s kB" % (after-before)
print "Total size of repr(z) is ",len(repr(z))

# Save a pickled copy for later
dump(z, open("z.bin", "wb"))

# Run the next part
os.system("%s memory_usage1.py" % sys.executable)

if __name__ == "__main__":
main()

-- memory_usage1.py ----------------------------------------

from memory_usage import memory
from cPickle import load

before = memory()
z = load(open("z.bin", "rb"))
after = memory()
print "Memory used to unpickle is %s kB" % (after-before)
print "Total size of repr(z) is ",len(repr(z))
 
G

GHUM

Note that once it has made the .pyc file the subsequent runs take even
less memory than the cpickle import.

Could that be the compiler compiling?

Without knowing to much details about that process, but from 2.4 to
2.5 the compiler was totally exchanged, <whatever> to AST.
That would explain the drop from 8Meg -> 2.2 Meg.

Harald
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top