checking available system memory?

D

Darren Dale

I am doing linear algebra with large numarray. It is very efficient, but I
have a small problem due to the size of my data. The dot product of a
10,000x3 double array with a 3x6,250,000 double array will consume 500GB of
memory. I need to break the operations up into managable chunks, so I dont
consume all the available memory and get a segmentation fault.

Its not a problem with numpy, I just need to intelligently slice up one of
my arrays so my routine works within the available system resources. Are
there any utilities that can query how much memory is available?

Thanks,
Darren
 
V

Ville Vainio

Darren> Its not a problem with numpy, I just need to intelligently
Darren> slice up one of my arrays so my routine works within the
Darren> available system resources. Are there any utilities that
Darren> can query how much memory is available?

What platform? On Linux at least you can do 'cat /proc/meminfo' to get
more info than you probably want to have...
 
D

David M. Cooke

Darren Dale said:
I am doing linear algebra with large numarray. It is very efficient, but I
have a small problem due to the size of my data. The dot product of a
10,000x3 double array with a 3x6,250,000 double array will consume 500GB of
memory. I need to break the operations up into managable chunks, so I dont
consume all the available memory and get a segmentation fault.

Its not a problem with numpy, I just need to intelligently slice up one of
my arrays so my routine works within the available system resources. Are
there any utilities that can query how much memory is available?

Not really, it tends to quite operating-system specific.

Instead of saying "What's the largest chunk I can do at a time", how
about "What's the smallest chunk, where bigger chunks won't get me
much?". If you operate on chunks that are on the order of the cache
size of the processor, that's probably sufficient.

Also, if you're using numarray.dot, note that it doesn't use BLAS (yet), so
it's not as efficient as it could be if it used it (through ATLAS, for
instance).
 
J

Josiah Carlson

I am doing linear algebra with large numarray. It is very efficient, but I
have a small problem due to the size of my data. The dot product of a
10,000x3 double array with a 3x6,250,000 double array will consume 500GB of
memory. I need to break the operations up into managable chunks, so I dont
consume all the available memory and get a segmentation fault.

Its not a problem with numpy, I just need to intelligently slice up one of
my arrays so my routine works within the available system resources. Are
there any utilities that can query how much memory is available?

Unless you are running bigmem patches on linux, or the equivalent in
windows, you are limited to 2 gigs of memory per process.

How much memory do you really have?

- Josiah
 
J

Jeremy Bowers

I am doing linear algebra with large numarray. It is very efficient, but I
have a small problem due to the size of my data. The dot product of a
10,000x3 double array with a 3x6,250,000 double array" will consume
500GB of memory. I need to break the operations up into managable
chunks, so I dont consume all the available memory and get a
segmentation fault.

Its not a problem with numpy, I just need to intelligently slice up one
of my arrays so my routine works within the available system resources.
Are there any utilities that can query how much memory is available?

I don't know what you're doing with that, but you're well into the domain
where you may have to trade running time for memory.

I am not familiar with the terms "10,000x3 double array with a 3x6,250,000
double array" (particularly "double array"), but speaking in general
terms, assuming the dot product is something like the vector dot product I
know, you can wrap your two source arrays in an object that lazily
computes the relevant dot product. Shell:

class LazyDotProduct(object):
def __init__(self, a, b):
self.a = a
self.b = b

def __getitem__(self, index):
return dot_prodect(self.a, self.b, index)

Add an optional cache to getitem if you need it and can afford it.
"dot_product" computes the relevant dot product element.

Just a thought; I may be over-extrapolating from what I know.
 
J

Jeremy Bowers

I am not familiar with the terms "10,000x3 double array with a 3x6,250,000
double array" (particularly "double array"),

Oh, duh, array of "doubles". The specification of dimensions had me
thinking of some sort of array where each cell had 2 elements in it or
something :)

Now I am pretty sure you can compute it lazily.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top