Memory error

J

Jamie Mitchell

Hello all,

I'm afraid I am new to all this so bear with me...

I am looking to find the statistical significance between two large netCDF data sets.

Firstly I've loaded the two files into python:

swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')

swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')

I have then isolated the variables I want to perform the pearson correlation on:

hs=swh.variables['hs']

hs_2050s=swh_2050s.variables['hs']

Here is the metadata for those files:

print hs
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)

print hs_2050s
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
standard_name: significant_height_of_wind_and_swell_waves
long_name: significant_wave_height
units: m
add_offset: 0.0
scale_factor: 0.002
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)


Then to perform the pearsons correlation:

from scipy.stats.stats import pearsonr

pearsonr(hs,hs_2050s)

I then get a memory error:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
x = np.asarray(x)
File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
return array(a, dtype, copy=False, order=order)
MemoryError

This also happens when I try to create numpy arrays from the data.

Does anyone know how I can alleviate theses memory errors?

Cheers,

Jamie
 
J

Jamie Mitchell

Hello all,



I'm afraid I am new to all this so bear with me...



I am looking to find the statistical significance between two large netCDF data sets.



Firstly I've loaded the two files into python:



swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')



swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')



I have then isolated the variables I want to perform the pearson correlation on:



hs=swh.variables['hs']



hs_2050s=swh_2050s.variables['hs']



Here is the metadata for those files:



print hs

<type 'netCDF4.Variable'>

int16 hs(time, latitude, longitude)

standard_name: significant_height_of_wind_and_swell_waves

long_name: significant_wave_height

units: m

add_offset: 0.0

scale_factor: 0.002

_FillValue: -32767

missing_value: -32767

unlimited dimensions: time

current shape = (86400, 350, 227)



print hs_2050s

<type 'netCDF4.Variable'>

int16 hs(time, latitude, longitude)

standard_name: significant_height_of_wind_and_swell_waves

long_name: significant_wave_height

units: m

add_offset: 0.0

scale_factor: 0.002

_FillValue: -32767

missing_value: -32767

unlimited dimensions: time

current shape = (86400, 350, 227)





Then to perform the pearsons correlation:



from scipy.stats.stats import pearsonr



pearsonr(hs,hs_2050s)



I then get a memory error:



Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr

x = np.asarray(x)

File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray

return array(a, dtype, copy=False, order=order)

MemoryError



This also happens when I try to create numpy arrays from the data.



Does anyone know how I can alleviate theses memory errors?



Cheers,



Jamie

Just realised that obviously pearson correlation requires two 1D arrays and mine are 3D, silly mistake!
 
G

Gary Herron

Hello all,

I'm afraid I am new to all this so bear with me...

I am looking to find the statistical significance between two large netCDF data sets.

Firstly I've loaded the two files into python:

swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')

swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')

I have then isolated the variables I want to perform the pearson correlation on:

hs=swh.variables['hs']

hs_2050s=swh_2050s.variables['hs']

This is not really a Python question. It's a question about netCDF
(whatever that may be), or perhaps it's interface to Python python-netCD4.

You may get an answer here, but you are far more likely to get one
quickly and accurately from a forum dedicated to netCDF, or python-netCD.

Good luck.

Gary Herron
 
D

dieter

Jamie Mitchell said:
...
I then get a memory error:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
x = np.asarray(x)
File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
return array(a, dtype, copy=False, order=order)
MemoryError

"MemoryError" means that Python cannot get sufficent memory
from the operating system.


You have already found out one mistake. Should you continue to
get "MemoryError" after this is fixed, then your system does not
provide enough resources (memory) to solve the problem at hand.
You would need to find a way to provide more resources.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top