linregress and polyfit

Krishnan · Sep 18, 2013

I created an xy pair

y = slope*x + intercept

then I added some noise to "y" using

numpy.random.normal - call it z

I could recover the slope, intercept from (x,y) using linregress
BUT cannot calculate the slope, intercept from (x, z)

What is puzzling is that for both pairs (x,y) and (x,z) the
polyfit (order 1) works just fine (gives me the slope, intercept)
----------------------------------------------------------------------
import numpy as np
import scipy
# create a straight line, y= slope*x + intercept
x = np.linspace(0.0,10.0,21)
slope = 1.0
intercept = 3.0
y = []
for i in range(len(x)):
y.append(slope*x + intercept)
# now create a data file with noise
z= []
for i in range(len(x)):
z.append(y + 0.1*np.random.normal(0.0,1.0,1))
# now calculate the slope, intercept using linregress
from scipy import stats
# No error here this is OK, works for x, y
cslope, cintercept, r_value, p_value, std_err = stats.linregress(x,y)
print cslope, cintercept
# I get an error here
#ValueError: array dimensions must agree except for d_0
nslope, nintercept, nr_value, np_value, nstd_err = stats.linregress(x,z)
print nslope, nintercept
# BUT polyfit works fine, polynomial or order 1 with both data sets
ncoeffs = scipy.polyfit(x,z,1)
print ncoeffs
coeffs = scipy.polyfit(x,y,1)
print coeffs

Josef Pktd · Sep 18, 2013

I created an xy pair

y = slope*x + intercept

then I added some noise to "y" using

numpy.random.normal - call it z

I could recover the slope, intercept from (x,y) using linregress

BUT cannot calculate the slope, intercept from (x, z)

What is puzzling is that for both pairs (x,y) and (x,z) the

polyfit (order 1) works just fine (gives me the slope, intercept)

----------------------------------------------------------------------

import numpy as np

import scipy

# create a straight line, y= slope*x + intercept

x = np.linspace(0.0,10.0,21)

slope = 1.0

intercept = 3.0

y = []

for i in range(len(x)):

y.append(slope*x + intercept)

# now create a data file with noise

z= []

for i in range(len(x)):

z.append(y + 0.1*np.random.normal(0.0,1.0,1))

When z is converted to a numpy array then it has an extra dimension that linregress cannot handle, because np.random.normal(0.0,1.0, 1) returns an array and not a scalar.

much easier: use vectorized numpy instead of loop

z = y + 0.1*np.random.normal(0.0,1.0, len(y))

which is a one dimensional array and works with linregress

Josef

chitturk · Sep 18, 2013

Thanks - that helps ... but it is puzzling because

np.random.normal(0.0,1.0,1) returns exactly one
and when I checked the length of "z", I get 21 (as before) ...

Dave Angel · Sep 18, 2013

Thanks - that helps ... but it is puzzling because

np.random.normal(0.0,1.0,1) returns exactly one
and when I checked the length of "z", I get 21 (as before) ...

I don't use Numpy, so this is just a guess, plus reading one web page.

According to:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.normal.html

the 3rd argument to normal should be a tuple. So if you want a single
element, you should have made it (1,)

As for checking the 'length of "z"' did you just use the len() function?
That just tells you the first dimension. Have you tried simply printing
out "z" ?

chitturk · Sep 19, 2013

tried (1,) - still same error ...
printed "z" and looks right, len(z) OK
(puzzling)

Oscar Benjamin · Sep 19, 2013

I don't use Numpy, so this is just a guess, plus reading one web page.

According to:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.normal.html

the 3rd argument to normal should be a tuple. So if you want a single
element, you should have made it (1,)

Numpy accepts ints in place of shape tuples so this makes no difference.

As for checking the 'length of "z"' did you just use the len() function?
That just tells you the first dimension. Have you tried simply printing
out "z" ?

Exactly. What you need to check is the shape attribute (converting to
numpy array first if necessary):

import numpy as np
a = np.random.normal(0, 1, 1)
a array([-0.90292348])
a.shape (1,)
a[0] -0.90292348393433797
np.array(a[0]) array(-0.902923483934338)
np.array(a[0]).shape ()
[a, a] [array([-0.90292348]), array([-0.90292348])]
np.array([a, a])

Click to expand...

Click to expand...

array([[-0.90292348],
[-0.90292348]])

np.array([a, a]).shape (2, 1)
np.random.normal(0, 1, 2).shape

Click to expand...

Click to expand...

(2,)

The square brackets in 'array([-0.90292348])' indicate that numpy
considers this to be a 1-dimensional array of length 1 rather than a
scalar value (which would have an empty shape tuple).

Oscar

How do you correctly plot for a 3d graph because my code seems to be stuck in an infinite loop because it is unable to plot.	2	Jul 20, 2023
I Need Fix In Code	1	Apr 12, 2023
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Value error: Argument Z must be 2-dimensional	0	May 7, 2022
Find slope of function given empirical data.	4	Jun 29, 2010
I am making a Snake game and it has a: "raise Terminator/turtle.Terminator" message.	2	Dec 20, 2021
Help for my project in the last minute	0	Apr 23, 2022
Pyautogui, cv2 and cannot find image	0	Feb 7, 2023

linregress and polyfit

Krishnan

Josef Pktd

chitturk

Dave Angel

chitturk

Oscar Benjamin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads