Re: Linear regression in 3 dimensions

Discussion in 'Python' started by Robert Kern, Sep 2, 2006.

  1. Robert Kern

    Robert Kern Guest

    wrote:
    > Hi all,
    >
    > I am seeking a module that will do the equivalent of linear regression in
    > 3D to yield a best fit a plane through a set of points (X1, Y1, Z1), (X1,
    > Y1, Z1),... (Xn, Yn, Zn).
    >
    > The resulting equation to be of the form:
    >
    > Z = aX + bY + c
    >
    > The function I need would take the set of points and return a,c & c Any
    > pointers to existing code / modules would be very helpful.


    Well, that's a very unspecified problem. You haven't defined "best."

    But if we make the assumption that you want to minimize the squared error in Z,
    that is minimize

    Sum((Z - (a*X + b*Y + c)) ** 2)

    then this is a standard linear algebra problem.

    In [1]: import numpy as np

    In [2]: a = 1.0

    In [3]: b = 2.0

    In [4]: c = 3.0

    In [5]: rs = np.random.RandomState(1234567890) # Specify a seed for repeatability

    In [6]: x = rs.uniform(size=100)

    In [7]: y = rs.uniform(size=100)

    In [8]: e = rs.standard_normal(size=100)

    In [9]: z = a*x + b*y + c + e

    In [10]: A = np.column_stack([x, y, np.ones_like(x)])

    In [11]: np.linalg.lstsq?
    Type: function
    Base Class: <type 'function'>
    String Form: <function lstsq at 0x6df070>
    Namespace: Interactive
    File:
    /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/numpy-1.0b2.dev3002-py2.4-macosx-10.4-ppc.egg/numpy/linalg/linalg.py
    Definition: np.linalg.lstsq(a, b, rcond=1e-10)
    Docstring:
    returns x,resids,rank,s
    where x minimizes 2-norm(|b - Ax|)
    resids is the sum square residuals
    rank is the rank of A
    s is the rank of the singular values of A in descending order

    If b is a matrix then x is also a matrix with corresponding columns.
    If the rank of A is less than the number of columns of A or greater than
    the number of rows, then residuals will be returned as an empty array
    otherwise resids = sum((b-dot(A,x)**2).
    Singular values less than s[0]*rcond are treated as zero.


    In [12]: abc, residuals, rank, s = np.linalg.lstsq(A, z)

    In [13]: abc
    Out[13]: array([ 0.93104714, 1.96780364, 3.15185125])

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
    Robert Kern, Sep 2, 2006
    #1
    1. Advertising

  2. Robert Kern

    Guest

    Hi Robert,

    I'm using the scipy package for such problems. In the submodule
    scipy.optimize there is an implmentation of a least-square fitting
    algorithm (Levenberg-Marquardt) called leastsq.

    You have to define a function that computes the residuals between your
    model and the data points:

    import scipy.optimize

    def model(parameter, x, y):
    a, b, c = parameter
    return a*x + b*y + c

    def residual(parameter, data, x, y):
    res = []
    for _x in x:
    for _y in y:
    res.append(data-model(parameter,x,y)
    return res

    params0 = [1., 1.,1.]
    result = scipy.optimize.leastsq(resdiual, params0, (data,x,y))
    fittedParams = result[0]

    If you haven't used numeric, numpy or scipy before, you should take a
    look at an introduction. It uses some nice extended array objects,
    where you can use some neat index tricks and compute values of array
    items without looping through it.

    Cheers! Bernhard



    Robert Kern wrote:
    > wrote:
    > > Hi all,
    > >
    > > I am seeking a module that will do the equivalent of linear regression in
    > > 3D to yield a best fit a plane through a set of points (X1, Y1, Z1), (X1,
    > > Y1, Z1),... (Xn, Yn, Zn).
    > >
    > > The resulting equation to be of the form:
    > >
    > > Z = aX + bY + c
    > >
    > > The function I need would take the set of points and return a,c & c Any
    > > pointers to existing code / modules would be very helpful.

    >
    > Well, that's a very unspecified problem. You haven't defined "best."
    >
    > But if we make the assumption that you want to minimize the squared error in Z,
    > that is minimize
    >
    > Sum((Z - (a*X + b*Y + c)) ** 2)
    >
    > then this is a standard linear algebra problem.
    >
    > In [1]: import numpy as np
    >
    > In [2]: a = 1.0
    >
    > In [3]: b = 2.0
    >
    > In [4]: c = 3.0
    >
    > In [5]: rs = np.random.RandomState(1234567890) # Specify a seed for repeatability
    >
    > In [6]: x = rs.uniform(size=100)
    >
    > In [7]: y = rs.uniform(size=100)
    >
    > In [8]: e = rs.standard_normal(size=100)
    >
    > In [9]: z = a*x + b*y + c + e
    >
    > In [10]: A = np.column_stack([x, y, np.ones_like(x)])
    >
    > In [11]: np.linalg.lstsq?
    > Type: function
    > Base Class: <type 'function'>
    > String Form: <function lstsq at 0x6df070>
    > Namespace: Interactive
    > File:
    > /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/numpy-1.0b2.dev3002-py2.4-macosx-10.4-ppc.egg/numpy/linalg/linalg.py
    > Definition: np.linalg.lstsq(a, b, rcond=1e-10)
    > Docstring:
    > returns x,resids,rank,s
    > where x minimizes 2-norm(|b - Ax|)
    > resids is the sum square residuals
    > rank is the rank of A
    > s is the rank of the singular values of A in descending order
    >
    > If b is a matrix then x is also a matrix with corresponding columns.
    > If the rank of A is less than the number of columns of A or greater than
    > the number of rows, then residuals will be returned as an empty array
    > otherwise resids = sum((b-dot(A,x)**2).
    > Singular values less than s[0]*rcond are treated as zero.
    >
    >
    > In [12]: abc, residuals, rank, s = np.linalg.lstsq(A, z)
    >
    > In [13]: abc
    > Out[13]: array([ 0.93104714, 1.96780364, 3.15185125])
    >
    > --
    > Robert Kern
    >
    > "I have come to believe that the whole world is an enigma, a harmless enigma
    > that is made terrible by our own mad attempt to interpret it as though it had
    > an underlying truth."
    > -- Umberto Eco
    , Sep 4, 2006
    #2
    1. Advertising

  3. Bernhard,

    Levenberg-Marquardt is a good solution when you want to solve a general
    non-linear least-squares problem. As Robert said, the OPs problem is
    linear and Robert's solution exploits that. Using LM here is unnecessary
    and I suspect a fair bit less efficient (i.e. slower).

    - Andrew


    wrote:
    > Hi Robert,
    >
    > I'm using the scipy package for such problems. In the submodule
    > scipy.optimize there is an implmentation of a least-square fitting
    > algorithm (Levenberg-Marquardt) called leastsq.
    >
    > You have to define a function that computes the residuals between your
    > model and the data points:
    >
    > import scipy.optimize
    >
    > def model(parameter, x, y):
    > a, b, c = parameter
    > return a*x + b*y + c
    >
    > def residual(parameter, data, x, y):
    > res = []
    > for _x in x:
    > for _y in y:
    > res.append(data-model(parameter,x,y)
    > return res
    >
    > params0 = [1., 1.,1.]
    > result = scipy.optimize.leastsq(resdiual, params0, (data,x,y))
    > fittedParams = result[0]
    >
    > If you haven't used numeric, numpy or scipy before, you should take a
    > look at an introduction. It uses some nice extended array objects,
    > where you can use some neat index tricks and compute values of array
    > items without looping through it.
    >
    > Cheers! Bernhard
    >
    >
    >
    > Robert Kern wrote:
    >> wrote:
    >>> Hi all,
    >>>
    >>> I am seeking a module that will do the equivalent of linear regression in
    >>> 3D to yield a best fit a plane through a set of points (X1, Y1, Z1), (X1,
    >>> Y1, Z1),... (Xn, Yn, Zn).
    >>>
    >>> The resulting equation to be of the form:
    >>>
    >>> Z = aX + bY + c
    >>>
    >>> The function I need would take the set of points and return a,c & c Any
    >>> pointers to existing code / modules would be very helpful.

    >> Well, that's a very unspecified problem. You haven't defined "best."
    >>
    >> But if we make the assumption that you want to minimize the squared error in Z,
    >> that is minimize
    >>
    >> Sum((Z - (a*X + b*Y + c)) ** 2)
    >>
    >> then this is a standard linear algebra problem.
    >>
    >> In [1]: import numpy as np
    >>
    >> In [2]: a = 1.0
    >>
    >> In [3]: b = 2.0
    >>
    >> In [4]: c = 3.0
    >>
    >> In [5]: rs = np.random.RandomState(1234567890) # Specify a seed for repeatability
    >>
    >> In [6]: x = rs.uniform(size=100)
    >>
    >> In [7]: y = rs.uniform(size=100)
    >>
    >> In [8]: e = rs.standard_normal(size=100)
    >>
    >> In [9]: z = a*x + b*y + c + e
    >>
    >> In [10]: A = np.column_stack([x, y, np.ones_like(x)])
    >>
    >> In [11]: np.linalg.lstsq?
    >> Type: function
    >> Base Class: <type 'function'>
    >> String Form: <function lstsq at 0x6df070>
    >> Namespace: Interactive
    >> File:
    >> /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/numpy-1.0b2.dev3002-py2.4-macosx-10.4-ppc.egg/numpy/linalg/linalg.py
    >> Definition: np.linalg.lstsq(a, b, rcond=1e-10)
    >> Docstring:
    >> returns x,resids,rank,s
    >> where x minimizes 2-norm(|b - Ax|)
    >> resids is the sum square residuals
    >> rank is the rank of A
    >> s is the rank of the singular values of A in descending order
    >>
    >> If b is a matrix then x is also a matrix with corresponding columns.
    >> If the rank of A is less than the number of columns of A or greater than
    >> the number of rows, then residuals will be returned as an empty array
    >> otherwise resids = sum((b-dot(A,x)**2).
    >> Singular values less than s[0]*rcond are treated as zero.
    >>
    >>
    >> In [12]: abc, residuals, rank, s = np.linalg.lstsq(A, z)
    >>
    >> In [13]: abc
    >> Out[13]: array([ 0.93104714, 1.96780364, 3.15185125])
    >>
    >> --
    >> Robert Kern
    >>
    >> "I have come to believe that the whole world is an enigma, a harmless enigma
    >> that is made terrible by our own mad attempt to interpret it as though it had
    >> an underlying truth."
    >> -- Umberto Eco

    >
    Andrew McLean, Sep 8, 2006
    #3
  4. Andrew McLean wrote:
    > Bernhard,
    >
    > Levenberg-Marquardt is a good solution when you want to solve a general
    > non-linear least-squares problem. As Robert said, the OPs problem is
    > linear and Robert's solution exploits that. Using LM here is unnecessary
    > and I suspect a fair bit less efficient (i.e. slower).
    >
    > - Andrew
    >
    >
    > wrote:
    >> Hi Robert,
    >>
    >> I'm using the scipy package for such problems. In the submodule
    >> scipy.optimize there is an implmentation of a least-square fitting
    >> algorithm (Levenberg-Marquardt) called leastsq.
    >>
    >> You have to define a function that computes the residuals between your
    >> model and the data points:
    >>
    >> import scipy.optimize
    >>
    >> def model(parameter, x, y):
    >> a, b, c = parameter
    >> return a*x + b*y + c
    >>
    >> def residual(parameter, data, x, y):
    >> res = []
    >> for _x in x:
    >> for _y in y:
    >> res.append(data-model(parameter,x,y)
    >> return res
    >>
    >> params0 = [1., 1.,1.]
    >> result = scipy.optimize.leastsq(resdiual, params0, (data,x,y))
    >> fittedParams = result[0]
    >>
    >> If you haven't used numeric, numpy or scipy before, you should take a
    >> look at an introduction. It uses some nice extended array objects,
    >> where you can use some neat index tricks and compute values of array
    >> items without looping through it.
    >>
    >> Cheers! Bernhard
    >>


    <<snip>>

    Hi Bernhard,
    I am just starting to learn Python; could you plz tell me specifically
    the introduction(s) you have in mind?

    TIA,
    DaveB
    David J. Braden, Sep 14, 2006
    #4
  5. Robert Kern

    Guest

    Hi Dave!

    > <<snip>>
    >
    > Hi Bernhard,
    > I am just starting to learn Python; could you plz tell me specifically
    > the introduction(s) you have in mind?
    >
    > TIA,
    > DaveB


    Take a look at the documenation section on www.scipy.org.

    Especially the old NumPy documentation (follow link to NumPy tutorial)
    and the scipy tutorial (there's an 'old' version as pdf) were helpfull.

    There is also a fee based e-book on NumPy which I should buy to support
    the development, because the package is so great :)

    For matplotlib take a look at http://matplotlib.sourceforge.net/ and
    get the user's guide.

    Enjoy! Bernhard
    , Sep 14, 2006
    #5
  6. wrote:
    > Hi Dave!
    >
    >> <<snip>>
    >>
    >> Hi Bernhard,
    >> I am just starting to learn Python; could you plz tell me specifically
    >> the introduction(s) you have in mind?
    >>
    >> TIA,
    >> DaveB

    >
    > Take a look at the documenation section on www.scipy.org.
    >
    > Especially the old NumPy documentation (follow link to NumPy tutorial)
    > and the scipy tutorial (there's an 'old' version as pdf) were helpfull.
    >
    > There is also a fee based e-book on NumPy which I should buy to support
    > the development, because the package is so great :)
    >
    > For matplotlib take a look at http://matplotlib.sourceforge.net/ and
    > get the user's guide.
    >
    > Enjoy! Bernhard
    >


    Whoa, your suggestions are great! I used to use MatLab, so your 3rd
    suggestion particularly hits home. With these packages and the nature of
    Python, its starting to feel a bit like S (well, in my case, R).

    Thanks mucho (oder sollte ich vielen Dank sagen?). Regards,
    DaveB
    David J. Braden, Sep 15, 2006
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. nikie

    Linear regression in NumPy

    nikie, Mar 17, 2006, in forum: Python
    Replies:
    15
    Views:
    13,656
    nikie
    Mar 24, 2006
  2. Replies:
    1
    Views:
    406
    John Machin
    Sep 2, 2006
  3. Nod Lee

    linear regression in webform

    Nod Lee, Jan 9, 2007, in forum: ASP .Net
    Replies:
    1
    Views:
    349
    Cowboy \(Gregory A. Beamer\)
    Jan 9, 2007
  4. Jianzhong Liu

    Linear regression in NumPy

    Jianzhong Liu, Dec 5, 2006, in forum: Python
    Replies:
    1
    Views:
    692
  5. Krzysztof Bieniasz

    Re: Non-linear regression help in Python

    Krzysztof Bieniasz, Feb 14, 2011, in forum: Python
    Replies:
    0
    Views:
    596
    Krzysztof Bieniasz
    Feb 14, 2011
Loading...

Share This Page