Wow, Python much faster than MatLab

Discussion in 'Python' started by Stef Mientki, Dec 29, 2006.

  1. Stef Mientki

    Stef Mientki Guest

    hi All,

    instead of questions,
    my first success story:

    I converted my first MatLab algorithm into Python (using SciPy),
    and it not only works perfectly,
    but also runs much faster:

    MatLab: 14 msec
    Python: 2 msec

    After taking the first difficult steps into Python,
    all kind of small problems as you already know,
    it nows seems a piece of cake to convert from MatLab to Python.
    (the final programs of MatLab and Python can almost only be
    distinguished by the comment character ;-)

    Especially I like:
    - more relaxed behavior of exceeded the upper limit of a (1-dimensional)
    array
    - much more functions available, like a simple "mean"
    - reducing datatype if it's allowed (booleans of 1 byte)

    thanks for all your help,
    probably need some more in the future,
    cheers,
    Stef Mientki
     
    Stef Mientki, Dec 29, 2006
    #1
    1. Advertising

  2. Stef Mientki

    Beliavsky Guest

    Stef Mientki wrote:
    > hi All,
    >
    > instead of questions,
    > my first success story:
    >
    > I converted my first MatLab algorithm into Python (using SciPy),
    > and it not only works perfectly,
    > but also runs much faster:
    >
    > MatLab: 14 msec
    > Python: 2 msec


    For times this small, I wonder if timing comparisons are valid. I do
    NOT think SciPy is in general an order of magnitude faster than Matlab
    for the task typically performed with Matlab.

    >
    > After taking the first difficult steps into Python,
    > all kind of small problems as you already know,
    > it nows seems a piece of cake to convert from MatLab to Python.
    > (the final programs of MatLab and Python can almost only be
    > distinguished by the comment character ;-)
    >
    > Especially I like:
    > - more relaxed behavior of exceeded the upper limit of a (1-dimensional)
    > array


    Could you explain what this means? In general, I don't want a
    programming language to be "relaxed" about exceeding array bounds.
     
    Beliavsky, Dec 30, 2006
    #2
    1. Advertising

  3. On Fri, 29 Dec 2006 19:35:22 -0800, Beliavsky wrote:

    >> Especially I like:
    >> - more relaxed behavior of exceeded the upper limit of a (1-dimensional)
    >> array

    >
    > Could you explain what this means? In general, I don't want a
    > programming language to be "relaxed" about exceeding array bounds.


    I'm not sure about SciPy, but lists in standard Python allow this:

    >>> array = [1, 2, 3, 4]
    >>> array[2:50000]

    [3, 4]

    That's generally a good thing.




    --
    Steven.
     
    Steven D'Aprano, Dec 30, 2006
    #3
  4. Stef Mientki

    Stef Mientki Guest


    >> MatLab: 14 msec
    >> Python: 2 msec

    >
    > For times this small, I wonder if timing comparisons are valid. I do
    > NOT think SciPy is in general an order of magnitude faster than Matlab
    > for the task typically performed with Matlab.

    The algorithm is meant for real-time analysis,
    where these kind of differences counts a lot.
    I'm also a typical "surface programmer"
    (don't need/want to know what's going inside),
    just want to get my analysis done,
    and the fact that Python has much more functions available,
    means I've to write far less explicit or implicit for loops,
    and thus I expect it to "look" faster for me always.
    >
    >> After taking the first difficult steps into Python,
    >> all kind of small problems as you already know,
    >> it nows seems a piece of cake to convert from MatLab to Python.
    >> (the final programs of MatLab and Python can almost only be
    >> distinguished by the comment character ;-)
    >>
    >> Especially I like:
    >> - more relaxed behavior of exceeded the upper limit of a (1-dimensional)
    >> array

    >
    > Could you explain what this means? In general, I don't want a
    > programming language to be "relaxed" about exceeding array bounds.
    >

    Well, I've to admit, that wasn't a very tactic remark, "noise" is still
    an unwanted issue in software.
    But in the meanwhile I've reading further and I should replace that by
    some other great things:
    - the very efficient way, comment is turned into help information
    - the (at first sight) very easy, but yet quit powerfull OOPs implemetation.

    cheers,
    Stef Mientki
     
    Stef Mientki, Dec 30, 2006
    #4
  5. Stef Mientki

    Stef Mientki Guest

    >
    > I'm not sure about SciPy,


    Yes SciPy allows it too !
    but lists in standard Python allow this:
    >
    >>>> array = [1, 2, 3, 4]
    >>>> array[2:50000]

    > [3, 4]
    >
    > That's generally a good thing.
    >


    You're not perhaps by origin an analog engineer ;-)

    cheers,
    Stef Mientki
     
    Stef Mientki, Dec 30, 2006
    #5
  6. A other great thing: With rpy you have R bindings for python.
    So you have the power of R and the easy syntax and big standard lib of python! :)
     
    Mathias Panzenboeck, Dec 30, 2006
    #6
  7. Stef Mientki

    Stef Mientki Guest

    Mathias Panzenboeck wrote:
    > A other great thing: With rpy you have R bindings for python.


    forgive my ignorance, what's R, rpy ?
    Or is only relevant for Linux users ?

    cheers
    Stef

    > So you have the power of R and the easy syntax and big standard lib of python! :)
     
    Stef Mientki, Dec 30, 2006
    #7
  8. Stef Mientki

    John J. Lee Guest

    Stef Mientki <> writes:

    > Mathias Panzenboeck wrote:
    > > A other great thing: With rpy you have R bindings for python.

    >
    > forgive my ignorance, what's R, rpy ?
    > Or is only relevant for Linux users ?

    [...]

    R is a language / environment for statistical programming. RPy is a
    Python interface to let you use R from Python. I think they both run
    on both Windows and Linux.

    http://www.r-project.org/

    http://rpy.sourceforge.net/


    John
     
    John J. Lee, Dec 30, 2006
    #8
  9. Stef Mientki

    sturlamolden Guest

    Stef Mientki wrote:

    > MatLab: 14 msec
    > Python: 2 msec


    I have the same experience. NumPy is usually faster than Matlab. But it
    very much depends on how the code is structured.

    I wonder if it is possible to improve the performance of NumPy by
    having its fundamental types in the language, instead of depending on
    operator overloading. For example, in NumPy, a statement like

    array3[:] = array1[:] + array2[:]

    allocates an intermediate array that is not needed. This is because the
    operator overloading cannot know if it's evaluating a part of a larger
    statement like

    array1[:] = (array1[:] + array2[:]) * (array3[:] + array4[:])

    If arrays had been a part of the language, as it is in Matlab and
    Fortran 95, the compiler could see this and avoid intermediate storage,
    as well as looping over the data only once. This is one of the main
    reasons why Fortran is better than C++ for scientific computing. I.e.
    instead of

    for (i=0; i<n; i++)
    array1 = (array1 + array2) * (array3 + array4);

    one actually gets something like three intermediates and four loops:

    tmp1 = malloc(n*sizeof(whatever));
    for (i=0; i<n; i++)
    tmp1 = array1 + array2;
    tmp2 = malloc(n*sizeof(whatever));
    for (i=0; i<n; i++)
    tmp2 = array3 + array4;
    tmp3 = malloc(n*sizeof(whatever));
    for (i=0; i<n; i++)
    tmp3 = tmp1 + tmp2;
    free(tmp1);
    free(tmp2);
    for (i=0; i<n; i++)
    array1 = tmp3;
    free(tmp3);

    In C++ this is actually further bloated by constructor, destructor and
    copyconstructor calls.
    Why one should use Fortran over C++ is obvious. But it also applies to
    NumPy, and also to the issue of Numpy vs. Matlab, as Matlab know about
    arrays and has a compiler that can deal with this, whilst NumPy depends
    on bloated operator overloading. On the other hand, Matlab is
    fundamentally impaired on function calls and array slicing compared
    with NumPy (basically copies are created instead of views). Thus, which
    is faster - Matlab or NumPy - very much depends on how the code is
    written.

    Now for my question: operator overloading is (as shown) not the
    solution to efficient scientific computing. It creates serious bloat
    where it is undesired. Can NumPy's performance be improved by adding
    the array types to the Python language it self? Or are the dynamic
    nature of Python preventing this?

    Sturla Molden
     
    sturlamolden, Dec 31, 2006
    #9
  10. Stef Mientki

    Robert Kern Guest

    sturlamolden wrote:
    > array3[:] = array1[:] + array2[:]


    OT, but why are you slicing array1 and array2? All that does is create new array
    objects pointing to the same data.

    > Now for my question: operator overloading is (as shown) not the
    > solution to efficient scientific computing. It creates serious bloat
    > where it is undesired. Can NumPy's performance be improved by adding
    > the array types to the Python language it self? Or are the dynamic
    > nature of Python preventing this?


    Pretty much. Making the array types builtin rather than from a third party
    module doesn't really change anything. However, if type inferencing tools like
    psyco are taught about numpy arrays like they are already taught about ints,
    then one could do make it avoid temporaries.

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
     
    Robert Kern, Dec 31, 2006
    #10
  11. Stef Mientki

    Klaas Guest

    sturlamolden wrote:

    > as well as looping over the data only once. This is one of the main
    > reasons why Fortran is better than C++ for scientific computing. I.e.
    > instead of
    >
    > for (i=0; i<n; i++)
    > array1 = (array1 + array2) * (array3 + array4);
    >
    > one actually gets something like three intermediates and four loops:
    >
    > tmp1 = malloc(n*sizeof(whatever));
    > for (i=0; i<n; i++)
    > tmp1 = array1 + array2;
    > tmp2 = malloc(n*sizeof(whatever));
    > for (i=0; i<n; i++)
    > tmp2 = array3 + array4;
    > tmp3 = malloc(n*sizeof(whatever));
    > for (i=0; i<n; i++)
    > tmp3 = tmp1 + tmp2;
    > free(tmp1);
    > free(tmp2);
    > for (i=0; i<n; i++)
    > array1 = tmp3;
    > free(tmp3);


    C/C++ do not allocate extra arrays. What you posted _might_ bear a
    small resemblance to what numpy might produce (if using vectorized
    code, not explicit loop code). This is entirely unrelated to the
    reasons why fortran can be faster than c.

    -Mike
     
    Klaas, Dec 31, 2006
    #11
  12. Stef Mientki

    sturlamolden Guest

    Klaas wrote:
    > C/C++ do not allocate extra arrays. What you posted _might_ bear a
    > small resemblance to what numpy might produce (if using vectorized
    > code, not explicit loop code). This is entirely unrelated to the
    > reasons why fortran can be faster than c.


    Array libraries in C++ that use operator overloading produce
    intermediate arrays for the same reason as NumPy. There is a C++
    library that are sometimes able to avoid intermediates (Blitz++), but
    it can only do so for small arrays for which bounds are known at
    compile time.

    Operator overloading is sometimes portrayed as required for scientific
    computing (e.g. in Java vs. C# flame wars), but the cure can be worse
    than the disease.

    C does not have operator overloading and is an entirely different case.
    You can of course avoid intermediates in C++ if you use C++ as C. You
    can do that in Python as well.
     
    sturlamolden, Jan 1, 2007
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sanny
    Replies:
    12
    Views:
    596
    Andrew Thompson
    Dec 15, 2006
  2. Doran, Harold

    RE: Wow, Python much faster than MatLab

    Doran, Harold, Dec 30, 2006, in forum: Python
    Replies:
    10
    Views:
    1,136
    Wensui Liu
    Jan 1, 2007
  3. kwatch
    Replies:
    1
    Views:
    361
    Igor Katson
    Jun 17, 2009
  4. kwatch
    Replies:
    0
    Views:
    293
    kwatch
    Jun 14, 2009
  5. Iñaki Baz Castillo

    Why {} is much faster than Hash.new ?

    Iñaki Baz Castillo, Jan 13, 2011, in forum: Ruby
    Replies:
    5
    Views:
    137
    Roger Pack
    Jan 17, 2011
Loading...

Share This Page