Speed-up for loops

Discussion in 'Python' started by Michael Kreim, Sep 2, 2010.

  1. Hi,

    I was comparing the speed of a simple loop program between Matlab and
    Python.

    My Codes:
    $ cat addition.py
    imax = 1000000000
    a = 0
    for i in xrange(imax):
    a = a + 10
    print a

    $ cat addition.m
    imax = 1e9;
    a = 0;
    for i=0:imax-1
    a = a + 10;
    end
    disp(a);
    exit;

    The results look like this:
    $ /usr/bin/time --verbose python addition.py
    10000000000
    Command being timed: "python addition.py"
    User time (seconds): 107.30
    System time (seconds): 0.08
    Percent of CPU this job got: 97%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 1:50.09
    [...]

    $ /usr/bin/time --verbose matlab -nodesktop -nosplash -r "addition"
    [...]
    1.0000e+10
    Command being timed: "matlab -nodesktop -nosplash -r addition"
    User time (seconds): 7.65
    System time (seconds): 0.18
    Percent of CPU this job got: 94%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:08.25
    [...]

    Unfortunately my Python Code was much slower and I do not understand why.

    Are there any ways to speed up the for/xrange loop?
    Or do I have to live with the fact that Matlab beats Python in this example?

    Thanks a lot for your answers.

    With best regards,

    Michael
     
    Michael Kreim, Sep 2, 2010
    #1
    1. Advertising

  2. Michael Kreim

    Peter Otten Guest

    Michael Kreim wrote:

    > I was comparing the speed of a simple loop program between Matlab and
    > Python.
    >
    > My Codes:
    > $ cat addition.py
    > imax = 1000000000
    > a = 0
    > for i in xrange(imax):
    > a = a + 10
    > print a


    > Are there any ways to speed up the for/xrange loop?


    Move it into a function; this turns a and i into local variables.

    def f():
    imax = 1000000000
    a = 0
    for i in xrange(imax):
    a = a + 10
    print a
    f()

    > Or do I have to live with the fact that Matlab beats Python in this
    > example?


    I think so.
     
    Peter Otten, Sep 2, 2010
    #2
    1. Advertising

  3. Peter Otten wrote:
    > Move it into a function; this turns a and i into local variables.
    >
    > def f():
    > imax = 1000000000
    > a = 0
    > for i in xrange(imax):
    > a = a + 10
    > print a
    > f()


    Wow. It is still slower than Matlab, but your suggestion speeds up the
    code by ca 50%.
    But I do not understand why the change of a global to a local variable
    gives such a big difference.


    $ cat addition.py
    imax = 1000000000
    a = 0
    for i in xrange(imax):
    a = a + 10
    print a

    $ cat additionOtten.py
    def f():
    imax = 1000000000
    a = 0
    for i in xrange(imax):
    a = a + 10
    print a
    f()

    $ /usr/bin/time --verbose python addition.py
    10000000000
    Command being timed: "python addition.py"
    User time (seconds): 110.52
    System time (seconds): 0.01
    Percent of CPU this job got: 98%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 1:52.76
    [...]

    $ /usr/bin/time --verbose python additionOtten.py
    10000000000
    Command being timed: "python additionOtten.py"
    User time (seconds): 56.38
    System time (seconds): 0.00
    Percent of CPU this job got: 99%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:56.64
    [...]
     
    Michael Kreim, Sep 2, 2010
    #3
  4. Michael Kreim

    Peter Otten Guest

    Michael Kreim wrote:

    > Peter Otten wrote:
    >> Move it into a function; this turns a and i into local variables.
    >>
    >> def f():
    >> imax = 1000000000
    >> a = 0
    >> for i in xrange(imax):
    >> a = a + 10
    >> print a
    >> f()

    >
    > Wow. It is still slower than Matlab, but your suggestion speeds up the
    > code by ca 50%.
    > But I do not understand why the change of a global to a local variable
    > gives such a big difference.


    Basically the local namespace is a C array where accessing an item is just
    pointer arithmetic while the global namespace is a Python dictionary.

    There may be optimisations for the latter. If you can read C have a look at
    the LOAD/STORE_FAST and LOAD/STORE_GLOBAL implementations for the gory
    details:

    http://svn.python.org/view/python/trunk/Python/ceval.c?view=markup

    Peter
     
    Peter Otten, Sep 2, 2010
    #4
  5. Michael Kreim <> writes:

    > Are there any ways to speed up the for/xrange loop?
    > Or do I have to live with the fact that Matlab beats Python in this
    > example?


    To a point, yes. However, there are things you can do to make your
    Python code go faster. One has been pointed out by Peter.

    Another is that Python treats numbers as regular heap objects, so
    creating a bunch of unneeded integers by xrange slows things down
    (despite allocation of integer objects being heavily optimized). For
    this reason, you may want to change xrange(1000000000) to
    itertools.repeat(None, 1000000000).

    $ python -m timeit -s 'from itertools import repeat' 'for _ in repeat(None, 100000): pass'
    1000 loops, best of 3: 1.71 msec per loop
    $ python -m timeit -s 'from itertools import repeat' 'for _ in xrange(100000): pass'
    100 loops, best of 3: 2.43 msec per loop
     
    Hrvoje Niksic, Sep 2, 2010
    #5
  6. Michael Kreim

    Nobody Guest

    On Thu, 02 Sep 2010 12:02:40 +0200, Michael Kreim wrote:

    > I was comparing the speed of a simple loop program between Matlab and
    > Python.


    > imax = 1000000000
    > a = 0
    > for i in xrange(imax):
    > a = a + 10
    > print a


    > Are there any ways to speed up the for/xrange loop?


    Sure; the above can be reduced to just:

    print imax * 10
    ;)

    More seriously, if you're comparing against Matlab, you should look at
    NumPy. If there's a reasonably direct approach using NumPy, it will be
    much quicker than a Python "for" loop (in a sense, NumPy is a library of
    useful "for" loops implemented in C).

    Even a fairly indirect NumPy approach is often quicker than pure Python.
     
    Nobody, Sep 2, 2010
    #6
  7. Michael Kreim

    Philip Bloom Guest

    Uh.

    Try:
    Imax=1000000000
    a=0
    i=0
    While(i<imax):
    a= a+10
    i=i+1
    print a

    I suspect you will find it is way faster than using range or xrange for
    large numbers and map far more closely in the final result to what you
    are doing on matlab's side. At least last I checked, xrange and range
    both involve iterating through an array, which is much slower in all
    cases than just doing an int vs int compare (which is what your matlab
    is doing).

    -----Original Message-----
    From: python-list-bounces+pbloom=
    [mailto:python-list-bounces+pbloom=] On Behalf Of
    Nobody
    Sent: Thursday, September 02, 2010 9:05 AM
    To:
    Subject: Re: Speed-up for loops

    On Thu, 02 Sep 2010 12:02:40 +0200, Michael Kreim wrote:

    > I was comparing the speed of a simple loop program between Matlab and
    > Python.


    > imax = 1000000000
    > a = 0
    > for i in xrange(imax):
    > a = a + 10
    > print a


    > Are there any ways to speed up the for/xrange loop?


    Sure; the above can be reduced to just:

    print imax * 10
    ;)

    More seriously, if you're comparing against Matlab, you should look at
    NumPy. If there's a reasonably direct approach using NumPy, it will be
    much quicker than a Python "for" loop (in a sense, NumPy is a library of
    useful "for" loops implemented in C).

    Even a fairly indirect NumPy approach is often quicker than pure Python.

    --
    http://mail.python.org/mailman/listinfo/python-list

    ______________________________________________________________________
    This email has been scanned by the MessageLabs
    ______________________________________________________________________

    ______________________________________________________________________
    This email has been scanned by the MessageLabs
    ______________________________________________________________________
     
    Philip Bloom, Sep 2, 2010
    #7
  8. Michael Kreim

    Peter Otten Guest

    Philip Bloom wrote:

    > Uh.


    > Try:
    > Imax=1000000000
    > a=0
    > i=0
    > While(i<imax):
    > a= a+10
    > i=i+1
    > print a


    > I suspect you will find it is way faster than using range or xrange for
    > large numbers and map far more closely in the final result to what you
    > are doing on matlab's side. At least last I checked, xrange and range
    > both involve iterating through an array, which is much slower in all
    > cases than just doing an int vs int compare (which is what your matlab
    > is doing).


    How did you check?

    $ python -m timeit "for i in xrange(1000000): pass"
    10 loops, best of 3: 47.5 msec per loop
    $ python -m timeit "i = 0" "while i < 1000000: i += 1"
    10 loops, best of 3: 152 msec per loop
    $

    So an empty while loop takes about three times as long as the equivalent for
    loop. Also:

    """
    class xrange(object)
    | xrange([start,] stop[, step]) -> xrange object
    |
    | Like range(), but instead of returning a list, returns an object that
    | generates the numbers in the range on demand. For looping, this is
    | slightly faster than range() and more memory efficient.
    """

    Peter
     
    Peter Otten, Sep 2, 2010
    #8
  9. Michael Kreim

    BartC Guest

    "Michael Kreim" <> wrote in message
    news:...

    > I was comparing the speed of a simple loop program between Matlab and
    > Python.
    >
    > My Codes:
    > $ cat addition.py
    > imax = 1000000000
    > a = 0
    > for i in xrange(imax):
    > a = a + 10
    > print a


    > Unfortunately my Python Code was much slower and I do not understand why.
    >
    > Are there any ways to speed up the for/xrange loop?
    > Or do I have to live with the fact that Matlab beats Python in this
    > example?


    I'm not sure the Python developers were interested in getting fast loops.

    For-loops which iterate between two numbers are amongst the easiest things
    to make fast in a language. Yet originally you had to use:

    for i in range(N):

    which (if I understood correctly) actually created a list of N objects,
    populated it with the values 0, 1, 2...N-1 (presumably using a more sensible
    loop), then iterated between the values of the list!

    So Python had the distinction of being one of the slowest languages in which
    to do nothing (ie. running an empty loop).

    --
    Bartc
     
    BartC, Sep 3, 2010
    #9
  10. BartC, 03.09.2010 22:17:
    > for i in range(N):
    >
    > which (if I understood correctly) actually created a list of N objects,
    > populated it with the values 0, 1, 2...N-1 (presumably using a more
    > sensible loop), then iterated between the values of the list!


    I guess what applies here is "special cases aren't special enough to break
    the rules". The performance is good enough in most cases, it only hurts
    when the range is large and the loop body is small in comparison, such as
    in the most obvious stupid benchmarks.

    Also, xrange() is a pretty old addition the the language and now replaces
    range() in Python 3.

    Stefan
     
    Stefan Behnel, Sep 3, 2010
    #10
  11. On Fri, 3 Sep 2010 21:17:44 +0100, "BartC" <> declaimed
    the following in gmane.comp.python.general:


    > I'm not sure the Python developers were interested in getting fast loops.
    >
    > For-loops which iterate between two numbers are amongst the easiest things
    > to make fast in a language. Yet originally you had to use:
    >
    > for i in range(N):
    >
    > which (if I understood correctly) actually created a list of N objects,
    > populated it with the values 0, 1, 2...N-1 (presumably using a more sensible
    > loop), then iterated between the values of the list!
    >


    In those languages in which a for loop cycles over a range of
    integral values (or even floating point values) one often ends up with
    code that then performs subscripting to access the real goal of the
    loop...

    The Python for loop is innately focused on giving one an object from
    some iterable (list, tuple, dictionary, user-defined...) with which one
    can directly process... Compare {pseudo-code}:

    for item in world_objects:
    item.update(simulation_time)

    to

    for i in range(len(world_objects)):
    world_objects.update(simulation_time)

    Those languages that have optimized numeric-only (integer-only in
    some cases) for loops do not support the clean interface of the first
    sample and force all access to look like the latter.

    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Sep 4, 2010
    #11
  12. On Fri, 03 Sep 2010 21:17:44 +0100, BartC wrote:


    > I'm not sure the Python developers were interested in getting fast
    > loops.
    >
    > For-loops which iterate between two numbers are amongst the easiest
    > things to make fast in a language. Yet originally you had to use:
    >
    > for i in range(N):


    I don't have any versions of Python prior to version 1.5, but as far back
    as that there was always a choice between creating a list with range()
    and a lazy iterator with xrange().

    > which (if I understood correctly) actually created a list of N objects,
    > populated it with the values 0, 1, 2...N-1 (presumably using a more
    > sensible loop), then iterated between the values of the list!


    By "more sensible", do you mean "in C code"? If so, then you are correct.


    > So Python had the distinction of being one of the slowest languages in
    > which to do nothing (ie. running an empty loop).


    Nonsense.


    [steve@sylar ~]$ time python test.py

    real 0m3.441s
    user 0m2.969s
    sys 0m0.024s
    [steve@sylar ~]$ time perl test.pl

    real 0m3.490s
    user 0m2.722s
    sys 0m0.011s
    [steve@sylar ~]$ time ruby test.rb

    real 0m11.875s
    user 0m6.740s
    sys 0m3.995s


    The difference between an empty loop in Python and Perl is insignificant,
    and much faster than Ruby (at least the specific version of Ruby
    installed on my machine, 1.8.6).

    And if you want to see the code I ran:


    [steve@sylar ~]$ cat test.*
    # perl
    for ($i = 0; $i < 10_000_000; ++$i) {
    1;
    }
    # python
    for i in xrange(10000000):
    1
    # ruby
    for i in 0...10000000
    1
    end


    Just for comparisons' sake:

    [steve@sylar ~]$ gpc empty_test.p
    [steve@sylar ~]$ time ./a.out

    real 0m0.106s
    user 0m0.070s
    sys 0m0.004s
    [steve@sylar ~]$ cat empty_test.p
    program main(input, output);
    var
    i: integer;
    begin
    for i := 0 to 10000000 do
    begin
    end;
    end.


    Of course, a real optimizing compiler would realise that the Pascal code
    did nothing at all, and compile it all away to an empty a.out file...



    --
    Steven
     
    Steven D'Aprano, Sep 5, 2010
    #12
  13. Steven D'Aprano, 05.09.2010 17:00:
    > Of course, a real optimizing compiler would realise that the Pascal code
    > did nothing at all, and compile it all away to an empty a.out file...


    Which is just one of the reasons why this kind if "benchmark" provides no
    insight into anything that should have an impact on a programmer's daily job.

    Stefan
     
    Stefan Behnel, Sep 5, 2010
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ham

    I need speed Mr .Net....speed

    Ham, Oct 28, 2004, in forum: ASP .Net
    Replies:
    6
    Views:
    2,342
    Antony Baula
    Oct 29, 2004
  2. efiedler
    Replies:
    1
    Views:
    2,051
    Tim Ward
    Oct 9, 2003
  3. Tim Wintle

    Re: Speed-up for loops

    Tim Wintle, Sep 2, 2010, in forum: Python
    Replies:
    5
    Views:
    314
    Tim Wintle
    Sep 3, 2010
  4. David Cournapeau

    Re: Speed-up for loops

    David Cournapeau, Sep 5, 2010, in forum: Python
    Replies:
    19
    Views:
    700
    BartC
    Sep 8, 2010
  5. Me
    Replies:
    2
    Views:
    246
Loading...

Share This Page