dot products

Discussion in 'Python' started by Rahul, Dec 19, 2004.

  1. Rahul

    Rahul Guest

    HI.
    I want to compute dot product of two vectors stored as lists a and b.a
    and b are of the same length.

    one simple way is
    sum(a*b for i in range(len(a)))

    another simple way is
    ans=0.0
    for i in range(len(a)):
    ans=ans+a*b

    But is there any other way which is faster than any of the above. (By
    the way profiling them i found that the second is faster by about 30%.)
    rahul
    Rahul, Dec 19, 2004
    #1
    1. Advertising

  2. Rahul wrote:

    > I want to compute dot product of two vectors stored as lists a and b.a
    > and b are of the same length.
    >
    > one simple way is
    > sum(a*b for i in range(len(a)))
    >


    btw, imho the most "Pythonic" would be:

    sum(i*j for (i,j) in zip(a,b))
    Andrey Tatarinov, Dec 19, 2004
    #2
    1. Advertising

  3. Rahul

    Nick Coghlan Guest

    Rahul wrote:
    > HI.
    > I want to compute dot product of two vectors stored as lists a and b.a
    > and b are of the same length.
    >
    > one simple way is
    > sum(a*b for i in range(len(a)))
    >
    > another simple way is
    > ans=0.0
    > for i in range(len(a)):
    > ans=ans+a*b
    >
    > But is there any other way which is faster than any of the above.


    Try:

    sum(x * y for x, y in zip(a, b))

    Between zip() (lockstep iteration over several sequences) and enumerate()
    (iteration over a sequence, but also providing an index counter), it is rare
    that you will want to use indexing notation in a generator expression.

    > (By the way profiling them i found that the second is faster by about 30%.)


    For short sequences, generator expressions may end up slightly slower than list
    comprehensions or for loops, as the latter two do not involve the overhead of
    setting up the generator and retrieving values from it. As the sequences
    increase in length, generator expressions generally win in the end due to their
    reduced memory impact.

    Cheers,
    Nick.

    --
    Nick Coghlan | | Brisbane, Australia
    ---------------------------------------------------------------
    http://boredomandlaziness.skystorm.net
    Nick Coghlan, Dec 19, 2004
    #3
  4. Rahul

    Peter Otten Guest

    Rahul wrote:

    > I want to compute dot product of two vectors stored as lists a and b.a
    > and b are of the same length.
    >
    > one simple way is
    > sum(a*b for i in range(len(a)))
    >
    > another simple way is
    > ans=0.0
    > for i in range(len(a)):
    > ans=ans+a*b
    >
    > But is there any other way which is faster than any of the above. (By
    > the way profiling them i found that the second is faster by about 30%.)


    You could try

    sigma = 0
    for ai, bi in itertools.izip(a, b):
    sigma += ai * bi

    or

    sum(itertools.starmap(operator.mul, itertools.izip(a, b)))

    but if you are really concerned about number-crunching efficiency, use
    Numarray.

    Peter
    Peter Otten, Dec 19, 2004
    #4
  5. Rahul wrote:
    > I want to compute dot product of two vectors stored as lists a and b.a
    > and b are of the same length.


    .. >>> import numarray as na
    .. >>> a, b = na.arange(5), na.arange(5, 10)
    .. >>> na.dot(a, b)
    .. 80

    Steve
    Steven Bethard, Dec 19, 2004
    #5
  6. [Rahul].
    > I want to compute dot product of two vectors stored as lists a and b.a
    > and b are of the same length.
    >
    > one simple way is
    > sum(a*b for i in range(len(a)))
    >
    > another simple way is
    > ans=0.0
    > for i in range(len(a)):
    > ans=ans+a*b
    >
    > But is there any other way which is faster than any of the above.


    Yes:
    from itertools import imap
    from operator import mul
    ans = sum(imap(mul, a, b))

    In general:
    * reduction functions like sum() do not need their arguments to
    take time building a full list; instead, an iterator will do fine
    * applying itertools instead of genexps can save the eval-loop overhead
    * however, genexps are usually more readable than itertools solutions
    * xrange() typically beats range()
    * but indexing is rarely the way to go
    * izip() typically beats zip()
    * imap() can preclude the need for either izip() or zip()
    * the timeit module settles these questions quickly

    Here are the some timings for vectors of length 10 and 3 respectively

    C:\pydev>python timedot.py 3
    1.25333310984 sum(a*b for i in xrange(len(a)))
    1.16825625639 sum(x*y for x,y in izip(a,b))
    1.45373455807 sum(x*y for x,y in zip(a,b))
    0.635497577901 sum(imap(mul, a, b))
    0.85894416601 sum(map(mul, a, b))

    C:\pydev>python timedot.py 10
    2.19490353509 sum(a*b for i in xrange(len(a)))
    2.01773998894 sum(x*y for x,y in izip(a,b))
    2.44932533231 sum(x*y for x,y in zip(a,b))
    1.24698871922 sum(imap(mul, a, b))
    1.49768685362 sum(map(mul, a, b))



    Raymond Hettinger
    Raymond Hettinger, Dec 19, 2004
    #6
  7. Raymond Hettinger wrote:

    > * applying itertools instead of genexps can save the eval-loop overhead
    > * however, genexps are usually more readable than itertools solutions


    I'm still waiting for you to implement itertools as a parse-tree analyzer/code generator,
    rather than an "bare" extension module.

    </F>
    Fredrik Lundh, Dec 19, 2004
    #7
  8. Rahul

    Rahul Guest

    Raymond Hettinger wrote:
    > [Rahul].
    > > I want to compute dot product of two vectors stored as lists a and

    b.a
    > > and b are of the same length.
    > >
    > > one simple way is
    > > sum(a*b for i in range(len(a)))
    > >
    > > another simple way is
    > > ans=0.0
    > > for i in range(len(a)):
    > > ans=ans+a*b
    > >
    > > But is there any other way which is faster than any of the above.

    >
    > Yes:
    > from itertools import imap
    > from operator import mul
    > ans = sum(imap(mul, a, b))
    >
    > In general:
    > * reduction functions like sum() do not need their arguments to
    > take time building a full list; instead, an iterator will do fine
    > * applying itertools instead of genexps can save the eval-loop

    overhead
    > * however, genexps are usually more readable than itertools solutions
    > * xrange() typically beats range()
    > * but indexing is rarely the way to go
    > * izip() typically beats zip()
    > * imap() can preclude the need for either izip() or zip()
    > * the timeit module settles these questions quickly
    >
    > Here are the some timings for vectors of length 10 and 3 respectively
    >
    > C:\pydev>python timedot.py 3
    > 1.25333310984 sum(a*b for i in xrange(len(a)))
    > 1.16825625639 sum(x*y for x,y in izip(a,b))
    > 1.45373455807 sum(x*y for x,y in zip(a,b))
    > 0.635497577901 sum(imap(mul, a, b))
    > 0.85894416601 sum(map(mul, a, b))
    >
    > C:\pydev>python timedot.py 10
    > 2.19490353509 sum(a*b for i in xrange(len(a)))
    > 2.01773998894 sum(x*y for x,y in izip(a,b))
    > 2.44932533231 sum(x*y for x,y in zip(a,b))
    > 1.24698871922 sum(imap(mul, a, b))
    > 1.49768685362 sum(map(mul, a, b))
    >
    >
    >
    > Raymond Hettinger

    Thanks all of you guys for enlightening me. Python is truly elegant.
    Rahul, Dec 20, 2004
    #8
  9. On 19 Dec 2004 03:04:15 -0800, "Rahul" <> wrote:

    >HI.
    >I want to compute dot product of two vectors stored as lists a and b.a
    >and b are of the same length.
    >
    >one simple way is
    >sum(a*b for i in range(len(a)))
    >
    >another simple way is
    >ans=0.0
    >for i in range(len(a)):
    >ans=ans+a*b
    >
    >But is there any other way which is faster than any of the above. (By
    >the way profiling them i found that the second is faster by about 30%.)
    >rahul
    >

    Don't know about the timing, but another way:

    >>> import operator
    >>> a, b = range(5), range(5,10)
    >>> sum(map(operator.mul, a, b))

    80

    Checking...

    >>> class OpShow(object):

    ... def __init__(self): self.tot = 0
    ... def __call__(self, x, y):
    ... prod = x*y
    ... self.tot += prod
    ... print '%3s * %-3s => %s (tot %s)' %(x, y, prod, self.tot)
    ... return prod
    ...
    >>> sum(map(OpShow(), a, b))

    0 * 5 => 0 (tot 0)
    1 * 6 => 6 (tot 6)
    2 * 7 => 14 (tot 20)
    3 * 8 => 24 (tot 44)
    4 * 9 => 36 (tot 80)
    80

    Regards,
    Bengt Richter
    Bengt Richter, Dec 20, 2004
    #9
  10. Rahul

    John Lenton Guest

    On Sun, Dec 19, 2004 at 03:04:15AM -0800, Rahul wrote:
    > HI.
    > I want to compute dot product of two vectors stored as lists a and b.a
    > and b are of the same length.
    >
    > one simple way is
    > sum(a*b for i in range(len(a)))
    >
    > another simple way is
    > ans=0.0
    > for i in range(len(a)):
    > ans=ans+a*b
    >
    > But is there any other way which is faster than any of the above. (By
    > the way profiling them i found that the second is faster by about 30%.)
    > rahul


    some numbers to confirm my previous reply:

    1
    zip: 0.00115 (1.00)
    range: 0.00108 (0.94)
    array: 0.00075 (0.65)
    10
    zip: 0.00306 (1.00)
    range: 0.00288 (0.94)
    array: 0.00074 (0.24)
    100
    zip: 0.02195 (1.00)
    range: 0.02035 (0.93)
    array: 0.00079 (0.04)
    1000
    zip: 0.21016 (1.00)
    range: 0.19930 (0.95)
    array: 0.00130 (0.01)
    10000
    zip: 4.98902 (1.00)
    range: 2.70843 (0.54)
    array: 0.01405 (0.00)

    (the integers are the number of elements in a random array of floats;
    'zip' refers to

    sum([x*y for (x,y) in zip(a,b)])

    'range', to

    sum([a*b for i in range(len(a))])

    and 'array' makes a and b Numeric's 'array' objects, with atlas
    installed (and hence dotblas loading and assembler version of dot
    tuned for the system's processor (in this case a pentium3)). The code
    in this case is simply

    Numeric.dot(a, b)

    The advantage of atlas on systems with sse2 should be even greater.

    --
    John Lenton () -- Random fortune:
    El deseo nos fuerza a amar lo que nos hará sufrir.
    -- Marcel Proust. (1871-1922) Escritor francés.

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.5 (GNU/Linux)

    iD8DBQFBxv1BgPqu395ykGsRAvuKAJkBULU763LN368mVFJf+trqku8/KQCbB75C
    noYAuTFkRh4SJ3VmJCXpwZY=
    =F9aS
    -----END PGP SIGNATURE-----
    John Lenton, Dec 20, 2004
    #10
  11. [Rahul].
    > > I want to compute dot product of two vectors stored as lists a and b.a
    > > and b are of the same length.
    > >
    > > one simple way is
    > > sum(a*b for i in range(len(a)))
    > >
    > > another simple way is
    > > ans=0.0
    > > for i in range(len(a)):
    > > ans=ans+a*b
    > >
    > > But is there any other way which is faster than any of the above.

    >
    > Yes:
    > from itertools import imap
    > from operator import mul
    > ans = sum(imap(mul, a, b))


    Doh! We all missed it.

    If your vector length is known in advance (and it often is in some apps), the
    simplest and fastest approach is:

    a[0]*b[0] + a[1]*b[1] + a[2]*b[2]


    C:\pydev>python timedot.py 3
    0.32 sec: a[0]*b[0] + a[1]*b[1] + a[2]*b[2]
    1.38 sec: sum(a*b for i in xrange(len(a)))
    1.32 sec: sum(x*y for x,y in izip(a,b))
    1.62 sec: sum(x*y for x,y in zip(a,b))
    0.75 sec: sum(imap(mul, a, b))
    1.04 sec: sum(map(mul, a, b))



    Raymond Hettinger
    Raymond Hettinger, Dec 21, 2004
    #11
  12. Rahul

    Alan G Isaac Guest

    [Rahul].
    > I want to compute dot product of two vectors stored as lists a and b.a
    > and b are of the same length


    from scipy import dot
    ans=dot(a,b)

    This times faster than the alternatives I have seen mentioned so far,
    given scipy.

    Cheers,
    Alan Isaac
    Alan G Isaac, Dec 21, 2004
    #12
  13. Rahul

    Alan G Isaac Guest

    "Alan G Isaac" <> wrote in message
    news:...
    > This times faster than the alternatives I have seen mentioned so far,
    > given scipy.


    Actually, since I am new to 'timeit', I probably should check that
    I am not overlooking something. Especially since I see an order
    of magnitude difference in performance. Does the code below
    give the right comparisons?

    Thanks,
    Alan Isaac

    #-------------------------------------------------------------
    import timeit
    env1='''
    from operator import mul
    from itertools import imap
    def innerprod(x,y):
    return sum(imap(mul,x,y))
    from scipy import rand
    x=rand(50); y=rand(50)
    '''
    env2='''
    from operator import mul
    from itertools import imap
    from scipy import rand
    x=rand(50); y=rand(50)
    '''
    env3='''
    from scipy import rand,dot
    x=rand(50); y=rand(50)
    '''
    t1=timeit.Timer("innerprod(x,y)",env1)
    t2=timeit.Timer("sum(imap(mul,x,y))",env2)
    t3=timeit.Timer("dot(x,y)",env3)
    trials=1000
    print t1.repeat(2,trials) #about 0.1 seconds
    print t2.repeat(2,trials) #about 0.1 seconds
    print t3.repeat(2,trials) #about 0.01 seconds
    Alan G Isaac, Dec 21, 2004
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Samuël van Laere

    To dot or not to dot?

    Samuël van Laere, Oct 16, 2003, in forum: HTML
    Replies:
    8
    Views:
    423
    Samuël van Laere
    Oct 16, 2003
  2. Christopher M. Lusardi

    volatile struct in dot h vs dot c

    Christopher M. Lusardi, May 11, 2004, in forum: C Programming
    Replies:
    3
    Views:
    473
    Peter Shaggy Haywood
    May 15, 2004
  3. Nathan Sokalski
    Replies:
    11
    Views:
    702
    AAaron123
    Aug 14, 2009
  4. krishnan

    Dot Net Project Execution without Dot Net and Framework....

    krishnan, Jan 7, 2006, in forum: ASP .Net Building Controls
    Replies:
    0
    Views:
    185
    krishnan
    Jan 7, 2006
  5. Replies:
    6
    Views:
    244
    Thomas 'PointedEars' Lahn
    Dec 12, 2005
Loading...

Share This Page