Efficient way to sum a product of numbers...

Discussion in 'Python' started by vsoler, Aug 31, 2009.

  1. vsoler

    vsoler Guest

    Hi,

    After simplifying my problem, I can say that I want to get the sum of
    the product of two culumns:

    Say
    m= [[ 'a', 1], [ 'b', 2],[ 'a', 3]]
    r={'a':4, 'b':5, 'c':6}

    What I need is the calculation

    1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26

    That is, for each row list in variable 'm' look for its first element
    in variable 'r' and multiply the value found by the second element in
    row 'm'. After that, sum all the products.

    What's an efficient way to do it? I have thousands of these
    calculations to make on a big data file.

    Thank you.
    vsoler, Aug 31, 2009
    #1
    1. Advertising

  2. vsoler

    Tim Chase Guest

    > After simplifying my problem, I can say that I want to get the sum of
    > the product of two culumns:
    >
    > Say
    > m= [[ 'a', 1], [ 'b', 2],[ 'a', 3]]

    assuming you meant ['c', 3] here... ^
    > r={'a':4, 'b':5, 'c':6}
    >
    > What I need is the calculation
    >
    > 1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26

    and you mean "3*6" here instead of "3*4", which is 18 instead of
    12, making the whole sum 4+10+18=32


    Then it sounds like you could do something like

    result = sum(v * r[k] for k,v in m)

    where "m" is any arbitrary iterable of tuples. If the keys (the
    letters) aren't guaranteed to be in "r", then you can use
    defaults (in this case "0", but could just as likely be "1"
    depending on your intent):

    result = sum(v * r.get(k,0) for k,v in m)


    If the conditions above don't hold, you'll have to introduce me
    to your new math. ;-)

    -tkc
    Tim Chase, Aug 31, 2009
    #2
    1. Advertising

  3. vsoler

    vsoler Guest

    On Aug 31, 6:30 pm, Tim Chase <> wrote:
    > > After simplifying my problem, I can say that I want to get the sum of
    > > the product of two culumns:

    >
    > > Say
    > >          m= [[ 'a', 1], [ 'b', 2],[ 'a', 3]]

    >
    > assuming you meant ['c', 3] here...    ^>          r={'a':4, 'b':5, 'c':6}
    >
    > > What I need is the calculation

    >
    > >          1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26

    >
    > and you mean "3*6" here instead of "3*4", which is 18 instead of
    > 12, making the whole sum 4+10+18=32
    >
    > Then it sounds like you could do something like
    >
    >   result = sum(v * r[k] for k,v in m)
    >
    > where "m" is any arbitrary iterable of tuples.  If the keys (the
    > letters) aren't guaranteed to be in "r", then you can use
    > defaults (in this case "0", but could just as likely be "1"
    > depending on your intent):
    >
    >   result = sum(v * r.get(k,0) for k,v in m)
    >
    > If the conditions above don't hold, you'll have to introduce me
    > to your new math. ;-)
    >
    > -tkc


    Hello Tim,

    There is no mistake in my original post, so I really meant [ 'a', 3]

    Imagine that m contains time sheets of suppliers

    supplier 'a' has worked for you 1 hour
    supplier 'b' has worked for you 2 hour
    supplier 'a' has worked for you 3 hour

    Now

    supplier 'a' charges $4 per hour
    supplier 'b' charges $5 per hour
    supplier 'c' charges $6 per hour

    I want to know how much I will be charged this month by my pannel of
    suppliers.

    1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26

    This is what I am after.
    I expect all my suppliers to have handed me in advance the per hour
    fee. If at least one hasn't, I must know that the result is undefined.

    Hope this helps


    Vicente Soler
    vsoler, Aug 31, 2009
    #3
  4. vsoler

    Tim Chase Guest

    vsoler wrote:
    > On Aug 31, 6:30 pm, Tim Chase <> wrote:
    >>> After simplifying my problem, I can say that I want to get the sum of
    >>> the product of two culumns:
    >>> Say
    >>> m= [[ 'a', 1], [ 'b', 2],[ 'a', 3]]

    >> assuming you meant ['c', 3] here... ^> r={'a':4, 'b':5, 'c':6}
    >>
    >>> What I need is the calculation
    >>> 1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26

    >> and you mean "3*6" here instead of "3*4", which is 18 instead of
    >> 12, making the whole sum 4+10+18=32
    >>
    >> Then it sounds like you could do something like
    >>
    >> result = sum(v * r[k] for k,v in m)
    >>
    >> where "m" is any arbitrary iterable of tuples. If the keys (the
    >> letters) aren't guaranteed to be in "r", then you can use
    >> defaults (in this case "0", but could just as likely be "1"
    >> depending on your intent):
    >>
    >> result = sum(v * r.get(k,0) for k,v in m)
    >>
    >> If the conditions above don't hold, you'll have to introduce me
    >> to your new math. ;-)


    > There is no mistake in my original post, so I really meant [ 'a', 3]


    Ah...that makes more sense of the data. My answer still holds
    then. Use the r[k] version instead of the r.get(...) version,
    and it will throw an exception if the rate doesn't exist in your
    mapping. (a KeyError if you want to catch it)

    -tkc
    Tim Chase, Aug 31, 2009
    #4
  5. vsoler

    vsoler Guest

    On Aug 31, 6:59 pm, Tim Chase <> wrote:
    > vsoler wrote:
    > > On Aug 31, 6:30 pm, Tim Chase <> wrote:
    > >>> After simplifying my problem, I can say that I want to get the sum of
    > >>> the product of two culumns:
    > >>> Say
    > >>>          m= [[ 'a', 1], [ 'b', 2],[ 'a', 3]]
    > >> assuming you meant ['c', 3] here...    ^>          r={'a':4, 'b':5, 'c':6}

    >
    > >>> What I need is the calculation
    > >>>          1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26
    > >> and you mean "3*6" here instead of "3*4", which is 18 instead of
    > >> 12, making the whole sum 4+10+18=32

    >
    > >> Then it sounds like you could do something like

    >
    > >>   result = sum(v * r[k] for k,v in m)

    >
    > >> where "m" is any arbitrary iterable of tuples.  If the keys (the
    > >> letters) aren't guaranteed to be in "r", then you can use
    > >> defaults (in this case "0", but could just as likely be "1"
    > >> depending on your intent):

    >
    > >>   result = sum(v * r.get(k,0) for k,v in m)

    >
    > >> If the conditions above don't hold, you'll have to introduce me
    > >> to your new math. ;-)

    > > There is no mistake in my original post, so I really meant [ 'a', 3]

    >
    > Ah...that makes more sense of the data.  My answer still holds
    > then.  Use the r[k] version instead of the r.get(...) version,
    > and it will throw an exception if the rate doesn't exist in your
    > mapping.  (a KeyError if you want to catch it)
    >
    > -tkc


    It works!!!

    Thank you
    vsoler, Aug 31, 2009
    #5
  6. vsoler

    Paul Rubin Guest

    vsoler <> writes:
    > m= [[ 'a', 1], [ 'b', 2],[ 'a', 3]]
    > r={'a':4, 'b':5, 'c':6}
    >
    > What I need is the calculation
    >
    > 1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26


    sum(r[k]*w for k,w in m)
    Paul Rubin, Aug 31, 2009
    #6
  7. 31-08-2009 o 18:19:28 vsoler <> wrote:

    > Say
    > m= [[ 'a', 1], [ 'b', 2],[ 'a', 3]]
    > r={'a':4, 'b':5, 'c':6}
    >
    > What I need is the calculation
    >
    > 1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26
    >
    > That is, for each row list in variable 'm' look for its first element
    > in variable 'r' and multiply the value found by the second element in
    > row 'm'. After that, sum all the products.
    >
    > What's an efficient way to do it? I have thousands of these
    > calculations to make on a big data file.



    31-08-2009 o 18:30:27 Tim Chase <> wrote:

    > result = sum(v * r[k] for k,v in m)



    You can also check if this isn't more efficient:

    from itertools import starmap
    from operator import mul

    result = sum(starmap(mul, ((r[name], hour) for name, hour in m)))


    Or, if you had m in form of two lists:

    names = ['a', 'b', 'a']
    hours = [1, 2, 3]

    ....then you could do:

    from itertools import imap as map # <- remove if you use Py3.x
    from operator import mul

    result = sum(map(mul, map(r.__getitem__, names), hours))


    Cheers,
    *j

    PS. I've done a quick test on my computer (Pentium 4, 2.4Ghz, Linux):

    >>> setup = "from itertools import starmap, imap ; from operator import
    >>> mul; import random, string; names =
    >>> [rndom.choice(string.ascii_letters) for x in xrange(10000)]; hours =
    >>> [random.randint(1, 12) for x in xrange(1000)]; m = zip(names, hours);
    >>> workers = set(names); r = dict(zip(workers, (random.randint(1, 10) for
    >>> x in xrange(en(workers)))))"
    >>> tests = (

    .... 'sum(v * r[k] for k,v in m)',
    .... 'sum(starmap(mul, ((r[name], hour) for name, hour in m)))',
    .... 'sum(imap(mul, imap(r.__getitem__, names), hours))',
    .... )
    >>> for t in tests:

    .... print t
    .... timeit.repeat(t, setup, number=1000)
    .... print
    ....
    sum(v * r[k] for k,v in m)
    [6.2493009567260742, 6.1892399787902832, 6.2634339332580566]

    sum(starmap(mul, ((r[name], hour) for name, hour in m)))
    [9.3293819427490234, 10.280816078186035, 9.2766909599304199]

    sum(imap(mul, imap(r.__getitem__, names), hours))
    [5.7341709136962891, 5.5898380279541016, 5.7318859100341797]


    --
    Jan Kaliszewski (zuo) <>
    Jan Kaliszewski, Aug 31, 2009
    #7
  8. 31-08-2009 o 22:28:56 Jan Kaliszewski <> wrote:

    > >>> setup = "from itertools import starmap, imap ; from operator

    > import mul; import random, string; names = [rndom.choice(string.
    > ascii_letters) for x in xrange(10000)]; hours = [random.randint(
    > 1, 12) for x in xrange(1000)]; m = zip(names, hours); workers =
    > set(names); r = dict(zip(workers, (random.randint(1, 10) for x i
    > n xrange(en(workers)))))"


    Erratum -- should be:

    >>> setup = (

    ... 'from itertools import starmap, imap;'
    ... 'from operator import mul;'
    ... 'import random, string; names'
    ... ' = [random.choice(string.ascii_letters)'
    ... ' for x in xrange(10000)];'
    ... 'hours = [random.randint(1, 12)'
    ... for x in xrange(10000)];'
    ... 'm = zip(names, hours);'
    ... 'workers = set(names);'
    ... 'r = dict(zip(workers, (random.randint(1, 10)'
    ... ' for x in xrange(len(workers)))))'
    ... )

    --
    Jan Kaliszewski (zuo) <>
    Jan Kaliszewski, Aug 31, 2009
    #8
  9. vsoler

    John Nagle Guest

    vsoler wrote:
    > Hi,
    >
    > After simplifying my problem, I can say that I want to get the sum of
    > the product of two columns:
    >
    > Say
    > m= [[ 'a', 1], [ 'b', 2],[ 'a', 3]]
    > r={'a':4, 'b':5, 'c':6}
    >
    > What I need is the calculation
    >
    > 1*4 + 2*5 + 3*4 = 4 + 10 + 12 = 26


    You need a matrix package.

    Use "numpy", the Python numerics module, if you're trying to do
    operations on multidimensional arrays. In NumPy, you can extract
    columns, multiply them together, and take the sum.

    John Nagle
    John Nagle, Aug 31, 2009
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    5
    Views:
    355
  2. er
    Replies:
    7
    Views:
    2,078
    terminator
    Sep 5, 2007
  3. padma
    Replies:
    0
    Views:
    329
    padma
    Oct 3, 2007
  4. TG
    Replies:
    3
    Views:
    527
    Robert Kern
    Oct 16, 2007
  5. Replies:
    4
    Views:
    303
    Phrogz
    Nov 2, 2007
Loading...

Share This Page