A question on python performance.

Discussion in 'Python' started by Bruno Desthuilliers, Sep 23, 2007.

  1. Joe Goldthwaite a écrit :
    > Hi everyone,
    >
    > I'm a developer who's been using python for a couple of years. I wrote a
    > fairly large application using it but I was learning the language at the
    > same time so it most of the code kind of sucks.
    >
    > I've learned a lot since then and I've been going through my code trying to
    > organize it better and make better use of Python's features. I'm still not
    > an expert by any definition but I'm slowly getting better.
    >
    > I've been working on a trend class that takes twelve monthly numbers and
    > returns a period to date, quarter to date, year to date and quarterly year
    > to date numbers for a specific period. This worked but I ended up with a lot
    > of code like this;
    >
    > def getValue(trend, param, per):
    > if param == 'Ptd':
    > return trend.Ptd(per)
    > elif param == 'Qtd':
    > return trend.Qtd(per)
    > elif param == 'Ytd':
    > return trend.Ytd(per)
    > elif param == 'YtdQ':
    > return trend.YtdQ(per)


    The first obvious simplification is to replace this with:

    def getValue(trend, param, per):
    meth = getattr(trend, param)
    return meth(per)

    The main difference is that it will raise (instead of returning None) if
    param is not the name of a method of trend.

    The second simplification is to either get rid of getValue() (which is
    mostly useless).

    > The code gets kind of wordy


    indeed

    > so I started trying to figure out how to call
    > them dynamically since the param type is the same as the method the
    > retrieves it. I came up with this;
    >
    > def getValue(trend, param, per):
    > return trend.__class__.__dict__[param](trend, per)


    Note that this is not strictly equivalent:
    class Parent(object):
    def __init__(self, name):
    self.name = name
    def __repr__(self):
    return "<%s %s>" % (self.__class__.__name__, self.name)

    def dothis(self):
    return "parent.dothis %s" % self

    class Child(Parent):
    def dothis(self):
    return "Child.dothis %s" % self

    class OtherChild(Parent): pass

    def dothat(obj):
    return "dothat %s" % obj

    p = Parent('p')
    c1 = Child('c1')
    c2 = Child('c2')
    c2.dothis = dothat.__get__(c2, type(c2))
    o1 = OtherChild('o1');
    o2 = OtherChild('o2');
    o2.dothis = dothat.__get__(o2, type(o2))

    for obj in p, c1, c2, o1, o2:
    print "obj : %s" % obj
    print "direct call :"
    print obj.dothis()
    print "via obj.__class__.__dict__ :"
    try:
    print obj.__class__.__dict__["dothis"](obj)
    except KeyError, e:
    print "oops - key error: %s" % e
    print

    =>
    obj : <Parent p>
    direct call :
    parent.dothis <Parent p>
    via obj.__class__.__dict__ :
    parent.dothis <Parent p>

    obj : <Child c1>
    direct call :
    Child.dothis <Child c1>
    via obj.__class__.__dict__ :
    Child.dothis <Child c1>

    obj : <Child c2>
    direct call :
    dothat <Child c2>
    via obj.__class__.__dict__ :
    Child.dothis <Child c2>

    obj : <OtherChild o1>
    direct call :
    parent.dothis <OtherChild o1>
    via obj.__class__.__dict__ :
    oops - key error: 'dothis'

    obj : <OtherChild o2>
    direct call :
    dothat <OtherChild o2>
    via obj.__class__.__dict__ :
    oops - key error: 'dothis'


    IOW, direct access to obj.__class__.__dict__ bypasses both inheritence
    and per-instance overriding.

    > That worked but it seems like the above line would have to do lots more
    > object look ups at runtime so I didn't think it would be very efficient. I
    > thought maybe I could add a caller method to the trend class and I came up
    > with this;
    >
    > class trend:
    > ...
    > ...
    > ...
    > def caller(self, param, *args):
    > return self.__class__.__dict__[param](self, *args)
    >
    > This simplified the getValue function to this;
    >
    > def getValue(trend, param, per):
    > return trend.caller(param, per)


    Err... It actually means *more* lookup and function calls - and still
    fails to behave correctly wrt/ polymorphic dispatch.
     
    Bruno Desthuilliers, Sep 23, 2007
    #1
    1. Advertising

  2. Hi everyone,

    I'm a developer who's been using python for a couple of years. I wrote a
    fairly large application using it but I was learning the language at the
    same time so it most of the code kind of sucks.

    I've learned a lot since then and I've been going through my code trying to
    organize it better and make better use of Python's features. I'm still not
    an expert by any definition but I'm slowly getting better.

    I've been working on a trend class that takes twelve monthly numbers and
    returns a period to date, quarter to date, year to date and quarterly year
    to date numbers for a specific period. This worked but I ended up with a lot
    of code like this;

    def getValue(trend, param, per):
    if param == 'Ptd':
    return trend.Ptd(per)
    elif param == 'Qtd':
    return trend.Qtd(per)
    elif param == 'Ytd':
    return trend.Ytd(per)
    elif param == 'YtdQ':
    return trend.YtdQ(per)

    The code gets kind of wordy so I started trying to figure out how to call
    them dynamically since the param type is the same as the method the
    retrieves it. I came up with this;

    def getValue(trend, param, per):
    return trend.__class__.__dict__[param](trend, per)

    That worked but it seems like the above line would have to do lots more
    object look ups at runtime so I didn't think it would be very efficient. I
    thought maybe I could add a caller method to the trend class and I came up
    with this;

    class trend:
    ...
    ...
    ...
    def caller(self, param, *args):
    return self.__class__.__dict__[param](self, *args)

    This simplified the getValue function to this;

    def getValue(trend, param, per):
    return trend.caller(param, per)

    Out of curiosity, I thought I'd do some benchmarking and see which one
    performs the best. I executed three multiple times;

    loop one. Time=11.71 seconds;
    trend.Ptd(per)
    trend.Qtd(per)
    trend.Ytd(per)
    trend.YtdQ(per)

    loop two. 12.107 seconds;
    trend.__class__.__dict__['Ptd'](trend, per)
    trend.__class__.__dict__['Qtd'](trend, per)
    trend.__class__.__dict__['Ytd'](trend, per)
    trend.__class__.__dict__['YtdQ'](trend, per)

    loop three. 17.085 seconds;
    trend.caller('Ptd', per)
    trend.caller('Qtd', per)
    trend.caller('Ytd', per)
    trend.caller('YtdQ', per)

    The first surprise was how close the first and second loops were. I would
    have thought the first loop would be much faster. The second surprise was
    how much slower the third loop was. I know it has an extra call in there
    but other than that, it's doing basically the same thing as loop two. Is
    there that much overhead in making a class method call?

    Can anyone explain the differences?
     
    Joe Goldthwaite, Sep 26, 2007
    #2
    1. Advertising

  3. Bruno Desthuilliers

    Guest

    On Sep 26, 2:26 pm, "Joe Goldthwaite" <> wrote:
    > Hi everyone,
    >
    > I'm a developer who's been using python for a couple of years. I wrote a
    > fairly large application using it but I was learning the language at the
    > same time so it most of the code kind of sucks.
    >
    > I've learned a lot since then and I've been going through my code trying to
    > organize it better and make better use of Python's features. I'm still not
    > an expert by any definition but I'm slowly getting better.
    >
    > I've been working on a trend class that takes twelve monthly numbers and
    > returns a period to date, quarter to date, year to date and quarterly year
    > to date numbers for a specific period. This worked but I ended up with a lot
    > of code like this;
    >
    > def getValue(trend, param, per):
    > if param == 'Ptd':
    > return trend.Ptd(per)
    > elif param == 'Qtd':
    > return trend.Qtd(per)
    > elif param == 'Ytd':
    > return trend.Ytd(per)
    > elif param == 'YtdQ':
    > return trend.YtdQ(per)
    >
    > The code gets kind of wordy so I started trying to figure out how to call
    > them dynamically since the param type is the same as the method the
    > retrieves it. I came up with this;
    >
    > def getValue(trend, param, per):
    > return trend.__class__.__dict__[param](trend, per)
    >
    > That worked but it seems like the above line would have to do lots more
    > object look ups at runtime so I didn't think it would be very efficient. I
    > thought maybe I could add a caller method to the trend class and I came up
    > with this;
    >
    > class trend:
    > ...
    > ...
    > ...
    > def caller(self, param, *args):
    > return self.__class__.__dict__[param](self, *args)
    >
    > This simplified the getValue function to this;
    >
    > def getValue(trend, param, per):
    > return trend.caller(param, per)
    >
    > Out of curiosity, I thought I'd do some benchmarking and see which one
    > performs the best. I executed three multiple times;
    >
    > loop one. Time=11.71 seconds;
    > trend.Ptd(per)
    > trend.Qtd(per)
    > trend.Ytd(per)
    > trend.YtdQ(per)
    >
    > loop two. 12.107 seconds;
    > trend.__class__.__dict__['Ptd'](trend, per)
    > trend.__class__.__dict__['Qtd'](trend, per)
    > trend.__class__.__dict__['Ytd'](trend, per)
    > trend.__class__.__dict__['YtdQ'](trend, per)
    >
    > loop three. 17.085 seconds;
    > trend.caller('Ptd', per)
    > trend.caller('Qtd', per)
    > trend.caller('Ytd', per)
    > trend.caller('YtdQ', per)
    >
    > The first surprise was how close the first and second loops were. I would
    > have thought the first loop would be much faster. The second surprise was
    > how much slower the third loop was. I know it has an extra call in there
    > but other than that, it's doing basically the same thing as loop two. Is
    > there that much overhead in making a class method call?
    >
    > Can anyone explain the differences?


    Makes perfect sense to me! Think about it:

    method 1: looks up the method directly from the object (fastest)
    method 2: looks up __class__, then looks up __dict__, then gets the
    element from __dict__
    method 3: looks up caller, looks up __class__, looks up __dict__, gets
    element from __dict__

    To get the element directly from the object (method 1), Python has to
    internally check __class__.__dict__[element], which shows why method 1
    and method 2 are nearly the same speed. The last version has to look
    up caller in addition to the process described by method 2.

    The best way to do what you are doing:

    getattr(self, param)(self, *args)
     
    , Sep 26, 2007
    #3
  4. Bruno Desthuilliers

    Paul Hankin Guest

    On Sep 26, 7:26 pm, "Joe Goldthwaite" <> wrote:
    > The code gets kind of wordy so I started trying to figure out how to call
    > them dynamically since the param type is the same as the method the
    > retrieves it. I came up with this;
    >
    > def getValue(trend, param, per):
    > return trend.__class__.__dict__[param](trend, per)
    >
    > That worked but it seems like the above line would have to do lots more
    > object look ups at runtime so I didn't think it would be very efficient. I
    > thought maybe I could add a caller method to the trend class and I came up
    > with this;
    >
    > class trend:
    > ...
    > ...
    > ...
    > def caller(self, param, *args):
    > return self.__class__.__dict__[param](self, *args)
    >
    > This simplified the getValue function to this;
    >
    > def getValue(trend, param, per):
    > return trend.caller(param, per)


    You're calling a function (getValue) that just calls a method of trend
    (caller), that just calls another method of trend (Ptd or Qtd or ...).
    You can skip all these steps, and just call the method yourself: the
    code that calls getValue(trend, param, per) replace with
    trend.<something>(per) if you're calling getValue with a static value
    for param, or getattr(trend, param)(per) if param is dynamic.

    --
    Paul Hankin
     
    Paul Hankin, Sep 27, 2007
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. jm
    Replies:
    1
    Views:
    519
    alien2_51
    Dec 12, 2003
  2. Robert Brewer
    Replies:
    1
    Views:
    353
    Dave Benjamin
    Jan 10, 2004
  3. Lucas Hofman
    Replies:
    13
    Views:
    738
  4. cjl
    Replies:
    3
    Views:
    999
    John Nagle
    May 21, 2007
  5. Software Engineer
    Replies:
    0
    Views:
    347
    Software Engineer
    Jun 10, 2011
Loading...

Share This Page