Re: fastest way to detect a user type

Discussion in 'Python' started by Steven D'Aprano, Feb 1, 2009.

  1. Robin Becker wrote:

    > Whilst considering a port of old code to python 3 I see that in several
    > places we are using type comparisons to control processing of user
    > instances (as opposed to instances of built in types eg float, int, str)
    >
    > I find that the obvious alternatives are not as fast as the current
    > code; func0 below. On my machine isinstance seems slower than type for
    > some reason. My 2.6 timings are


    First question is, why do you care that it's slower? The difference between
    the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. If you
    call the slowest function one million times, your code will run less than a
    second longer.

    Does that really matter, or are you engaged in premature optimization? In
    your test functions, the branches all execute "pass". Your real code
    probably calls other functions, makes calculations, etc, which will all
    take time. Probably milliseconds rather than microseconds. I suspect you're
    concerned about a difference of 0.1 of a percent, of one small part of your
    entire application. Unless you have profiled your code and this really is a
    bottleneck, I recommend you worry more about making your code readable and
    maintainable than worrying about micro-optimisations.

    Even more important that being readable is being *correct*, and I believe
    that your code has some unexpected failure modes (bugs). See below:



    > so func 3 seems to be the fastest option for the case when the first
    > test matches, but is poor when it doesn't. Can anyone suggest a better
    > way to determine if an object is a user instance?
    >
    > ##############################
    > from types import InstanceType


    I believe this will go away in Python 3, as all classes will be New Style
    classes.


    > class X:
    > __X__=True


    This is an Old Style class in Python 2.x, and a New Style class in Python 3.

    Using hasattr('__X__') is a curious way of detecting what you want. I
    suppose it could be argued that it is a variety of duck-typing: "if it has
    a duck's bill, it must be a duck". (Unless it is a platypus, of course.)
    However, attribute names with leading and trailing double-underscores are
    reserved for use as "special methods". You should rename it to something
    more appropriate: _MAGIC_LABEL, say.


    > class V(X):
    > pass
    >
    > def func0(ob):
    > t=type(ob)
    > if t is InstanceType:
    > pass


    This test is too broad. It will succeed for *any* old-style class, not just
    X and V instances. That's probably not what you want.

    It will also fail if ob is an instance of a New Style class. Remember that
    in Python 3, all classes become new-style.


    > elif t in (float, int):
    > pass


    This test will fail if ob is a subclass of float or int. That's almost
    certainly the wrong behavior. A better way of writing that is:

    elif issubclass(t, (float, int)):
    pass


    > else:
    > pass
    >
    > def func1(ob):
    > if isinstance(ob,X):
    > pass


    If you have to do type checking, that's the recommended way of doing so.



    > elif type(ob) in (float, int):
    > pass


    The usual way to write that is:

    if isinstance(ob, (float, int)):
    pass



    Hope this helps,


    --
    Steven
     
    Steven D'Aprano, Feb 1, 2009
    #1
    1. Advertising

  2. Steven D'Aprano

    Paul Rubin Guest

    Steven D'Aprano <> writes:
    > First question is, why do you care that it's slower? The difference between
    > the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond.


    That's a 71% speedup, pretty good if you ask me.

    > If you call the slowest function one million times, your code will
    > run less than a second longer.


    What if you call it a billion times, or a trillion times, or a
    quadrillion times, you see where this is going? If you're testing
    100-digit numbers, there are an awful lot of them before you run out.
     
    Paul Rubin, Feb 1, 2009
    #2
    1. Advertising

  3. Paul Rubin wrote:

    > Steven D'Aprano <> writes:
    >> First question is, why do you care that it's slower? The difference
    >> between the fastest and slowest functions is 1.16-0.33 = 0.83
    >> microsecond.

    >
    > That's a 71% speedup, pretty good if you ask me.


    Don't you care that the code is demonstrably incorrect? The OP is
    investigating options to use in Python 3, but the fastest method will fail,
    because the "type is InstanceType" test will no longer work. (I believe the
    fastest method, as given, is incorrect even in Python 2.x, as it will
    accept ANY old-style class instead of just the relevant X or V classes.)

    That reminds me of something that happened to my wife some years ago: she
    was in a van with her band's roadies, and one asked the driver "Are you
    sure you know where you're going?", to which the driver replied, "Who
    cares? We're making great time." (True story.)

    If you're going to accept incorrect code in order to save time, then I can
    write even faster code:

    def func4(ob):
    pass

    Trying beating that for speed!


    >> If you call the slowest function one million times, your code will
    >> run less than a second longer.

    >
    > What if you call it a billion times, or a trillion times, or a
    > quadrillion times, you see where this is going?


    It doesn't matter. The proportion of time saved will remain the same. If you
    run it a trillion times, you'll save 12 minutes in a calculation that takes
    278 hours to run. Big Effing Deal. Saving such trivial amounts of time is
    not worth the cost of hard-to-read or incorrect code.

    Of course, if you have profiled your code and discovered that *significant*
    amounts of time are being used in type-testing, *then* such a
    micro-optimization may be worth doing. But I already allowed for that:

    "Does that really matter...?"
    (the answer could be Yes)

    "Unless you have profiled your code and this really is a bottleneck ..."
    (it could be)


    > If you're testing
    > 100-digit numbers, there are an awful lot of them before you run out.


    Yes. So what? Once you've tested them, then what? If *all* you are doing
    them is testing them, your application is pretty boring. Even a print
    statement afterwards is going to take 1000 times longer than doing the
    type-test. In any useful application, the amount of time used in
    type-testing is almost surely going to be a small fraction of the total
    runtime. A 71% speedup on 50% of the runtime is significant; but a 71%
    speedup on 0.1% of the total execution time is not.



    --
    Steven
     
    Steven D'Aprano, Feb 1, 2009
    #3
  4. Robin Becker wrote:

    > Steven D'Aprano wrote:
    >> Paul Rubin wrote:
    >>
    >>> Steven D'Aprano <> writes:
    >>>> First question is, why do you care that it's slower? The difference
    >>>> between the fastest and slowest functions is 1.16-0.33 = 0.83
    >>>> microsecond.
    >>> That's a 71% speedup, pretty good if you ask me.

    >>
    >> Don't you care that the code is demonstrably incorrect? The OP is
    >> investigating options to use in Python 3, but the fastest method will
    >> fail, because the "type is InstanceType" test will no longer work. (I
    >> believe the fastest method, as given, is incorrect even in Python 2.x, as
    >> it will accept ANY old-style class instead of just the relevant X or V
    >> classes.)

    >
    > I'm not clear why this is true? Not all instances will have the __X__
    > attribute or has something else changed in Python3?


    The func0() test doesn't look for __X__.


    > The original code was intended to be called with only a subset of all
    > class instances being passed as argument; as currently written it was
    > unsafe because an instance of an arbitrary old class would pass into
    > branch 1. Of course it will still be unsafe as arbitrary instances end
    > up in branch 3
    >
    > The intent is to firm up the set of cases being accepted in the first
    > branch. The problem is that when all instances are new style then
    > there's no easy check for the other acceptable arguments eg float,int,
    > str etc,


    Of course there is.

    isinstance(ob, (float, int))

    is the easy, and correct, way to check if ob is a float or int.


    > as I see it, the instances must be of a known class or have a
    > distinguishing attribute.


    Are you sure you need to check for different types in the first place? Just
    how polymorphic is your code, really? It's hard to judge because I don't
    know what your code actually does.


    > As for the timing, when I tried the effect of func1 on our unit tests I
    > noticed that it slowed the whole test suite by 0.5%.


    An entire half a percent slower. Wow.

    That's like one minute versus one minute and 0.3 second. Or one hour, versus
    one hour and 18 seconds. I find it very difficult to get worked up over
    such small differences. I think you're guilty of premature optimization:
    wasting time and energy trying to speed up parts of the code that are
    trivial. (Of course I could be wrong, but I doubt it.)



    > Luckily func 3
    > style improved things by about 0.3% so that's what I'm going for.


    I would call that the worst solution. Not only are you storing an attribute
    which is completely redundant (instances already know what type they are,
    you don't need to manually store a badge on them to mark them as an
    instance of a class), but you're looking up this attribute only to
    immediately throw away the value you get. The only excuse for this extra
    redirection would be if it were significantly faster. But it isn't: you
    said it yourself, 0.3% speed up. That's like 60 seconds versus 59.82
    seconds.


    --
    Steven
     
    Steven D'Aprano, Feb 1, 2009
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. yurps

    fastest way to change type

    yurps, Apr 13, 2005, in forum: ASP .Net
    Replies:
    4
    Views:
    738
    Karl Seguin
    Apr 13, 2005
  2. Bredahl jensen
    Replies:
    7
    Views:
    13,873
    John Timney \(ASP.NET MVP\)
    Jun 10, 2005
  3. Robin Becker

    correct way to detect container type

    Robin Becker, Oct 7, 2004, in forum: Python
    Replies:
    16
    Views:
    500
    Peter L Hansen
    Oct 8, 2004
  4. Dun Peal
    Replies:
    2
    Views:
    266
    Carl Banks
    Oct 18, 2010
  5. libsfan01
    Replies:
    3
    Views:
    113
Loading...

Share This Page