is None or == None ?

Discussion in 'Python' started by mk, Nov 6, 2009.

  1. mk

    mk Guest

    Hello,

    Some claim that one should test for None using:

    if x is None:

    ...but the standard equality which is theoretically safer works as well:

    if x == None:

    So, which one is recommended?

    Can there be two None objects in interpreter's memory? Is testing for
    identity of some variable with None safe? Does language guarantee that?
    Or is it just property of implementation?

    Regards,
    mk
     
    mk, Nov 6, 2009
    #1
    1. Advertising

  2. mk, 06.11.2009 14:20:
    > Some claim that one should test for None using:
    >
    > if x is None:


    Which is the correct and safe way of doing it.


    > ..but the standard equality which is theoretically safer works as well:
    >
    > if x == None:


    Absolutely not safe, think of

    class Test(object):
    def __eq__(self, other):
    return other == None

    print Test() == None, Test() is None

    Stefan
     
    Stefan Behnel, Nov 6, 2009
    #2
    1. Advertising

  3. * mk:
    > Hello,
    >
    > Some claim that one should test for None using:
    >
    > if x is None:
    >
    > ..but the standard equality which is theoretically safer works as well:
    >
    > if x == None:
    >
    > So, which one is recommended?
    >
    > Can there be two None objects in interpreter's memory? Is testing for
    > identity of some variable with None safe? Does language guarantee that?
    > Or is it just property of implementation?


    As I understand it, 'is' will always work and will always be efficient (it just
    checks the variable's type), while '==' can depend on the implementation of
    equality checking for the other operand's class.


    Cheers & hth.,

    - Alf
     
    Alf P. Steinbach, Nov 6, 2009
    #3
  4. mk

    John Machin Guest

    On Nov 7, 12:35 am, "Alf P. Steinbach" <> wrote:
    > * mk:
    >
    > > Hello,

    >
    > > Some claim that one should test for None using:

    >
    > > if x is None:

    >
    > > ..but the standard equality which is theoretically safer works as well:

    >
    > > if x == None:

    >
    > > So, which one is recommended?

    >
    >
    > As I understand it, 'is' will always work and will always be efficient (it just
    > checks the variable's type),


    It doesn't check the type. It doesn't need to. (x is y) is true if x
    and y are the same object. If that is so, then of course (type(x) is
    type(y)) is true, and if not so, their types are irrelevant. "is"
    testing is very efficient in the CPython implementation: addressof(x)
    == addressof(y)
     
    John Machin, Nov 6, 2009
    #4
  5. Alf P. Steinbach wrote:

    > As I understand it, 'is' will always work and will always be efficient
    > (it just checks the variable's type), while '==' can depend on the
    > implementation of equality checking for the other operand's class.


    "== None" makes sense, for instance, in the context of the SQLAlchemy
    sql construction layer, where the underlying machinery defines __eq__()
    / __ne__() and generates the appropriate 'IS NULL' SQL code when
    appropriate.
     
    Marco Mariani, Nov 6, 2009
    #5
  6. mk

    mk Guest

    Stefan Behnel wrote:
    > mk, 06.11.2009 14:20:
    >> Some claim that one should test for None using:
    >>
    >> if x is None:

    >
    > Which is the correct and safe way of doing it.


    ok

    >> ..but the standard equality which is theoretically safer works as well:
    >>
    >> if x == None:

    >
    > Absolutely not safe, think of
    >
    > class Test(object):
    > def __eq__(self, other):
    > return other == None
    >
    > print Test() == None, Test() is None


    Err, I don't want to sound daft, but what is wrong in this example? It
    should work as expected:

    >>> class Test(object):

    .... def __eq__(self, other):
    .... return other == None
    ....
    >>> Test() is None

    False
    >>> Test() == None

    True

    My interpretation of 1st call is that it is correct: instance Test() is
    not None (in terms of identity), but it happens to have value equal to
    None (2nd call).

    Or perhaps your example was supposed to show that I should test for
    identity with None, not for value with None?

    That, however, opens a can of worms, sort of: whether one should compare
    Test() for identity with None or for value with None depends on what
    programmer meant at the moment.

    Regards,
    mk
     
    mk, Nov 6, 2009
    #6
  7. * John Machin:
    > On Nov 7, 12:35 am, "Alf P. Steinbach" <> wrote:
    >> * mk:
    >>
    >>> Hello,
    >>> Some claim that one should test for None using:
    >>> if x is None:
    >>> ..but the standard equality which is theoretically safer works as well:
    >>> if x == None:
    >>> So, which one is recommended?

    >>
    >> As I understand it, 'is' will always work and will always be efficient (it just
    >> checks the variable's type),

    >
    > It doesn't check the type.
    > It doesn't need to. (x is y) is true if x
    > and y are the same object. If that is so, then of course (type(x) is
    > type(y)) is true, and if not so, their types are irrelevant. "is"
    > testing is very efficient in the CPython implementation: addressof(x)
    > == addressof(y)


    Maybe.

    I imagined it wouldn't waste additional space for e.g. (Python 2.x) int values,
    but just use the same space as used for pointer in the case of e.g. string, in
    which case it would have to check the type -- an integer can very easily have
    the same bitpattern as a pointer residing there.

    If you imagine that instead, for an integer variable x it stores the integer
    value in the variable in some other place than ordinarily used for pointer, and
    let the pointer point to that place in the same variable, then without checking
    type the 'is' operator should report false for 'x = 3; y = 3; x is y', but it
    doesn't with my Python installation, so if it doesn't check the type then even
    this half-measure (just somewhat wasteful of space) optimization isn't there.

    In short, you're saying that there is an extreme inefficiency with every integer
    dynamically allocated /plus/, upon production of an integer by e.g. + or *,
    inefficiently finding the previously allocated integer of that value and
    pointing there, sort of piling inefficiency on top of inefficiency, which is
    absurd but I have seen absurd things before so it's not completely unbelievable.

    I hope someone else can comment on these implications of your statement.


    Cheers,

    - Alf
     
    Alf P. Steinbach, Nov 6, 2009
    #7
  8. Alf P. Steinbach wrote:

    > If you imagine that instead, for an integer variable x it stores the
    > integer value in the variable in some other place than ordinarily used
    > for pointer, and let the pointer point to that place in the same
    > variable, then without checking type the 'is' operator should report
    > false for 'x = 3; y = 3; x is y', but it doesn't with my Python


    Yes, CPython caches a handful of small, "commonly used" integers, and
    creates objects for them upon startup. Using "x is y" with integers
    makes no sense and has no guaranteed behaviour AFAIK

    > In short, you're saying that there is an extreme inefficiency with every
    > integer dynamically allocated /plus/, upon production of an integer by
    > e.g. + or *, inefficiently finding the previously allocated integer of
    > that value and pointing there,


    no, it doesn't "point there":

    >>>> a=1E6
    >>>> a is 1E6

    > False
    >>>> a=100
    >>>> a is 100

    > True
     
    Marco Mariani, Nov 6, 2009
    #8
  9. * Marco Mariani:
    > Alf P. Steinbach wrote:
    >
    >> If you imagine that instead, for an integer variable x it stores the
    >> integer value in the variable in some other place than ordinarily used
    >> for pointer, and let the pointer point to that place in the same
    >> variable, then without checking type the 'is' operator should report
    >> false for 'x = 3; y = 3; x is y', but it doesn't with my Python

    >
    > Yes, CPython caches a handful of small, "commonly used" integers, and
    > creates objects for them upon startup. Using "x is y" with integers
    > makes no sense and has no guaranteed behaviour AFAIK
    >
    >> In short, you're saying that there is an extreme inefficiency with
    >> every integer dynamically allocated /plus/, upon production of an
    >> integer by e.g. + or *, inefficiently finding the previously allocated
    >> integer of that value and pointing there,

    >
    > no, it doesn't "point there":
    >
    >>>>> a=1E6
    >>>>> a is 1E6

    >> False
    >>>>> a=100
    >>>>> a is 100

    >> True


    I stand corrected on that issue, I didn't think of cache for small values.

    On my CPython 3.1.1 the cache seems to support integer values -5 to +256,
    inclusive, apparently using 16 bytes of storage per value (this last assuming
    id() just returns the address).

    But wow. That's pretty hare-brained: dynamic allocation for every stored value
    outside the cache range, needless extra indirection for every operation.

    Even Microsoft COM managed to get this right.

    On the positive side, except that it would probably break every C module (I
    don't know), in consultant speak that's definitely a potential for improvement. :p


    Cheers,

    - Alf
     
    Alf P. Steinbach, Nov 6, 2009
    #9
  10. On Fri, 06 Nov 2009 08:54:53 -0800, Alf P. Steinbach <>
    wrote:

    > But wow. That's pretty hare-brained: dynamic allocation for every stored
    > value outside the cache range, needless extra indirection for every
    > operation.
    >


    Perhaps I'm not understanding this thread at all but how is dynamic
    allocation hare-brained, and what's the 'needless extra indirection'?



    --
    Rami Chowdhury
    "Never attribute to malice that which can be attributed to stupidity" --
    Hanlon's Razor
    408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
     
    Rami Chowdhury, Nov 6, 2009
    #10
  11. On Nov 6, 5:20 am, mk <> wrote:
    > Some claim that one should test for None using:
    >
    > if x is None:
    >
    > ..but the standard equality which is theoretically safer works as well:
    >
    > if x == None:
    >
    > So, which one is recommended?


    In the standard library, we use "x is None".

    The official recommendation in PEP 8 reads:
    '''
    Comparisons to singletons like None should always be done with
    'is' or 'is not', never the equality operators.

    Also, beware of writing "if x" when you really mean "if x is not
    None"
    -- e.g. when testing whether a variable or argument that
    defaults to
    None was set to some other value. The other value might have a
    type
    (such as a container) that could be false in a boolean context!
    '''


    Raymond
     
    Raymond Hettinger, Nov 6, 2009
    #11
  12. "Alf P. Steinbach" <> writes:

    > But wow. That's pretty hare-brained: dynamic allocation for every
    > stored value outside the cache range, needless extra indirection for
    > every operation.
    >
    > Even Microsoft COM managed to get this right.
    >
    > On the positive side, except that it would probably break every C
    > module (I don't know), in consultant speak that's definitely a
    > potential for improvement. :p


    Tagged integers have been tried, shown not really worth it, and
    ultimately rejected by the BDFL:

    http://mail.python.org/pipermail/python-dev/2004-July/thread.html#46139
     
    Hrvoje Niksic, Nov 6, 2009
    #12
  13. * Rami Chowdhury:
    > On Fri, 06 Nov 2009 08:54:53 -0800, Alf P. Steinbach <>
    > wrote:
    >
    >> But wow. That's pretty hare-brained: dynamic allocation for every
    >> stored value outside the cache range, needless extra indirection for
    >> every operation.
    >>

    >
    > Perhaps I'm not understanding this thread at all but how is dynamic
    > allocation hare-brained, and what's the 'needless extra indirection'?


    Dynamic allocation isn't hare-brained, but doing it for every stored integer
    value outside a very small range is, because dynamic allocation is (relatively
    speaking, in the context of integer operations) very costly even with a
    (relatively speaking, in the context of general dynamic allocation) very
    efficient small-objects allocator - here talking order(s) of magnitude.

    A typical scheme for representing dynamically typed objects goes like, in C++,

    enum TypeId { int_type_id, dyn_object_type_id };

    struct Object
    {
    int type_id;
    union
    {
    void* p;
    int i;
    // Perhaps other special cased type's values in this union.
    };
    };

    This would then be the memory layout of what's regarded as a variable at the
    script language level.

    Then getting the integer value reduces to

    int intValueOf( Object const& o )
    {
    if( o.type_id != int_type_id ) { throw TypeError(); }
    return o.i;
    }

    If on the other hand int (and perhaps floating point type, whatever) isn't
    special-cased, then it goes like

    int intValueOf( Object const& o )
    {
    if( o.type_id != int_type_id ) { throw TypeError(); }
    return static_cast<IntType*>( o.p )->value; // Extra indirection
    }

    and depending on where the basic type id is stored it may be more extra
    indirection, and worse, creating that value then involves a dynamic allocation.


    Cheers & hth.

    - Alf
     
    Alf P. Steinbach, Nov 6, 2009
    #13
  14. * Hrvoje Niksic:
    > "Alf P. Steinbach" <> writes:
    >
    >> But wow. That's pretty hare-brained: dynamic allocation for every
    >> stored value outside the cache range, needless extra indirection for
    >> every operation.
    >>
    >> Even Microsoft COM managed to get this right.
    >>
    >> On the positive side, except that it would probably break every C
    >> module (I don't know), in consultant speak that's definitely a
    >> potential for improvement. :p

    >
    > Tagged integers have been tried, shown not really worth it, and
    > ultimately rejected by the BDFL:
    >
    > http://mail.python.org/pipermail/python-dev/2004-July/thread.html#46139


    Yah, as I suspected. I looked at the first few postings in that thread and it
    seems an inefficient baroque implementation was created and tested, not
    realizing more than 50% speedup in a test not particularly much exercising its
    savings, and against that counts as mentioned in the thread and as I mentioned
    in quoted material above, breaking lots of existing C code.

    Speedup would likely be more realistic with normal implementation (not fiddling
    with bit-fields and stuff) not to mention when removing other inefficiencies
    that likely dwarf and hide the low-level performance increase, but still I agree
    wholeheartedly with those who argue compatibility, not breaking code.

    As long as it Works, don't fix it... ;-)


    Cheers, (still amazed, though)

    - Alf
     
    Alf P. Steinbach, Nov 6, 2009
    #14
  15. mk

    Carl Banks Guest

    On Nov 6, 9:28 am, "Alf P. Steinbach" <> wrote:
    > * Rami Chowdhury:
    >
    > > On Fri, 06 Nov 2009 08:54:53 -0800, Alf P. Steinbach <>
    > > wrote:

    >
    > >> But wow. That's pretty hare-brained: dynamic allocation for every
    > >> stored value outside the cache range, needless extra indirection for
    > >> every operation.

    >
    > > Perhaps I'm not understanding this thread at all but how is dynamic
    > > allocation hare-brained, and what's the 'needless extra indirection'?

    >
    > Dynamic allocation isn't hare-brained, but doing it for every stored integer
    > value outside a very small range is, because dynamic allocation is (relatively
    > speaking, in the context of integer operations) very costly even with a
    > (relatively speaking, in the context of general dynamic allocation) very
    > efficient small-objects allocator - here talking order(s) of magnitude.



    Python made a design trade-off, it chose a simpler implementation and
    uniform object semantic behavior, at a cost of speed. C# made a
    different trade-off, choosing a more complex implementation, a
    language with two starkly different object semantic behaviors, so as
    to allow better performance.

    You don't have to like the decision Python made, but I don't think
    it's fair to call a deliberate design trade-off hare-brained.


    Carl Banks
     
    Carl Banks, Nov 6, 2009
    #15
  16. On Fri, 06 Nov 2009 09:28:08 -0800, Alf P. Steinbach <>
    wrote:

    > * Rami Chowdhury:
    >> On Fri, 06 Nov 2009 08:54:53 -0800, Alf P. Steinbach <>
    >> wrote:
    >>
    >>> But wow. That's pretty hare-brained: dynamic allocation for every
    >>> stored value outside the cache range, needless extra indirection for
    >>> every operation.
    >>>

    >> Perhaps I'm not understanding this thread at all but how is dynamic
    >> allocation hare-brained, and what's the 'needless extra indirection'?

    >
    > Dynamic allocation isn't hare-brained, but doing it for every stored
    > integer value outside a very small range is, because dynamic allocation
    > is (relatively speaking, in the context of integer operations) very
    > costly even with a (relatively speaking, in the context of general
    > dynamic allocation) very efficient small-objects allocator - here
    > talking order(s) of magnitude.


    Well, sure, it may seem that way. But how large a cache would you want to
    preallocate? I can't see the average Python program needing to use the
    integers from -10000 to 10000, for instance. In my (admittedly limited)
    experience Python programs typically deal with rather more complex objects
    than plain integers.

    > int intValueOf( Object const& o )
    > {
    > if( o.type_id != int_type_id ) { throw TypeError(); }
    > return static_cast<IntType*>( o.p )->value; // Extra
    > indirection
    > }


    If a large cache were created and maintained, would it not be equally
    indirect to check for the presence of a value in the cache, and return
    that value if it's present?

    > creating that value then involves a dynamic allocation.


    Creating which value, sorry -- the type object?


    --
    Rami Chowdhury
    "Never attribute to malice that which can be attributed to stupidity" --
    Hanlon's Razor
    408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
     
    Rami Chowdhury, Nov 6, 2009
    #16
  17. * Carl Banks:
    > On Nov 6, 9:28 am, "Alf P. Steinbach" <> wrote:
    >> * Rami Chowdhury:
    >>
    >>> On Fri, 06 Nov 2009 08:54:53 -0800, Alf P. Steinbach <>
    >>> wrote:
    >>>> But wow. That's pretty hare-brained: dynamic allocation for every
    >>>> stored value outside the cache range, needless extra indirection for
    >>>> every operation.
    >>> Perhaps I'm not understanding this thread at all but how is dynamic
    >>> allocation hare-brained, and what's the 'needless extra indirection'?

    >> Dynamic allocation isn't hare-brained, but doing it for every stored integer
    >> value outside a very small range is, because dynamic allocation is (relatively
    >> speaking, in the context of integer operations) very costly even with a
    >> (relatively speaking, in the context of general dynamic allocation) very
    >> efficient small-objects allocator - here talking order(s) of magnitude.

    >
    >
    > Python made a design trade-off, it chose a simpler implementation


    Note that the object implementation's complexity doesn't have to affect to any
    other code since it's trivial to provide abstract accessors (even macros), i.e.,
    this isn't part of a trade-off except if the original developer(s) had limited
    resources -- and if so then it wasn't a trade-off at the language design level
    but a trade-off of getting things done then and there.


    > and uniform object semantic behavior,


    Also note that the script language level semantics of objects is /unaffected/ by
    the implementation, except for speed, i.e., this isn't part of a trade-off
    either. ;-)


    > at a cost of speed.


    In summary, the trade-off, if any, couldn't as I see it be what you describe,
    but there could have been a different kind of getting-it-done trade-off.

    It is usually better with Something Usable than waiting forever (or too long)
    for the Perfect... ;-)

    Or, it could be that things just evolved, constrained by frozen earlier
    decisions. That's the main reason for the many quirks in C++. Not unlikely that
    it's also that way for Python.


    > C# made a
    > different trade-off, choosing a more complex implementation, a
    > language with two starkly different object semantic behaviors, so as
    > to allow better performance.


    Don't know about the implementation of C#, but whatever it is, if it's bad in
    some respect then that has nothing to do with Python.


    > You don't have to like the decision Python made, but I don't think
    > it's fair to call a deliberate design trade-off hare-brained.


    OK. :)


    Cheers,

    - Alf
     
    Alf P. Steinbach, Nov 6, 2009
    #17
  18. mk

    Mel Guest

    Alf P. Steinbach wrote:
    > Note that the object implementation's complexity doesn't have to affect to
    > any other code since it's trivial to provide abstract accessors (even
    > macros), i.e., this isn't part of a trade-off except if the original
    > developer(s) had limited
    > resources -- and if so then it wasn't a trade-off at the language design
    > level but a trade-off of getting things done then and there.


    But remember what got us in here: your belief (which followed from your
    assumptions) that computing `is` required testing the object types. You
    might optimize out the "extra indirection" to get an object's value, but
    you'd need the "extra indirection" anyway to find out what type it was
    before you could use it.

    Mel.
     
    Mel, Nov 6, 2009
    #18
  19. On Fri, 06 Nov 2009 11:50:33 -0800, Alf P. Steinbach <>
    wrote:

    > * Rami Chowdhury:
    >> On Fri, 06 Nov 2009 09:28:08 -0800, Alf P. Steinbach <>
    >> wrote:
    >>
    >>> * Rami Chowdhury:
    >>>> On Fri, 06 Nov 2009 08:54:53 -0800, Alf P. Steinbach <>
    >>>> wrote:
    >>>>
    >>>>> But wow. That's pretty hare-brained: dynamic allocation for every
    >>>>> stored value outside the cache range, needless extra indirection for
    >>>>> every operation.
    >>>>>
    >>>> Perhaps I'm not understanding this thread at all but how is dynamic
    >>>> allocation hare-brained, and what's the 'needless extra indirection'?
    >>>
    >>> Dynamic allocation isn't hare-brained, but doing it for every stored
    >>> integer value outside a very small range is, because dynamic
    >>> allocation is (relatively speaking, in the context of integer
    >>> operations) very costly even with a (relatively speaking, in the
    >>> context of general dynamic allocation) very efficient small-objects
    >>> allocator - here talking order(s) of magnitude.

    >> Well, sure, it may seem that way. But how large a cache would you want
    >> to preallocate? I can't see the average Python program needing to use
    >> the integers from -10000 to 10000, for instance. In my (admittedly
    >> limited) experience Python programs typically deal with rather more
    >> complex objects than plain integers.

    >
    > Uhm, you've misunderstood or failed to understand something basic, but
    > what?


    Oh, I see, you were referring to a tagging scheme as an alternative. Sorry
    for the misunderstanding.

    >
    > Well it's an out-of-context quote, but t'was about creating the value
    > object that a variable contains a pointer to with the current CPython
    > implementation.
    >


    Again, perhaps I'm just misunderstanding what you're saying, but as I
    understand it, in CPython if you're looking for the value of a
    PyIntObject, that's stored right there in the structure, so no value
    object needs to be created...



    --
    Rami Chowdhury
    "Never attribute to malice that which can be attributed to stupidity" --
    Hanlon's Razor
    408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
     
    Rami Chowdhury, Nov 6, 2009
    #19
  20. * Mel:
    > Alf P. Steinbach wrote:
    >> Note that the object implementation's complexity doesn't have to affect to
    >> any other code since it's trivial to provide abstract accessors (even
    >> macros), i.e., this isn't part of a trade-off except if the original
    >> developer(s) had limited
    >> resources -- and if so then it wasn't a trade-off at the language design
    >> level but a trade-off of getting things done then and there.

    >
    > But remember what got us in here: your belief (which followed from your
    > assumptions) that computing `is` required testing the object types.


    Yes, I couldn't believe what I've now been hearing. Uh, reading. :)


    > You
    > might optimize out the "extra indirection" to get an object's value, but
    > you'd need the "extra indirection" anyway to find out what type it was
    > before you could use it.


    No, that type checking is limited (it just checks whether the type is special
    cased), doesn't involve indirection, and is there anyway except for 'is'. It can
    be moved around but it's there, or something more costly is there. 'is' is about
    the only operation you /can/ do without checking the type, but I don't see the
    point in optimizing 'is' at cost of all other operations on basic types.


    Cheers & hth.,

    - Alf
     
    Alf P. Steinbach, Nov 6, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. David Freeman
    Replies:
    8
    Views:
    7,653
    tcena9
    Feb 16, 2011
  2. length power
    Replies:
    2
    Views:
    92
    Rustom Mody
    Apr 10, 2014
  3. Skip Montanaro
    Replies:
    0
    Views:
    64
    Skip Montanaro
    Apr 10, 2014
  4. Johannes Schneider

    Re: why i have the output of [None, None, None]

    Johannes Schneider, Apr 10, 2014, in forum: Python
    Replies:
    0
    Views:
    54
    Johannes Schneider
    Apr 10, 2014
  5. Terry Reedy
    Replies:
    0
    Views:
    64
    Terry Reedy
    Apr 10, 2014
Loading...

Share This Page