Difference between 'is' and '=='

F

Felipe Almeida Lessa

Em Seg, 2006-03-27 às 21:05 -0500, Dan Sommers escreveu:
Right off the top of my head, I can't think of a way to make "a = b; a
is b" return False.

Sorry for being so --quiet. I will try to be more --verbose.

I can think of two types of constants:
1) Those defined in the language, like True, None, 0 and the like.
2) Those defined on your code.

You said type 1 can be used with "is", you're right:False

I said type 2 can (maybe "should"?) be used with "is", and AFAICT I'm
right as well:True

That said, you can do thinks like:True

That kind of constants can be used with "is". But if don't want to be
prone to errors as I do, use "is" only when you really know for sure
that you're dealing with singletons.

HTH,
 
A

alex23

Felipe said:
I said [constants defined in your code] can (maybe "should"?) be used with "is", and
AFAICT I'm right as well:True

You should _never_ use 'is' to check for equivalence of value. Yes, due
to the implementation of CPython the behaviour you quote above does
occur, but it doesn't mean quite what you seem to think it does.

Try this:

Comparing a changing variable to a pre-defined constant seems a lot
more general a use case than sequential binding & comparison...and as
this should show, 'is' does _not_ catch these cases.

- alex23
 
A

Antoon Pardon

Op 2006-03-27 said:
But even better style is just `foo' or `not foo'. Or not,
depending on what you're thinking.

No it is not. When you need None to be treated special,
that doesn't imply you want to treat zero numbers or empty
sequences as special too.
 
F

Felipe Almeida Lessa

Em Seg, 2006-03-27 às 23:02 -0800, alex23 escreveu:
Felipe said:
I said [constants defined in your code] can (maybe "should"?) be used with "is", and
AFAICT I'm right as well:
b = a
b is a
True

You should _never_ use 'is' to check for equivalence of value. Yes, due
to the implementation of CPython the behaviour you quote above does
occur, but it doesn't mean quite what you seem to think it does.

/me not checking for value. I'm checking for identity. Suppose "a" is a
constant. I want to check if "b" is the same constant.
Try this:


Comparing a changing variable to a pre-defined constant seems a lot
more general a use case than sequential binding & comparison...and as
this should show, 'is' does _not_ catch these cases.

That's *another* kind of constant. I gave you the example of
socket.AF_UNIX, the kind of constant I'm talking about. Are you going to
sequentially create numbers until you find it? Of couse not.

The problem with Python (and other languages like Jave) is that we don't
have a type like an enum (yet) so we have to define constants in our
code. By doing an "is" instead of a "==" you *can* catch some errors.
For example, a very dummy function (picked MSG_EOR as its value is
greater than 99):

---
from socket import MSG_EOR, MSG_WAITALL

def test(type):
if type is MSG_EOR:
print "This *is* MSG_EOR"
elif type == MSG_EOR:
print "This maybe be MSG_EOR"
else:
print "*Not MSG_EOR"
---

Now testing it:
This *is* MSG_EOR

Fine, but:
This maybe be MSG_EOR

This is a mistake. Here I knew 128 == MSG_EOR, but what if that was a
coincidence of some other function I created? I would *never* catch that
bug as the function that tests for MSG_EOR expects any integer. By
testing with "is" you test for *that* integer, the one defined on your
module and that shouldn't go out of it anyway.

Of course using an enum should make all said here obsolete.
 
J

Joel Hedlund

This does *not* also mean constants and such:

I didn't mean that kind of constant. I meant named constants with defined
meaning, as in the example that I cooked up in my post. More examples: os.R_OK,
or more complex ones like mymodule.DEFAULT_CONNECTION_CLASS.

Sorry for causing unneccessary confusion.

Cheers!
/Joel Hedlund
 
J

Joel Hedlund

You should _never_ use 'is' to check for equivalence of value. Yes, due
/me not checking for value. I'm checking for identity. Suppose "a" is a
constant. I want to check if "b" is the same constant.

/me too. That's what my example was all about. I was using identity to a known
CONSTANT (in caps as per python naming conventions :) to sidestep costly value
equality computations.
By doing an "is" instead of a "==" you *can* catch some errors.
<snip>
By
testing with "is" you test for *that* integer, the one defined on your
module and that shouldn't go out of it anyway.

I totally agree with you on this point. Anything that helps guarding against
"stealthed" errors is a good thing by my standards.

Cheers!
/Joel Hedlund
 
J

Joel Hedlund

Not those kind of constants, but this one:
Python 2.4.2 (#2, Nov 20 2005, 17:04:48)
[GCC 4.0.3 20051111 (prerelease) (Debian 4.0.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
CONST = 123456789
a = CONST
a == CONST
True

a is CONST

True

That's a little misleading, and goes back to the questions of "what is
assignment in Python?" and "What does it mean for an object to be
mutable?"

The line "a = CONST" simply gives CONST a new name. After that, "a is
CONST" will be True no matter what CONST was. Under some circumstances,
I can even change CONST, and "a is CONST" will *still* be True.

Anyone who thinks it's a good idea to change a CONST that's not in a module
that they have full control over must really know what they're doing or suffer
the consequences. Most often, the consequences will be nasty bugs.

Cheers!
/Joel Hedlund
 
J

Joel Hedlund

a is None
is quicker than

a == None

I think it's not such a good idea to focus on speed gains here, since they
really are marginal (max 2 seconds total after 10000000 comparisons):
2.48372101784

Your observation is certainly correct, but I think it's better applied to more
complex comparisons (say for example comparisons between gigantic objects or
objects where value equality determination require a lot of nontrivial
computations). That's where any real speed gains can be found. PEP8 tells me
it's better style to write "a is None" and that's good enough for me. Otherwise
I try to stay away from speed microoptimisations as much as possible since it
generally results in less readable code, which in turn often results in an
overall speed loss because code maintenance will be harder.

Cheers!
/Joel Hedlund
 
P

Peter Hansen

Joel said:
I didn't mean that kind of constant. I meant named constants with defined
meaning, as in the example that I cooked up in my post. More examples: os.R_OK,
or more complex ones like mymodule.DEFAULT_CONNECTION_CLASS.

If it weren't for the current CPython optimization (caching small
integers) this code which it appears you would support writing, would fail:

if (flags & os.R_OK) is os.R_OK:
# do something

while this, on the other hand, is not buggy, because it correctly uses
equality comparison when identity comparison is not called for:

if (flags & os.R_OK) == os.R_OK:
# do something

(I think you should give it up... you're trying to push a rope.)

-Peter
 
S

Steven D'Aprano

I try to stay away from speed microoptimisations as much as possible since it
generally results in less readable code, which in turn often results in an
overall speed loss because code maintenance will be harder.

+1 QOTW
 
R

Ross Ridge

Felipe said:
That said, you can do thinks like:
True

That kind of constants can be used with "is". But if don't want to be
prone to errors as I do, use "is" only when you really know for sure
that you're dealing with singletons.

It's only safe to to compare address family values with socket.AF_UNIX
using "is", if small integers are guaranteed to be singletons, and
socket.AF_UNIX has one of those small values. Otherwise, address
family values equal in value to socket.AF_UNIX can be generated using
different objects. There's no requirement that the socket module or
anything else return values using the same object that the
socket.AF_UNIX constant uses.

Consider this example using the socket.IPPROTO_RAW constant:
socket.getaddrinfo("localhost", None, socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)[0][2] is socket.IPPROTO_RAW
False
socket.getaddrinfo("localhost", None, socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)[0][2] == socket.IPPROTO_RAW
True

Ross Ridge
 
F

Felipe Almeida Lessa

Em Ter, 2006-03-28 às 15:18 -0800, Ross Ridge escreveu:
[snip]
Consider this example using the socket.IPPROTO_RAW constant:
socket.getaddrinfo("localhost", None, socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)[0][2] is socket.IPPROTO_RAW
False
socket.getaddrinfo("localhost", None, socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)[0][2] == socket.IPPROTO_RAW
True

Ok, you win. It's not safe to do "is" checks on these kinds of
constants.
 
J

Joel Hedlund

If it weren't for the current CPython optimization (caching small
integers)

This has already been covered elsewhere in this thread. Read up on it.
this code which it appears you would support writing

if (flags & os.R_OK) is os.R_OK:

I do not.

You compare a module.CONSTANT to the result of an expression (flags & os.R_OK).
Expressions are not names bound to objects, the identity of which is what I'm
talking about. This example does not apply. Also, the identity check in my
example has a value equality fallback. Yours doesn't, so it really does not apply.
> (I think you should give it up... you're trying to push a rope.)

I'm not pushing anything. I just don't like being misquoted.

Cheers,
Joel Hedlund
 
J

Joel Hedlund

There's no requirement that the socket module or
anything else return values using the same object that the
socket.AF_UNIX constant uses.

Ouch. That's certainly an eyeopener.

For me, this means several things, and I'd really like to hear people's
thoughts about them.

It basically boils down to "don't ever use 'is' unless pushed into a corner,
and nevermind what PEP8 says about it".

So here we go... *takes deep breath*

Identity checks can only be done safely to compare a variable to a defined
builtin singleton such as None. Since this is only marginally faster than a
value equality comparison, there is little practical reason for doing so.
(Except for the sake of following PEP8, more of that below).

You cannot expect to ever have identity between a value returned by a
function/method and a CONSTANT defined in the same package/module, if you do
not have comlete control over that module. Therefore, such identity checks
should always be given a value equality fallback. In most cases the identity
check will not be significantly faster than a value equality check, so for the
sake of readability it's generally a good idea to skip the identity check and
just do a value equality check directly. (Personally, I don't think it's good
style to define constants and not be strict about how you use them, but that's
on a side note and not very relevant to this discussion)

It may be a good idea to use identity checks for variables vs CONSTANTs defined
in the same module/package, if it's Your module/package and you have complete
control over it. Felipe Almeida Lessa provided a good argument for this earlier
in this thread:
Here I knew 128 == MSG_EOR, but what if that was a
coincidence of some other function I created? I would *never* catch that
bug as the function that tests for MSG_EOR expects any integer. By
testing with "is" you test for *that* integer, the one defined on your
module and that shouldn't go out of it anyway.

However it may be a bad idea to do so, since it may lure you into a false sense
of security, so you may start to unintentionally misuse 'is' in an unsafe manner.

So the only motivated use of 'is' would then be the one shown in my first
example with the massive_computations() function: as a shortcut past costly
value equality computations where the result is known, and with an added value
equality fallback for safety. Preferably, the use of identity should then also
be motivated in a nearby comment.

My conlusion is then that using 'is' is a bad habit and leads to less readable
code. You should never use it, unless it leads to a *measurable* gain in
performance, in which it should also be given a value equality fallback and a
comment. And lastly, PEP8 should be changed to reflect this.

Wow... that got a bit long and I applaud you for getting this far! :) Thanks
for taking the time to read it.

So what are your thoughts about this, then?

Cheers!
/Joel Hedlund
 
F

Fredrik Lundh

Joel said:
For me, this means several things, and I'd really like to hear people's
thoughts about them.

It basically boils down to "don't ever use 'is' unless pushed into a corner,
and nevermind what PEP8 says about it".
nonsense.

Identity checks can only be done safely to compare a variable to a defined
builtin singleton such as None.

utter nonsense.
You cannot expect to ever have identity between a value returned by a
function/method and a CONSTANT defined in the same package/module, if you do
not have comlete control over that module.

or if the documentation guarantees that you can use "is" (e.g. by specifying
that you get back the object you passed in, or by specifying that a certain
object is a singleton, etc).
Therefore, such identity checks should always be given a value equality
fallback.

if the documentation guarantees that you can use "is", you don't need any
"value equality fallback".
My conlusion is then that using 'is' is a bad habit and leads to less readable
code. You should never use it, unless it leads to a *measurable* gain in
performance, in which it should also be given a value equality fallback and a
comment. And lastly, PEP8 should be changed to reflect this.

Wow... that got a bit long and I applaud you for getting this far! :) Thanks
for taking the time to read it.

So what are your thoughts about this, then?

you need to spend more time relaxing, and less time making up arbitrary
rules for others to follow.

read the PEP and the documentation. use "is" when you want object identity,
and you're sure it's the right thing to do. don't use it when you're not sure.
any other approach would be unpythonic.

</F>
 
M

Max M

Joel said:
Ouch. That's certainly an eyeopener.

For me, this means several things, and I'd really like to hear people's
thoughts about them.

It basically boils down to "don't ever use 'is' unless pushed into a
corner, and nevermind what PEP8 says about it".



Identity checks are often used for checking input parameters of a function:

def somefunc(val=None):
if val is None:
val = []
do_stuff(val)



Or if None is a possible parameter you can use your own object as a marker::

_marker = []

def somefunc(val=_marker):
if val is marker:
val = []
do_stuff(val)



--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Phone: +45 66 11 84 94
Mobile: +45 29 93 42 96
 
J

Joel Hedlund

For me, this means several things, and I'd really like to hear people's
you need to spend more time relaxing, and less time making up arbitrary
rules for others to follow.

I'm very relaxed, thank you. I do not make up rules for others to follow. I ask
for other peoples opinions so that I can reevaluate my views.

I do respect your views, as I clearly can see you have been helpful and
constructive in earlier discussions in this newsgroup. So therefore if you
think my statements are nonsense, there's a good chance you're right. And
that's why I posted. To hear what other people think.

Sorry if I came off stiff and belligerent because that certainly wasn't ny intent.
> read the PEP and the documentation.

Always do.
> use "is" when you want object identity,
> and you're sure it's the right thing to do. don't use it when you're not sure.
> any other approach would be unpythonic.

Right.

Chill!
/Joel Hedlund
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,117
Latest member
Matilda564
Top