Short-circuit Logic

A

Ahmed Abdulshafy

Hi,
I'm having a hard time wrapping my head around short-circuit logic that's used by Python, coming from a C/C++ background; so I don't understand why the following condition is written this way!>

if not allow_zero and abs(x) < sys.float_info.epsilon:
print("zero is not allowed")

The purpose of this snippet is to print the given line when allow_zero is False and x is 0.
 
R

Roy Smith

Ahmed Abdulshafy said:
Hi,
I'm having a hard time wrapping my head around short-circuit logic that's
used by Python, coming from a C/C++ background; so I don't understand why the
following condition is written this way!>

if not allow_zero and abs(x) < sys.float_info.epsilon:
print("zero is not allowed")

The purpose of this snippet is to print the given line when allow_zero is
False and x is 0.

I don't understand your confusion. Short-circuit evaluation works in
Python exactly the same way it works in C. When you have a boolean
operation, the operands are evaluated left-to-right, and evaluation
stops as soon as the truth value of the expression is known.

In C, you would write:

if (p && p->foo) {
blah();
}

to make sure that you don't dereference a null pointer. A similar
example in Python might be:

if d and d["foo"]:
blah()

which protects against trying to access an element of a dictionary if
the dictionary is None (which might happen if d was an optional argument
to a method and wasn't passed on this invocation).

But, none of that applies to your example. The condition is

not allow_zero and abs(x) < sys.float_info.epsilon:

it's safe to evaluate "abs(x) < sys.float_info.epsilon" no matter what
the value of "not allow_zero". For the purposes of understanding your
code, you can pretend that short-circuit evaluation doesn't exist!

So, what is your code doing that you don't understand?
 
S

Steven D'Aprano

Hi,
I'm having a hard time wrapping my head around short-circuit logic
that's used by Python, coming from a C/C++ background; so I don't
understand why the following condition is written this way!

if not allow_zero and abs(x) < sys.float_info.epsilon:
print("zero is not allowed")

Follow the logic.

If allow_zero is a true value, then "not allow_zero" is False, and the
"and" clause cannot evaluate to true. (False and X is always False.) So
print is not called.

If allow_zero is a false value, then "not allow_zero" is True, and the
"and" clause depends on the second argument. (True and X is always X.) So
abs(x) < sys.float_info.epsilon is tested, and if that is True, print is
called.

By the way, I don't think much of this logic. Values smaller than epsilon
are not necessarily zero:

py> import sys
py> epsilon = sys.float_info.epsilon
py> x = epsilon/10000
py> x == 0
False
py> x * 3 == 0
False
py> x + epsilon == 0
False
py> x + epsilon == epsilon
False

The above logic throws away many perfectly good numbers and treats them
as zero even though they aren't.

The purpose of this snippet is to print the given line when allow_zero
is False and x is 0.

Then the snippet utterly fails at that, since it prints the line for many
values of x which can be distinguished from zero. The way to test whether
x equals zero is:

x == 0

What the above actually tests for is whether x is so small that (1.0+x)
cannot be distinguished from 1.0, which is not the same thing. It is also
quite arbitrary. Why 1.0? Why not (0.0001+x)? Or (0.00000001+x)? Or
(10000.0+x)?
 
T

Terry Jan Reedy

if not allow_zero and abs(x) < sys.float_info.epsilon:
print("zero is not allowed")

The reason for the order is to do the easy calculation first and the
harder one only if the first passes.
 
R

Roy Smith

Terry Jan Reedy said:
The reason for the order is to do the easy calculation first and the
harder one only if the first passes.

This is a particularly egregious case of premature optimization. You're
worried about how long it takes to execute abs(x)? That's silly.
 
T

Terry Jan Reedy

This is a particularly egregious case of premature optimization. You're
worried about how long it takes to execute abs(x)? That's silly.

This is a particularly egregious case of premature response. You're
ignoring an extra name lookup and two extra attribute lookups. That's silly.

That's beside the fact that one *must* choose, so any difference is a
reason to act rather than being frozen like Buridan's ass.
http://en.wikipedia.org/wiki/Buridan's_ass

If you wish, replace 'The reason' with 'A reason'. I also the logical
flow as better with the order given.
 
S

Steven D'Aprano

This is a particularly egregious case of premature optimization. You're
worried about how long it takes to execute abs(x)? That's silly.

I don't think it's a matter of premature optimization so much as the
general principle "run code only if it needs to run". Hence, first you
check the flag to decide whether or not you care whether x is near zero,
and *only if you care* do you then check whether x is near zero.

# This is silly:
if x is near zero:
if we care:
handle near zero condition()

# This is better:
if we care:
if x is near zero
handle near zero condition()


Not only is this easier to understand because it matches how we do things
in the real life, but it has the benefit that if the "near zero"
condition ever changes to become much more expensive, you don't have to
worry about reordering the tests because they're already in the right
order.
 
C

Cameron Simpson

| On Sun, 26 May 2013 16:22:26 -0400, Roy Smith wrote:
|
| > In article <[email protected]>,
| >
| >> On 5/26/2013 7:11 AM, Ahmed Abdulshafy wrote:
| >>
| >> > if not allow_zero and abs(x) < sys.float_info.epsilon:
| >> > print("zero is not allowed")
| >>
| >> The reason for the order is to do the easy calculation first and the
| >> harder one only if the first passes.
| >
| > This is a particularly egregious case of premature optimization. You're
| > worried about how long it takes to execute abs(x)? That's silly.
|
| I don't think it's a matter of premature optimization so much as the
| general principle "run code only if it needs to run". Hence, first you
| check the flag to decide whether or not you care whether x is near zero,
| and *only if you care* do you then check whether x is near zero.
|
| # This is silly:
| if x is near zero:
| if we care:
| handle near zero condition()
|
| # This is better:
| if we care:
| if x is near zero
| handle near zero condition()
|
|
| Not only is this easier to understand because it matches how we do things
| in the real life, but it has the benefit that if the "near zero"
| condition ever changes to become much more expensive, you don't have to
| worry about reordering the tests because they're already in the right
| order.

I wouldn't even go that far, though nothing you say above is wrong.

Terry's assertion "The reason for the order is to do the easy
calculation first and the harder one only if the first passes" is
only sometimes that case, though well worth considering if the
second test _is_ expensive.

There are other reasons also. The first is of course your response,
that if the first test fails there's no need to even bother with
the second one. Faster, for free!

The second is that sometimes the first test is a guard against even
being able to perform the second test. Example:

if s is not None and len(s) > 0:
... do something with the non-empty string `s` ...

In this example, None is a sentinel value for "no valid string" and
calling "len(s)" would raise an exception because None doesn't have
a length.

With short circuiting logic you can write this clearly and intuitively in one line
without extra control structure like the nested ifs above.

Cheers,
 
R

rusi

I don't think it's a matter of premature optimization so much as the
general principle "run code only if it needs to run". Hence, first you
check the flag to decide whether or not you care whether x is near zero,
and *only if you care* do you then check whether x is near zero.

# This is silly:
if x is near zero:
    if we care:
        handle near zero condition()

# This is better:
if we care:
    if x is near zero
        handle near zero condition()

Not only is this easier to understand because it matches how we do things
in the real life, but it has the benefit that if the "near zero"
condition ever changes to become much more expensive, you don't have to
worry about reordering the tests because they're already in the right
order.

Three points:

3. These arguments are based on a certain assumption: that the inputs
are evenly distributed statistically.
If however that is not so, ie say:
"We-care" is mostly true
and
"x-is-near-zero" is more often false
then doing the near-zero test first would be advantageous

Well thats the 3rd point...

2. Nikalus Wirth deliberately did not use short-circuit boolean
operators in his languages because he found these kind of distinctions
to deteriorate into irrelevance and miss out the more crucial
questions of correctness

1. As Roy pointed out in his initial response to the OP:
"I dont understand your confusion... None of <the above> applies to
your example"
its not at all clear to me that anything being said has anything to do
with what the OP asked!
 
V

Vito De Tullio

Cameron said:
if s is not None and len(s) > 0:
... do something with the non-empty string `s` ...

In this example, None is a sentinel value for "no valid string" and
calling "len(s)" would raise an exception because None doesn't have
a length.

obviously in this case an `if s: ...` is more than sufficient :p
 
N

Nobody

I'm having a hard time wrapping my head around short-circuit logic that's
used by Python, coming from a C/C++ background; so I don't understand why
the following condition is written this way!>

if not allow_zero and abs(x) < sys.float_info.epsilon:
print("zero is not allowed")

The purpose of this snippet is to print the given line when allow_zero is
False and x is 0.

I don't understand your confusion. The above is directly equivalent to the
following C code:

if (!allow_zero && fabs(x) < DBL_EPSILON)
printf("zero is not allowed\n");

In either case, the use of short-circuit evaluation isn't necessary here;
it would work just as well with a strict[1] "and" operator.

Short-circuit evaluation is useful if the second argument is expensive to
compute, or (more significantly) if the second argument should not be
evaluated if the first argument is false; e.g. if x is a pointer then:

if (x && *x) ...

relies upon short-circuit evaluation to avoid dereferencing a null pointer.

On an unrelated note: the use of the "epsilon" value here is
almost certainly wrong. If the intention is to determine if the result of
a calculation is zero to within the limits of floating-point accuracy,
then it should use a value which is proportional to the values used in
the calculation.
 
A

Ahmed Abdulshafy

Hi,

I'm having a hard time wrapping my head around short-circuit logic that's used by Python, coming from a C/C++ background; so I don't understand why the following condition is written this way!>



if not allow_zero and abs(x) < sys.float_info.epsilon:

print("zero is not allowed")



The purpose of this snippet is to print the given line when allow_zero is False and x is 0.

Thank you guys! you gave me valuable insights! But regarding my original post, I don't know why for the past two days I was looking at the code *only* this way>
if ( not allow_zero and abs(x) ) < sys.float_info.epsilon:

I feel so stupid now :-/, may be it's the new syntax confusing me :)! Thanks again guys.
 
A

Ahmed Abdulshafy

Follow the logic.



If allow_zero is a true value, then "not allow_zero" is False, and the

"and" clause cannot evaluate to true. (False and X is always False.) So

print is not called.



If allow_zero is a false value, then "not allow_zero" is True, and the

"and" clause depends on the second argument. (True and X is always X.) So

abs(x) < sys.float_info.epsilon is tested, and if that is True, print is

called.



By the way, I don't think much of this logic. Values smaller than epsilon

are not necessarily zero:



py> import sys

py> epsilon = sys.float_info.epsilon

py> x = epsilon/10000

py> x == 0

False

py> x * 3 == 0

False

py> x + epsilon == 0

False

py> x + epsilon == epsilon

False



The above logic throws away many perfectly good numbers and treats them

as zero even though they aren't.









Then the snippet utterly fails at that, since it prints the line for many

values of x which can be distinguished from zero. The way to test whether

x equals zero is:



x == 0



What the above actually tests for is whether x is so small that (1.0+x)

cannot be distinguished from 1.0, which is not the same thing. It is also

quite arbitrary. Why 1.0? Why not (0.0001+x)? Or (0.00000001+x)? Or

(10000.0+x)?

That may be true for integers, but for floats, testing for equality is not always precise
 
N

Nobody

That may be true for integers,

What may be true for integers?
but for floats, testing for equality is not always precise

And your point is?

What Steven wrote is entirely correct: sys.float_info.epsilon is the
smallest value x such that 1.0 and 1.0+x have distinct floating-point
representations. It has no relevance for comparing to zero.
 
D

Dennis Lee Bieber

Thank you guys! you gave me valuable insights! But regarding my original post, I don't know why for the past two days I was looking at the code *only* this way>
if ( not allow_zero and abs(x) ) < sys.float_info.epsilon:
Ah... That's covered under "operator precedence" and not the
short-circuit evaluation rule.

Boolean "and" and "or" tend to come last in the parsing (bitwise &
and ^ come earlier, as I recall)

{Python 2.x}
 
A

Ahmed Abdulshafy

What may be true for integers?






And your point is?



What Steven wrote is entirely correct: sys.float_info.epsilon is the

smallest value x such that 1.0 and 1.0+x have distinct floating-point

representations. It has no relevance for comparing to zero.

He just said that the way to test for zero equality is x == 0, and I meant that this is true for integers but not necessarily for floats. And that's not specific to Python.
 
C

Carlos Nepomuceno

----------------------------------------
Date: Tue, 28 May 2013 01:39:09 -0700
Subject: Re: Short-circuit Logic
From: (e-mail address removed) [...]
What Steven wrote is entirely correct: sys.float_info.epsilon is the

smallest value x such that 1.0 and 1.0+x have distinct floating-point

representations. It has no relevance for comparing to zero.

He just said that the way to test for zero equality is x == 0, and I meant that this is true for integers but not necessarily for floats. And that's not specific to Python.


Have you read [1]? There's a section "Infernal Zero" that discuss this problem. I think it's very interesting to know! ;)

Just my 49.99999999999998¢! lol


[1] http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
 
M

Mark Lawrence

And that's not specific to Python.

Using google products is also not specific to Python. However whereever
it's used it's a PITA as people are forced into reading double spaced
crap. Please check out the link in my signature.
 
S

Steven D'Aprano

That may be true for integers, but for floats, testing for equality is
not always precise

Incorrect. Testing for equality is always precise, and exact. The problem
is not the *equality test*, but that you don't always have the number
that you think you have. The problem lies elsewhere, not equality!
Unfortunately, people who say "never test floats for equality" have
misdiagnosed the problem, or they are giving a simple work-around which
can be misleading to those who don't understand what is actually going on.

Any floating point libraries that support IEEE-754 semantics can
guarantee a few things, including:

x == 0.0 if, and only if, x actually equals zero.

This was not always the case for all floating point systems prior to
IEEE-754. In his forward to the Apple Numerics Manual, William Kahan
describes a Capriciously Designed Computer where 1/x can give a Division
By Zero error even though x != 0. Fortunately, if you are programming in
Python on Intel-compatible hardware, you do not have to worry about
nightmares like that.

Let me repeat that: in Python, you can trust that if x == 0.0 returns
False, then x is definitely not zero.

In any case, the test that you show is not a good test. I have already
shown that it wrongly treats many non-zero numbers which can be
distinguished from zero as if they were zero. But worse, it also fails as
a guard against numbers which cannot be distinguished from zero!

py> import sys
py> epsilon = sys.float_info.epsilon
py> x < epsilon # Is x so tiny it looks like zero?
False
py> y = 1e17 + x # x is not zero, so y should be > 1e17
py> 1/(1e17 - y)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: float division by zero


So as you can see, testing for "zero" by comparing to machine epsilon
does not save you from Zero Division errors.
 
S

Steven D'Aprano

He just said that the way to test for zero equality is x == 0, and I
meant that this is true for integers but not necessarily for floats. And
that's not specific to Python.

Can you show me a value of x where x == 0.0 returns False, but x actually
isn't zero?

Built-in floats only, if you subclass you can do anything you like:

class Cheating(float):
def __eq__(self, other):
return False
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top