Attack a sacred Python Cow

M

Matthew Woodcraft

Steven D'Aprano said:
"if x" is completely type agnostic. You can pass an object of any type to
it, and it will work. (Excluding objects with buggy methods, naturally.)

There are many circumstances where if a parameter is None I'd rather
get an exception than have the code carry on with the 'empty container'
branch (and silently give me a meaningless result).

-M-
 
M

Matthew Woodcraft

Ben Finney said:
No, he retracted the *insult* and restated the *advice* as a distinct
statement. I think it's quite worthwhile to help people see the
difference.

Ben, it was quite clear from Anders' post that he knows about
__nonzero__ . That's why the so-called advice is insulting. The
original phrasing was just the icing on the cake.

-M-
 
C

Carl Banks

It's pretty elementary, and people thought just describing the issue of
polymorphism and duck-typing was sufficient to explain it. Since it
apparently isn't:

Let's say you come up with some kind of custom sequence class. You want
to act like any native sequence type (list, tuple, array, string, etc.)
in all reasonable ways (length testing, iteration, indexing, etc.) so
that it can be used in place of these things in code that doesn't
require explicit types. You know, standard polymorphism and duck-typing.

So you want a test for whether your custom sequence isn't empty. To
create an "simple, explicit test" would be defined an `isntEmpty` method
that you can call, like so:

if myObject.isntEmpty():
# then do something

However, this wouldn't be polymorphic since now someone would have to
call a "simple, explicit test" that doesn't exist on all the other
sequence-like objects. Therefore, you've broken polymorphism.

The solution is to override the `__nonzero__` method so that you can use
Boolean testing, just like all the other sequence-like objects:

if myObject:
# then do the same thing

Now people who use your custom sequence type don't have to write special
code, and code written to deal with sequences using duck typing (which
is typically nearly all Python code) don't have to know anything special
about your custom sequence class.

Bzzt. "if len(x)!=0" is a simple explicit that would work for this
class and all built-in containers. (Or should--Steven D'Aprano's
objections notwithstanding, any reasonable container type should
support this invariant. From a language design standpoint, an "empty"
builtin could have been created to simplify this even more, but since
there isn't one len(x)!=0 will have to do.)

Let me reframe the question to see if we can make some headway.

The vast majority of true/false tests fit into one of the following
four categories:

1. Testing the explicit result of a boolean operation (obviously)
2. Testing whether a numeric type is nonzero.
3. Testing whether a container type is empty.
4. Testing whether you have a (non-numeric, non-container) object of
some sort, or None.

There's a few other cases, but let's start with these to keep things
simple and add other cases as necessary.

We already know that cases 2, 3, and 4 can, individually, be converted
to a simple explicit test (using x!=0, len(x)!=0, and x is not None,
respectively). As long as you know which kind of object you're
expecting, you can convert the implicit to an explicit test.

Now, you guys keep whining "But what if you don't know what kind of
object you're expecting?!!" It's a fair question, and my belief is
that, in practice, this almost never happens. Duck typing happens
between numeric types often, and between container types often, but
almost never between both numeric and container types. Their usages
are simply too different.

So I present another question to you: Give me an useful, non-trivial,
example of some code that where x could be either a numeric or
container type. That would be the first step to finding a
counterexample. (The next step would be to show that it's useful to
use "if x" in such a context.)


Once again, I'm invoking the contraint against simply using x in a
boolean context, or passing x to a function expecting a boolean
doesn't count, since in those cases x can be set to the result of the
explicit test. So none of this:

def nand(a,b): return not (a and b)

Or anything trivial like this:

def add(a,b): return a+b


Carl Banks
 
H

Heiko Wundram

Also, just a couple of points:

1. Any container type that returns a length that isn't exactly the
number of elements in it is broken.

I agree, but how do you ever expect to return an infinite element count?
The direction I took in that recipe was not returning some "magic" value
but raising an OverflowError (for example, you could've also cropped the
length at 2**31-1 as meaning anything equal to or larger). This is the
thing that breaks your explicit test for non-emptyness using len(x) > 0,
but it's also the only possible thing to do if you want to return the
number of elements exactly where possible and inform the user when not
(and OverflowError should make the point clear).

Anyway, that's why there is a separate member function which is explicitly
documented to return a magic value in case of an infinite set (i.e., -1)
and an exact element count otherwise, but using that (you could write
x.len() != 0 for the type in question to test for non-emptiness) breaks
polymorphism.
2. The need for __nonzero__ in this case depends on a limitation in
the language.

True, but only for sets with are finite. For an infinite set, as I said
above: what would you want __len__() to return? There is no proper
interpretation of __len__() for an infinite set, even though the set is
non-empty, except if you introduced the magic value infinity into Python
(which I did as -1 for my "personal" length protocol).
3. On the other hand, I will concede that sometimes calculating len is
a lot more expensive than determining emptiness, and at a basic level
it's important to avoid these costs. You have found a practical use
case for __nonzero__.

This is just a somewhat additional point I was trying to make; the main
argument are the two points you see above.
However, I'd like to point out the contrasting example of numpy
arrays. For numpy arrays, "if x" fails (it raises an exception) but
"if len(x)!=0" succeeds.

The only sane advice for dealing with nonconformant classes like numpy
arrays or your interger set is to be wary of nonconformances and don't
expect polymorphism to work all the time.

The thing is: my integer set type IS conformant to the protocols of all
other sequence types that Python offers directly, and as such can be used
in any polymorphic function that expects a sequence type and doesn't test
for the length (because of the obvious limitation that the length might
not have a bound), but only for emptiness/non-emptiness. It's the numpy
array that's non-conformant (at least from what you're saying here; I
haven't used numpy yet, so I can't comment).
So I guess I'll concede that in the occasional cases with
nonconformant classes the "if x" might help increase polymorphism a
little.

(BTW: here's another little thing to think about: the "if x" is useful
here only because there isn't an explicit way to test emptiness
without len.)

The thing being, again, as others have already stated: __nonzero__() IS
the explicit way to test non-emptiness of a container (type)! If I wanted
to make things more verbose, I'd not use "if len(x)>0", but "if bool(x)"
anyway, because "casting" to a boolean calls __nonzero__(). "if len(x)>0"
solves a different problem (even though in set theory the two are
logically similar), and might not apply to all container types because of
the restrictions on the return value of __len__(), which will always exist.

--- Heiko.
 
C

Carl Banks

As I wrote in the second reply email I sent, check out my integer set
recipe on ASPN (and to save you the search: http://code.activestate.com/recipes/466286/).

Couple points:

1. Any container type that returns a length that isn't exactly the
number of elements in it is broken.
2. The need for __nonzero__ in this case depends on a limitation in
the language.
3. On the other hand, I will concede that sometimes calculating len is
a lot more expensive than determining emptiness, and at a basic level
it's important to avoid these costs. You have found a practical use
case for __nonzero__.

However, I'd like to point out the contrasting example of numpy
arrays. For numpy arrays, "if x" fails (it raises an exception) but
"if len(x)!=0" succeeds.

The only sane advice for dealing with nonconformant classes like numpy
arrays or your interger set is to be wary of nonconformances and don't
expect polymorphism to work all the time.

So I guess I'll concede that in the occasional cases with
nonconformant classes the "if x" might help increase polymorphism a
little.

(BTW: here's another little thing to think about: the "if x" is useful
here only because there isn't an explicit way to test emptiness
without len.)

Carl Banks
 
G

giltay

I'm not going to take your word for it.  Do you have code that
demonstrates how "if x" improves polymorphism relative to simple
explicit tests?

Carl Banks

I think "if x" works in some cases, but not in others. Here's an
example I just made up and a counterexample from my own code.

Example:

Here's a function, print_members. It's just something that takes some
iterable and prints its members in XML. It's special-cased so that an
empty iterable gets an empty tag. (This is a bit of a trivial
example, I admit; the point is that the empty iterable is a special
case.)

def print_members(iterable):
if not iterable:
print '<members />'
return
print '<members>'
for item in iterable:
print '<item>%s</item>' % item
print ' said:
print_members(['a', 'b', 'c'])
<members>
<item>a</item>
<item>b</item>
<item>c</item>

<members />

I could have used "if len(iterable) == 0:" for these cases, but only
because the length of a list means something. Say I were to implement
a matrix class. The length of a matrix isn't well-defined, so I won't
implement __len__. I will, however, implement __nonzero__, since if
it has no members, it's empty.

class SimpleMatrix(object):

def __init__(self, *rows):
self.rows = rows

def __nonzero__(self):
for row in self.rows:
if row:
return True
return False

def __iter__(self):
return iter(sum(self.rows, []))
a = SimpleMatrix([1, 2, 3], [4, 5, 6])
print_members(a)
<members>
<item>1</item>
<item>2</item>
<item>3</item>
<item>4</item>
<item>5</item>
<item>6</item>
b = SimpleMatrix([],[],[])
len(b)

Traceback (most recent call last):

<members />

So print_members can work on iterables that have no len(), and handle
the special case of an empty iterable, as long as __nonzero__ is
implemented.

Counterexample:

While "if x" works well in some circumstances, I don't like using it
for purely numeric types. For instance, I have a mutable Signal class
for doing some basic DSP. Among other things, I can apply a DC offset
to the signal (it just adds the offset to all the samples). I have a
special case for an offset of 0 so I don't have to loop through all
the samples (I also have a two-pass remove_offset method that
subtracts the average; if it's already properly centred, I can skip a
step).

class Signal:
[...]
def dc_offset(self, amount):
if amount == 0:
return
self.samples = [sample + amount for sample in self.samples]

Here, "if amount == 0" is deliberate. At no point should I be adding
an offset that's a list or a dict, even an empty one.
Signal.dc_offset should raise an exception if I try to do that,
because that indicates there's a bug somewhere. If I do pass in [] or
{}, that test will fail, and it will try to add the list or dict to
the samples, at which point I get a TypeError.

Geoff G-T
 
M

Matthew Fitzgibbons

Carl said:
Much like in Steven D'Aprano's example, still the only actual code
snippet I've seen, it seems that this can easily be done with a simple
explicit test by having all no-advance filters return None and testing
with "if x is not None". So it doesn't pass my criterion of being not
replaceable with simple explicit test.

Maybe that's not workable for some reason. Perhaps if you'd post a
code example that shows this, rather than just talking about it, you
might be more persuasive.


Carl Banks

The no-advance filters have to return the object because I don't just
forget about it; I evaluate whether I pass it to the next filter or drop
it in a completely different queue for use in the next stage of the
operation. True means 'I'm ready to move on to the next stage,' False
means 'Do the filter thing some more.'

Furthermore, the argument that I should just change my API to make a
'simple test' work is not very convincing. The natural, obvious way for
a filter to work is to pass through the data it operates on; why on
Earth would it return None? I want to DO something with the data. In
this case, make a decision about where to pass the data next. In Java,
to accomplish this I would have to do lots of introspection and value
checking (adding more any time I came up with a new kind of input), or
make a new kind of interface that gives me a method so I can do a
'simple test' (including wrappers for ints and arrays and anything else
I decide to pass in down the road). But Python supports duck typing and
gives me a handy __nonzero__ method; I can rebind __nonzero__ in my
filters for my own classes, and ints and lists are handled how I want
them to be by default. So why jump through hoops instead of just using
'if x'?

I don't have any postable code (it's in a half way state and I haven't
touched it for a while), but I'll see if I can't find the time to bang
something up to give you the gist.

-Matt
 
S

Steven D'Aprano

I'm not going to take your word for it. Do you have code that
demonstrates how "if x" improves polymorphism relative to simple
explicit tests?


On the rapidly decreasing chance that you're not trolling (looking more
and more unlikely every time you post):

# The recommended way:
if x:
do_something


# Carl's so-called "simple explicit tests" applied to polymorphic code:
try:
# could be a sequence or mapping?
# WARNING: must do this test *before* the number test, otherwise
# "if [] != 0" will return True, leading to the wrong branch being
# taken.
if len(x) != 0:
do_something
except AttributeError:
# not a sequence or mapping, maybe it's a number of some sort
try:
int(x)
except TypeError:
# not convertable to numbers
# FIXME: not really sure what to do here for arbitrary types
# so fall back on converting to a boolean, and hope that works
if bool(x):
do_something
else:
if x != 0:
do_something


But wait... that can be re-written as follows:

if bool(x):
do_something

and that can be re-written without the call to bool:

if x:
do_something
 
S

Steven D'Aprano

Much like in Steven D'Aprano's example, still the only actual code
snippet I've seen, it seems that this can easily be done with a simple
explicit test by having all no-advance filters return None and testing
with "if x is not None".

Yet again you ignore the actual functional requirements of the code, and
made vague claims that if we change the function to another function that
behaves differently, we can replace our simple code with your more
complex code, and it (probably) won't break. Even if we could redesign
the functional requirements of the code, why on earth would we want to?
 
S

Steven D'Aprano

There are many circumstances where if a parameter is None I'd rather get
an exception than have the code carry on with the 'empty container'
branch (and silently give me a meaningless result).

Sure. The way I'd handle that is test for None at the start of the
function, and raise an error there, rather than deal with it at some
arbitrary part of the function body.
 
S

Steven D'Aprano

Ben, it was quite clear from Anders' post that he knows about
__nonzero__ . That's why the so-called advice is insulting. The original
phrasing was just the icing on the cake.

Anders wrote:

"But then you decide to name the method "__nonzero__", instead of some
nice descriptive name?"

That suggests to me that Anders imagined that __nonzero__ is something I
just made up, instead of a standard Python method. What does it suggest
to you?
 
S

Steven D'Aprano

Bzzt. "if len(x)!=0" is a simple explicit that would work for this
class and all built-in containers. (Or should--Steven D'Aprano's
objections notwithstanding, any reasonable container type should support
this invariant.

What's the length of an empty list with a sentinel at the end?

What's the length of a binary tree? Does it even have a length?

What's the length of a constant-sized table where each position might be
in use or free?

You're making assumptions about what an empty object "should" look like.
I say, just ask the object if it's empty, don't try to guess how you
would find out. Maybe the object carries around a flag x.isempty. Who
knows? Why do you care? You're basing your code on implementation
details. That should be avoided whenever possible.

[...]
Once again, I'm invoking the contraint against simply using x in a
boolean context,

So, to put it another way... you want us to give you an example of using
x in a boolean context, but you're excluding any example where we use x
in a boolean context.

Right-o, I've had enough of your trolling. I'm out of here.
 
E

Ethan Furman

Matthew said:
Ben, it was quite clear from Anders' post that he knows about
__nonzero__ . That's why the so-called advice is insulting. The
original phrasing was just the icing on the cake.

-M-

I got just the opposite -- it seems quite clear to me that Anders did
*not* know about __nonzero__, and perhaps doesn't know about
double-underscore functions in general...

Here's his quote:
> Okay, so you have this interesting object property that you often need
> to test for, so you wrap the code for the test up in a method, because
> that way you only need to write the complex formula once. I'm with
> you so far. But then you decide to name the method "__nonzero__",
> instead of some nice descriptive name? What's up with that?

His last question, "What's up with that?" is the indicator. His
comments after that, such as
> Even if we find out that C.__nonzero__ is called, what was it that
> __nonzero__ did again?

reinforce the impression that he is unaware of the double-underscore
functions and what they do and how they work. One can find out when
they are called with a simple search of the python documentation.
> Better dig up the class C documentation and find out, because there is
> no single obvious interpretation of what is means for an object to
> evaluate to true.

If you are using somebody else's code, and maybe even your own, you
should always check the docs if you don't know/can't remember what a
function does.
~Ethan~
 
M

Matthew Woodcraft

Steven D'Aprano said:
Anders wrote:
That suggests to me that Anders imagined that __nonzero__ is something I
just made up, instead of a standard Python method. What does it suggest
to you?

That he thinks using __nonzero__ like this decreases readability.

He also wrote
In comparison, I gather you would write something like this:
class C:
def __nonzero__(self):
return len(self.method()[self.attribute]) > -1
...
c = get_a_C()
if c:
...

If he had imagined that __nonzero__ was something you just made up, the
second-last line there would have read
if c.__nonzero__()

-M-
 
T

Terry Reedy

Carl said:
Couple points:

1. Any container type that returns a length that isn't exactly the
number of elements in it is broken.
2. The need for __nonzero__ in this case depends on a limitation in
the language.
3. On the other hand, I will concede that sometimes calculating len is
a lot more expensive than determining emptiness, and at a basic level
it's important to avoid these costs. You have found a practical use
case for __nonzero__.

I thought of another one: testing whether an iterator is 'empty' (will
raise StopIteration on the next next() (3.0) call) or not. As virtual
collections, iterators generally have neither __len__ or __bool__. But
__bool__ (but only __bool__) can be added to any iterator by wrapping it
with something like the following 3.0 code (not tested):

class look_ahead_it():
def __init__(self, iterable):
self.it = iter(iterable)
self.fill_next()

def __iter__(self):
return self
def __next__(self):
tem = self.next
if tem is self.empty:
raise StopIteration
else:
self.fill_next()
return tem

empty = object()
def fill_next(self)
try:
self.next = next(self.it)
except StopIteration:
self.next = self.empty

def __bool__(self):
return self.next is not self.empty
 
C

Carl Banks

That you choose not to test for non-emptiness doesn't change the fact
that it's already a builtin part of the language that is supported by
all fundamental types and is overridable by anyone writing a custom
type. Use it or don't use it, but it's an example of precisely what you
were asking for that is both practical and already in widespread use.

That's not what I was asking for. I was asking for a use case for "if
x" that can't be replaced by a simple explicit test. Your example
didn't satisfy that.


Carl Banks
 
C

Carl Banks

On the rapidly decreasing chance that you're not trolling (looking more
and more unlikely every time you post):

# The recommended way:
if x:
do_something

# Carl's so-called "simple explicit tests" applied to polymorphic code:

No, the following isn't my way.

try:
# could be a sequence or mapping?
# WARNING: must do this test *before* the number test, otherwise
# "if [] != 0" will return True, leading to the wrong branch being
# taken.
if len(x) != 0:
do_something
except AttributeError:
# not a sequence or mapping, maybe it's a number of some sort
try:
int(x)
except TypeError:
# not convertable to numbers
# FIXME: not really sure what to do here for arbitrary types
# so fall back on converting to a boolean, and hope that works
if bool(x):
do_something
else:
if x != 0:
do_something

I say that you will never, ever have to do this because there isn't a
do_something that's actually useful for all these types.

Ok, tell me, oh indignant one, what is do_something? What could
possibly be the contents of do_something such that it actually does
something useful? If you can tell me what do_something is you will
have answered my question.


Carl Banks
 
R

Russ P.

Are you actually this stupid? I mean, you were entertaining while you
were mouthing of and insulting your betters, but now you're gonna
complain the second anyone insults you (and I mean, 'boy' - what an
insult!). Never mind that you're never gonna get off your ass to
write a PEP, which would be rejected on language design grounds anyway
(as demonstrated by alex23's link - the one you aren't
comprehending). The most irritating thing is that I like the idea of
being able to use '.x = 10' type notation (and have been for a long
time), but the person arguing for it is an insufferable buffoon who's
too dense to understand a cogent argument, never mind make one. So
great, thanks, the chances of this (or a VB 'with'-like 'using'
keyword) ever making it into the language get smaller every time you
fire up your keyboard. Nice work.

Iain

p.s. am looking forward to your post whining about the invalid reasons
your PEP got rejected, in the slim hope you actually write one.


+1 POTW

Thanks for the gentle prod! I'm on it -- well, soon anyway. I have to
consciously avoid thinking about this post whenever I consume a
beverage!
 
C

Carl Banks

It's a completely artificial request.

Bull. This is a request, that, if satisfied, would prove that "if x"
is more polymorphic than a simple explicit test. I posed the question
precisely to see if anyone could come up with a use case that shows
this benefit of "if x".

"if x" _is_ a completely simple
test. Simpler, in fact, than the ones you were advocating.

It's not explicit.

Here's what we know for sure.
1. "if x" uses fewer keystrokes than an explicit test
2. "if x" creates fewer nodes in the parse tree than an explicit test
3. Couple minor things we somehow managed to uncover in this thread

Is that it? Is that all the benefits of "if x"? This is what I want
to establish. A bunch of people in this thread are crowing that it
helps polymorphism, but I have only seen minor examples of it.

I've explained why I doubt that it helps polymorphism that much: you
almost never see code for which an integer and list both work, so
having the ability to spell a test the same way for both types isn't
useful. If you claim that "if x" does help polymorphism, please tell
me what's wrong with the above analysis.

Or just give me the use case I asked for.


Carl Banks
 
C

Carl Banks

I thought of another one: testing whether an iterator is 'empty' (will
raise StopIteration on the next next() (3.0) call) or not. As virtual
collections, iterators generally have neither __len__ or __bool__. But
__bool__ (but only __bool__) can be added to any iterator by wrapping it
with something like the following 3.0 code (not tested):

class look_ahead_it():
def __init__(self, iterable):
self.it = iter(iterable)
self.fill_next()

def __iter__(self):
return self
def __next__(self):
tem = self.next
if tem is self.empty:
raise StopIteration
else:
self.fill_next()
return tem

empty = object()
def fill_next(self)
try:
self.next = next(self.it)
except StopIteration:
self.next = self.empty

def __bool__(self):
return self.next is not self.empty

Iterators are funny: if there's any reason you should not use "if x"
it's because of them. Built-in iterators are always true, so if
you're writing a function that accepts an iterable you should never
use the "if x" to test whether it's empty, because it fails for a
whole class of iterables.

However, given that that wart exists, your example does work for "if
x" and not with "if len(x)!=0".

Then again, it really only works to accommodate faulty code, because
no code that expects an iterable should be using that test in the
first place. (Unless you wrap every iterable as soon as you get it,
but then you're not bound to use __nonzero__.)


Carl Banks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top