assertions to validate function parameters

M

Matthew Wilson

Lately, I've been writing functions like this:

def f(a, b):

assert a in [1, 2, 3]
assert b in [4, 5, 6]

The point is that I'm checking the type and the values of the
parameters.

I'm curious how this does or doesn't fit into python's duck-typing
philosophy.

I find that when I detect invalid parameters overtly, I spend less time
debugging.

Are other people doing things like this? Any related commentary is
welcome.

Matt
 
G

George Sakkis

Lately, I've been writing functions like this:

def f(a, b):

assert a in [1, 2, 3]
assert b in [4, 5, 6]

The point is that I'm checking the type and the values of the
parameters.

I'm curious how this does or doesn't fit into python's duck-typing
philosophy.

I find that when I detect invalid parameters overtly, I spend less time
debugging.

Are other people doing things like this? Any related commentary is
welcome.

Matt

Well, duck (or even static for that matter) typing can't help you if
you're checking for specific *values*, not types. The real question is
rather, is f() intended to be used in the "outside world" (whatever
this might be; another program, library, web service, etc.) or is it to
be called only by you in a very controlled fashion ?

In the first case, passing an invalid input is a *user error*, not a
*programming error*. Assertions should be used for programming errors
only to assert invariants, statements which should be correct no matter
what; their violations mean that the code is buggy and should be fixed
asap.

User errors OTOH should be handled by explicit checking and raising
appropriate exceptions, e.g. ValueError or a subclass of it. There are
several reasons for this but a very practical one is that a user can
turn off the assertions by running python with '-O' or '-OO'.
Optimization flags should never change the behavior of a program, so
using assertions for what's part of the normal program behavior
(validating user-provided input) is wrong.

George
 
S

Steven D'Aprano

Lately, I've been writing functions like this:

def f(a, b):

assert a in [1, 2, 3]
assert b in [4, 5, 6]

The point is that I'm checking the type and the values of the
parameters.

If somebody passes in a == MyNumericClass(2), which would have worked
perfectly fine except for the assert, your code needlessly raises an
exception.

Actually that's a bad example, because surely MyNumericClass(2) would
test equal to int(2) in order to be meaningful. So, arguably, testing that
values fall within an appropriate range is not necessarily a bad idea, but
type-testing is generally bad.

Note also that for real code, a bare assert like that is uselessly
uninformative:
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AssertionError


This is better:
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AssertionError: x must be equal to three but is 1 instead


This is even better still:
.... raise ValueError("x must be equal to three but is %s instead" % x)
....
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: x must be equal to three but is 1 instead


And even better still is to move that test out of your code and put it
into unit tests (if possible).


I'm curious how this does or doesn't fit into python's duck-typing
philosophy.

Doesn't fit, although range testing is probably okay.

I find that when I detect invalid parameters overtly, I spend less time
debugging.

Yes, probably. But you end up with less useful code:

def double(x):
"""Return x doubled."""
assert x == 2.0 and type(x) == float
return 2*x

Now I only need to test one case, x == 2.0. See how much testing I don't
have to do? *wink*

There's a serious point behind the joke. The less your function does, the
more constrained it is, the less testing you have to do -- but the less
useful it is, and the more work you put onto the users of your function.
Instead of saying something like

a = MyNumericClass(1)
b = MyNumericClass(6)
# more code in here...
# ...
result = f(a, b)


you force them to do this:

a = MyNumericClass(1)
b = MyNumericClass(6)
# more code in here...
# ...
# type-cast a and b to keep your function happy
result = f(int(a), int(b))
# and type-cast the result to what I want
result = MyNumericClass(result)

And that's assuming that they can even do that sort of type-cast without
losing too much information.

Are other people doing things like this? Any related commentary is
welcome.

Generally speaking, type-checking is often just a way of saying "My
function could work perfectly with any number of possible types, but I
arbitrarily want it to only work with these few, just because."

Depending on what you're trying to do, there are lots of strategies for
avoiding type-tests: e.g. better to use isinstance() rather than type,
because that will accept subclasses. But it doesn't accept classes that
use delegation.

Sometimes you might have a series of operations, and you want the lot to
either succeed or fail up front, and not fail halfway through (say, you're
modifying a list and don't want to make half the changes needed). The
solution to that is to check that your input object has all the methods
you need:

def f(s):
"""Do something with a string-like object."""
try:
upper = s.upper
split = s.split
except AttributeError:
raise TypeError('input is not sufficiently string-like')
return upper()

Good unit tests will catch anything type and range tests will catch, plus
a whole lot of other errors, while type-testing and range-testing will
only catch a small percentage of bugs. So if you're relying on type- and
range-testing, you're probably not doing enough testing.
 
M

Matthew Woodcraft

Steven D'Aprano said:
The less your function does, the more constrained it is, the less
testing you have to do -- but the less useful it is, and the more work
you put onto the users of your function. Instead of saying something
like
a = MyNumericClass(1)
b = MyNumericClass(6)
# more code in here...
# ...
result = f(a, b)
you force them to do this:
a = MyNumericClass(1)
b = MyNumericClass(6)
# more code in here...
# ...
# type-cast a and b to keep your function happy
result = f(int(a), int(b))
# and type-cast the result to what I want
result = MyNumericClass(result)


I have a question for you. Consider this function:

def f(n):
"""Return the largest natural power of 2 which does not exceed n."""
if n < 1:
raise ValueError
i = 1
while i <= n:
j = i
i *= 2
return j

If I pass it an instance of MyNumericClass, it will return an int or a
long, not an instance of MyNumericClass.

In your view, is this a weakness of the implementation? Should the
author of the function make an effort to have it return a value of the
same type that it was passed?

-M-
 
S

Steven D'Aprano

I have a question for you. Consider this function:

def f(n):
"""Return the largest natural power of 2 which does not exceed n."""
if n < 1:
raise ValueError
i = 1
while i <= n:
j = i
i *= 2
return j

If I pass it an instance of MyNumericClass, it will return an int or a
long, not an instance of MyNumericClass.

In your view, is this a weakness of the implementation? Should the
author of the function make an effort to have it return a value of the
same type that it was passed?

Only if it makes sense in the context of the function. I'd say it
depends on the principle of "least surprise": if the caller would expect
that passing in a MyNumericClass or a float or a Rational should return
the same type, then Yes, otherwise its optional.

Numeric functions are probably the least convincing example of this,
because in general people expect numeric functions to coerce arguments in
not-always-intuitive ways, especially when they pass multiple arguments
of mixed types. What should g(MyNumericClass, int, float, Rational)
return? And it often doesn't matter, not if you're just doing arithmetic,
because (in principle) any numeric type is compatible with any other
numeric type.

The principle of least surprise is sometimes hard to follow because it
means putting yourself in the shoes of random callers. Who knows what they
expect? One rule of thumb I use is to consider the function I'm writing,
and its relationship to the argument. Would I consider that the result is
somehow _made_from_ the argument? If so, then I should return the same
type (unless there is a compelling reason not to).

I'm NOT talking about implementation here, I'm thinking abstract
functions. Whether your implementation actually transforms the initial
argument, or creates a new piece of data from scratch, is irrelevant.

In your above example, the result isn't "made from" the argument (although
some implementations, using log, might do so). In abstract, the result is
an integer that is chosen by comparison to the argument, not by
construction from the argument. So it is unimportant for it to be the same
type, and in fact the caller might expect that the result is an int no
matter what argument he passes.
 
N

Nick Craig-Wood

Matthew Woodcraft said:
I have a question for you. Consider this function:

def f(n):
"""Return the largest natural power of 2 which does not exceed n."""
if n < 1:
raise ValueError
i = 1
while i <= n:
j = i
i *= 2
return j

If I pass it an instance of MyNumericClass, it will return an int or a
long, not an instance of MyNumericClass.

In your view, is this a weakness of the implementation? Should the
author of the function make an effort to have it return a value of the
same type that it was passed?

Possibly... It is relatively easy to do anyway and reasonably cheap so
why not? In this case a number about the same size as the original
number (possibly very large) will be returned so the argument that it
should return the same type is reasonably strong.
.... """Return the largest natural power of 2 which does not exceed n."""
.... if n < 1:
.... raise ValueError
.... i = n - n + 1
.... while i <= n:
.... j = i
.... i *= 2
.... return j
....
There are other ways of writing that

i = n - n + 1

eg

i = n.__class__(1)

It it basically saying "make me a numeric type with this value" so
maybe the __class__ is the clearest. It assumes that the constructor
can co-erce an int into the type, wheras the first assumes that the
type can add an int.

Here is my function to calculate arctan() from any type. The only
subtle bit for a general numeric type is detecting when we've
calculated enough, without using any specific knowledge about which
numeric type.

def arctan(x):
"""
Calculate arctan(x)

arctan(x) = x - x**3/3 + x**5/5 - ... (-1 < x < 1)
"""
total = x
power = x
divisor = 1
old_delta = None
while 1:
power *= x
power *= x
power = -power
divisor += 2
old_total = total
total += power / divisor
delta = abs(total - old_total)
if old_delta is not None and delta >= old_delta:
break
old_delta = delta
return total
Decimal("0.4636476090008061162142562314")
 
C

Carl Banks

Note also that for real code, a bare assert like that is uselessly
uninformative:

File "<stdin>", line 1, in ?
AssertionError

In real code, a traceback usually prints the line of code containing
the failed assertion.

This is better:

File "<stdin>", line 1, in ?
AssertionError: x must be equal to three but is 1 instead

This is even better still:

...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: x must be equal to three but is 1 instead

These are are verbose to the point of silliness, and usually not worth
the effort since assertions are only supposed to check for ostensibly
impossible conditions. Thus it shouldn't pop up often enough to
justify writing a verbose error message in advance, and when it does
trigger you're going to have to print the failed result in a debuging
run anyways.


Carl Banks
 
C

Carl Banks

Lately, I've been writing functions like this:

def f(a, b):

assert a in [1, 2, 3]
assert b in [4, 5, 6]

The point is that I'm checking the type and the values of the
parameters.

I'm curious how this does or doesn't fit into python's duck-typing
philosophy.

The duck-typing thing fits into a wider philosophy of being liberal in
what you accept. As you're constraining what a function accepts, it
definitely goes against the philosophy. I suggest you not blindly
slap assertions on every single function.

Assertions should only be used to check for ostensibly impossible
conditions. Therefore, guarding arguments like this is probably only
a good idea for internal or private functions that you can personally
guarantee will only be called with the right values.

Personally, I find assertions are more helpful in complex situtations
where I find myself having to maintain some sort of invariant. Some
condition is supposed to always be true at this point, and my code
relies on this. I've taken steps to maintain the invariant, but I
could have made a mistake. So I throw an assertion in. If there's a
leak somewhere, it will catch it.

I find that when I detect invalid parameters overtly, I spend less time
debugging.

If it helps go ahead an use them. The world won't end if you use an
assertion in a less than ideal situation. And, after all, if someone
doesn't like it they can shut them off.

Are other people doing things like this? Any related commentary is
welcome.

Well, there are examples of this usage of assert the standard
library. Some public functions (off hand I can think of the threading
module) use assertions to check for invalid arguments, a use I highly
disagree with. The library shouldn't be making assertions on behalf
of the users. If an AssertionError is raised from the threading
module, it should be because there is a bug in the threading module,
not because the user passed it a bad value.

But, yes, it has been done.


Carl Banks
 
S

Steven D'Aprano

If it helps go ahead an use them. The world won't end if you use an
assertion in a less than ideal situation. And, after all, if someone
doesn't like it they can shut them off.


Is there any way to have finer control of assertions than just passing -O
to the Python interpreter? Suppose I want to switch them off for certain
modules but not others, am I out of luck?
 
C

Carl Banks

Is there any way to have finer control of assertions than just passing -O
to the Python interpreter? Suppose I want to switch them off for certain
modules but not others, am I out of luck?

Please relax. The suggestion that one could shut them off was tongue
in cheek.


Carl Banks
 
S

Steven D'Aprano

Please relax. The suggestion that one could shut them off was tongue
in cheek.

(a) I am relaxed;

(b) One can shut down assertions, and it isn't a joke, it is a feature;

(c) I meant my question seriously, I think it would be a good thing.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top