Is it ok to type check a boolean argument?

A

Adal Chiriliuc

Hello,

Me and my colleagues are having an discussion about the best way to
code a function (more Pythonic).

Here is the offending function:

def find(field, order):
.....if not isinstance(order, bool):
.........raise ValueError("order must be a bool")
.....order_by = "asc" if order else "desc"
.....return _find(field + "+" + order_by)

We are not sure what's the best practice here. Should we or should we
not check the type of the "order" variable, which should be a bool?

In one of our unit-tests we passed the "invalid_order" string as the
order argument value. No exception was raised, since the string was
evaluated as being True.

We know about "Don't look before we jump", but we are not sure how it
applies in this case, since we don't get any exception when passing an
invalid type argument.

This function is not visible to our clients, only internally in our
project. It's part of the public interface of a sub-system, so we are
not sure either if the fact that it returns an invalid result for a
badly type argument it's an API specification break or not.

The pro argument was that if a new programmer comes and passes a
wrongly typed argument, he will get a silent failure.
The cons argument was that there are many functions which could
silently fail in this mode, especially those having bool arguments.

Should we in general check when doing unit-testing that the methods
behave correctly when passed arguments of the wrong type? What do you
do in your projects?
 
P

Paul McGuire

Hello,

Me and my colleagues are having an discussion about the best way to
code a function (more Pythonic).

Here is the offending function:

def find(field, order):
....if not isinstance(order, bool):
........raise ValueError("order must be a bool")
....order_by = "asc" if order else "desc"
....return _find(field + "+" + order_by)

<grammar_rant>
First of all, please get used to saying "My colleagues and I..."
instead of "Me and my colleagues". My kids still haven't learned this
(both are in college!), and it annoys me daily. Sorry for the rant,
you were just the final straw, so I'm taking it out on you.
</grammar_rant>

The offending part of this function is the argument name "order". If
this is a boolean argument, I would assume that True should be ordered
data, and False would be unordered or randomly ordered data. The name
"order" really does not clue me in that the two choices are
"ascending" and "descending", nor is it clear what the mapping is to
True and False.

You are right, Python's inference of boolean-ness makes it easy for
coders to guess wrong as to the argument type and still get a program
that emits no exceptions, although possibly doing the opposite of the
intended function. I would suggest that in your project, you
establish a naming convention for these boolean arguments. The name
should be descriptive and assertive, so that the value of True is
reasonably intuitive. Instead of "order", perhaps
"return_ascending_results" with a default value of True. To further
cue your developers that a boolean argument is desired, you could try
adding "_flag" to all of your boolean arguments, sort of a Reverse
Hungarian Notation.

Or another option altogether would be to define some module level
constants, like this:

ASCENDING_RESULTS = object()
DESCENDING_RESULTS = object()
def find(field, results_order=ASCENDING_RESULTS):
order_by = { ASCENDING_RESULTS : "asc", DESCENDING_RESULTS: "desc"}
[results_order]
return _find(field + "+" + order_by)

If anything other than those two constants is passed for the
results_order argument, then a KeyError exception will be raised. (I
used a similar technique to this in pyparsing's operatorPrecedence
method, in which an argument indicates whether an operator is right-
or left-associative.)

-- Paul
 
P

Philip Semanchuk

Hello,

Me and my colleagues are having an discussion about the best way to
code a function (more Pythonic).

Here is the offending function:

def find(field, order):
....if not isinstance(order, bool):
........raise ValueError("order must be a bool")
....order_by = "asc" if order else "desc"
....return _find(field + "+" + order_by)

We are not sure what's the best practice here. Should we or should we
not check the type of the "order" variable, which should be a bool?

IMHO you should not.
The pro argument was that if a new programmer comes and passes a
wrongly typed argument, he will get a silent failure.

"Wrongly typed" is a matter of opinion. The following values also
evaluate to False when converted to bool:
None
[]
()
""
{}

To me, they're just as false as False is, if you catch my meaning. I
would not like to have to convert them to bool to be able to use them
when calling your function.

Imagine this scenario --

order = read_order_from_preferences_xml_file(default_value = None)

# order is now "ascending" or None

find(the_field_name, order)


It seems logical to pass a string or None in this case without
converting them to bool.

You might feel more comfortable if the parameter was called
"is_ordered", which would at least imply that it is a boolean so that
someone unfamiliar with the interface would be less tempted to pass
something like the last part of an ORDER BY statement, like
"last_name, first_name, age DESC".

I can understand your temptation to enforce bool-ness, but you have a
very good point about this one function then being different from all
of the others that aren't as picky.

HTH
Philip
 
T

Terry Reedy

Adal said:
Me and my colleagues are having an discussion about the best way to
code a function (more Pythonic).

Here is the offending function:

def find(field, order):
....if not isinstance(order, bool):
........raise ValueError("order must be a bool")
....order_by = "asc" if order else "desc"
....return _find(field + "+" + order_by)

My opinions:
1. 'order' should be 'a' (the default) or 'd'. True and False by
themselves are meaningless. Then the check, if needed, is "if order not
in 'ad':".
'up' and 'down' are possible too.
2. Consider renaming _find as find, with two params and do the parameter
check there. Except when one is calling a function recursively
(repeatedly) with known-good params, wrapping a function with a simple
arg check seems like noise to me.
We are not sure what's the best practice here. Should we or should we
not check the type of the "order" variable, which should be a bool?

I say it should not be a bool. The below illustrates another reason why
this is the wrong type.
In one of our unit-tests we passed the "invalid_order" string as the
order argument value. No exception was raised, since the string was
evaluated as being True.

If you make 'order' a string, then a bad string input should raise an
exception somewhere. I suspect _find could be written to do this if it
does not already.
We know about "Don't look before we jump", but we are not sure how it
applies in this case, since we don't get any exception when passing an
invalid type argument.
This function is not visible to our clients, only internally in our
project. It's part of the public interface of a sub-system, so we are
not sure either if the fact that it returns an invalid result for a
badly type argument it's an API specification break or not.

The pro argument was that if a new programmer comes and passes a
wrongly typed argument, he will get a silent failure.

That is bad, but in this case, I see the problem as a slight mis-design.
The cons argument was that there are many functions which could
silently fail in this mode, especially those having bool arguments.

Which are rather rare, I think. In the 3.0 builtin functions, sorted's
'reverse' param is the only one. For back compatibility, it actually
accepts ints.
Should we in general check when doing unit-testing that the methods
behave correctly when passed arguments of the wrong type?

As in 'raise an exception', I think at least one test is good.

Terry Jan Reedy
 
B

Bruno Desthuilliers

Adal Chiriliuc a écrit :
Hello,

Me and my colleagues are having an discussion about the best way to
code a function (more Pythonic).

Here is the offending function:

def find(field, order):
....if not isinstance(order, bool):
........raise ValueError("order must be a bool")
....order_by = "asc" if order else "desc"
....return _find(field + "+" + order_by)

We are not sure what's the best practice here. Should we or should we
not check the type of the "order" variable, which should be a bool?

This kind of typechecking is usually considered bad practice in Python,
but well, <python-zen>practicallity beats purity</python-zen> - and as a
matter of fact, if you refer to list.sort, passing a non-integer value
as the 'reverse' argument raises a TypeError...

This being said, I can only concur with other posters here about the
very poor naming. As far as I'm concerned, I'd either keep the argument
as a boolean but rename it "ascending" (and use a default True value),
or keep the 'order' name but then accept 'asc' and 'desc' as values
('asc' being the default).

My 2 cents...
 
A

Adal Chiriliuc

This being said, I can only concur with other posters here about the
very poor naming. As far as I'm concerned, I'd either keep the argument
as a boolean but rename it "ascending" (and use a default True value),
or keep the 'order' name but then accept 'asc' and 'desc' as values
('asc' being the default).

Well, I lied a bit :p

The actual function is 20 lines long, and does other stuff to. The
"order" argument is not passed as an argument, but as a field on an
input object. Since I generally don't like it when people post
questions and include gazillions of lines of code, I boiled down the
real function to the one posted above, which is the minimum required
to show our dilemma. "_find" also does not exist in the real code.

You are all right, "order" is a bad name choice. "sort_ascending"
would be a much better name for a boolean variable.

I found Paul's idea very interesting, but I prefer to not introduce
new objects which behave like C enums (maybe that would be more
Pythonic?), so I found this variant:

def find(field, sort_ascending):
.....order_by = {True: "asc", False: "desc"}
.....return _find(field + "+" + order_by[sort_ascending])

But what if we can't solve it as ellegantly, and we need to execute
different code depending on the condition:

def find(field, fast_mode=True):
.....if fast_mode:
.........do_something(field, 1, DEFAULT_PAGE_SIZE)
.....else:
.........do_another_thing(field, False, "xml", ignore_error=False)
.........and_another_one()

Should we typecheck in this case to ensure that if we pass a string
for "fast_mode" we will raise an exception?
 
J

James Stroud

Adal said:
Hello,

Me and my colleagues are having an discussion about the best way to
code a function (more Pythonic).

Here is the offending function:

def find(field, order):
....if not isinstance(order, bool):
........raise ValueError("order must be a bool")
....order_by = "asc" if order else "desc"
....return _find(field + "+" + order_by)

We are not sure what's the best practice here. Should we or should we
not check the type of the "order" variable, which should be a bool?

None of the above. You have put unnecessary constraints on find by
requiring a certain data type.

def find(field, order_by='desc'):
return _find(field + "+" + order_by)

Here we are deferring error checking to the _find function. Otherwise:

def find(field, order_by='desc'):
if order_by not in ['asc', 'desc']:
raise ValueError, 'Bad order_by parameter.'
else:
return _find(field + "+" + order_by)

Now you have achieved the exact same affect without any sort of explicit
type checking. Moreover, you have eliminated an "if" statement. Also,
you leave room for exotic orderings should you ever want them.

James
 
J

James Stroud

Ben said:
Why are people so reluctant to make error message templates clearer
with named placeholders?

Because they can never remember they even exist. Thanks for the reminder.
 
B

Bruno Desthuilliers

Adal Chiriliuc a écrit :
Well, I lied a bit :p

The actual function is 20 lines long, and does other stuff to. The
"order" argument is not passed as an argument, but as a field on an
input object.

Then the input object should probably be responsible for validating this
(IMHO).

(snip)
You are all right, "order" is a bad name choice. "sort_ascending"
would be a much better name for a boolean variable.

I found Paul's idea very interesting, but I prefer to not introduce
new objects which behave like C enums (maybe that would be more
Pythonic?), so I found this variant:

def find(field, sort_ascending):
....order_by = {True: "asc", False: "desc"}
....return _find(field + "+" + order_by[sort_ascending])

for a simple True/False dispatch, my usual idiom (predating the ternary
operator) is:

order_by = ("desc", "asc")[sort_ascending]

But what if we can't solve it as ellegantly, and we need to execute
different code depending on the condition:

def find(field, fast_mode=True):
....if fast_mode:
........do_something(field, 1, DEFAULT_PAGE_SIZE)
....else:
........do_another_thing(field, False, "xml", ignore_error=False)
........and_another_one()

Should we typecheck in this case to ensure that if we pass a string
for "fast_mode" we will raise an exception?

I don't think so. Going that way, you will end up typechecking each and
any function argument. While there are a couple typechecks here and
there even in the stdlib and builtins, this doesn't mean it's how Python
is supposed to be used - Python is a dynamic language, leave with it or
choose another one !-)

But anyway, ensuring a program's correctness requires much more than
static typing. What you want IMHO is:

1/ a solid *user input* validation/conversion framework (FormEncode
comes to mind)
2/ a sensible set of unittests and integration tests.

Trying to forcefit static typing into Python will only buy you pain and
frustration (not even talking about the waste of time).
 
C

Carl Banks

Why are you concerned only with type errors on inputs?
Even if you could do exhaustive checking of input parameters to
make sure they are the only acceptable values, what prevents your
user from providing the wrong valid value?  What made you pick on
type errors in the first place?  If it turns out that an argument
of 422 is a typo for 42, why is that less of a problem?  Just because
you are used to systems where one kind of error is always caught does
not really make you invulnerable.  You'll have no problem telling
your users that the 422 is their fault.  Why do you have such
certainty that passing in "nonsense" as a boolean is a failure
you need to deal with?


I'm going to play Devil's Advocate here.

The motivation here is not "we want type safety" but "our unit tests
can't register this deliberate error because it passes silently".
Presumably if they had a function that accepted integers, and 442 was
an invalid value, and the function failed silently in a unit test,
they would also consider whether it should instead fail loudly.

It's always a judgment call how much to screen for bad input, but type
errors aren't different from any other error in this regard.
Sometimes it's appropriate (note: not, IMHO, in this case), just like
it's sometimes appropriate to check for 442.


Carl Banks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top