# Clarity vs. code reuse/generality

Discussion in 'Python' started by kj, Jul 3, 2009.

1. ### kjGuest

I'm will be teaching a programming class to novices, and I've run
into a clear conflict between two of the principles I'd like to
it.

The context is the concept of a binary search. In one of their
homeworks, my students will have two occasions to use a binary
search. This seemed like a perfect opportunity to illustrate the
idea of abstracting commonalities of code into a re-usable function.
So I thought that I'd code a helper function, called _binary_search,
that took five parameters: a lower limit, an upper limit, a
one-parameter function, a target value, and a tolerance (epsilon).
It returns the value of the parameter for which the value of the
passed function is within the tolerance of the target value.

This seemed straightforward enough, until I realized that, to be
useful to my students in their homework, this _binary_search function
had to handle the case in which the passed function was monotonically
decreasing in the specified interval...

The implementation is still very simple, but maybe not very clear,
particularly to programming novices (docstring omitted):

def _binary_search(lo, hi, func, target, epsilon):
assert lo < hi
assert epsilon > 0
sense = cmp(func(hi), func(lo))
if sense == 0:
return None
target_plus = sense * target + epsilon
target_minus = sense * target - epsilon
while True:
param = (lo + hi) * 0.5
value = sense * func(param)
if value > target_plus:
hi = param
elif value < target_minus:
lo = param
else:
return param

if lo == hi:
return None

My question is: is the business with sense and cmp too "clever"?

Here's the rub: the code above is more general (hence more reusable)
by virtue of this trick with the sense parameter, but it is also
a bit harder to understand.

This not an unusual situation. I find that the processing of
abstracting out common logic often results in code that is harder
to read, at least for the uninitiated...

I'd love to know your opinions on this.

TIA!

kj

kj, Jul 3, 2009

2. ### Steven D'ApranoGuest

On Fri, 03 Jul 2009 14:05:08 +0000, kj wrote:

> ... I find that the processing of
> abstracting out common logic often results in code that is harder to

Yes. There is often a conflict between increasing abstraction and ease of
understanding.

[...]
> The implementation is still very simple, but maybe not very clear,
> particularly to programming novices (docstring omitted):
>
> def _binary_search(lo, hi, func, target, epsilon):
> assert lo < hi
> assert epsilon > 0

You're using assert for data sanitation. That is a very bad habit for you
to be teaching novices. Your code will fail to behave as expected
whenever the caller runs Python with the -O switch.

[...]
> My question is: is the business with sense and cmp too "clever"?

For production code? Maybe -- you would need to profile the code, and
compare it to a less general, more simple-minded function, and see which
performs better. If there is an excessive performance penalty, that's a
good sign that perhaps you're being too clever, too general, or too
abstract -- or all three.

Too clever for novices? That depends on how inexperienced they are -- are
they new to Python or new to programming in general? Are they children or
adults? Do they have a PhD degree or are they still at primary school? It
depends on their experience and their intelligence -- dare I say it, some
people will *never* get programming, let alone code-reuse. It also
depends on how you lead up to it -- are you dropping them straight into
the abstract code, or giving them two pieces of nearly the same code and
showing them how to pull out the common functionality?

--
Steven

Steven D'Aprano, Jul 3, 2009

3. ### Alan G IsaacGuest

On 7/3/2009 10:05 AM kj apparently wrote:
> The context is the concept of a binary search. In one of their
> homeworks, my students will have two occasions to use a binary
> search. This seemed like a perfect opportunity to illustrate the
> idea of abstracting commonalities of code into a re-usable function.
> So I thought that I'd code a helper function, called _binary_search,
> that took five parameters: a lower limit, an upper limit, a
> one-parameter function, a target value, and a tolerance (epsilon).
> It returns the value of the parameter for which the value of the
> passed function is within the tolerance of the target value.
>
> This seemed straightforward enough, until I realized that, to be
> useful to my students in their homework, this _binary_search function
> had to handle the case in which the passed function was monotonically
> decreasing in the specified interval...
>
> The implementation is still very simple, but maybe not very clear,
> particularly to programming novices (docstring omitted):
>
> def _binary_search(lo, hi, func, target, epsilon):
> assert lo < hi
> assert epsilon > 0
> sense = cmp(func(hi), func(lo))
> if sense == 0:
> return None
> target_plus = sense * target + epsilon
> target_minus = sense * target - epsilon
> while True:
> param = (lo + hi) * 0.5
> value = sense * func(param)
> if value > target_plus:
> hi = param
> elif value < target_minus:
> lo = param
> else:
> return param
>
> if lo == hi:
> return None

1. Don't use assertions to test argument values!

2.
from scipy.optimize import bisect
def _binary_search(lo, hi, func, target, epsilon):
def f(x): return func(x) - target
return bisect(f, lo, high, xtol=epsilon)

3. If you don't want to use SciPy (why?), have them
implement http://en.wikipedia.org/wiki/Bisection_method#Pseudo-code
to produce their own `bisect` function.

hth,
Alan Isaac

Alan G Isaac, Jul 3, 2009
4. ### AahzGuest

In article <h2l36k\$q5l\$>, kj <> wrote:
>
>This seemed straightforward enough, until I realized that, to be
>useful to my students in their homework, this _binary_search function
>had to handle the case in which the passed function was monotonically
>decreasing in the specified interval...
>
>def _binary_search(lo, hi, func, target, epsilon):
> assert lo < hi
> assert epsilon > 0
> sense = cmp(func(hi), func(lo))
> if sense == 0:
> return None
> target_plus = sense * target + epsilon
> target_minus = sense * target - epsilon
> while True:
> param = (lo + hi) * 0.5
> value = sense * func(param)
> if value > target_plus:
> hi = param
> elif value < target_minus:
> lo = param
> else:
> return param
>
> if lo == hi:
> return None
>
>My question is: is the business with sense and cmp too "clever"?

First of all, cmp() is gone in Python 3, unfortunately, so I'd avoid
using it. Second, assuming I understand your code correctly, I'd change
"sense" to "direction" or "order".
--
Aahz () <*> http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha

Aahz, Jul 3, 2009
5. ### BearophileGuest

On 3 Lug, 16:05, kj <> wrote:
> I'm will be teaching a programming class to novices, and I've run
> into a clear conflict between two of the principles I'd like to
> teach: code clarity vs. code reuse.

They are both important principles, but clarity is usually more
important because short code that can't be read can't be fixed and
modified, while long code that can be read is alive still.

> So I thought that I'd code a helper function, called _binary_search,
> that took five parameters: a lower limit, an upper limit, a
> one-parameter function, a target value, and a tolerance (epsilon).

Five arguments are a bit too much, even for non-novices. 2-3 arguments
are more than enough for novices.

Where are doctests'? Novices have to learn to think of tests as part
of the normal code

I think the main problem in all this discussion is that generally to
learn to program you have to write code yourself, so your students
have to invent and write their own binary search. Reading your code
they are going to learn much less. Creating a binary search from
almost scratch can turn out to be too much hard for them, it means you
have to ask them to invent something simpler

Programming is (among other things) problem solving, so they have to
learn to solve not easy problems from the beginning. Showing them
famous algorithms (and very good code to copy from) can be useful, but
it's less important than learning basic problem solving skills, and
much less important than learning why solving problems has to became
their purpose and pleasure

Bye,
bearophile

Bearophile, Jul 3, 2009
6. ### Bruno DesthuilliersGuest

kj a écrit :
> I'm will be teaching a programming class to novices, and I've run
> into a clear conflict between two of the principles I'd like to
> teach: code clarity vs. code reuse. I'd love your opinion about
> it.

(snip - others already commented on this code)

> Here's the rub: the code above is more general (hence more reusable)
> by virtue of this trick with the sense parameter, but it is also
> a bit harder to understand.

Perhaps better naming (s/sense/direction/g ?) and a small comment could
help ?

> This not an unusual situation. I find that the processing of
> abstracting out common logic often results in code that is harder
> to read, at least for the uninitiated...

IOW : the notion of "clarity" depends on the context (including the
reader). Anyway, there are algorithms (or implementations of...) that
are definitly and inherently on the 'hard to read' side - well,
complexity is something you have to learn to live with, period. The key
is to try and make it as simple and readable *as possible*.

Also, factoring out common code - or even slightly complex code - often
makes _client_ code (much) more readable.

Bruno Desthuilliers, Jul 3, 2009
7. ### Jean-Michel PichavantGuest

kj wrote:
> I'm will be teaching a programming class to novices, and I've run
> into a clear conflict between two of the principles I'd like to
> teach: code clarity vs. code reuse. I'd love your opinion about
> it.
>

[...]
> sense = cmp(func(hi), func(lo))
> if sense == 0:
> return None

My suggestion on how to improve this part for python novices:
# assuming func is monotonous
if func(high) > func(low):
direction = 1 # aka sign of the func derivative
elif func(low) > func(high):
direction = -1
else:
return None

Avoid using cmp, this will prevent the "what the hell is cmp doing ?" ,
unless you want to teach your students how to search for inline python
documentation.
Some other list members have suggested to improve the variable naming. I
couldn't agree more, in your case, I think clarity can be achieved as
well with the abstraction (these notions does not necessarily collide).

Here's a link on my how-to-name bible :
http://tottinge.blogsome.com/meaningfulnames/

Jean-Michel

Jean-Michel Pichavant, Jul 3, 2009
8. ### Lie RyanGuest

kj wrote:
> I'm will be teaching a programming class to novices, and I've run
> into a clear conflict between two of the principles I'd like to
> teach: code clarity vs. code reuse. I'd love your opinion about
> it.

Sometimes when the decision between clarity and generality becomes too
hard; you might fare better to save the code, go out for a walk to
forget the code, and start a blank text editor. Being in a fresh mind,
you may found an alternative approach, e.g.:

from __future__ import division
def binary_search(lo, hi, func, target, epsilon):
# reverses hi and lo if monotonically decreasing
lo, hi = (lo, hi) if func(hi) > func(lo) else (hi, lo)

param = (lo + hi) / 2

# loop while not precise enough
while abs(func(param) - target) > epsilon:
param = (lo + hi) / 2

if target < func(param):
hi = param
else:
lo = param
return param

> The context is the concept of a binary search. In one of their
> homeworks, my students will have two occasions to use a binary
> search. This seemed like a perfect opportunity to illustrate the
> idea of abstracting commonalities of code into a re-usable function.
> So I thought that I'd code a helper function, called _binary_search,
> that took five parameters: a lower limit, an upper limit, a
> one-parameter function, a target value, and a tolerance (epsilon).
> It returns the value of the parameter for which the value of the
> passed function is within the tolerance of the target value.
>
> This seemed straightforward enough, until I realized that, to be
> useful to my students in their homework, this _binary_search function
> had to handle the case in which the passed function was monotonically
> decreasing in the specified interval...
>
> The implementation is still very simple, but maybe not very clear,
> particularly to programming novices (docstring omitted):
>
> def _binary_search(lo, hi, func, target, epsilon):
> assert lo < hi
> assert epsilon > 0
> sense = cmp(func(hi), func(lo))
> if sense == 0:
> return None
> target_plus = sense * target + epsilon
> target_minus = sense * target - epsilon
> while True:
> param = (lo + hi) * 0.5
> value = sense * func(param)
> if value > target_plus:
> hi = param
> elif value < target_minus:
> lo = param
> else:
> return param
>
> if lo == hi:
> return None
>
> My question is: is the business with sense and cmp too "clever"?
>
> Here's the rub: the code above is more general (hence more reusable)
> by virtue of this trick with the sense parameter, but it is also
> a bit harder to understand.
>
> This not an unusual situation. I find that the processing of
> abstracting out common logic often results in code that is harder
> to read, at least for the uninitiated...
>
> I'd love to know your opinions on this.
>
> TIA!
>
> kj
>

Lie Ryan, Jul 3, 2009
9. ### kjGuest

In <mGo3m.591\$> Alan G Isaac <> writes:

>1. Don't use assertions to test argument values!

Out of curiosity, where does this come from?

Thanks,

kj

kj, Jul 3, 2009
10. ### kjGuest

In <h2l51m\$jps\$> (Aahz) writes:

>First of all, cmp() is gone in Python 3, unfortunately, so I'd avoid
>using it.

Good to know.

>Second, assuming I understand your code correctly, I'd change
>"sense" to "direction" or "order".

Thanks,

kj

kj, Jul 3, 2009
11. ### Paul RubinGuest

kj <> writes:
> sense = cmp(func(hi), func(lo))
> if sense == 0:
> return None
> target_plus = sense * target + epsilon
> target_minus = sense * target - epsilon
> ...

The code looks confusing to me and in some sense incorrect. Suppose
func(hi)==func(lo)==target. In this case the solver gives up
and returns None even though it already has found a root.

Also, the stuff with the sense parameter, and target_minus and
target_plus looks messy. I do like to use cmp. I'd just write
something like (untested):

def _binary_search(lo, hi, func, target, epsilon):
y_hi, y_lo = func(hi), func(lo)

while True:
x_new = (lo + hi) * 0.5
y_new = func(x_new)
if abs(y_new - target) < epsilon:
return x_new
elif cmp(y_new, target) == cmp(y_hi, target):
hi = x_new
else:
lo = x_new
if lo == hi:
return None

This uses an extra couple function calls in that trivial case, of
course.

Paul Rubin, Jul 3, 2009
12. ### Alan G IsaacGuest

> In <mGo3m.591\$> Alan G Isaac <> writes:
>> 1. Don't use assertions to test argument values!

On 7/3/2009 12:19 PM kj apparently wrote:
> Out of curiosity, where does this come from?

http://docs.python.org/reference/simple_stmts.html#grammar-token-assert_stmt
"The current code generator emits no code for an assert statement when optimization is requested at compile time."

Alan Isaac

Alan G Isaac, Jul 3, 2009
13. ### Steven D'ApranoGuest

On Fri, 03 Jul 2009 16:19:22 +0000, kj wrote:

> In <mGo3m.591\$> Alan G Isaac
> <> writes:
>
>>1. Don't use assertions to test argument values!

>
> Out of curiosity, where does this come from?

Assertions are disabled when you run Python with the -O (optimise) flag.
If you rely on assertions to verify data, then any time the user runs
python with -O your code will be running without error checking.

assert should be used to verify program logic, not to sanitize data.

--
Steven

Steven D'Aprano, Jul 3, 2009
14. ### Charles YeomansGuest

On Jul 3, 2009, at 2:03 PM, Steven D'Aprano wrote:

> On Fri, 03 Jul 2009 16:19:22 +0000, kj wrote:
>
>> In <mGo3m.591\$> Alan G Isaac
>> <> writes:
>>
>>> 1. Don't use assertions to test argument values!

>>
>> Out of curiosity, where does this come from?

>
> Assertions are disabled when you run Python with the -O (optimise)
> flag.
> If you rely on assertions to verify data, then any time the user runs
> python with -O your code will be running without error checking.
>
> assert should be used to verify program logic, not to sanitize data.

I wouldn't describe the use of an assert statement in the original
code as data sanitizing, but rather as a precondition. And with that
description, the use of an assert statement that might be compiled
away is not unreasonable; indeed, it certainly is so in the context of
design by contract.

Charles Yeomans

Charles Yeomans, Jul 3, 2009
15. ### Terry ReedyGuest

Steven D'Aprano wrote:
> On Fri, 03 Jul 2009 16:19:22 +0000, kj wrote:
>
>> In <mGo3m.591\$> Alan G Isaac
>> <> writes:
>>
>>> 1. Don't use assertions to test argument values!

>> Out of curiosity, where does this come from?

>
> Assertions are disabled when you run Python with the -O (optimise) flag.
> If you rely on assertions to verify data, then any time the user runs
> python with -O your code will be running without error checking.
>
> assert should be used to verify program logic, not to sanitize data.

In other words, assertions should never happen, whereas your assertions
could easily happen with 'bad' input. An example of program logic
assertion would be 'assert <loop-invariant>' at the top of the loop.
That would be useful for development, but would not be needed for
production use of tested code.

tjr

Terry Reedy, Jul 3, 2009
16. ### Pablo Torres N.Guest

On Fri, Jul 3, 2009 at 09:05, kj<> wrote:
> The context is the concept of a binary search. Â In one of their
> homeworks, my students will have two occasions to use a binary
> search. Â This seemed like a perfect opportunity to illustrate the
> idea of abstracting commonalities of code into a re-usable function.
> So I thought that I'd code a helper function, called _binary_search,
> that took five parameters: a lower limit, an upper limit, a
> one-parameter function, a target value, and a tolerance (epsilon).
> It returns the value of the parameter for which the value of the
> passed function is within the tolerance of the target value.

Five params for a novice function is overkill. I'd drop the epsilon
thing and, assuming you are searching in a list, tuple or something
similar, make the lower limit default to the first index and the upper
default to the last one.

In fact, I'd let them to realize that a function is convenient, and
base some of the grading in whether they wrote it or not. Just a
thought.

--
Pablo Torres N.

Pablo Torres N., Jul 4, 2009
17. ### kjGuest

In <> Paul Rubin <http://> writes:

>kj <> writes:
>> sense = cmp(func(hi), func(lo))
>> if sense == 0:
>> return None
>> target_plus = sense * target + epsilon
>> target_minus = sense * target - epsilon
>> ...

>The code looks confusing to me and in some sense incorrect. Suppose
>func(hi)==func(lo)==target.

In hindsight, I too think that it is incorrect, but for a different
reason. I've rewritten it like this:

sense = cmp(func(hi), func(lo))
assert sense != 0, "func is not strictly monotonic in [lo, hi]"

I regard the very special case of func(hi)==func(lo)==target as
pathological (analogous to the fact that a stopped watch is "exactly
right" twice a day), and not one I care to support.

kj

kj, Jul 4, 2009
18. ### kjGuest

In <zxr3m.613\$> Alan G Isaac <> writes:

>> In <mGo3m.591\$> Alan G Isaac <> writes:
>>> 1. Don't use assertions to test argument values!

>On 7/3/2009 12:19 PM kj apparently wrote:
>> Out of curiosity, where does this come from?

>http://docs.python.org/reference/simple_stmts.html#grammar-token-assert_stmt
>"The current code generator emits no code for an assert statement when optimization is requested at compile time."

Sorry, this doesn't say anything like "don't use assertions to test
argument values". I'm aware of the fact that assertions are silenced
under certain circumstances. But where does the principle 1. in
your first reply come from? I've never seen it before, and would
like to understand the reasoning behind it.

kj

kj, Jul 4, 2009
19. ### Pablo Torres N.Guest

On Sat, Jul 4, 2009 at 10:05, kj<> wrote:
>>http://docs.python.org/reference/simple_stmts.html#grammar-token-assert_stmt
>>"The current code generator emits no code for an assert statement when optimization is requested at compile time."

>
> Sorry, this doesn't say anything like "don't use assertions to test
> argument values". Â I'm aware of the fact that assertions are silenced
> under certain circumstances. Â But where does the principle 1. in
> your first reply come from? Â I've never seen it before, and would
> like to understand the reasoning behind it.
>
> kj
> --
> http://mail.python.org/mailman/listinfo/python-list
>

But...if no code is generated for assertions on some occasions, then the
parameters would go unchecked, potentially breaking your code in said
occasions.

--
Pablo Torres N.

Pablo Torres N., Jul 4, 2009
20. ### kjGuest

In <> "Pablo Torres N." <> writes:

>On Sat, Jul 4, 2009 at 10:05, kj<> wrote:
>tmt
>>>"The current code generator emits no code for an assert statement when op=

>timization is requested at compile time."
>>
>> Sorry, this doesn't say anything like "don't use assertions to test
>> argument values". =C2=A0I'm aware of the fact that assertions are silence=

>d
>> under certain circumstances. =C2=A0But where does the principle 1. in
>> your first reply come from? =C2=A0I've never seen it before, and would
>> like to understand the reasoning behind it.
>>
>> kj
>> --
>> http://mail.python.org/mailman/listinfo/python-list
>>

>But...if no code is generated for assertions on some occasions, then the
>parameters would go unchecked, potentially breaking your code in said
>occasions.

This implies that code that uses *any* assert statement (other than
perhaps the trivial and meaningless ones like "assert True") is
liable to break, because whatever it is that these assert statements
are checking "on some occasions, ... would go unchecked, potentially

I'm beginning to think that the original "precept" was simply "cargo
cult," i.e. one of those rules that are parroted without being
fully understood.

As I wrote in my original post, the function I posted was an internal
("helper") function, not to be used outside of the file. This fact
was further (ahem) underscored by the leading underscore in the
function's name. Under these circumstances, the assert statements
seem to me perfectly in keeping with the intended use for them.

kj

kj, Jul 4, 2009