Code correctness, and testing strategies

David · May 24, 2008

Hi list.

What strategies do you use to ensure correctness of new code?

Specifically, if you've just written 100 new lines of Python code, then:

1) How do you test the new code?
2) How do you ensure that the code will work correctly in the future?

Short version:

For (1) I thoroughly (manually) test code as I write it, before
checking in to version control.

For (2) I code defensively.

Long version:

For (2), I have a lot of error checks, similar to contracts (post &
pre-conditions, invariants). I've read about Python libs which help
formalize this[1][2], but I don't see a great advantage over using
regular ifs and asserts (and a few disadvantages, like additional
complexity). Simple ifs are good enough for Python built-in libs

[1] PEP 316: http://www.python.org/dev/peps/pep-0316/
[2] An implementation:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/436834

An aside: What is the correct situation in which to use assert
statements in Python? I'd like to use them for enforcing 'contracts'
because they're quick to type, but from the docs:

"Assert statements are a convenient way to insert debugging assertions
into a program:"

So to me it sounds like 'assert' statements are only useful while
debugging, and not when an app is live, where you would also
(especially!) want it to enforce contracts. Also, asserts can be
removed with -O, and you only ever get AssertionError, where
ValueError and the like might be more appropriate.

As for point 1 (how do you test the new code?):

I like the idea of automated unit tests. However, in practice I find
they take a long time to write and test, especially if you want to
have good coverage (not just lines, but also possible logic branches).

So instead, I prefer to thoroughly test new code manually, and only
then check in to version control. I feel that if you are disciplined,
then unit tests are mainly useful for:

1) Maintenance of legacy code
2) More than 1 person working on a project

One recent personal example:

My workstation is a Debian Unstable box. I like to upgrade regularly
and try out new library & app versions. Usually this doesn't cause
major problems. One exception is sqlalchemy. It's API seems to change
every few months, causing warnings and breakage in code which used the
old API. This happened regularly enough that for one project I spent a
day adding unit tests for the ORM-using code, and getting the unit
tests up to 100% coverage. These tests should allow me to quickly
catch and fix all sqlalchemy API breakages in my app in the future.
The breakages also make me want to stop using ORM entirely, but it
would take longer to switch to SQL-only code than to keep the unit
tests up to date

My 'test code thoroughly before checkin' methodology is as follows:

1) Add "raise 'UNTESTED'" lines to the top of every function
2) Run the script
3) Look where the script terminated
4) Add print lines just before the exception to check the variable values
5) Re-run and check that the values have expected values.
6) Remove the print and 'raise "UNTESTED"' lines
7) Add liberal 'raise "UNTESTED"' lines to the body of the function.
8.1) For short funcs, before every line (if it seems necessary)
8.2) For longer funcs, before and after each logic entry/exit point
(blocks, exits, returns, throws, etc):

eg, before:

if A():
B()
C()
D()
E()

after:

raise 'UNTESTED'
if A():
raise 'UNTESTED'
B()
C()
D()
raise 'UNTESTED'
raise 'UNTESTED'
E()

8.2.1) Later I add "raise 'UNTESTED'" lines before each line in the
blocks also, if it seems necessary.

9) Repeat steps 2 to 8 until the script stops throwing exceptions
10) Check for 'raise "UNTESTED"' lines still in the script
11) Cause those sections of code to be run also (sometimes I need to
temporarily set vars to impossible values inside the script, since the
logic will never run otherwise)

And here is one of my biggest problem with unit tests. How do you unit
test code which almost never runs? The only easy way I can think of is
for the code to have 'if <some almost impossible condition> or <busy
running test case XYZ> lines'. I know I'm meant to make 'fake' testing
classes which return erroneous values, and then pass these objects to
the code being tested. But this can take a long time and even then
isn't guaranteed to reach all your error-handling code.

The above methodology works well for me. It goes fairly quickly, and
is much faster than writing and testing elaborate unit tests.

So finally, my main questions:

1) Are there any obvious problems with my 'correctness' strategies?

2) Should I (regardless of time it takes initially) still be adding
unit tests for everything? I'd like to hear what XP/agile programming
advocates have to say on the subject.

3) Are there easy and fast ways to do write and test (complete) unit tests?

4) Any other comments?

Thanks for your time.

David.

Roy Smith · May 24, 2008

David said:
Hi list.

What strategies do you use to ensure correctness of new code?

Specifically, if you've just written 100 new lines of Python code, then:

1) How do you test the new code?
2) How do you ensure that the code will work correctly in the future?

There's various philosophies about this (and none of them are specific to
Python), but I like a process called Test Driven Development (TDD).

In TDD, you invert the order you describe above. First write the test,
then write the code to make the tests pass. What I usually do is write the
documentation first, then write some tests that assert things the
documentation says, then write the code.

Python has several testing modules; unittest and doctest come with the
system, and there's a couple of other third-party modules that have sprung
up. I find unittest works well for me and stick with that.

I work in small increments, writing one test at a time, then some code,
then another test, then some more code, etc. In fact, I take this to what
many people might call an extreme. When I sit down to write a class, the
first test I write is one that just instantiates an instance of that class
with no arguments. Then a write the stub class:

class foo:
pass

and see that the test works. If I'm feeling really expansive, I'll write a
more sophisticated version:

class foo:
def __init__(self):
pass

and use that to satisfy my initial test. Then it's just later, rinse,
repeat until the class is done, fully documented, and fully tested.

For (1) I thoroughly (manually) test code as I write it, before
checking in to version control.

I agree with the idea that it's fully tested before checking it in, but not
with the manual testing. In my mind, any test that's not fully automated
might as well not exist.

David · May 24, 2008

I work in small increments, writing one test at a time, then some code,

then another test, then some more code, etc. In fact, I take this to what
many people might call an extreme.

Thanks for the replies.

I've read about the TDD (and similar) approaches. Maybe I need to try
it and get used to it, but there are a few things I don't like about
it on a gut level. I'll try to enumerate my concerns here.

Problem 1: You can only code against tests

Basically, with TDD you write the tests first, then the code which
passes/fails the tests as appropriate. However, as you're writing the
code you will also think of a lot of corner cases you should also
handle. The natural way to do this is to add them to the code first.
But with TDD you have to first write a test for the corner case, even
if setting up test code for it is very complicated. So, you have these
options:

- Take as much time as needed to put a complicated test case in place.
- Don't add corner case to your code because you can't (don't have
time to) write a test for it.
- Add the corner case handling to the code first, and try to add a
test later if you have time for it.

Problem 2: Slows down prototyping

In order to get a new system working, it's nice to be able to throw
together a set of modules quickly, and if that doesn't work, scrap it
and try something else. There's a rule (forget where) that your first
system will always be a prototype, regardless of intent.

With TDD, you have to first write the tests for the first version.
Then when that first version doesn't work out, you end up scrapping
the tests and the code. The time spent writing the tests was wasted.

Problem 3: Slows down development in general

Having to write tests for all code takes time. Instead of eg: 10 hours
coding and say 1/2 an hour manual testing, you spend eg: 2-3 hours
writing all the tests, and 10 on the code.

Problem 4: Can make refactoring difficult.

If you have very complete & detailed tests for your project, but one
day you need to change the logic fundamentally (maybe change it from
single-threaded to multi-threaded, or from running on 1 server to
distributed), then you need to do a large amount of test refactoring
also. The more tests you have (usually a good thing), the longer it
will take to update all the tests, write new ones, etc. It's worse if
you have to do all this first before you can start updating the code.

Problem 5: Tests are more important than code

You need to justify the extra time spent on writing test code. Tests
are nice, and good to have for code maintainability, but they aren't
an essential feature (unless you're writing critical software for life
support, etc). Clients, deadlines, etc require actual software, not
tests for software (that couldn't be completed on time because you
spent too much time writing tests first ;-)).

Problems like the above sum up why I'm not comfortable with the idea of TDD.

I think that automated tests can be very valuable for maintainability,
making sure that you or other devs don't break something down the
line. But these benefits must be worth the time (and general
inconvenience) spent on adding/maintaining the tests.

If I did start doing some kind of TDD, it would be more of the 'smoke
test' variety. Call all of the functions with various parameters, test
some common scenarios, all the 'low hanging fruit'. But don't spend a
lot of time trying to test all possible scenarios and corner cases,
100% coverage, etc, unless I have enough time for it.

I'm going to read more on the subject (thanks to Ben for the link).
Maybe I have some misconceptions.

David.

Jacob Hallen · May 24, 2008

Thanks for the replies.

I've read about the TDD (and similar) approaches. Maybe I need to try
it and get used to it, but there are a few things I don't like about
it on a gut level. I'll try to enumerate my concerns here.

Problem 1: You can only code against tests

Basically, with TDD you write the tests first, then the code which
passes/fails the tests as appropriate. However, as you're writing the
code you will also think of a lot of corner cases you should also
handle. The natural way to do this is to add them to the code first.
But with TDD you have to first write a test for the corner case, even
if setting up test code for it is very complicated. So, you have these
options:

- Take as much time as needed to put a complicated test case in place.
- Don't add corner case to your code because you can't (don't have
time to) write a test for it.
- Add the corner case handling to the code first, and try to add a
test later if you have time for it.

As you come up with the corner case, write a test for it and leave the implementation
for later. The hard part of coding is always defining your problem. Once it is
defined (by a test) the solution is just a matter of tidy work.

Problem 2: Slows down prototyping

In order to get a new system working, it's nice to be able to throw
together a set of modules quickly, and if that doesn't work, scrap it
and try something else. There's a rule (forget where) that your first
system will always be a prototype, regardless of intent.

With TDD, you have to first write the tests for the first version.
Then when that first version doesn't work out, you end up scrapping
the tests and the code. The time spent writing the tests was wasted.

Agreed. There is no good way of reusing your prototype code in TDD. You end
up having to throw your propotype away in order to have a proper tested
implementation in the end. Takes more time up front, but less time over the
lifecycle of the program you are building.

Problem 3: Slows down development in general

Having to write tests for all code takes time. Instead of eg: 10 hours
coding and say 1/2 an hour manual testing, you spend eg: 2-3 hours
writing all the tests, and 10 on the code.

This is incorrect. It speeds up development in general. The debugging phase of
development becomes much shorter because the bugs are fewer and the ones you
have a much shallower. There are so many tests that reduce the scope in which
you have to search for the bug that it usually becomes trivial to find.

I have direct experience from this, getting my company to change to TDD about
10 months ago. Productivity has improved enormously. I'd say that we have cut
between 25 and 50% in development time.

Problem 4: Can make refactoring difficult.

If you have very complete & detailed tests for your project, but one
day you need to change the logic fundamentally (maybe change it from
single-threaded to multi-threaded, or from running on 1 server to
distributed), then you need to do a large amount of test refactoring
also. The more tests you have (usually a good thing), the longer it
will take to update all the tests, write new ones, etc. It's worse if
you have to do all this first before you can start updating the code.

No, this is a total misunderstanding. It makes refactoring much easier.
It takes a bit of time to refactor the affected tests for the module, but you
gain so much by having tests that show that your refactoring is not breaking
code that should be unaffected that it saves the extra time spent many times
over.

Introducing bugs because you missed some aspect in a refactoring is one of the
most common problems in non-TDD code and it is a really nasty quality
concern.

Problem 5: Tests are more important than code

You need to justify the extra time spent on writing test code. Tests
are nice, and good to have for code maintainability, but they aren't
an essential feature (unless you're writing critical software for life
support, etc). Clients, deadlines, etc require actual software, not
tests for software (that couldn't be completed on time because you
spent too much time writing tests first ;-)).

The tests are as important as the code. As a customer, I don't think I'd buy
software today unless I know that it has been built using TDD. Certainly I am
ahead of the curve in this, but it won't be long before this will be required
by skilled organisations buying software and sooner or later the rest of the
world will follow.

Getting into a TDD mindset is hard work, but those who succeed produce better
software with less effort.

Jacob HallÃ©n

--

David · May 24, 2008

Hi again list.

As you come up with the corner case, write a test for it and leave the implementation
for later. The hard part of coding is always defining your problem. Once it is
defined (by a test) the solution is just a matter of tidy work.

Is it considered to be cheating if you make a test case which always
fails with a "TODO: Make a proper test case" message?

While it is possible to describe all problems in docs, it can be very
hard to write actual test code.

For example: sanity tests. Functions can have tests for situations
that can never occur, or are very hard to reproduce. How do you unit
test for those?

A few examples off the top of my head:

* Code which checks for hardware defects (pentium floating point,
memory or disk errors, etc).

* Code that checks that a file is less than 1 TB large (but you only
have 320 GB harddrives in your testing environment).

* Code which checks if the machine was rebooted over a year ago.

And so on. These I would manually test by temporarily changing
variables in the code, then changing them back. To unit test these you
would need to write mock functions and arrange for the tested code to
call them instead of the python built-ins.

Also, there are places where mock objects can't be used that easily.

eg 1: A complicated function, which needs to check the consistency of
it's local variables at various points.

It *is* possible to unit test those consistency checks, but you may
have to do a lot of re-organization to enable unit testing.

In other cases it might not be appropriate to unit test, because it
makes your tests brittle (as mentioned by another poster).

eg: You call function MyFunc with argument X, and expect to get result Y.

MyFunc calls __private_func1, and __private_func2.

You can check in your unit test that MyFunc returns result Y, but you
shouldn't check __private_func1 and __private_func2 directly, even if
they really should be tested (maybe they sometimes have unwanted side
effects unrelated to MyFunc's return value).

eg: Resource usage.

How do you unit test how much memory, cpu, temporary disk space, etc a
function uses?

eg: Platforms for which unit tests are hard to setup/run.

- embedded programming. You would need to load your test harness into
the device, and watch LED patterns or feedback over serial. Assuming
it has enough memory and resources

- mobile devices (probably the same issues as above)

eg: race conditions in multithreaded code: You can't unit test
effectively for these.

And so on.

Agreed. There is no good way of reusing your prototype code in TDD. You end
up having to throw your propotype away in order to have a proper tested
implementation in the end. Takes more time up front, but less time over the
lifecycle of the program you are building.

Sounds like you are talking about cases where you have to throw away
the prototype *because* you couldn't unit test it properly? (but it
was otherwise functioning perfectly well).

This is incorrect. It speeds up development in general. The debugging phase of
development becomes much shorter because the bugs are fewer and the ones you
have a much shallower. There are so many tests that reduce the scope in which
you have to search for the bug that it usually becomes trivial to find.

Depends on the type of bug. If it's a bug which breaks the unit tests,
then it can be found quickly. Unit tests won't help with bugs they
don't explicitly cover. eg off-by-one, memory leaks, CPU load,
side-effects (outside what the unit tests test), and so on.

That's another reason why I don't think that unit tests are a silver
bullet. You can have code that's totally wrong, but still succeeds in
the tests (even if they're very detailed). eg: hardcoding return
values expected by the tests, and returning garbage the rest of the
time.

But once you track down problems like the above you can write more
unit tests to catch those exact bugs in the future. This is one case
where I do favour unit tests.

I guess you could compare unit tests to blacklists or anitivirus
software. All of them only catch cases that have been explicitely
coded into them.

I have direct experience from this, getting my company to change to TDD about
10 months ago. Productivity has improved enormously. I'd say that we have cut
between 25 and 50% in development time.

Going by your figures and other cases I've read on the web, there are
definitely cases where TDD is beneficial and can save a lot of time.
What I'm not sure of (probably inexperience on my part) is when you
should and shouldn't use TDD, and to what extent.

I'm sure that factors like this have to come in to play when deciding
if and how to use TDD in a given project:

- size and age of the project (new, small code is easier to understand
than large, old)
- complexity & modularity of project
- programming language used (dynamic langs need unit tests more than compiled)
- importance of project (consequences of bugs)
- who the project is for (yourself, inhouse, or for client)
- how easy it is to fix problems in deployed software
- number of developers
- skill & discipline of developers
- development style (waterfall/incremental, etc)
- consistency checks already built into the software

That last one (internal consistency checks) is a big one for me. If
software has a lot of internal consistency checks (contracts), then I
feel that the need for unit tests is a lot less.

No, this is a total misunderstanding. It makes refactoring much easier.
It takes a bit of time to refactor the affected tests for the module, but you
gain so much by having tests that show that your refactoring is not breaking
code that should be unaffected that it saves the extra time spent many times
over.

Makes refactoring easier how?

I assume you mean the unchanged tests, which check functionality which
should be the same before and after your refactoring. All the other
tests need to be replaced. I agree that the unchanged tests can be
helpful. It's the extra work of unit test maintenance I have a problem
with.

In an extreme case you might refactor the same code multiple times,
requiring the test cases to also be refactored each time too (more
test cases = more work each time). To me it feels like the test cases
can be a lot of 'dead weight', slowing down development.

Even if you want unit tests for the final version which goes to the
customer, you still had to spend time with re-writing unit tests for
all the refactored versions inbetween. To me those intermediate unit
test versions sound like a complete waste of time. Please correct me
if this is also mistaken

Introducing bugs because you missed some aspect in a refactoring is one of the
most common problems in non-TDD code and it is a really nasty quality
concern.

I agree here. But I feel that you can get the same quality by good QA
and human testing (ie: developer does a lot of testing, then hands it
over to testers, then it goes to the customer). Which should be done
whether or not you have unit tests. I feel that unit tests are mainly
good for catching rare cases which might not be tested by a human.

In other words, full-on TDD only becomes really useful when projects
grow large (and complicated) and can't be fully tested by humans?
Until that point, only using unit tests to catch regressions seems to
be more than enough to ensure good quality. (Again, correct me if this
is mistaken).

The tests are as important as the code. As a customer, I don't think I'd buy
software today unless I know that it has been built using TDD. Certainly I am
ahead of the curve in this, but it won't be long before this will be required
by skilled organisations buying software and sooner or later the rest of the
world will follow.

How much software is written with TDD? Do companies generally
advertise this? I get the idea that most high quality open source
software (Apache, linux kernel, GNU, Python, etc) are developed in a
non-TDD way. What they do have is intelligent developers, coding
conventions, and a very good testing & QA process. Where they do have
unit tests it's usually for regressions. Why don't they use TDD if it
would make such a big difference? Are you going to stop using open
source software (like Python) which isn't written with TDD?

Getting into a TDD mindset is hard work, but those who succeed produce better
software with less effort.

Thanks for your informative reply. I've learned a bit from this thread
and will definitely look more into TDD

David.

David · May 24, 2008

In order to get a new system working, it's nice to be able to throw

That's fine. It's alright to prototype without tests. The only rule is that
you cannot then use any of that code in production.

So, at what point do you start writing unit tests? Do you decide:
"Version 1 I am going to definitely throw away and not put it into
production, but version 2 will definitely go into production, so I
will start it with TDD?".

Where this doesn't work so well is if version 2 is a refactored and
incrementally-improved version of version 1. At some point you need to
decide "this is close to the version that will be in production, so
let's go back and write unit tests for all the existing code".

You are either a very slow coder or a very poor tester: there should be a
lot more than 1/2 hour testing for 10 hours coding. I would say the
comparison might be 10 hours coding, 10 hours testing, then about a week
tracking down the bugs which escaped testing and got out to the customers.
With proper unit tests you will reduce all 3 of these numbers but
especially the last one. Any bug which gets away from the programmer and is
only caught further downstream costs vastly more than bugs caught during
development, and not just for the programmer but everyone else who
affected.

Seriously, 10 hours of testing for code developed in 10 hours? What
kind of environment do you write code for? This may be practical for
large companies with hordes of full-time testing & QA staff, but not
for small companies with just a handful of developers (and where you
need to borrow somone from their regular job to do non-developer
testing). In a small company, programmers do the lions share of
testing. For programmers to spend 2 weeks on a project, and then
another 2 weeks testing it is not very practical when they have more
than one project.

As for your other points - agreed, bugs getting to the customer is not
a good thing. But depending on various factors, it may not be the end
of the world if they do. eg: There are many thousands of bugs in open
source bug trackers, but people still use open source for important
things. Sometimes it is better to have software with a few bugs, than
no software (or very expensive, or very long time in development). See
"Worse is Better": http://en.wikipedia.org/wiki/Worse_is_better. See
also: Microsoft ;-)

You seem to think that people are suggesting you write all the tests up
front: what you should be doing is interleaving design+testing+coding all
together. That makes it impossible to account for test time separately as
the test time is tightly mixed with other coding, what you can be sure
though is that after an initial slowdown while you get used to the process
your overall productivity will be higher.

Sounds like you are suggesting that I obfuscate my development process
so noone can tell how much time I spent doing what

I think that moderate amounts of unit tests can be beneficial and not
slow down development significantly (similar to a bit more time spent
using version control vs not using it at all). Regression tests is a
good example. But going to the TDD extreme of always coding tests
before *any* code, for *all* projects, does not sit well with me (bad
analogy: similar to wasting time checking in each line into version
control separately, with a paragraph of comments).

The first time you make a change to some code and a test which is
apparently completely unrelated to the change you made breaks is the point
when you realise that you have just saved yourself hours of debugging when
that bug would have surfaced weeks later.

The next time your project is running late, your manager and the
customer will be upset if you spend time updating your unit tests
rather than finishing off the project (and handing it over to QA etc)
and adding the unit tests when there's actually time for it.

Clients generally require *working* software. Unfortunately it is all too
easy to ship something broken because then you can claim you completed the
coding on time and any slippage gets lost in the next 5 years of
maintenance.

That's why you have human testing & QA. Unit tests can help, but they
are a poor substitute. If the customer is happy with the first
version, you can improve it, fix bugs, and add more unit tests later.

David

PS: To people following this thread: I don't mean to be argumentative.
This is a subject I find interesting and I enjoy the debate. I'm
playing devil's advocate (troll?) to provoke further discussion.

D'Arcy J.M. Cain · May 24, 2008

Basically, with TDD you write the tests first, then the code which
passes/fails the tests as appropriate. However, as you're writing the
code you will also think of a lot of corner cases you should also
handle. The natural way to do this is to add them to the code first.
But with TDD you have to first write a test for the corner case, even
if setting up test code for it is very complicated. So, you have these
options:

- Take as much time as needed to put a complicated test case in place.

Absolutely. You may think that it is slowing you down but I can assure
you that in the long run you are saving yourself time.

- Don't add corner case to your code because you can't (don't have
time to) write a test for it.

If you don't have time to write complete, working, tested code then you
have a problem with your boss/client, not your methodology.

- Add the corner case handling to the code first, and try to add a
test later if you have time for it.

Never! It won't happen.

Having to write tests for all code takes time. Instead of eg: 10 hours
coding and say 1/2 an hour manual testing, you spend eg: 2-3 hours
writing all the tests, and 10 on the code.

In conventional development, 10 hours of code requires 90 hours of
testing, debugging and maintenance. Under TDD (and agile in general)
you spend 20 hours testing and coding. That's the real economics if
you want to deliver a good product.

I think that automated tests can be very valuable for maintainability,
making sure that you or other devs don't break something down the
line. But these benefits must be worth the time (and general
inconvenience) spent on adding/maintaining the tests.

I can assure you from experience that it always is worth the time.

If I did start doing some kind of TDD, it would be more of the 'smoke
test' variety. Call all of the functions with various parameters, test
some common scenarios, all the 'low hanging fruit'. But don't spend a
lot of time trying to test all possible scenarios and corner cases,
100% coverage, etc, unless I have enough time for it.

Penny wise, pound foolish. Spend the time now or spend the time later
after your client complains.

I'm going to read more on the subject (thanks to Ben for the link).
Maybe I have some misconceptions.

Perhaps just lack of experience. Read up on actual case studies.

D'Arcy J.M. Cain · May 24, 2008

Is it considered to be cheating if you make a test case which always
fails with a "TODO: Make a proper test case" message?

Yes. It's better to have the daily reminder that some code needs to be
finished.

While it is possible to describe all problems in docs, it can be very
hard to write actual test code.

It may be hard to start but once you have your framework in place it
becomes very easy.

For example: sanity tests. Functions can have tests for situations
that can never occur, or are very hard to reproduce. How do you unit
test for those?

Believe me, thousands of people reading this are remembering situations
where something that couldn't possibly happen happened.

A few examples off the top of my head:

* Code which checks for hardware defects (pentium floating point,
memory or disk errors, etc).

* Code that checks that a file is less than 1 TB large (but you only
have 320 GB harddrives in your testing environment).

* Code which checks if the machine was rebooted over a year ago.

And so on. These I would manually test by temporarily changing
variables in the code, then changing them back. To unit test these you
would need to write mock functions and arrange for the tested code to
call them instead of the python built-ins.

Yes but the mock functions can be wrappers around the real functions
which only change the results that you are testing for.

eg: You call function MyFunc with argument X, and expect to get result Y.

MyFunc calls __private_func1, and __private_func2.

You can check in your unit test that MyFunc returns result Y, but you
shouldn't check __private_func1 and __private_func2 directly, even if
they really should be tested (maybe they sometimes have unwanted side
effects unrelated to MyFunc's return value).

It isn't your job to test __private_func1 and __private_func2 unless
you are writing MyFunc.

Depends on the type of bug. If it's a bug which breaks the unit tests,
then it can be found quickly. Unit tests won't help with bugs they
don't explicitly cover. eg off-by-one, memory leaks, CPU load,
side-effects (outside what the unit tests test), and so on.

No but when you find that your code breaks due to these problems that's
when you write new unit tests.

But once you track down problems like the above you can write more
unit tests to catch those exact bugs in the future. This is one case
where I do favour unit tests.

Yes! One of the biggest advantages to unit testing is that you never
ever deliver the same bug to the client twice. Delivering software
with a bug is bad but delivering it with the same bug after it was
reported and fixed is calamitous.

Roy Smith · May 24, 2008

David said:
Problem 1: You can only code against tests

Yup. That's the flavor of Kool-Aide being served at a convenient TDD
outlet near you.

Basically, with TDD you write the tests first, then the code which
passes/fails the tests as appropriate. However, as you're writing the
code you will also think of a lot of corner cases you should also
handle. The natural way to do this is to add them to the code first.

That's only the natural way if you're haven't drunk the Kool-Aide

It
takes a while to get used to, but once you get the hang of it, doing it
this way becomes very natural. Sure, I think of corner cases when I'm
writing code. But, what I do with them is write a test for them first,
then write the code which implements it.

Roy Smith · May 24, 2008

David said:
While it is possible to describe all problems in docs, it can be very
hard to write actual test code.

For example: sanity tests. Functions can have tests for situations
that can never occur, or are very hard to reproduce. How do you unit
test for those?

In some cases, you can use mock objects to mimic the "can never happen"
situations. But, you are right, there are certainly cases which are
difficult or impossible to test for. TDD is a very powerful tool, but it's
just that: a tool. It's not a magic wand.

My suggestion is to make using TDD a habit, but don't turn it into a
religion. You will undoubtedly find places where it's just the wrong tool.
Don't let the fact that it can't do everything keep you from using it when
it makes sense.

Terry Reedy · May 25, 2008

| > But once you track down problems like the above you can write more
| > unit tests to catch those exact bugs in the future. This is one case
| > where I do favour unit tests.
|
| Yes! One of the biggest advantages to unit testing is that you never
| ever deliver the same bug to the client twice. Delivering software
| with a bug is bad but delivering it with the same bug after it was
| reported and fixed is calamitous.

Writing a test for each code bug is now part of the Python maintenance
procedure.

Michael L Torrie · May 25, 2008

David said:
Seriously, 10 hours of testing for code developed in 10 hours? What
kind of environment do you write code for? This may be practical for
large companies with hordes of full-time testing & QA staff, but not
for small companies with just a handful of developers (and where you
need to borrow somone from their regular job to do non-developer
testing). In a small company, programmers do the lions share of
testing. For programmers to spend 2 weeks on a project, and then
another 2 weeks testing it is not very practical when they have more
than one project.

Watch your programmers then. They do have to write and debug the code.
And they will spend at least as much or more time debugging as writing
the code. It's a fact. I have several programmers working for me on
several projects. What you have been told is fact. In my experience
it's 3-10x more time debugging as programming. I've heard that *good*
programmers write, on average, 10 new lines of code per day. I can also
verify that this is pretty accurate, both in my own programming
experience, and watching programmers working for me.

Matthew Woodcraft · May 25, 2008

Michael L Torrie said:
Watch your programmers then. They do have to write and debug the
code. And they will spend at least as much or more time debugging as
writing the code. It's a fact. I have several programmers working
for me on several projects. What you have been told is fact.

This isn't the case for everyone. In my workplace the time we spend
debugging is small compared to the time writing the code in the first
place. I wonder what the difference is?

We do use unit-testing quite widely, but by no means everywhere. The
code which doesn't have unit tests doesn't tend to be any buggier than
the code which does. Where testsuites really help is when you have to
upgrade some library or service that your programs are depending on,
and you get to find out about subtle backwards-incompatibilities.

-M-

David · May 25, 2008

Hi again.

Taking the advice of numerous posters, I've been studying BDD further.

I spent a while looking for a Python library which implemented BDD in
Python similar to jbehave, as described by Dan North on this page:
http://dannorth.net/introducing-bdd. I did find a few, but they either
had awful-looking syntax, or they were overly-complicated. So I
decided that using regular unit tests (via nosetest) was good enough,
even if it doesn't have support for stories, scenarios, givens, etc,
and it uses words with "Test" in it instead of "Behavior".

One thing I just tried was to put together a basic stack class
following BDD, with nosetest. I got the idea from this page:
http://www.ibm.com/developerworks/web/library/j-cq09187/index.html

It was an interesting exercise, and I'm encouraged to try it further.

I ended up with these 2 modules:

======test_stack.py========

from nose.tools import raises
import stack

class TestStackBehaviour:
def setup(self):
self.stack = stack.Stack()
@raises(stack.Empty)
def test_should_throw_exception_upon_pop_without_push(self):
self.stack.pop()
def test_should_pop_pushed_value(self):
self.stack.push(12345)
assert self.stack.pop() == 12345
def test_should_pop_second_pushed_value_first(self):
self.stack.push(1)
self.stack.push(2)
assert self.stack.pop() == 2
def test_should_leave_value_on_stack_after_peep(self):
self.stack.push(999)
assert self.stack.peep() == 999
assert self.stack.pop() == 999
def test_should_pop_values_in_reverse_order_of_push(self):
self.stack.push(1)
self.stack.push(2)
self.stack.push(3)
assert self.stack.pop() == 3
assert self.stack.pop() == 2
assert self.stack.pop() == 1
@raises(stack.Empty)
def test_peep_should_fail_when_stack_is_empty(self):
self.stack.peep()
def test_should_be_empty_when_new(self):
assert len(self.stack) == 0

======stack.py========

class Empty(Exception):
"""Thrown when a stack operation is impossible because it is empty"""
pass

class Stack:
"""Basic implementation of a stack"""
def __init__(self):
self._data = []
def push(self, value):
"""Push an element onto a stack"""
self._data.append(value)
def pop(self):
"""Pop an element off a stack"""
try:
return self._data.pop()
except IndexError:
raise Empty
def peep(self):
"""Return the top-most element of the stack"""
try:
return self._data[-1]
except IndexError:
raise Empty
def __len__(self):
"""Return the number of elements in the stack"""
return len(self._data)

===================

Does the above look like a decent BDD-developed class?

Is it ok that there are no 'scenarios', 'stories', 'having', 'given',
etc references?

Some pages suggest that you should use so-called contexts
(EmptyStackContext, StackWithOneElementContext, FullStackContext,
AlmostFullStackContext, etc).

Would you normally start with a basic TestStackBehavoiur class, and
when Stack becomes more complicated, split the tests up into
TestEmptyStackContext, TestStackWithOneElementContext, etc?

Another thing I noticed is that some of my test cases were redundant.
Would you normally leave in the redundant tests, or remove the ones
which are included in the more general test?

Also, I have another question. How do you unit test event loops?

eg: Your app is a (very basic) service, and you want to add some
functionality (following BDD principles)

Here's an example unit test:

class TestServiceBehavior:
def setup(self):
...
def test_handles_event_xyz(self):
...

If your service is normally single-threaded, would your unit test need
to start the service in a separate thread to test it?

Another method would be to update the event loop to enable unit
testing. eg only iterate once if a 'being_tested' variable is set
somewhere.

None of the above are ideal. What is a good way to unit test event loops?

David.

Aahz · May 25, 2008

Seriously, 10 hours of testing for code developed in 10 hours? What
kind of environment do you write code for? This may be practical for
large companies with hordes of full-time testing & QA staff, but not
for small companies with just a handful of developers (and where you
need to borrow somone from their regular job to do non-developer
testing). In a small company, programmers do the lions share of
testing. For programmers to spend 2 weeks on a project, and then
another 2 weeks testing it is not very practical when they have more
than one project.

You must have poor project management/tracking. You WILL pay the cost
of testing, the only question is when. The when does have an impact on
other aspects of the development process.

Speaking as someone who started in my current job four years ago as the
third developer in a five-person company, I believe that your claim about
the differences between small companies and large companies is specious.

David · May 25, 2008

You must have poor project management/tracking. You WILL pay the cost
of testing, the only question is when. The when does have an impact on
other aspects of the development process.

Speaking as someone who started in my current job four years ago as the
third developer in a five-person company, I believe that your claim about
the differences between small companies and large companies is specious.
--

Might be a difference in project size/complexity then, rather than
company size. Most of my works projects are fairly small (a few
thousand lines each), very modular, and each is usually written and
maintained by one developer. A lot of the programs will be installed
together on a single server, but their boundaries are very clearly
defined.

Personally I've had to do very little bug fixing and maintenance.
Thorough testing of all my changes before they go into production
means that I've caught 99% of the problems, and there is very little
to fix later.

That's why I'm surprised to hear that such a huge amount of time is
spent on testing maintenance, and why the other posters make such a
big deal about unit tests.

I'm not a genius programmer, so it must be that I'm lucky to work on
smaller projects most of the time.

David.

Matthew Woodcraft · May 25, 2008

David said:
Might be a difference in project size/complexity then, rather than
company size. Most of my works projects are fairly small (a few
thousand lines each), very modular, and each is usually written and
maintained by one developer. A lot of the programs will be installed
together on a single server, but their boundaries are very clearly
defined.

Personally I've had to do very little bug fixing and maintenance.
Thorough testing of all my changes before they go into production
means that I've caught 99% of the problems, and there is very little
to fix later.

That's why I'm surprised to hear that such a huge amount of time is
spent on testing maintenance, and why the other posters make such a
big deal about unit tests.

I'm not a genius programmer, so it must be that I'm lucky to work on
smaller projects most of the time.

This is close to my experience. One lesson we might draw is that
there's an advantage to structuring your work as multiple small
projects whenever possible, even if making one big project seems more
natural.

I should think everyone would be happy with a bug-coping strategy of
"don't write the bugs in the first place", if at all possible. My guess
is that the main part of the 'small project' advantage is that all
changes can be written or reviewed by someone who is decently familiar
with the whole program.

That does suggest that if programmers do find that they're spending
more than half of their time fighting bugs, it might be worthwhile to
invest time in having more people become very familiar with the
existing code.

-M-

Paul Rubin · May 25, 2008

You must have poor project management/tracking. You WILL pay the cost
of testing, the only question is when. The when does have an impact on
other aspects of the development process.

Well, let's say you used TDD and your program has 5000 tests. One
might reasonably ask: why 5000 test? Why not 10000? Why not 20000?
No number of tests can give you mathematical certainty that your code
is error-free. The only sensible answer I can think of to "why 5000"
is that 5000 empirically seemed to be enough to make the program
reliable in practice. Maybe if you used a more error-prone coding
process, or wrote in assembly language instead of Python, you would
have needed 10000 or 20000 tests instead of 5000 to get reliable code.

But then one might reasonably ask again: why 5000 tests? Why not 2000
or 1000? Was there something wrong with the coding process, that
couldn't produce reliable code with fewer tests?

So, I think you have to consider the total development cycle and not
treat test development as if it were free.

I also haven't yet seem an example of a real program written in this
test-driven style that people keep touting. I use doctest when
writing purely computational code, and maybe it helps some, but for
more typical code involving (e.g.) network operations, writing
automatic tests (with "mock objects" and all that nonsense) is a heck
of a lot more work than testing manually in the interactive shell, and
doesn't seem to help reliability much. I'd be interested in seeing
examples of complex, interactive or network-intensive programs with
automatic tests.

Jacob Hallen · Jun 2, 2008

That's why you have human testing & QA. Unit tests can help, but they
are a poor substitute. If the customer is happy with the first
version, you can improve it, fix bugs, and add more unit tests later.

The most important aspect of usnit testing is actually that it makes the code testable.
This may sound lik an oxymoron but it is actually a really important property. Testable
code has to have a level of modularity as well as simplicity and clarity in its
interfaces that you will not achieve in code that lacks automated unit tests.

You can easily convince yourself that this is true by adding complete coverage unit
tests to some code after you have written it. It's tough work and more often than
not, you need to refactor the code to make it testable.

Another aspect that you are raising is the use of human testing and QA. I agree that
these are important, but every bug they discover is a failure of the developers and
their tests. Our testers can sail through a testbed in 30 minutes if there are no bugs.
Every single bug adds 30-60 minutes of testers time in order to pinpoint the bug
and supply the developers with enough information to locate and fix it. Add some
10 minutes to testing time on the next testbed to verify that the bug is actually
fixed. In my end of the world, tester time is not cheaper than developer time. It
is also a scarcer resource than developer time.

The roundtrips between developers and testers also add to real required to develop
the product. There will always be delays introduced by having to produce the tester
with a testable system, wait for the tests to happen, have the report written up
and allocated to a developer etc. With TDD most of these bugs will be caught
in a no-delay loop at the developers desk.

This last fact has another important property that is easily overlooked. Once the
off-by-one errors and other trivial bugs don't clutter the minds of the testers
they will start thinking more clearly about the application and where the really
nasty bugs are found - the ones where your thinking went wrong but your implementation
is correct. if the testers are busy tracking trivial bugs, your customers will
find the nasty one. If you are lucky, they will tell you.

Jacob HallÃ©n

--

Paul Rubin · Jun 3, 2008

Duncan Booth said:
I made the mistake at one point when I was trying to sell the concept of
TDD telling the people I was trying to persuade that by writing the tests
up front it influences the design of the code. I felt the room go cold:
they said the customer has to sign off the design before we start coding,
and once they've signed it off we can't change anything.

Usually the customer signs off on a functional specification but that
has nothing to do with the coding style. Jacob makes a very good
point that TDD influences coding style, for example by giving a strong
motivation to separate computational code from I/O. But that is
independent of the external behavior that the customer cares about.

Strategies for unit testing an HTTP server.	0	Nov 29, 2010
Testing interactive code using raw_input	5	Mar 10, 2014
Remote SSH and Configuring code help	0	Dec 13, 2023
C programing code	6	Aug 1, 2023
Code help please	4	May 19, 2023
Who are low code solutions designed for?	1	Oct 22, 2023
Python testing tools	8	Jul 19, 2013
Unittest - testing for filenames and filesize	15	Aug 23, 2012

Code correctness, and testing strategies

David

Roy Smith

David

Jacob Hallen

David

David

D'Arcy J.M. Cain

D'Arcy J.M. Cain

Roy Smith

Roy Smith

Terry Reedy

Michael L Torrie

Matthew Woodcraft

David

Aahz

David

Matthew Woodcraft

Paul Rubin

Jacob Hallen

Paul Rubin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads