JUnit et al approach - criticisms

C

Chris Uppal

Mike said:
Have you never had managers who said things like "We're in crunch mode, so
we can't afford to do as much testing as we'd like to."?

I've had /customers/ who said that.

(To be fair, the customers had their own test plans, so we weren't installing
underdone software, only /shipping/ it ;-)

-- chris
 
C

Chris Uppal

Andrew said:
What do you mean by 'non observable behavior'?

All code should had some effect upon the system else by definition that
code is not doing anything and should be deleted.

There is much behaviour which is not observable. That is to say not legally
observable according to the laws of the Java language. For instance:
1) any non-public member variable
2) any aspect of an object's state that is only reflected in
its non-public behaviour
3) any part of the execution state of another thread
4) the locked/unlocked status of a monitor
There are probably others. I would expect the first two of these to be the
most problematic in practise.

Additionally, there may be external restrictions -- e.g. it may be possible for
an application to log a message but not (without artificially boosting its
privileges) be permitted to read the resulting log.

-- chris
 
C

Chris Uppal

I said:
2) any aspect of an object's state that is only reflected in
its non-public behaviour

I meant to revise that before posting, and then hit the send button
prematurely. The sentence should read "any apect [...] that is only feasibly
reflected in [...]". If the behaviour is genuinely /never/ public then it
"doesn't matter", but there can certainly be cases where bugs could easily be
detected by examining non-public behaviour but which would be difficult or
inpractical to examine only using public behaviour.

-- chris
 
M

Mike Schilling

Chris Uppal said:
I said:
2) any aspect of an object's state that is only reflected in
its non-public behaviour

I meant to revise that before posting, and then hit the send button
prematurely. The sentence should read "any apect [...] that is only
feasibly
reflected in [...]". If the behaviour is genuinely /never/ public then it
"doesn't matter", but there can certainly be cases where bugs could easily
be
detected by examining non-public behaviour but which would be difficult or
inpractical to examine only using public behaviour.

This is the best reason I know of for C++'s "friend" feature; it allows a
test class to see the complete state of the class it's testing.
 
M

Mike Schilling

Chris Uppal said:
I've had /customers/ who said that.

(To be fair, the customers had their own test plans, so we weren't
installing
underdone software, only /shipping/ it ;-)

It's still foolish. "We want to delay finding the bugs, so they'll be
harder and more expensive to fix."
 
T

Timbo

Chris said:
A wise man once told me that "testing is a deliberate attempt to break the
system".
Yes, this is a common view held in testing literature: a good test
is a test that produces a failure.
I think that's a very good description of the attitude that should be
present when testing -- or rather, should be applied during /some/ testing.
And that's what I find is missing in almost any automated testing -- whether
the wildly faddish TDD or classical overnight test suites -- the purpose of the
test is to confirm that the system (or module) works. There is no active
intent to make the system break, there is no intelligent exploration of
corner-cases, there is no inspired guessing at possible-unanticipated
combinations of inputs. Above all, there is no exploration of the problem
space -- the suite tests the same combinations each time. (The big exception
to this -- which I have very rarely seen used in practise -- is when automation
is used for a brute-force, exhaustive, exploration of a significant sub-set of
the problem space). I like to see some real testing (in the above sense) as
part of the development effort. To me (and assuming that exhaustive testing is
infeasible) that means interactive testing. Write a test harness or use
something like BeanShell. Try things out, did they work ? Did they work
/exactly/ how you expected ? Push things a bit. Did the disk-light flash when
it shouldn't ? Did the screen flicker more than you expected during your GUIs
repaint ? Try wildly implausible combinations. You are trying to break your
own code -- which requires imagination and attention to detail.

While I agree 100% with your view on how inputs should be tested
(many test scripts I see test only the normal case behaviours), I
disagree that interactive testing is better at finding faults, and
that unit testing and aggresive testing are mutually exclusive. If
you are trying wildly implausible combinations of behaviour in an
attempt to produce a failure in an interactive mode, then you
should be testing that same behaviour in your automated tests. I
don't really see why interactive testing would produce more
interesting inputs if it's the same person writing the tests.

This is not to say that interactive scripts are worthless. They
can be quite useful for debugging.
Anyway, the warning I wanted to give about automated testing isn't just that
it's not an adequate substitute for aggressive testing (in the above sense),
but also that there's a risk that it will /displace/ aggressive testing. Once
the automated tests are written (whether before, along-side, or after the code
they apply to), it takes a great deal of self-discipline[*] not just to rely on
those tests. Press the button, everything comes up green, "Good, it works!".
That's not /testing/ -- it has a great deal of value, but it's confirming that
the system works on some inputs
That's exactly what testing is: confirming that a system works on
SOME inputs. As Dijkstra famously said: "Testing confirms the
presence of bugs, not their absence". While I agree that it takes
a good deal of discipline to maintain test scripts, IMO, it is
much more painful to sit down after every change and interactively
test, which is also testing only some of the inputs. This will
result in only testing the areas of the code that have changed,
and not testing the result of that on the rest of the system/module.
([*] /I/ don't have that much self-discipline, so -- although I've always put a
great deal of effort into testing -- I've settled into a working pattern where
I don't write automated tests until /after/ I'm satisfied that my stuff works.
I see no problem with that. Failure in testing processes generally
comes at the regression level. It's straightforward to test that
new functionality works as intended, but the effects of that
change on the rest of the system has to be tested also. As long as
those tests are automated, this makes the problem much more
straightforward.
in my experience, writing scripted tests (JUnit, and
similar) takes a lot of work, about an order of magnitude more work than doing
the same tests in the kind of interactive environment I as talking about
earlier.

Is it really more difficult to add tests for new behaviour in a
script than it is to interactively test that behaviour? A bonus is
regression testing can be performed, and more importantly, the
tests are better documented.

Tim
 
C

Chris Uppal

Mike said:
It's still foolish. "We want to delay finding the bugs, so they'll be
harder and more expensive to fix."

Not necessarily. Both customers (it's only happened to me twice) were large,
experienced, and competent in the ways of IT. I think their reasoning was
something like [in the following "we" means the customer, "they" means the mob
I worked for]:

We have a hard deadline to go live with a system which, we hope, will
include their software. If not then we go live anyway with reduced
functionality.
We have only a limited time available for testing and are worried that it
might be insufficient.
Therefore we must ensure that we are not wasting /any/ time on
inefficient testing.
We know (by definition) exactly what priority to give to which functions
and aspects of the system, and therefore are able to prioritise testing
appropriately.
Whatever testing we leave to them risks them spending too little time on
things that are important to us, or too much time on things that we don't
care so much about.

Downside:
a) They know the implementation of the system better that we do, and
so they can do finer grained testing, and also can concentrate on
things they consider (technically) high-risk.

But on the other hand:
a) There are some critical things that /only/ we can test properly.
b) They have proved pretty good at delivering working systems in the
past.
c) They can be testing in parallel with us, and so can look for the
things in downside (a) above (but we may have to pay extra
for that -- must talk to the lawyers).
d) We can blame them just as easily for delivering a buggy product as
we can for delivering late ;-) [*]

All in all I think it was certainly defensible, and probably sensible, decision
making.

(BTW, why should an error found by their testing be any more or less expensive
to fix than the same error found by our testing ?)

-- chris

([*] A more cynical version of the same idea, the customer's manager in charge
of the project thinks:
I have the choice of getting into acceptance testing on time, but with
increased risk of the delivered product being unacceptable (which we
would blame on the supplier), or of risking missing /my/ deadlines and
being blamed myself for not managing the supplier properly.
Hmm, /tough/ decision...
In all truth, I doubt if the thinking really was quite as cynical in the cases
I experienced, but one never knows.)
 
R

Roedy Green

It's still foolish. "We want to delay finding the bugs, so they'll be
harder and more expensive to fix."

There are cases where it is not foolish. E.g. You must have SOMETHING
to show for Java One. That date is not negotiable. Even if it is
buggy you can dance around the bugs in a demo.
 
M

Mike Schilling

(BTW, why should an error found by their testing be any more or less
expensive
to fix than the same error found by our testing ?)

Because the earlier you find a bug, the cheaper it is to fix it:

Fewer people will run into it. (Ideally, only one, if it's found before
the code is released to a shared area.)
Less of the system will be affected by the problem when there is less of
the system.
If fixing it requires redesign, less of the system will be affected by
that.


Unless both of you are doing end-to-end testing of the complete system:
then, there's no particular reason.
 
A

Andrew McDonagh

Mike said:
Because the earlier you find a bug, the cheaper it is to fix it:

Fewer people will run into it. (Ideally, only one, if it's found before
the code is released to a shared area.)
Less of the system will be affected by the problem when there is less of
the system.
If fixing it requires redesign, less of the system will be affected by
that.


Unless both of you are doing end-to-end testing of the complete system:
then, there's no particular reason.

And then theres developer memory. When a bug is introduced as a
directeffect of the work they are working upon (it identified before
they are 'done' or asap after they are 'done', then they are more likely
to 'know' (i.e. guess correctly) where the problem is. So they can
reproduce and fix quicker & therefore cheaper.
 
M

Mike Schilling

Roedy Green said:
There are cases where it is not foolish. E.g. You must have SOMETHING
to show for Java One. That date is not negotiable. Even if it is
buggy you can dance around the bugs in a demo.

If the project is of any complexity, skipping unit testing is more likely
to result in nothing that's even demoable.
 
C

Chris Uppal

Timbo said:
While I agree 100% with your view on how inputs should be tested
(many test scripts I see test only the normal case behaviours), I
disagree that interactive testing is better at finding faults, and
that unit testing and aggresive testing are mutually exclusive.

If you have an adequate interactive testing environment, then interactive
testing is an order of magnitude faster, and far less disruptive, than writing
"formal" test code. (If your environment is not such that you have that order
of difference, then I'd say that's a very serious flaw in your environment, not
a strength of jUnit ;-) If creating scripted tests were as fast and fluid as
interactive testing, then I would much more nearly agree with you.

Another point is that much testing does not need to be repeated. E.g. if I
have written some simple buffer manipulation, and want to run a quick sanity
check, then I'll try using inputs that are bunched around the buffer size.
There's no point in "freezing" such tests as code, since it is only useful for
the /specific/ code that I'm testing. If that code changes the the test would
have to change (completely) too. Of course, if the buffer handling is in any
way complex, then some proper regression tests should be created too (but
that -- in my book -- is a different issue, and it should wait until the code
has settled down).

One last point -- which to my mind is /extremely/ important, but which I failed
to bring out in my earlier post -- is that each time you "play" with your code
interactively, you will be running a /different/ test. Think of it as "beta"
testing happening very early in the cycle. It's /precisely/ the lack of that
variation which makes me uneasy about software (including my own) which is
tested only via scripts.

-- chris
 
T

Timbo

Chris said:
If you have an adequate interactive testing environment, then interactive
testing is an order of magnitude faster, and far less disruptive, than writing
"formal" test code. (If your environment is not such that you have that order
of difference, then I'd say that's a very serious flaw in your environment, not
a strength of jUnit ;-) If creating scripted tests were as fast and fluid as
interactive testing, then I would much more nearly agree with you.
Well, we'll have to agree to disagree here, because I think that
if one's interactive testing is quicker than a test script, than
you are not testing a componenent adequately enough. For your
example of a buffer, one could write a automated script to test
different sizes (0,1,many) and the boundary of that in a matter of
minutes, which would be just as straightforward.

Either way, I'm guessing your interactive test code is something
like 'main' program that drives the unit in question. If that's
the case, I don't think it is much harder to write a test script
that reads in files placed in a specific test directory, with each
file specifying the input that you would otherwise be typing in,
and the expected output. It may take a bit more time, but you then
have a test suite that allows you to run your tests over and over
after each change, and that is easily extensible (add new test
input/output files), and easily changed.
One last point -- which to my mind is /extremely/ important, but which I failed
to bring out in my earlier post -- is that each time you "play" with your code
interactively, you will be running a /different/ test. Think of it as "beta"
testing happening very early in the cycle. It's /precisely/ the lack of that
variation which makes me uneasy about software (including my own) which is
tested only via scripts.
Yes, that is a very interesting point about running different
tests each time. However, you may use different tests each time,
but if the code-under-test has been changed, the tests you have
run previously are no longer valid, because they have tested a
different implementation.
 
A

Andrew McDonagh

Chris said:
Andrew McDonagh wrote:




There is much behaviour which is not observable. That is to say not legally
observable according to the laws of the Java language. For instance:

UNit tests test the result of a call, of which there has to be some
output, else the code isn't doing any thing useful.


Note: All of the following answers rely upon normal Java access
privileges and do not assume the need to use any kind of reflection.

1) any non-public member variable

If the ClassUnderTest (CUT) has a private member var that changes as a
result of the test AND its change is the result that will tell the test
if its passed, then we will need to somehow see that variable in one
form or another. In other words, the CUT would normally have some other
method that could be called to see if the private member has changed
(note not talking solely about a simple getter here - but it may be the
case).

The change may be that the private member has been nullified - in which
case, by stimulating the CUT again we can check for change.

An example here would be having a Listener reference. Our test could
give the CUT a MockListener which we verify is called when we stimulate
as necessary, then call the public method to set the listener var to
null then stimulate the CUT again and check the MockListener was NOT
called a second time.

Either or any other way, if a CUT is changing the state of a member var
and that change has no effect, then the member var is not needed - and
so delete it.
2) any aspect of an object's state that is only reflected in
its non-public behaviour

Can you give me an example of what you mean here, because I'd say that
state was not needed if it can't be detected from outside.
3) any part of the execution state of another thread
4) the locked/unlocked status of a monitor


Testing the state of execution of another thread, multi-threadness and
other threaded scenarios are not a unit tests, they are functional (ALA
Acceptance) tests.

These types of test are not what Junit was developed for and so 'out of
the box' it isn't supported very well. That being said that are add-ons
to do just those types of tests.

The main reason they were considered to be part of the unit testing
stage, is that they test functionality of the system, rather than the
behavior of a Unit (Class). Acceptance/Functional tests are better
developed in other testing tools that support a scripting interface that
non-java developers can read, understand and use themselves to help
create tests (See FitNesse.org, Selenium, Watir, etc)

We would not normally write tests to detect monitor status itself, but
we could write tests to prove we don't get deadlocks.

Unit tests test individual classes/methods.

There are probably others. I would expect the first two of these to be the
most problematic in practise.

Weird, to me they are the easiest.
Additionally, there may be external restrictions -- e.g. it may be possible for
an application to log a message but not (without artificially boosting its
privileges) be permitted to read the resulting log.

A unit test is not a unit test if it relies upon external resources
and/or restrictions that can not be substituted at runtime - they are
acceptance tests.

In the example above, the unit test could very easily substitute the log
object the class (not the application - cause then we are acceptance
testing) is writing to, so it can verify the correct behavior.

This dependency injection (via constructor, setter, dynamically loaded
from resource file, retrieved from a singleton, etc) necessary for
testability results in a better more changeable & maintainable design.

If this seems strange, feel free to knock up an SSCEE (or what ever its
called this week) example to which you have something that is
non-visible and so can't test and I'll show you how it could or if its
not a unit test.

Andrew
 
C

Chris Uppal

Andrew McDonagh wrote:

[choosing just this one paragraph as a representative of the whole post]
A unit test is not a unit test if it relies upon external resources
and/or restrictions that can not be substituted at runtime - they are
acceptance tests.

This is simply playing with words. I would not define "unit test" so narrowly,
and I /certainly/ would not relate the concept of "acceptance test" to external
resources in any way (they may use them, they may not). An acceptance test has
alway (in my experience) been a test run by the /buyer/. Anyway, if we do use
your restricted sense of "unit test" then I have never at any time indicated
any restriction of the validity of /my/ comments to "unit tests". Indeed I
have little or no interest in them, nor in the religion that appears to
accompany the term when used so narrowly.

-- chris
 
C

Chris Uppal

Timbo said:
Well, we'll have to agree to disagree here, because I think that
if one's interactive testing is quicker than a test script, than
you are not testing a componenent adequately enough. For your
example of a buffer, one could write a automated script to test
different sizes (0,1,many) and the boundary of that in a matter of
minutes, which would be just as straightforward.

Minutes ! Good gracious, man, I don't have /minutes/ to waste!

;-)

Seriously, if writing a test distracted me for minutes, then that is indeed
something that I would avoid. That would be killing the fluidity of
development, not to mention causing an immense disturbance of my train of
thought. As such it should be (IMO) postponed to a time when the software has
settled down, and I am in a position to concentrate on writing test code that
is not only sensible and reasonably thorough, but also clear and maintainable.

Either way, I'm guessing your interactive test code is something
like 'main' program that drives the unit in question.

No. By interactive I /mean/ interactive. The ideal environment for doing this
sort of thing is Smalltalk (or, perhaps, Lisp), but tools like BeanShell, etc,
can get you part of the way there. Actually, I don't know of any Java
environment that /really/ supports interactive testing , but this is a pattern
of working that I developed before Java even existed (and long before I started
using Smalltalk). It's a matter of making best use of the available tools, and
writing your own whenever necessary.

-- chris
 
M

Mike Schilling

Chris Uppal said:
Andrew McDonagh wrote:

[choosing just this one paragraph as a representative of the whole post]
A unit test is not a unit test if it relies upon external resources
and/or restrictions that can not be substituted at runtime - they are
acceptance tests.

This is simply playing with words. I would not define "unit test" so
narrowly,
and I /certainly/ would not relate the concept of "acceptance test" to
external
resources in any way (they may use them, they may not). An acceptance
test has
alway (in my experience) been a test run by the /buyer/.

I think you're being too specific here. I use:

Unit test -- test that is part of a subsystem, verifying that a part of that
subsystem behaves correctly. This may include both black- and white-box
testing.

Acceptance test -- test that is part of a subsystem's client, testing the
subsystem as called by that client. This is black-box testing only, since
the client is allowed to know nothing of the subsystem's implementation.

To reemployment a subsystem safely, you need both unit tests and acceptance
tests from all clients.
 
T

Timbo

Chris said:
Timbo wrote:




Minutes ! Good gracious, man, I don't have /minutes/ to waste!

;-)
LOL!

Seriously, if writing a test distracted me for minutes, then that is indeed
something that I would avoid. That would be killing the fluidity of
development, not to mention causing an immense disturbance of my train of
thought. As such it should be (IMO) postponed to a time when the software has
settled down, and I am in a position to concentrate on writing test code that
is not only sensible and reasonably thorough, but also clear and maintainable.
Fair enough. I'm the opposite. I like to run the test scripts
while I continue with other stuff (tidying up comments etc).

No. By interactive I /mean/ interactive.
Yeah, sorry, I meant an interactive 'main' program that drives the
testing by prompting for input etc.
The ideal environment for doing this
sort of thing is Smalltalk (or, perhaps, Lisp), but tools like BeanShell, etc,
can get you part of the way there. Actually, I don't know of any Java
environment that /really/ supports interactive testing , but this is a pattern
of working that I developed before Java even existed (and long before I started
using Smalltalk). It's a matter of making best use of the available tools, and
writing your own whenever necessary.
In my best Homer Simpson voice: MMmmmmm... Smalltalk.....

Have you tried that BlueJ (not BlueJay) interactive environment
(I've never used it myself)? I seem to recall that was aimed at
beginners for interacting with stand-alone units, although I don't
really remember many details.
 
C

Chris Uppal

Timbo said:
Have you tried that BlueJ (not BlueJay) interactive environment
(I've never used it myself)? I seem to recall that was aimed at
beginners for interacting with stand-alone units, although I don't
really remember many details.

I have tried it and have even recommended it. It's important to realise that
its primarily a /teaching/ tool -- and a tool for teaching OO at that. Some of
the ideas behind it come about as close as one can conveniently get to
"interacting with objects" rather than "writing code" being the central concept
of a good (OO) development environment.

And of course, once one can talk directly to one's objects, testing (in the
interactive sense I've been going on about) becomes easy. Indeed there's a
sense in which that's all one ever does with them.

I have some ideas for a /fully/ interactive Java environment (needs bytecode
rewriting plus a bit of JNI) but with the layers of fluff that Sun keep adding
to the language, the gap between what's technically feasible (more-or-less
anything) and what's actaully worthwhile doing /in Java/ gets bigger and
bigger...

-- chris
 
C

Chris Uppal

Mike said:
I think you're being too specific here. I use:

Unit test -- test that is part of a subsystem, verifying that a part of
that subsystem behaves correctly. This may include both black- and
white-box testing.

No significant disagreement there.

Only two points I'd add. One is that "unit testing" neither implies nor is
implied by the use of an xUnit framework. (I doubt if you confuse the two, but
I think some people do). The other is that given the fractal nature of
software designs, almost any test is simultaneously a unit test at one level
and an "integration test" at a lower level.

(I put "integration test" is scare quotes because that is not quite the usual
meaning of the term, but I don't know of anything closer in common usage.
Normally, in my experience, it is only used when a complete /system/ is being
tested -- full end-to-end functionality.)

Acceptance test -- test that is part of a subsystem's client, testing the
subsystem as called by that client. This is black-box testing only, since
the client is allowed to know nothing of the subsystem's implementation.

Hmm. I would definitely call that odd use of the term. At least in an
environment where one has genuine paying customers, the question that
"acceptance testing" addresses is not "does the code work?", but "are we going
to be paid?". And, for all the importance of the first question, the second
has importance of a totally different order. I think few people would want to
confuse the two ;-)

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,151
Latest member
JaclynMarl
Top