Requesting critique of a C unit test environment

A

Ark Khasin

Ian said:
What was your conclusion?
Since you asked... I expected a more substantial feedback (perhaps,
naively).

I've been heavily involved in a SIL 3 project with ARM and IAR EWARM 4.x
Our consultant-on-safety-stuff company spelled out requirements for unit
testing /documentation/ so that it could be accepted by TUumlautV - a
certifying agency. So we started looking for test automation tools that
would help /demonstrate/ test results including code/branch coverage.
IPL Cantata++ didn't (at least at the time) support IAR. Period.
LDRA Testbed was (at least at the time) outright buggy; we had to
abandon it.
We ended up doing unit tests ad hoc with manual documentation, asking
for code coverage proof from the debugger. In the process, I found a way
of instrumenting the code by abusing the C preprocessor. It was not what
I asked to critique but the ideas were similar. I used it for regression
testing to make sure the execution trace was the same - or it changed
for a purpose and I had to set a new regression base.
It occurred to me later that the same instrumentation ideas could be
used to demonstrate code coverage, and that the framework could be
commonized with a very small footprint. That's what I asked to critique.
Personally, I love this thingie. It can get me through SIL business for
free (as opposed 10K per license or so), and additional coding policy
items should not be an issue for those already restricted by (a subset
of) MISRA.
Note that this stuff is applicable whether you do TDD or a formal
testing campaign.
 
I

Ian Collins

Ark said:
Since you asked... I expected a more substantial feedback (perhaps,
naively).

I've been heavily involved in a SIL 3 project with ARM and IAR EWARM 4.x
Our consultant-on-safety-stuff company spelled out requirements for unit
testing /documentation/ so that it could be accepted by TUumlautV - a
certifying agency. So we started looking for test automation tools that
would help /demonstrate/ test results including code/branch coverage.

Well TDD done correctly will give you the required coverage, might be
interesting proving it to the certifying agency though (a branch won't
be there unless a test requires it).
 
A

Ark Khasin

Ian said:
Well TDD done correctly will give you the required coverage, might be
interesting proving it to the certifying agency though (a branch won't
be there unless a test requires it).
The mantra is, if a branch is to never be executed, it shall not be in
the production code. If it is executable, I must demonstrate how I
exercised it.
BTW, I tend to doubt TDD can be followed in semi-research environment
where you churn out piles of code to see what works vs. how the plant
behaves etc. Of course there is some lightweight informal T but none
documented. Once you found a decent solution, you got a fair amount of
code that needs, aside from bug finding/fixing, only error conditions
handling for productizing. At this point, T is already way behind D.
Or I must be missing something here...
 
A

Ark Khasin

Ian said:
The last sentence is important, so I'll repeat it - the unit test are
built and run each time the module is compiled.
Assuming that the test code itself must be reasonably dumb (so that
/its/ errors immediately stand out), that's not terribly realistic:
imagine a sweep over, say, "int24_t" range. One could only hope to run
automated tests overnight - on a long night :).
 
I

Ian Collins

Ark said:
Assuming that the test code itself must be reasonably dumb (so that
/its/ errors immediately stand out), that's not terribly realistic:
imagine a sweep over, say, "int24_t" range. One could only hope to run
automated tests overnight - on a long night :).

It may not appear that way, but it is the reality on any project I
manage. In all (C++) cases, the tests take less time to run than the
code takes to build (somewhere between 50 and 100 tests per second,
unoptimised).
 
I

Ian Collins

Ark said:
The mantra is, if a branch is to never be executed, it shall not be in
the production code. If it is executable, I must demonstrate how I
exercised it.
BTW, I tend to doubt TDD can be followed in semi-research environment
where you churn out piles of code to see what works vs. how the plant
behaves etc. Of course there is some lightweight informal T but none
documented. Once you found a decent solution, you got a fair amount of
code that needs, aside from bug finding/fixing, only error conditions
handling for productizing. At this point, T is already way behind D.
Or I must be missing something here...
Well there you would be wrong. I even use it for quick one offs,
because it helps me go faster. The time saved not having to debug more
than justifies the process.
 
I

Ian Collins

Ark said:
The mantra is, if a branch is to never be executed, it shall not be in
the production code. If it is executable, I must demonstrate how I
exercised it.

With TDD, if the branch isn't required to pass a test, it wont be there
at all.
 
K

Keith Thompson

Ark Khasin said:
Assuming that the test code itself must be reasonably dumb (so that
/its/ errors immediately stand out), that's not terribly realistic:
imagine a sweep over, say, "int24_t" range. One could only hope to run
automated tests overnight - on a long night :).

If every unit test has to check every possible value over a large
range, then yes, things could take a while. I just wrote a program
that iterated over a range of 2**24 in under a second, but if a
function takes three 32-bit arguments an exhaustive test starts to be
impractical.

But presumably in that case you'd just test a carefully chosen subset
of the possible argument values.
 
P

Phlip

Ian said:
Ark Khasin wrote:

Sounds like TDD.
Well there you would be wrong. I even use it for quick one offs,
because it helps me go faster. The time saved not having to debug more
than justifies the process.

Ian, you are responding to the straw-person argument, "Projects that use TDD
only ever write any tests before the tested code."

They don't. Now let's hear why these "research" environments are _allowed_
to write code without a failing test, first!
 
P

Phlip

Ian said:
Well TDD done correctly will give you the required coverage, might be
interesting proving it to the certifying agency though

Put the test into FIT, and let the certifying agency change the input
variables and watch the output responses change.

Oh, are you going to say there are "certifying agencies" out there which
_don't_ expect literate acceptance tests to cover their requirements??

Sure explains KBR, huh? (-;
 
P

Phlip

Ark said:
Ian Collins wrote:
Assuming that the test code itself must be reasonably dumb (so that /its/
errors immediately stand out), that's not terribly realistic: imagine a
sweep over, say, "int24_t" range. One could only hope to run automated
tests overnight - on a long night :).

Under TDD, you might only need enough tests to establish a linear function
along that int24_t (two tests), and enough to establish its extrema (two
more tests). These run fast.

If you then need more tests, you add them. There is no TDD "or" a formal
test campaign. You are allowed to use "unit test" techniques. For example,
you might calculate a sweep that covers a representative subset of all
possible integers.

If your tests run too long to help development, you push the slow ones out
into a slow suite, and run this on a test server. You can use one of many
Continuous Integration tools to trigger that batch run each time developers
commit. (And they should commit every 5 to 15 minutes.)

You leave every TDD test in the developers' suite.

If the long suites fail, you treat the failure the same as a bug reported by
users. You determine the fix, then write a trivial TDD test which fails
because the fix is not there. Note the test does not exhaustively prove the
fix is correct; it's just enough to pin the fix down. You leave this test
case with the developers' suite.

A program that uses a wide range of an int24_t's possible states is a
program with a high dimensional space. All programs have such a space! The
exhaustive test for such spaces would take forever to run. However, high
dimension spaces are inherently sparse. Tests only need to constrain
specific points and lines within that space.

One way to determine these points and lines is to analyze that space to
determine the minimum set of sweeps of inputs that will cover the whole
space. You test a carefully choses subset of the possible input values.

Another way is to write the tests first. Then you have a test for two points
on every line, and each endpoint to each line, and so on.

If you write tests first, you also get to avoid a lot of debugging. If your
tests fail unexpectedly, you have the option to revert back to the last
state where all tests passed, and try again.

This leads to some curious effects. Firstly, you can make much more savage
changes between each test run. You can even change code that someone else
wrote! a long time ago!! that everything else uses!!!

Next, if your tests are cheap and sloppy, but the code can't exist without
them, you get the ultimate in Design for Testing. You get Design BY Testing.
That means your tests might fail even if your code had no bug.

Again: Your tests might fail even if your code had no bug.

That means your code occupies a high-dimension space that is easy for your
tests to cover. So instead of analyzing your code and wrapping your tests
around it, you used Adaptive Planning with both the code and tests to
simplify that high-dimension space.
 
A

Ark Khasin

Phlip said:
Put the test into FIT, and let the certifying agency change the input
variables and watch the output responses change.

Oh, are you going to say there are "certifying agencies" out there which
_don't_ expect literate acceptance tests to cover their requirements??

Sure explains KBR, huh? (-;
A certifying agency's stamp assures a Bhopal or Chernobyl plant manager
that it is safe to put your gadget in a safety application w.r.t.
acceptable safety risk according to a safety standard (ISO/IEC 61508 in
my case).
It verifies that
- you have an acceptable development process (from marketing
requirements to validation and everything in between, usually including
continuous improvement)
- you followed the process and can demonstrate it on every level.

It doesn't _run_ any tests for you; but it checks that you did so and
your tests were comprehensive and that you can show a documentation for it.
 
A

Ark Khasin

Phlip said:
Sounds like TDD.


Ian, you are responding to the straw-person argument, "Projects that use TDD
only ever write any tests before the tested code."

They don't. Now let's hear why these "research" environments are _allowed_
to write code without a failing test, first!
[If we agree that a test is a contraption to check if the code works as
expected:]
If we don't know what to expect ("research"), we cannot write a test.
[Or again I am missing something]

E.g. if I'm writing a version control system, I know exactly what _has_
to happen, and I can write the tests.
If e.g. I'm writing a monitoring code for the car's wheel speed sensors,
I may have a rock solid idea that e.g. whatever the speeds, the wheels
always remain in the vertices of a rectangle of the original size. Enter
sensor noise, wheel spinning, tire inflation and what not. I need lots
of code just to study what's going on before I can arrive at a sensible
algorithm.
 
E

Everett M. Greene

Ark Khasin said:
Assuming that the test code itself must be reasonably dumb (so that
/its/ errors immediately stand out), that's not terribly realistic:
imagine a sweep over, say, "int24_t" range. One could only hope to run
automated tests overnight - on a long night :).

Is it really necessary to test all 2**24 values? It would seem that
testing the minimum, maximum, zero, and some representative values in
between would suffice. The "representative values" should be numerically
irrational (pi and e are good, for instance) so as to catch cases of
certain bits not being handled properly; 1 and 2 are not good choices
although they need to work properly as well.

In the area of branch testing, one has to test loops for proper
termination. [I just found some bugs last evening that involved
some simple counting loops that didn't terminate due to doing a
check for <0 on an unsigned value -- oops.]
 
E

Everett M. Greene

Phlip said:
Ian, you are responding to the straw-person argument, "Projects that use
TDD only ever write any tests before the tested code."

They don't. Now let's hear why these "research" environments are _allowed_
to write code without a failing test, first!

Have you ever worked in a product R&D environment? A lot of concepts
are taken for a test drive without ever seeing the light of day outside
the lab. If the product does make it out the door, the original
concept proving/testing work is probably a very small portion of the
final product. You want to spend a lot of time and effort producing
more formalized testing processes for something that has a very low
probability of ever being used in a production environment?
 
P

Phlip

Ark said:
[If we agree that a test is a contraption to check if the code works as
expected:]

The weakest possible such contraption - yes.
If we don't know what to expect ("research"), we cannot write a test. [Or
again I am missing something]

If you can think of the next line of code to write, you must perforce be
able to think of a test case that will fail because the line is not there.

Next, if you are talking about research to generate algorithms for some
situation, then you aren't talking about production code. Disposable code
doesn't need TDD. Once you have a good algorithm, it will have details that
lead to simple test cases.
E.g. if I'm writing a version control system, I know exactly what _has_ to
happen, and I can write the tests.
If e.g. I'm writing a monitoring code for the car's wheel speed sensors, I
may have a rock solid idea that e.g. whatever the speeds, the wheels
always remain in the vertices of a rectangle of the original size. Enter
sensor noise, wheel spinning, tire inflation and what not. I need lots of
code just to study what's going on before I can arrive at a sensible
algorithm.

That's an acceptance test. TDD tests don't give a crap if your code is
acceptable - if it targets wheels or wings. It's just a system to match
lines of code to trivial, nearly useless test cases.
 
A

Ark Khasin

Everett said:
Is it really necessary to test all 2**24 values? It would seem that
testing the minimum, maximum, zero, and some representative values in
between would suffice. The "representative values" should be numerically
irrational (pi and e are good, for instance) so as to catch cases of
certain bits not being handled properly; 1 and 2 are not good choices
although they need to work properly as well.

The example of "gullibility measurement and conversion" in
http://www.macroexpressions.com/dl/C code unit testing on a shoestring.pdf
may be reasonably convincing
In the area of branch testing, one has to test loops for proper
termination. [I just found some bugs last evening that involved
some simple counting loops that didn't terminate due to doing a
check for <0 on an unsigned value -- oops.]

IMHO, unsigned<0 condition doesn't rise to testing: Lint will find it
before you compile
 
P

Phlip

Everett said:
Have you ever worked in a product R&D environment?

Yes. I helped teach TDD and Python to polymaths who were into declaring
multi-dimensional arrays as void **.
A lot of concepts
are taken for a test drive without ever seeing the light of day outside
the lab. If the product does make it out the door, the original
concept proving/testing work is probably a very small portion of the
final product. You want to spend a lot of time and effort producing
more formalized testing processes for something that has a very low
probability of ever being used in a production environment?

TDD is faster and easier than debugger-oriented programming.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,119
Latest member
IrmaNorcro
Top