Regression testing for pointers

Ian Collins · Mar 16, 2012

Hi Nick,

Doing so means you have now determined the *implementation*
as well as the functionality.

If you want to split out "sub-functions", do so. Then,
formally specify how they must work. And, formally
specify a test suite for them. Thereafter, the implementation
of the allocator must preserve all of these interfaces.

No, you do not. The interface of the allocator is formally specified,
not its internals.

The whole point of regression testing is to formalize the
test cases and ensure changes to a function preserve its
interface semantics and functionality.

Quite. If you make a change to the internals that breaks the published
interface, the published interface test will fail. You could have
several stets of internals for different environments and be free to
substitute one for another without breaking the published interface.

Guest · Mar 16, 2012

Doing so means you have now determined the *implementation*
as well as the functionality.

no. You've tested the implementation as well as the functionality.

If you want to split out "sub-functions", do so. Then,
formally specify how they must work. And, formally
specify a test suite for them. Thereafter, the implementation
of the allocator must preserve all of these interfaces.

no. A particular implementation has particualar internal tests.
Perhaps you provide a call internalDataValid() the high level test
need never know what the internal details of that are.

The whole point of regression testing is to formalize the
test cases and ensure changes to a function preserve its
interface semantics and functionality.

that's your definition. To me regression tests are to test you
haven't broken something with your change. A purely functional
test might not check all the boundaries and special cases.

I write a CORDIC sqrt. I peek inside and notice that iteration
X for an input of FOO yields -- and *should* yield -- an
approximation having the value BAZ.

You rewrite the sqrt using Newton-Rhapson. Suddenly your test
fails because it converges on the result differently.

so testing floating point is difficult.

Or, maybe my peek-inside test looks at some parameter of
the CORDIC implementation that simply doesn't EXIST in
the Newton-Rhapson version.

You can *test* your code by single stepping through its
execution or "whatever". But *formalizing* that portion
of the test process is folly.

I'm not sure sqrt() is a really good example. I never suggested
stepping-through-the-code was a sensible validation step.

An air traffic control system will have *numerous* layers
of FORMAL SPECIFICATION below "system test/specification".
All of those lower "interfaces" are tested. But, they
are also points of inflexibility. Any change to any of them
can result in changes to the entire *system* above and below
it in the abstraction hierarchy.

"Let's make malloc(3) return a pointer to a struct that contains
a pointer to the allocated memory and the size of that memory".

Suddenly, everything that uses malloc changes.

quite. You've changed the public interface.

(let's make
sqrt rely on successive iterations of a CORDIC algorithm;
you are free to change the implementation of the CORDIC
algorithm -- but, you have to ensure that the results it
produces agree with those that are being tested for)
why?

If you FORMALIZE test suites then you formalize interfaces.

why? Why can't you write internal tests? You can call them "formal" if you like. Ok.

Step 1 "run internalTest() verify that it returns true"
Step 2 "run first functional test..."

If that's what you want or need, then do so FORMALLY. Make
sure there is a formal document that contractually specifies
how the interface is expected to function so that:
- implementors know how to implement the function described
- consumers know how to *use* that function
- validation suites know how to exercise that function

by why does that preclude any other form of testing?

If, OTOH, you don't want to go to all this trouble and tie
your future hands, implement whatever ad hoc testing YOU
want and keep it on a scrap of paper in your desk drawer
since it is not a formal part of the specification for the
function/module you are testing!

I find this very strange. Why can't I run internal tests that
verify correct internal behaviour? These test are not "ad hoc",
they are not kept in the bottom left hand drawer. They can be reviewed,
configuration controlled, and results can be recorded.

It vastly increases confidence in the product over "just run the SAT" .
I've *seen* systems taht passed their SAT and then exhibited field bugs
that good module testing would have detected. It also makes debugging
easier...

And each of those *components* is formally specified, tested
and "vouched for" (by the manufacturer).

yes, but not by the customer of the bridge. He probably writes in the
contract "shall be constructed in accordance with best industry
practice" or some such boiler plate. The contract will then specify
its load capacity and how many days a year it will close due to high
winds. etc. They don't actually care what grade of steel is used as
long as someone ensures that the right one was used.

So a *lawyer* and
set of "expert witnesses" can testify that the reason the
bridge failed was because ACME Asphalt sold defective road
coating to the builder which didn't meet the specifications
that the supplier FORMALLY AND LEGALLY AGREED TO in his
CONTRACT withe the builder.

You can't go and complain that ACME used *pink* asphalt (while
you EXPECTED black) -- unless it fails to meet some specified
criteria in that contract. Or, that ACME heated the raw material
using a wood fire instead of a gas oven in preparing the road
covering (unless the specification says "shall be heated in
a natural gas fired oven")

ACME has liberties as to how it approaches each specification
to which it has *contractually* agreed to be bound.

As I said, *you* can look at individual BITS in the CPU, memory,
etc. while *you* are testing it. But, the function formally
only has ONE interface that it must contractually satisfy.

we seem to have acheived stalemate by repetition. Though "sanity is not statistical" Iwill point out there seem to be two of us that don't agree with you.

Happy programming!

Joe keane · Mar 16, 2012

You need to learn to express yourself in a more cultured manner. Not
everyone who disagrees with you is talking crap, and even if they are,
often it's better not to say it.

'if you don't have anything nice to say'

then you're a jerk

Don Y · Mar 20, 2012

Hi Ian,

No, you do not. The interface of the allocator is formally specified,
not its internals.

Great! What is your internal test SUPPOSED TO DO when
given ___________? How do you *know* what it is supposed to
do in that case? What happens when your boss wants to
throw some manpower at your task to improve your chances of
meeting a deadline and he decides splitting of the development
of the test suite for <whatever> should be a nice, easily
compartmentalized task. The guy who is writing the test suite
for you wants to know what he should *expect* your code to do
given ___________.

Once you *have* specified this *internal* interface, any
changes you make to *that* interface require the rewriting
and revalidation of the layers *above* and *below*. I.e.,
the parts of your <top_level_function> that rely on that
<lower_level_implementation> now have to change. And, that

Quite. If you make a change to the internals that breaks the published
interface, the published interface test will fail. You could have
several stets of internals for different environments and be free to
substitute one for another without breaking the published interface.

While you are away at lunch, I replace the code for function()
with a new implementation. It adheres to the PUBLISHED interface
for function(). But, fails ALL of your "internal tests" because
it is implemented in an entirely different manner (that is NOT
mandated by the published specification).

I haven't done anything wrong. But, my code fails your
tests -- because your tests are looking at undocumented
aspects of *an* implementation.

I rewrite sqrt(): the new RUN-TIME version causes a
packet to be sent across the internet to an accountant
sitting in a smoke-filled office. The value for which
you are seeking the sqrt appears on his display. He
takes out his pencil and a clean ream of paper and
gets to work on the problem. When he is done, some
time later, he types the answer in on his keyboard and
a packet is returned to the WAITING application. The
sqrt() code then converts this ASCII decimal string
into a suitable "double" and returns the value to the
caller.

There is *nothing* about this implementation that
violates the letter of the sqrt() interface specification.
Granted, it's dog slow and only works on machines with
internet connectivity. But, it still *works*. It
passes the regression tests for sqrt().

But, *doesn't* pass any of your undocumented internal
tests -- because you probably wouldn't have conceived of
such an approach.

[If this is too far-fetched, it shouldn't be hard for you
to imagine two or three *radically* incompatible means for
computing a sqrt that would satisfy the sqrt()'s published
interface yet share no "internals" in common with each
other. So, any FORMALIZATION of any internal interface
(as required to document the expected behavior for a test
suite that exercises that internal interface) will dictate
which of those implementations are "acceptable" to you.]

Ian Collins · Mar 20, 2012

Hi Ian,

Great! What is your internal test SUPPOSED TO DO when
given ___________? How do you *know* what it is supposed to
do in that case?

As I said a while back, most platforms have a number of interchangeable
allocators optimised for various applications. Their internals may be
completely different, but they still conform to the the standard
allocator requirements.

What happens when your boss wants to
throw some manpower at your task to improve your chances of
meeting a deadline and he decides splitting of the development
of the test suite for<whatever> should be a nice, easily
compartmentalized task. The guy who is writing the test suite
for you wants to know what he should *expect* your code to do
given ___________.

We don't work that way. In the TDD world, the bloke (or pair) writing
the tests is the bloke (or pair) writing the code.

Once you *have* specified this *internal* interface, any
changes you make to *that* interface require the rewriting
and revalidation of the layers *above* and *below*. I.e.,
the parts of your<top_level_function> that rely on that
<lower_level_implementation> now have to change. And, that
<lower_level_implementation> also has to change to agree
with the new interface definition.

That argument could go round in circles for ever.

While you are away at lunch, I replace the code for function()
with a new implementation. It adheres to the PUBLISHED interface
for function(). But, fails ALL of your "internal tests" because
it is implemented in an entirely different manner (that is NOT
mandated by the published specification).

It couldn't. You didn't change the internal code, you replaced it with
some new code, presumably with its own tests. In other words, you
produced another alternative implementation.

I haven't done anything wrong. But, my code fails your
tests -- because your tests are looking at undocumented
aspects of *an* implementation.
Nope.

I rewrite sqrt(): the new RUN-TIME version causes a
packet to be sent across the internet to an accountant
sitting in a smoke-filled office. The value for which
you are seeking the sqrt appears on his display. He
takes out his pencil and a clean ream of paper and
gets to work on the problem. When he is done, some
time later, he types the answer in on his keyboard and
a packet is returned to the WAITING application. The
sqrt() code then converts this ASCII decimal string
into a suitable "double" and returns the value to the
caller.

There is *nothing* about this implementation that
violates the letter of the sqrt() interface specification.
Granted, it's dog slow and only works on machines with
internet connectivity. But, it still *works*. It
passes the regression tests for sqrt().

But, *doesn't* pass any of your undocumented internal
tests -- because you probably wouldn't have conceived of
such an approach.

While sightly exaggerated, that isn't an uncommon situation. For
example an application may happily be working away processing data from
a local store. At some point that data is moved to another system and
is accessed through a web service. So the persistence layer gets
replaced. The application still passes all its regression tests because
they don't care how the application outputs 2 when 4 is input, They just
care that it does.

Don Y · Mar 20, 2012

Hi Nick,

no. You've tested the implementation as well as the functionality.

If you peek inside AN IMPLEMENTATION and test expecting
some internal aspect of the implementation to comply with
some expectations under <some_set> of conditions, then
you have defined a new interface.

You have to formalize that interface -- otherwise, you can't
KNOW what to expect in any given situation. You can't *know* what
constitutes "correct" results at that internal level.

no. A particular implementation has particualar internal tests.
Perhaps you provide a call internalDataValid() the high level test
need never know what the internal details of that are.

How is internalDataValid() defined? You have now exposed
another interface. How do I know what that is supposed to
do? How does SOMEONE ELSE looking at the test suite
assure themselves that "internalDataValid()" is being
tested properly?

that's your definition. To me regression tests are to test you
haven't broken something with your change. A purely functional
test might not check all the boundaries and special cases.

Then your functional test sucks!

What happens when your test suite reports:
test string is located at: 0xFFFFFFFC
the first 4 locations of the string are: 'A' 'B' 'C' 'D'
the FUT (strlen()) returns a result of: ______
Do you *know* what the strlen() that you probably use *often*
will do in this case? Will the above test suite SIGSEGV before
it reports any of this??

Or, is your test suite "lame":
Please type a string, followed by CRLF: ABCDEFG
According to strlen(), you typed 7 characters.
Please type a string, followed by CRLF:

Do you have any faith in how it handles strings with, for example,
32767 characters? 32769??

so testing floating point is difficult.

No. CORDIC converges on the result differently than a FP
N-R approximation.

Imagine I have an infinitely large table in which all of the values
of sqrt() have been precomputed so sqrt just does a "lookup"
operation. Obviously, testing it's internals would be very
different from testing one that performs a successive approximation.

I'm not sure sqrt() is a really good example. I never suggested
stepping-through-the-code was a sensible validation step.

quite. You've changed the public interface.

Exactly. Now, imagine changing some *internal* interface of
malloc that is not visible to the user. YOUR test suite examines
this internal interface. But, its changed. Suddenly, your
tests don't work -- even though the rest of the malloc
implementation has been "fixed" so that it still satisfies
IT'S public interface.

why?

Because YOU want to test this internal interface. You've
documented it so that you can convince a validation
engineer that your test suite for that INTERNAL interface
functions properly. Even though I have just changed
the implementation of the function itself (while preserving
*its* public interface SO IT STILL PASSES *IT'S* TEST SUITE)

why? Why can't you write internal tests? You can call them "formal" if you like. Ok.

Step 1 "run internalTest() verify that it returns true"
Step 2 "run first functional test..."

What is "internalTest()" for strlen()? for malloc()? for sqrt()?
A function with *two* entry points??

by why does that preclude any other form of testing?

Testing is the process of ensuring an implementation meets
its published goals/criteria. Anything you want to test
has to be formally specified -- else how do we agree that
your test really *does* validate proper operation?

I find this very strange. Why can't I run internal tests that
verify correct internal behaviour? These test are not "ad hoc",
they are not kept in the bottom left hand drawer. They can be reviewed,
configuration controlled, and results can be recorded.

Are you REQUIRING the function to pass those tests?
If YOU quit and I am assigned the job of finishing your
work, do *I* have to use the same implementation that
you have chosen? Does the code I implement have to pass
those *internal* tests that you settled on? How do I
know what they should yield? *Why* should they yield those
results? What makes your implementation better than mine?
If I pass a *thorough* set of tests for the published
formal interface, why should I also have to pass your
particular set of internal tests?

I saw an implementation of production software that determined
the size of a file by reading the file and *counting* the
individual bytes. Sure, the result was correct. But a foolish
implementation.

How would you test this function's internals -- look at the
count at random points in time and verify that they were
constantly increasing? Have the loop emit the current
value of the counter periodically (e.g., every 1000 bytes)?

What happens if I rewrite the function and just use fstat()
and return the st_size member of the result? Your test
(for a 4567 byte file) would expect to see:
1000
2000
3000
4000
4567
for example. *My* implementation would simply produce the
final "4567" result. It has *failed* your "internal test"
because the implementation didn't rely on an iterative
counter approach.

Yet, it *passes* the test for the function itself!

It vastly increases confidence in the product over "just run the SAT" .
I've *seen* systems taht passed their SAT and then exhibited field bugs
that good module testing would have detected. It also makes debugging
easier...

If you passed a formal test and failed in the field, then your
formal test sucks. You clearly aren't testing all of the
conditions that your code is *supposed* to be able to tolerate.

Debugging and testing DURING DEVELOPMENT are different issues.
That's the "desk drawer" comment I made. You can't *force*
someone else (i.e., The Organization) to embrace your
particular implementation without formally nailing down a
contract for that interface that you are exposing.

yes, but not by the customer of the bridge. He probably writes in the
contract "shall be constructed in accordance with best industry
practice" or some such boiler plate. The contract will then specify
its load capacity and how many days a year it will close due to high
winds. etc. They don't actually care what grade of steel is used as
long as someone ensures that the right one was used.

----------^^^^^^^^^^^^^^^^^^^^^^^^^AAAAAAAAA^^^^^^^^^^

A contract, *somewhere* specifies what that "right one" is
and who that "someone" will be. When the bridge fails,
you can bet these items will be readily identifiable (as
the lawyers sort out who to go after for the losses).

You can create whatever sorts of contracts you (and your
organization) want. You can hide whatever you want, too.
But, everything that you expose *for* testing ties down
another corner of HOW you implement something -- as well
as how you can reimplement it in the future.

E.g., the is___() character functions are often implemented
as table lookups. The argument to the function is used
to access a const table whose contents represent the
"classifications" for the "character" corresponding to the
argument.

So, table['c'] encodes information that tells me that 'c'
is alphabetic, NOT numeric, NOT whitespace, NOT punctuation,
graphic, NOT a control character, etc.

*BUT*, table[] and its contents are not formally exposed
in the specifications for any of these functions! E.g.,
I can reimplement isdigit() to NOT use that table[] but,
instead, examine the argument directly, algebraically.

OTOH, if you expose table[] and try to test it as an
"internal test", you now have to formally define it
*and* mandate its role in the is___() functions.

Ian Collins · Mar 23, 2012

Hi Nick,

If you peek inside AN IMPLEMENTATION and test expecting
some internal aspect of the implementation to comply with
some expectations under<some_set> of conditions, then
you have defined a new interface.

You have to formalize that interface -- otherwise, you can't
KNOW what to expect in any given situation. You can't *know* what
constitutes "correct" results at that internal level.

The tests formalise the interface.

C unit testing and regression testing	40	Aug 8, 2013
Regression testing for HTTP server	5	May 16, 2013
differentiating between pointers - "primary"?	9	May 24, 2012
Regression testing with Python	0	Jun 5, 2007
How to go about with PDF regression	1	Feb 18, 2013
Regression test	1	Mar 24, 2009
Sizes of pointers	233	Jul 30, 2013
Testing for performance regressions	3	Apr 5, 2011

Regression testing for pointers

Ian Collins

Guest

Joe keane

Don Y

Ian Collins

Don Y

Ian Collins

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads