Regression testing for pointers

Don Y · Mar 12, 2012

Hi Ian,

This appears to be the flaw in your understanding of unit tests. You can
nail down actual addresses.

No. You can only nail them down in specific environments.
I code on bare iron. For me, I can put heap *anywhere*.
Indeed, I have to be able to do this else the operating
system won't be able to use memory as and where it wants to!

But, someone else operating on a different platform finds
my tests don't fit that platform. E.g., I put the heap at
0x00001000 and that happens to be right in the middle
of your interrupt vector table. Or, there is a bit of
hardware located there. Or your device only has a 16b
address space. Or...

Now *you* have to understand the intricacies of my routine
just to create (modify) a test suite that is equally
thorough. You have to understand the alignment constraints
that your compiler imposes. You have to know that the
portion of my test suite that addresses with very fine
grained (e.g., single byte) alignment won't port to your
environment (yet, I shouldn't omit it since there are
environments where this *is* permissible and it should
be verified).

I just don't see how this can be done, reliably -- short
of me writing code that actively reconfigures the tests
that it applies based on *its* observations of the
execution environment.

Don Y · Mar 12, 2012

Hi Nick,

On 3/11/2012 6:09 AM, (e-mail address removed) wrote:
[...] Yet that seems to be the most common approach to testing
nowadays (how many folks have formal specifications, test suites,
validation procedures, etc.).

Click to expand...

there's a system test spec and we have to comply to an international standard. This is verified by an external 3rd party.

Most folks operate with "none of the above". How can you know
if you've written what you were *supposed* to write if you
don't have anything DEFINING it and PROVING your compliance??

I worked in a syetem test department for a while. I collected a
list of best excuses "that hardly ever happens", "the customer
would never do that" and my all time favourite "but the code
says that's what happens!"

When you're moving a few tons of material and "fail to stop"
when expected, people's eyebrows raise! :> Or, if you
extract money from a machine and there is no record of
that event.

OTOH, The fact that your dialog box didn't get redrawn on an
expose event tends to get dismissed as unimportant. ("Um,
if *that* isn't happening as planned/desired/DESIGNED, what
ELSE isn't??")

? why not? Because they don't terminate?

Because they never run long enough without CRASHING! :>

I was talking about actual proofs of correctness. VDM, Z, Sparc.
Are these widely used?

I've never found a case where you could apply such a technique
"across the board". Rather, you want people who can imagine
the "impossible" and codify those conditions -- in their code
and in their test suites.

E.g., if I am driving a motorized mechanism "to the left",
I don't *just* watch the "left limit switch" and use that
to stop further motion. What if the motor is wired
*backwards* and I am really moving the mechanism to the
*right*? What if the limit switches are mixed up and
left is wired to right, etc.

If I am moving left, the RIGHT limit switch should NEVER
be actuated (ignore boundary case where the mechanism is
currently *at* that limit). So, it should be perfectly
reasonable for me to watch the LEFT limit switch to
stop normal/EXPECTED motion... AND if I see activity on
the RIGHT limit switch, I can panic(), stop the mechanism,
etc.

In reality, this sort of "miswiring" *does* happen.
Especially on prototypes. In that case, the cost of
ruining a mechanism can be EXTREMELY painful (we don't
have another one!).

But, it takes a perverse sort of mind to think about these
sorts of things. And, teaching others to think the same
way is difficult. I resort to having folks sit down and
list the ASSUMPTIONS they are making in any aspect of
design. Then, ask yourself which of those are *truly*
safe!

E.g., if I have a "string" located at 0xFF...F8 that has
8 non-NUL characters in it, what should strlen() return?

Where it *doesn't* happen is smaller projects, "lone wolf"
developers, "immature" organizations, etc. "Pay me now or
pay me later" is a great mantra when it comes to software
quality and testing, in general.

The writers of the Space Shuttle software might have achieved [] what
you're trying to do. At enourmous cost.

Click to expand...

There are many industries where formal testing and validation
are the norm and "required" practices.

Click to expand...

yes. I know. The Space Shuttle software was a step above this. CMM level 5 and all that.

Formal testing is not the same as formal methods

But formal testing WITHOUT formal methods is like buying food
that "Bob" said was prepared healthily.

(Great! Who's Bob??)

If Bob is a highly respected individual KNOWN (renowned!) for
his dedication to healthy food, that might be a great
endorsement. OTOH, if Bob is some anonymous guy...

Formal methods tries to take some of the variability out of
individual "preferences"/diligence.

Ian Collins · Mar 12, 2012

Hi Ian,

But those criteria might not test the full capabilities of
myMalloc. E.g., does it behave when the heap size is increased
to 0x6000? Does it behave if I move the heap to 0x0000? Does
it work for "big" requests -- e.g., a megabyte? (it *should*
on TARGET_Y but that wouldn't make sense on TARGET_X... two
different test suites?)

This is a demonstration, not a full test suite. I've been designing
allocators for 20 odd year and non of them have worked differently is
different locations.

You are relying on being able to enumerate all of the test
conditions for each potential platform.

No I'm not, that's just an example of allowing either constant or
variable parameters. The allocator code should be position independent.

But you can't write
that test suite "today" without knowledge of what they will be
(e.g., 1MB request).

Why not? Just add maxRequest as a constant/variable to the mix.

The problem is inherent in the fact that this is an area
(pointers, size_t) where the language gives the implementation
lots of flexibility. OTOH, you *know* that a double will
be good for ~10+ digits...

Flexibility that should not change the relative behaviour of the code.

Should be null for "too big" request.

So add a test.

Should be "in heap" for *all* requests.

So add a test.

Should be correctly aligned for *all* requests.

So add a test.

The point of my "should be's" is that you can pin the
function's return values down much tighter than that. And,
you *need* to be able to do so in order to assure yourself
that it truly is operating properly.

If I run the returnIsInHeapForOneByteRequest 0x1000 times,
will it always pass (without an intervening free())?

So add a test.

Shouldn't it fail at the 0x1000/minAlign point, instead?
Am I sure that no two of the returned pointers EVER coincide?

So add a test.

E.g., you wouldn't write:

ASSERT(sqrt(3)> 1)
ASSERT(sqrt(3)< 2)
ASSERT(sqrt(3)> sqrt(2))
ASSERT(sqrt(3)< sqrt(4))

*All* of these are appropriate. But, what you really want
to know is

ASSERT(sqrt(3) =~ 1.732...)

This embodies *all* of the above tests and gives the
function far less "wiggle room".

(malloc -- and your myMalloc -- are too loosely defined for
robust formal testing.

It's only a demonstration to address some of your questions of a work in
progress.

Ian Collins · Mar 12, 2012

Hi Ian,

No. You can only nail them down in specific environments.
I code on bare iron. For me, I can put heap *anywhere*.
Indeed, I have to be able to do this else the operating
system won't be able to use memory as and where it wants to!

But, someone else operating on a different platform finds
my tests don't fit that platform. E.g., I put the heap at
0x00001000 and that happens to be right in the middle
of your interrupt vector table. Or, there is a bit of
hardware located there. Or your device only has a 16b
address space. Or...

So you write your code to be position independent. It isn't rocket science.

Now *you* have to understand the intricacies of my routine
just to create (modify) a test suite that is equally
thorough. You have to understand the alignment constraints
that your compiler imposes. You have to know that the
portion of my test suite that addresses with very fine
grained (e.g., single byte) alignment won't port to your
environment (yet, I shouldn't omit it since there are
environments where this *is* permissible and it should
be verified).

Again, your code to be alignment independent.

Don Y · Mar 12, 2012

Hi Ian,

So add a test.

So I -- and anyone modifying the test suite -- should be diligent
enough to verify all of these issues INSTEAD OF something like:
ASSERT(result == 0x80032440)
which performs *all* of these "additional tests". If result is
*not* 0x80032440 then one of the other enumerated issues was
not satisfied!

If you were testing strchr(3), would you do:
template = "ABCDEABCFF
sought = 'C'
result = strchr(template, sought)
ASSERT(result >= &template[0])
ASSERT(result < &template[strlen(template)])
ASSERT(*result == sought)
ASSERT(result[1] == sought+1)

Or, would you do:
ASSERT(result == &template[2])

Don Y · Mar 12, 2012

Hi Ian,

So you write your code to be position independent. It isn't rocket science.

We're talking about the TEST SUITE! The code already *is* PIC!

I write the test suite. It picks an address for "heap".
Works fine in *my* environment -- because I know I have RAM
there, etc.

*You* want to run the code on a different processor in
a different environment. Gee, you *can't* put the heap
in that place because it's not "general purpose memory".

Now *you* have to modify the test suite and hopefully
not *break* it in the process!

You move the heap to 0xFFFFF000 -- since you have memory
there - and suddenly discover that the test suite's attempt
to allocate a 4K block of memory fails -- when it *shouldn't*
(there's nothing wrong with the code... the BUG is in your
test suite modification!)

Don Y · Mar 12, 2012

Hi Jorgen,

I just gave some rules for valid and invalid results above. Not *all*
rules, of course! You surely realize that it's not possible to prove
the correctness of a malloc() implementation this way.

Note that I'm not saying I would be satisfied with a test which only
uses the three rules above. It just seemed to me you believed there
was nothing you could do to test such a function, without a sort of
debugging interface into its internals.

Understood. The point I was making was that it's not the same
sort of validation exercise as testing, e.g., sqrt() might be.

In a given/fixed/known environment, it is relatively easy to
implement lots of tests "safely". The problem I am facing
here is wanting to make a "portable" test suite -- something
that is easily done with other types of functions but not
so readily with things that fiddle this close to the iron.

I don't see the relation between what you write and what I wrote ...
but in the case of free() the specification allows anything to happen,
so in some sense you don't need to test this.

You don't need to test it for the "free()" in the standard
libraries. Just let the OS throw a SIGSEGV when you do
something improper. :> (not my style!)

In my case, I need to ensure that myFree() catches conditions
that are indicative of errors in its implementation *or* in
its *use*. Does this refer to memory managed in *this* heap?
Is the referenced piece of memory known (by the memory
management system) to be "in use" -- and thus a legitimate
target of free()? Can the pointer legitimately reference a
real block of memory? (imagine passing 0xFFFFFFFF to free())
etc.

E.g., if a free-d pointer already *appears* to be contained
in the free list (possibly *within* a free memory fragment),
is that an indication of an error on the programmer's part?
(free-ing something that was already free-d?) Or, is it a
bug in the free() implementation (thinking something is free
that really *isn't*!)

Don Y · Mar 13, 2012

Hi Gordon,

Most functions that manipulate pointers can be tested by what it
puts *IN* the memory pointed at by pointers. For example, one test
of the non-standard function strdup() can be done by comparing the
string and the copy. A test of fgets() could test that the buffer
contents reflects the contents of the test file being read, and
that fgets() doesn't write too far. A test of a parser that builds
a complicated tree of nodes out of source code input can look at
the contents of the parse tree and perhaps check if all nodes
allocated are either freed or still reachable. The exact position
of the nodes is not important, as long as they don't overlap.

Correct. But when the "product" in question *is* the pointer,
the issue changes!

malloc() and free() are exceptions to this. You could check that
newly-allocated memory is filled with 0xdeadbeef and freed memory
is filled with 0xfreebee1 (and in the process, perhaps figure out
if it is rounding up your request size properly) if your malloc()
has that debug feature.

---^^^^^^^^^^^^^^^^^

Note I am not debugging "malloc(3)" but, rather, allocateMemory()
(there is no malloc in the system -- because malloc has poor
semantics)

Do you want to open the black box or not?

If so, you can write things like a malloc() arena consistency checker
(if it doesn't come with one already) to ensure that the arena is
consistent, and call it before and after every call to malloc().
(You can verify *that* by scribbling beyond the end of memory you
allocated and determining that the consistency check fails, or
deliberately introducing errors into malloc() or free().) You
may also be able to take snapshots of the free list before and after
and determine that it is obeying any policy options in effect for
where to allocate memory.

If I *know* the execution environment, I can get all of this
information just from looking at the returned pointers from
the allocator. No need to peek inside at the free list, etc.

My allocator gives the developer precise control over how it
operates. How it selects the "right" fragment from the free
list, how it decides which *part* of that fragment to use
for the request, what it does with the balance of the
fragment, etc.

So, starting with a full heap (as a single "fragment"), I
can predict exactly what the return value of each allocation
request will be (as long as you let me *see* each such request
and releaseMemory() invocations). Furthermore, I can predict
how "expensive" (time) each such request will be.

E.g., (pseudocode -- ignore errors)

#if BUFSIZE < MINFRAGSIZE
# error ...
#endif

#if BUFFERS < 1
# error ...
#endif

....

createHeap(&theHeap, &memory[], BUFFERS * BUFSIZE)

// create the individual buffers
lastptr = NULL
do {
ptr = (list*) allocateMemory(theHeap, BUFSIZE, FIRSTFIT, TRIMTOSIZE)

ptr->next = lastptr
lastptr = ptr
} while (ptr != NULL)

// place the buffers back into the heap while retaining their sizes
while (ptr != NULL) {
lastptr = ptr->next // can't peek inside the heap once released!
releaseMemory(ptr, RETAINSIZE, ATHEAD)
ptr = lastptr
}

At this point, I have built a pool of fixed size buffers.
Hereafter,
allocateMemory(theHeap, BUFSIZE, FIRSTFIT, EXACTFIT)
will give me a result in constant time -- regardless of the
number of buffers in the pool (because it just grabs the first
fragment off the free list -- which we KNOW is the exact size
that we want!). Similarly,
releaseMemory(ptr, RETAINSIZE, ATHEAD)
will "free" that buffer (for reuse) also in constant time
(because it never has to traverse the free list to append
the "fragment/buffer" -- nor does it have to try to combine
the fragment with adjoining/contiguous fragments in an attempt
to make larger "free fragments")

Furthermore, I can take a single heap and use it to implement
two different types of memory -- a set of fixed size buffers
on one end and variable size buffers on the other. Or, a set
of buffers of a different fixed size.

It's important that the results of each such invocation be
predictable -- not "ad hoc" like malloc(3) would like you
to believe.

If not, you can check that no two allocations still in force overlap
each other. You can also check that the pointers returned are
properly aligned.

I haven't checked, but if you free() a block of memory of size N
previously allocated, then repeatedly allocate memory blocks of
size N, you should allocate a block that at least partially
overlaps the one you free()d before running out of memory.

[Recall, the location and size of the free store might change
from environment to environment;

Click to expand...

The location of the free store might change from run to run of the
*same* executable with the same modification time, due to code that
tries to prevent buffer overflows from being exploitable by randomizing
load locations. There is nothing to prohibit this. On a 64-bit
machine, the OS might try to never re-use any address from one
program run to any other address in another program run, for the
lifetime of that CPU.

I'm *below*/alongside the OS. (*I* provide the memory that
the OS uses internally!) Any of these sorts of features *I*
would have to implement -- or, something layered on top of me.
(I write for bare iron)

alignment constraints might
vary, etc.]

E.g., if a test invocation of malloc(100) returns 0x12345678
in one environment, it could just as easily return 0x1234
in *another* (smaller address space). A second, identical test
invocation returning 0x87654321 could likewise be "correct".

About the only thing I can check for is if that second
invocation returned 0x12345699 (which overlaps the previous
allocation at 0x12345678!).

Furthermore, just examining the results from malloc don't
tell me that the freelist is being maintained, properly.

Click to expand...

Not getting back overlapping allocations is a fairly good indication.

For *malloc*, perhaps.

But, if I tell the allocator to select the LAST_FITTING fragment
from the free list, I would expect the pointer returned to be
significantly different than the pointer returned from the
allocation that preceded it using a FIRST_FITTING criteria.

I.e., the allocator's role isn't just to "provide memory"
but to provide the memory per criteria that the caller
supplies.

Some functions are like that. Try writing a test suite for a SQL
server sometime. Especially if you aren't allowed to examine any
of the stuff it has stored on disk. Or try testing a C compiler
as a black box when all the results you get are:

- Compilation failed (with a list of error message).
- Run failed (with a shell error message like "segmentation
violation" along with output produced up to that point)
- Run succeeded (with program output)

As far as I know, there is no rule that says that, given 3
exact-size-match blocks that could satisfy a request, malloc()
cannot choose one *randomly*.

Again, what malloc(3) does and what *my* allocator does are two
different issues, entirely. What they share is the fact that
they "effectively" return void *'s -- i.e., there's nothing
"there".

There are some good security reasons
to do that intentionally, although there are tradeoffs with speed
which prefers to quit looking after you find one match.

Most cases of "pointer manipulation" are not a test of a memory
allocator.

Sure. But that doesn't mean you can't test them!
The problem lies in trying to do so portably -- without
writing a piece of code of comparable complexity to the
allocator itself!

Kaz Kylheku · Mar 13, 2012

*You* want to run the code on a different processor in
a different environment. Gee, you *can't* put the heap
in that place because it's not "general purpose memory".

Now *you* have to modify the test suite and hopefully
not *break* it in the process!

You move the heap to 0xFFFFF000 -- since you have memory
there - and suddenly discover that the test suite's attempt
to allocate a 4K block of memory fails -- when it *shouldn't*
(there's nothing wrong with the code... the BUG is in your
test suite modification!)

Yes. A regression test suite doesn't catch bugs in the code;
it catches changes of behavior in the test-suite-code combo.

If you fix a bug in the main code, that can cause tests to fail, tests
sometimes confirm that the bugs that were there yesterday are still
there today.

Ian Collins · Mar 13, 2012

Hi Ian,

So I -- and anyone modifying the test suite -- should be diligent
enough to verify all of these issues INSTEAD OF something like:
ASSERT(result == 0x80032440)
which performs *all* of these "additional tests". If result is
*not* 0x80032440 then one of the other enumerated issues was
not satisfied!

If you have a condition you want to test, test it.

If you were testing strchr(3), would you do:
template = "ABCDEABCFF
sought = 'C'
result = strchr(template, sought)
ASSERT(result>=&template[0])
ASSERT(result< &template[strlen(template)])
ASSERT(*result == sought)
ASSERT(result[1] == sought+1)

Or, would you do:
ASSERT(result ==&template[2])

That doesn't make any sense and bears no resemblance to the questions
you asked and subsequently snipped. You added additional requirements,
I suggested you add tests for them.

Ian Collins · Mar 13, 2012

Hi Ian,

We're talking about the TEST SUITE! The code already *is* PIC!

As was the example test suite I posted.

I write the test suite. It picks an address for "heap".
Works fine in *my* environment -- because I know I have RAM
there, etc.

No *you* don, the compiler/linker does. The test suite does not make
any assumptions about the memory heap location.

*You* want to run the code on a different processor in
a different environment. Gee, you *can't* put the heap
in that place because it's not "general purpose memory".

Good, because I didn't try. The compiler/linker on the new system will
do that for me.

Now *you* have to modify the test suite and hopefully
not *break* it in the process!

No, I don't.

You move the heap to 0xFFFFF000 -- since you have memory
there - and suddenly discover that the test suite's attempt
to allocate a 4K block of memory fails -- when it *shouldn't*
(there's nothing wrong with the code... the BUG is in your
test suite modification!)

Nope, show me where I specified the location of the heap (other than
heap = new uint8_t[heapSize]; heapBase = heap

.

Ian Collins · Mar 13, 2012

Yes. A regression test suite doesn't catch bugs in the code;
it catches changes of behavior in the test-suite-code combo.

If you fix a bug in the main code, that can cause tests to fail, tests
sometimes confirm that the bugs that were there yesterday are still
there today.

Very true. The first step in fixing a bug should be to add a new
failing test to show the bug is there.

Don Y · Mar 13, 2012

Hi Ian,

So I -- and anyone modifying the test suite -- should be diligent
enough to verify all of these issues INSTEAD OF something like:
ASSERT(result == 0x80032440)
which performs *all* of these "additional tests". If result is
*not* 0x80032440 then one of the other enumerated issues was
not satisfied!

Click to expand...

If you have a condition you want to test, test it.

If you were testing strchr(3), would you do:
template = "ABCDEABCFF
sought = 'C'
result = strchr(template, sought)
ASSERT(result>=&template[0])
ASSERT(result< &template[strlen(template)])
ASSERT(*result == sought)
ASSERT(result[1] == sought+1)

Or, would you do:
ASSERT(result ==&template[2])

This embodies *all* of the above tests and gives the
function far less "wiggle room".

Click to expand...

Click to expand...

That doesn't make any sense and bears no resemblance to the questions
you asked and subsequently snipped.

If your comment applies to the strchr example, that is intended
to show that you can present lots of tests/criteria to *infer*
that the result is correct. *OR*, you could just CHECK THE
RESULT! You *know* what the conditions are when the test
is executed (i.e., you know what the resolution of your math
library is, you know the value(s) of the inputs to the routines
under test) so all you have to do is verify that the *result*
is EXACTLY what you expect it to be. No need to look at
indirect evidence when you have a concrete result that you
can test!

You added additional requirements, I
suggested you add tests for them.

ASSERT(result == 0x80032440)

exactly and unequivocally proves ALL CONDITIONS are met.
Specifically, it states (knowing the conditions in which
the test suite executed) that:
- the request was satisfied (i.e., the heap's condition
at the time was such that the criteria associated with
the request could be met)
- the *correct* fragment in the free list was selected
(because I *knew* what the free list should have contained)
- the correct *portion* of the selected fragment was chosen
to be returned to the caller
- the correct *alignment* was enforced
- the correct size chunk was returned to the caller
- the internal state of the free list remains consistent
(as will be evidenced in subsequent requests)

All of those facts are expressed in that single assertion.
Change any digit in the constant shown and one or more of
these facts are no longer true.

I don't have to, separately, test:
- did the function return non-NULL? OK, then the request
was satisfied
- did the function select the LASTFIT fragment as directed
(how do you know without peeking inside the free list?)
- did the function pick the right *part* of that fragment
to use in satisfying the request (how do you know if
you can't see the actual bounds of the fragment BEFORE
it was selected/manipulated?)
- did the request get properly aligned per the requirements
of the processor and compiler (again, without seeing the
pre-selected fragment, how do you know that any "work"
was done on that selection?)
- did the right amount of "excess" get trimmed off the
fragment prior to being returned
- what are the chances that the new state of the free list
is as it should be? how many more of these questions do
I need to answer in the next N example invocations to
assure myself of that?

*And*, any other criteria embedded in the function now
or in the future are also proven by this.

Change a digit in ASSERT(sqrt(3.0) == 1.732) and you
KNOW the sqrt code isn't working properly.

Don Y · Mar 13, 2012

Yes. A regression test suite doesn't catch bugs in the code;
it catches changes of behavior in the test-suite-code combo.

Yes. Hence the reason you want the test suite to be "of
impeccable quality". If your test suite becomes to complex,
then you risk introducing apparent -- *or* masking actual --
errors in the code.

If you require others to *modify* your test suite to make *it*
work ("operate") in their environment, then you've added another
potential source of error/unreliability.

If you fix a bug in the main code, that can cause tests to fail, tests
sometimes confirm that the bugs that were there yesterday are still
there today.

Sure! But, more often than not, it tells you that you had
a problem in your test suite. Something that you *thought*
you were testing for but actually weren't.

The beauty of a formal test suite is that you can "easily"
reapply the new test suite to the *old* code base and
determine if the test suite was defective, the code base
or *both*.

Ian Collins · Mar 13, 2012

Hi Ian,

On 3/12/2012 11:23 AM, Ian Collins wrote:

So add a test.

So add a test.

So add a test.

So add a test.

So add a test.

So I -- and anyone modifying the test suite -- should be diligent
enough to verify all of these issues INSTEAD OF something like:
ASSERT(result == 0x80032440)
which performs *all* of these "additional tests". If result is
*not* 0x80032440 then one of the other enumerated issues was
not satisfied!

Click to expand...

If you have a condition you want to test, test it.

If you were testing strchr(3), would you do:
template = "ABCDEABCFF
sought = 'C'
result = strchr(template, sought)
ASSERT(result>=&template[0])
ASSERT(result< &template[strlen(template)])
ASSERT(*result == sought)
ASSERT(result[1] == sought+1)

Or, would you do:
ASSERT(result ==&template[2])

This embodies *all* of the above tests and gives the
function far less "wiggle room".

Click to expand...

That doesn't make any sense and bears no resemblance to the questions
you asked and subsequently snipped.

Click to expand...

If your comment applies to the strchr example, that is intended
to show that you can present lots of tests/criteria to *infer*
that the result is correct. *OR*, you could just CHECK THE
RESULT! You *know* what the conditions are when the test
is executed (i.e., you know what the resolution of your math
library is, you know the value(s) of the inputs to the routines
under test) so all you have to do is verify that the *result*
is EXACTLY what you expect it to be. No need to look at
indirect evidence when you have a concrete result that you
can test!

Your typical memory allocator is a little more complex and less
deterministic than strchr().

ASSERT(result == 0x80032440)

exactly and unequivocally proves ALL CONDITIONS are met.
Specifically, it states (knowing the conditions in which
the test suite executed) that:
- the request was satisfied (i.e., the heap's condition
at the time was such that the criteria associated with
the request could be met)
- the *correct* fragment in the free list was selected
(because I *knew* what the free list should have contained)
- the correct *portion* of the selected fragment was chosen
to be returned to the caller
- the correct *alignment* was enforced
- the correct size chunk was returned to the caller
- the internal state of the free list remains consistent
(as will be evidenced in subsequent requests)

All of those facts are expressed in that single assertion.
Change any digit in the constant shown and one or more of
these facts are no longer true.

Nowhere in your earlier descriptions was that requirement mentioned. I
thought you were test a general purpose allocator.

If your requirements are that specific, what's wrong with

ASSERT(result == heapBase+0x32440)?

The base address can still be flexible.

I don't have to, separately, test:
- did the function return non-NULL? OK, then the request
was satisfied
- did the function select the LASTFIT fragment as directed
(how do you know without peeking inside the free list?)
- did the function pick the right *part* of that fragment
to use in satisfying the request (how do you know if
you can't see the actual bounds of the fragment BEFORE
it was selected/manipulated?)
- did the request get properly aligned per the requirements
of the processor and compiler (again, without seeing the
pre-selected fragment, how do you know that any "work"
was done on that selection?)
- did the right amount of "excess" get trimmed off the
fragment prior to being returned
- what are the chances that the new state of the free list
is as it should be? how many more of these questions do
I need to answer in the next N example invocations to
assure myself of that?

*And*, any other criteria embedded in the function now
or in the future are also proven by this.

I would probably test the competent functions individually (those
working with slabs) and mock them in the allocator tests. If you were
to build your allocator test first, the design would end up with
wouldn't require tests that poke around in the internal structures.

Don Y · Mar 13, 2012

So I -- and anyone modifying the test suite -- should be diligent
enough to verify all of these issues INSTEAD OF something like:
ASSERT(result == 0x80032440)
which performs *all* of these "additional tests". If result is
*not* 0x80032440 then one of the other enumerated issues was
not satisfied!

If you have a condition you want to test, test it.

If you were testing strchr(3), would you do:
template = "ABCDEABCFF
sought = 'C'
result = strchr(template, sought)
ASSERT(result>=&template[0])
ASSERT(result< &template[strlen(template)])
ASSERT(*result == sought)
ASSERT(result[1] == sought+1)

Or, would you do:
ASSERT(result ==&template[2])

This embodies *all* of the above tests and gives the
function far less "wiggle room".

That doesn't make any sense and bears no resemblance to the questions
you asked and subsequently snipped.

Click to expand...

If your comment applies to the strchr example, that is intended
to show that you can present lots of tests/criteria to *infer*
that the result is correct. *OR*, you could just CHECK THE
RESULT! You *know* what the conditions are when the test
is executed (i.e., you know what the resolution of your math
library is, you know the value(s) of the inputs to the routines
under test) so all you have to do is verify that the *result*
is EXACTLY what you expect it to be. No need to look at
indirect evidence when you have a concrete result that you
can test!

Click to expand...

Your typical memory allocator is a little more complex and less
deterministic than strchr().

But it still produces *a* result. It is a deterministic
algorithm (even if the specification -- like malloc -- doesn't
make that determinism explicit). With a given set of operating
conditions and inputs, you **KNOW** what the result will be.
TEST FOR THAT RESULT!

Nowhere in your earlier descriptions was that requirement mentioned.

Because I didn't consider it germane to the concept of
"testing pointer values". The specification is 15 pages.

I thought you were test a general purpose allocator.

If your requirements are that specific, what's wrong with

ASSERT(result == heapBase+0x32440)?

What happens if heapBase is effectively (char *) -32400?
What happens if the processor only has a 16b address space?
What happens if that +32440 was a consequence of pointer
alignment requirements on THIS PLATFORM?
What happens if there isn't physical memory present at
that location?

The base address can still be flexible.

I would probably test the competent functions individually (those
working with slabs) and mock them in the allocator tests. If you were to
build your allocator test first, the design would end up with wouldn't
require tests that poke around in the internal structures.

The approach I have taken (to testing on *one particular* platform)
eliminates the need for "peeking inside".

If I know where (on this particular platform) the heap is located
and how large it is and alignment constraints, etc. AND have just
initialized the heap... THEN:
- I know what the internal free list looks like, "conceptually"
(I might not know specific bit patterns but I know what the
free list *reflects* -- a heap that is completely FULL!
- given that free list, I know how any particular allocation
request WILL be handled.
- given how that request is handled, I now know what the NEW
state of the free list will be.
- lather, rinse, repeat

And, all I need to do to convince myself that my idea of what's
happening "under the covers" agrees with what is *really* under
the covers is to repeat this process a sufficient number of times
and in an appropriately clever order to remove any doubt.

Don Y · Mar 13, 2012

As was the example test suite I posted.

No *you* don, the compiler/linker does. The test suite does not make any
assumptions about the memory heap location.

And, because of that, *you* can't run the entire test suite.

*Pick* a number. Pretend you are the compiler. Write it on
a little slip of paper. Don't show it to me.

Is it 0x27000010? Is it 0xF9880400? Is it something ELSE??
For *any* two of those cases, it's *wrong*. The code doesn't
expect to deal with a heap in a certain place in memory.
It doesn't expect the heaps size to be artificially constrained.
It has to address all possible deployments... different
size pointers, size_t's, etc.

Do you have 4GB of VM in the machine? Can you allocate a
4GB chunk of memory? What will your 32b system do when you
try? Is there a bug in the code that treats allocation sizes
as *signed* (so 2G is the max)? What if you're operating
on a 64b platform?

Do you only test conditions that you *use*?
"All of my strings are 30 characters or less so I only need
to test strlen for results of [0..30]"
"I never use numbers larger than 999999 so as long as sqrt
works for that range of values..."
"Because the linker located my heap at 0xC000, I can never
allocate a chunk larger than 16KB so..."

Good, because I didn't try. The compiler/linker on the new system will
do that for me.

So, you live with whatever conditions the linker wants to
impose on the code? Examine any parts of the test suite that
throw errors and rationalize which ones "need not pass"
based on the choice the linkage editor made for you?

A test suite should address the operation of the function(s)
that it purports to test -- over the entire expected range of
operation. I.e., sqrt should "work" for all doubles -- even
the ones that you don't use in your application.

Now *you* have to modify the test suite and hopefully
not *break* it in the process!

Click to expand...

No, I don't.

You move the heap to 0xFFFFF000 -- since you have memory
there - and suddenly discover that the test suite's attempt
to allocate a 4K block of memory fails -- when it *shouldn't*
(there's nothing wrong with the code... the BUG is in your
test suite modification!)

Click to expand...

Nope, show me where I specified the location of the heap (other than
heap = new uint8_t[heapSize]; heapBase = heap.

Ian Collins · Mar 13, 2012

So I -- and anyone modifying the test suite -- should be diligent
enough to verify all of these issues INSTEAD OF something like:
ASSERT(result == 0x80032440)
which performs *all* of these "additional tests". If result is
*not* 0x80032440 then one of the other enumerated issues was
not satisfied!

If you have a condition you want to test, test it.

If you were testing strchr(3), would you do:
template = "ABCDEABCFF
sought = 'C'
result = strchr(template, sought)
ASSERT(result>=&template[0])
ASSERT(result< &template[strlen(template)])
ASSERT(*result == sought)
ASSERT(result[1] == sought+1)

Or, would you do:
ASSERT(result ==&template[2])

This embodies *all* of the above tests and gives the
function far less "wiggle room".

That doesn't make any sense and bears no resemblance to the questions
you asked and subsequently snipped.

If your comment applies to the strchr example, that is intended
to show that you can present lots of tests/criteria to *infer*
that the result is correct. *OR*, you could just CHECK THE
RESULT! You *know* what the conditions are when the test
is executed (i.e., you know what the resolution of your math
library is, you know the value(s) of the inputs to the routines
under test) so all you have to do is verify that the *result*
is EXACTLY what you expect it to be. No need to look at
indirect evidence when you have a concrete result that you
can test!

Click to expand...

Your typical memory allocator is a little more complex and less
deterministic than strchr().

Click to expand...

But it still produces *a* result. It is a deterministic
algorithm (even if the specification -- like malloc -- doesn't
make that determinism explicit). With a given set of operating
conditions and inputs, you **KNOW** what the result will be.
TEST FOR THAT RESULT!

It may produce a result, but more often than not, the exact value is of
little on no consequence. If you start to test for specific values
based on knowledge of the internals, you end up with a very fragile test
harness. I think this may have been what you were saying elsewhere. If
your tests become bonded to single implementation, you are stuck with
it. If your tests don't assume specific values, you are free to change
the internals of the function as long as you don't break its published
behaviour. Obviously if your internal implementation is specified you
are stuck.

Memory allocators are definitely one of those cases where one size does
not fit all, just look at the range of malloc libraries available for a
typical non-embedded environment. They would all pass the same test
suite assuming the tests didn't make assumptions about the implementation.

What happens if heapBase is effectively (char *) -32400?
What happens if the processor only has a 16b address space?
What happens if that +32440 was a consequence of pointer
alignment requirements on THIS PLATFORM?
What happens if there isn't physical memory present at
that location?

You break 0x32440 down into platform independent variables.

The approach I have taken (to testing on *one particular* platform)
eliminates the need for "peeking inside".

If I know where (on this particular platform) the heap is located
and how large it is and alignment constraints, etc. AND have just
initialized the heap... THEN:
- I know what the internal free list looks like, "conceptually"
(I might not know specific bit patterns but I know what the
free list *reflects* -- a heap that is completely FULL!
- given that free list, I know how any particular allocation
request WILL be handled.
- given how that request is handled, I now know what the NEW
state of the free list will be.
- lather, rinse, repeat

Until someone introduces a critical optimisation that changes the layout...

I still think you are testing at too high a level, test the functions
that manipulate your internal independently.

Ian Collins · Mar 13, 2012

And, because of that, *you* can't run the entire test suite.

It hasn't stopped me in the past and I have quite a few (embedded)
allocators and (hosted) test suites.

So, you live with whatever conditions the linker wants to
impose on the code? Examine any parts of the test suite that
throw errors and rationalize which ones "need not pass"
based on the choice the linkage editor made for you?

No. I don't make assumptions about values.

Don Y · Mar 13, 2012

Hi Ian,

It may produce a result, but more often than not, the exact value is of
little on no consequence.

Is the exact value of sqrt()'s result of no consequence?

If you start to test for specific values based
on knowledge of the internals, you end up with a very fragile test

No! I am *not* looking at test internals. I am looking at
the SPECIFIED, CONTRACTUAL BEHAVIOR of the function. Are
you "looking at internals" when you test the return value
of sqrt(5)? The sqrt function returns a positive value
for all non negative arguments such that the value, when
multiplied by itself, yields the original argument. THAT
is what you test for!

A *good* test picks values that are likely to cause problems
in an implementation and.or *this* implementation.

A memory allocator is no different. Pretend the "addresses"
returned were integers and the specification was written in
terms of *integers*. You wouldn't throw an arbitrary input
value at it and not *carefully* examine the result to see that
it complies with EXACTLY what you would expect, given its
contract.

And, if your goal was to ensure that the function behaved well
in the wide variety of cases that it might be subjected to
once deployed, you would think very carefully about *which*
conditions you put it in and how you exercised them.

If you developed a general purpose (4 function) calculator,
would you use 1+1, 8-5, 3*4, etc. as your test cases? Or,
would you think: "Hmmm... I wonder if it handles negative
arguments properly? And negative *results*? And, what happens
if I give it something too big -- 99999999+1? Or, something
'undefined' -- 345/0?"

"Settling" for what the linker gives you as the location for
the heap (as well as its *size*!) means you will *never* test
the code with any degree of confidence GIVEN THAT IT WILL
BE DEPLOYED IN ENVIRONMENTS OTHER THAN YOURS!

How do you test sqrt? Don't *you* pick the values that are
of interest to you in proving correct operation of the
function? Don't *you* verify that the results are "as
expected"?

Granted, for an invertible function you could potentially do
a monte carlo simulation having the test suite check the
results (i.e., if result*result =~ argument, then pass).
How would you apply that "random" approach to testing a
non-invertible function? Or, one who's inverse function
is of a complexity greater than or equal to the original
function (e.g., a secure hash)?

harness. I think this may have been what you were saying elsewhere. If
your tests become bonded to single implementation, you are stuck with
it. If your tests don't assume specific values, you are free to change

If your tests don't assume specific values, then you can't guarantee
where in the function's *domain* you are testing.

the internals of the function as long as you don't break its published
behaviour. Obviously if your internal implementation is specified you
are stuck.

The internal implementation need not be "fixed". You are still
free to implement the function however you want -- as long as
the external interface and the guarantees that it affords the
developer are ALWAYS met.

Memory allocators are definitely one of those cases where one size does
not fit all, just look at the range of malloc libraries available for a
typical non-embedded environment. They would all pass the same test
suite assuming the tests didn't make assumptions about the implementation.

No. It would be easier for *your* "implementation agnostic" approach
to test *all* of those allocators with a given test suite. Because
you don't *know/care* how they are making their choices! "Did I
get a result? OK. Does it overlap any other results I have had?
No. Does the chunk returned fit entirely in the *stated* (by
the linker!) area of the heap? Yes. Lather, rinse, repeat."

You break 0x32440 down into platform independent variables.

And how is that any different from platform *dependent*
variables? You still need to "tune" the values to fit
the *particular* (test) deployment. And, you have no
guarantee that you are poking around all the right
"dark corners" for the routine -- since you are not in
control of the conditions that your function is being
subjected to!

Until someone introduces a critical optimisation that changes the layout...

Until someone changes what "sqrt" means!

I *only* -- though THOROUGHLY -- test *published*, contractual
guarantees that the function *must* honor. What goes on under the
hood is never examined, directly.

You can implement sqrt with Newton-Rhapson, large lookup tables,
CORDIC, etc. Your test suite DOESN'T CARE! As long as sqrt(foo)
truly produces the square root of foo!

I can change the internals of the memory allocator any way I choose
AS LONG AS the results returned for a given set of operating
conditions and input variables REMAIN UNCHANGED.

E.g., the allocator makes no claims about the *content* of the
blocks of memory allocated. I could add some code to zero them
out, fill them with pseudo-random numbers -- or just leave
their previous contents "as is". The test suite can't *test*
any of those conditions because none are guaranteed in the
contract. If I find it more efficient to leave assorted
crap *in* the chunks, the test suite can't complain!

(If K&R's malloc wants to do a "first fit" selection strategy,
there's nothing preventing it! Nor anything preventing it
from changing that, in a later edition, to a "last fit".
The test suite for *their* malloc would have to accept both
sets of results as equally valid.)

I still think you are testing at too high a level, test the functions
that manipulate your internal independently.

There *is* nothing else. the allocator and the deallocator
are the *only* things that massage the internal state of the
heap!

C unit testing and regression testing	40	Aug 8, 2013
Regression testing for HTTP server	5	May 16, 2013
differentiating between pointers - "primary"?	9	May 24, 2012
Regression testing with Python	0	Jun 5, 2007
How to go about with PDF regression	1	Feb 18, 2013
Regression test	1	Mar 24, 2009
Sizes of pointers	233	Jul 30, 2013
Testing for performance regressions	3	Apr 5, 2011

Regression testing for pointers

Don Y

Don Y

Ian Collins

Ian Collins

Don Y

Don Y

Don Y

Don Y

Kaz Kylheku

Ian Collins

Ian Collins

Ian Collins

Don Y

Don Y

Ian Collins

Don Y

Don Y

Ian Collins

Ian Collins

Don Y

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads