Audit

R

Richard Herring

In message
REH said:
On Jul 8, 9:22 am, Richard Herring <junk@[127.0.0.1]> wrote:

[restoring context]
I believe IEEE rules state that anytime a denormalized value is used
with a normalized value (e.g., arithmetic), the denormalized value is
normalized (becomes 0.0).

I'm talking about when x is *not* denormalized, just less than
epsilon()..
 
R

REH

Well, I could be wrong (I usually am!). My understand is that it is
not just division. Any time a denormalized number is used
arithmetically with a normalized number, the former is "flushed" to
zero.

Here's a quick-and-dirty program to demonstrate the use of denormals
with normalized values:

#include <iostream>
#include <float.h>
#include <math.h>

int main()
     {
     float base = ldexp(1.0f, FLT_MIN_EXP);
     float fl = base;
     while (fl != 0.0f)
         {
         fl /= 2;
         std::cout << fl << ", " << (base + fl) << std::endl;
         }
     return 0;
     }

The loop starts with fl having the smallest normalized value (base), and
generates successively smaller denormal values by dividing the previous
value by 2. The output shows that value and the result of adding it to
base. The output I get is this:

[work]$ g++ test.cpp
[work]$ ./a.out
1.17549e-38, 3.52648e-38
5.87747e-39, 2.93874e-38
2.93874e-39, 2.64486e-38
1.46937e-39, 2.49793e-38
7.34684e-40, 2.42446e-38
3.67342e-40, 2.38772e-38
1.83671e-40, 2.36936e-38
9.18355e-41, 2.36017e-38
4.59177e-41, 2.35558e-38
2.29589e-41, 2.35328e-38
1.14794e-41, 2.35214e-38
5.73972e-42, 2.35156e-38
2.86986e-42, 2.35128e-38
1.43493e-42, 2.35113e-38
7.17465e-43, 2.35106e-38
3.58732e-43, 2.35102e-38
1.79366e-43, 2.35101e-38
8.96831e-44, 2.351e-38
4.48416e-44, 2.35099e-38
2.24208e-44, 2.35099e-38
1.12104e-44, 2.35099e-38
5.60519e-45, 2.35099e-38
2.8026e-45, 2.35099e-38
1.4013e-45, 2.35099e-38
0, 2.35099e-38
[work]$

The first dozen or so lines show that adding the denormal value to base
produces a value that's different from base, so the denormal value is
not being flushed to zero. The last several lines show no difference,
because the denormal value is too small to affect the result.

Granted, those last lines could be described as "the denormal is flushed
to zero". But that's not because the value is a denormal; it's simply
because the other value in the sum is much larger, so the denormal value
gets lost in rounding. The same thing happens with 1.0e20 + 1.0e1.

I guess I'll have to try and locate where I read that info about
denormals. I thought it was in "What Every Computer Scientist Should
Know About Floating-Point Arithmetic," by David Goldberg. But I can't
seem to find it in there.

REH
 
J

Jonathan Lee

My comments were entirely based on the
implementation scheme you had in mind, which struck me as problematic
(and I stand by that). I hope that if nothing else I've raised a couple
of issues which might be worth considering.

Fair enough :) I appreciate the advice

--Jonathan
 
J

Jerry Coffin

[ ... ]
Well, I could be wrong (I usually am!). My understand is that it is
not just division. Any time a denormalized number is used
arithmetically with a normalized number, the former is "flushed" to
zero.

That would be a rather strange thing to do. The whole point of having
denormals in the first place is to _avoid_ flushing to zero if at all
possible. If using it in an expression with a normalized number
resulted in its being flushed to zero anyway, you'd render denormals
almost entirely useless.
 
A

Andrew Tomazos

The balance depends on the application. If your program going wrong
could potentially kill me (think code to control an X-ray scanner, or a
large, heavy robot arm in a factory, or...), I'd want you to be far
closer to (B) than (A). Equally, though, I'd want to make sure that your
being closer to (B) would actually have the desired effect of making my
demise less likely. Checklists of *irrelevant* points could distract you
from making your code safer.

I suspect James might doubt that it will be equally robust. But thinking
about it, I suppose there is an issue of how you quantify the robustness
of your code.

I'm all for
avoiding onerous processes, except when they're necessary (which is
sometimes the case).

Robustness is quantified by the degree to which software meets its
requirements.

What you are missing is that time saved on onerous process can be
spent on runtime testing/debugging. Perfect software is practically
impossible for a nontrivial requirement set. You have to look at how
much robustness you get per unit development time. This is true
whether or not the software is mission-critical and/or potentially
life-threatening. That is what I mean when I say that moving from the
balance point toward (B) doesn't get you more robustness in the same
amount of time.

If a software application could potentially injure people when it
fails to meet its requirements, than it should be treated in the same
way that a new medical drug is treated. After feature freeze, a crazy
amount of runtime testing is conducted in a safe controlled
environment. Any new changes to the code must be put through full
regression (ie the testing process much be restarted from scratch).

Let's take a specific example. Suppose there is a function:

void f(int x);

and that f is not exception safe for all values of x. In fact when f
(3) is called, an exception is thrown in an unexpected place, and the
system misbehaves - zapping the patient with a deadly amount of X-ray
radiation.

But suppose that f(3) can never be called at runtime in the final
application. ie There is no calling site of f where x can possibly be
the parameter 3.

If we were to check f for exception safety, we might find this f(3)
problem and fix it. This will however be a waste of time, and
achieves zero improvement on robustness.

Therefore someone that did not bother to check f for exception safety
would have an equally robust application in less time.

This person can then spend this extra time on something else. (For
example runtime testing/debugging)
-Andrew.
 
A

Andrew Tomazos

All the testing in the world won't be enough in most cases: you'll never
have enough time to test your code for all possible inputs. For
safety-critical systems, it's a choice between something that can
potentially give you the confidence levels you require in your system,
and something that can't. It's a no-brainer really.

Whether you call it robustness, correctness, reliability, stability,
solidness or just plain old quality - isn't very interesting. What I
mean is how well the software works as expected (in all possible
situations). If you don't know what you expect than none of the
methods we have discussed can save you.

Formal Methods are different than general Software Engineering
Process. We started out this conversation talking about a Software
Engineering Process related activity (checking each function for 20
rules). Formal Methods are quite different. They use first-order
logic to prove that a given algorithm is correct. See the book:
Software Reliability Methods, Peled/Springer.

Formal Methods *cannot* be applied to anything other than a trivially
small program. They break down quickly with complexity and large
ranges of chaotic input. Certainly anything as complex as the
software running an X-ray machine cannot and is not verified using
Formal Methods. I have colleagues that work on a line of one of the
most common medical robots that are used in surgeries. They
specifically do not use these type of methods as they are
impractical. You may be surprised to learn that they work pretty much
the same way as most professional developers work, they are just given
more time to get it right, and a hell of a lot bigger QA team.

Nothing can prove the absence of bugs in a nontrivial realworld
application. As I said, perfect software is virtually impossible.
Yes, you can't test every possible set of inputs. Testing isn't
everything. Neither is anything else. None of the methods discussed
are a magic bullet. It is a question of balance. If you want more
reliable software than you need to spend more time on it. Moving off
balance isn't going to help.
Since 3 is unexpected, and you've improved the
behaviour for it, you've likely improved the robustness of the program
(assuming your change didn't mess up something else). Whether it's a
waste of time or not in this particular instance is another matter.

You didn't understand my example. It is impossible (not just
unexpected) that f is called with a parameter of 3. Go and study my
example again.
-Andrew.
 
J

James Kanze

What do you recommend for this scenario: I have a project
which uses the Qt libraries. From version 3 to version 4
of Qt there were many changes to the API, so my code has
a lot of this:
#ifndef QT_V3
statusbar->addPermanentWidget(ageLabel);
#else
statusbar->addWidget(ageLabel, true);
#endif
I would like to support both versions, but having two
separate cpp files for these little one liners seems
difficult to maintain. Having one file, I figure,
allows me to keep the two versions in step.

You don't necessarily need two separate source files. Just two
headers, with something like:
inline void
addPermanentWidget(
QT_StatusBar* statusBar,
std::string const& label )
{
statusBar->addWidget( label, true ) ;
}
in one, and:
inline void
addPermanentWidget(
QT_StatusBar* statusBar,
std::string const& label )
{
statusBar->addPermanentWidget( label ) ;
}
in the other. Put them in separate directories, and choose
which one by means of a -I option at compile time. Or wrap the
interface any other way you like.

(Of course, one might also consider changing suppliers
completely. Breaking client code in this manner is pretty
irresponsible, and I'd wonder about a supplier who cared so
little for his customers.)
 
J

James Kanze

On Jul 8, 11:30 am, James Kanze <[email protected]> wrote:

[...]
There is a difference between a 1 hour code review of a
manweek's worth of code, and double checking every function
for 20 things.

The code reviewer's job is to double check that the code works
(and is understandable). Checking against a list, for things
which can be checked against a list, makes the job easier and
quicker.

Note that that doesn't mean I agree with all of the things on
his list. "Thread safety" just doesn't work as a check
point---you have to actually analyse the code, and the context
in which it is designed to be used, to determine that. What
I've seen in experience is that the list should be short and
evolutive: a list with a couple hundred points won't work,
because it will either be ignored, or it will cost too much time
to check each point separately. On the other hand, if you keep
the list short, and have it reflect actual errors which slipped
through code review, then it can be very effective. (I.e. if
you find errors after code review because variables haven't been
initialized, you add a point to the list "check that all
variables have been initialized".)
 
J

James Kanze

Robustness is quantified by the degree to which software meets
its requirements.

Amongst other things. (Or are you including "implicit
requirements"? I've never seen a requirements specification
which said that the program shouldn't core dump, but of course,
programs which core dump aren't robust.)
What you are missing is that time saved on onerous process can
be spent on runtime testing/debugging.

Whose talking about an "onerous process"? We're talking about a
means of simplifying, and perhaps even mechanising, part of an
essential step.
Perfect software is practically impossible for a nontrivial
requirement set. You have to look at how much robustness you
get per unit development time. This is true whether or not
the software is mission-critical and/or potentially
life-threatening. That is what I mean when I say that moving
from the balance point toward (B) doesn't get you more
robustness in the same amount of time.

No. It gets you more robustness in less time, because the
errors are found earlier. Finding an error which only shows up
in a test takes significantly more time than finding one where
the error is pointed out to you immediately in the source code.
In general, the further upstream the error is detected, the less
it costs to find and fix it: an error found in code review costs
less to find and fix than one found in unit tests; an error
found in unit tests costs less to find an fix than one found in
integration tests; and an error found in integration tests costs
less than one found in the field.

Using a check list is simply a way of making some aspects of the
code review more efficient (or even automatic).
If a software application could potentially injure people when
it fails to meet its requirements, than it should be treated
in the same way that a new medical drug is treated. After
feature freeze, a crazy amount of runtime testing is conducted
in a safe controlled environment. Any new changes to the code
must be put through full regression (ie the testing process
much be restarted from scratch).

Don't forget that testing can only prove that a program is
incorrect; it can't prove that a program is correct. For life
critical software, it's usual to require some sort of formal
proofs. And that can be expensive. Generally, judging from my
own experience, anything you do getting the error rate down to
about one error per 100KLoc, going into integration, also
reduces total development costs. Beyond that, I'm not sure.
Formal proofs aren't necessary for that level of quality, but
that level of quality probably isn't acceptable for life
critical software. (But the issue isn't cut and dried. For the
example mentionned, one could argue that the system should be
physically designed so that no matter what the software did, it
couldn't generate an overdose of radiation. I know that when I
worked on a locamotive brake system, the emergency brake
position acted physically, in a way that nothing in the rest of
the system could possibly prevent the brakes from being
applied.)
Let's take a specific example. Suppose there is a function:
void f(int x);
and that f is not exception safe for all values of x. In fact
when f (3) is called, an exception is thrown in an unexpected
place, and the system misbehaves - zapping the patient with a
deadly amount of X-ray radiation.
But suppose that f(3) can never be called at runtime in the
final application. ie There is no calling site of f where x
can possibly be the parameter 3.
If we were to check f for exception safety, we might find this
f(3) problem and fix it. This will however be a waste of
time, and achieves zero improvement on robustness.

Until some modification elsewhere causes f to be called with 3.
Either f's contract allows it to be called with 3, or it
doesn't. If f's contract allows it to be called with 3, then f
must handle 3 correctly. If it doesn't there should be an
assert at the top of the function, so that the system will fail
(and of course, all calling code should be checked that it
respects the contract).
 
A

Andrew Tomazos

Amongst other things.  (Or are you including "implicit
requirements"?  I've never seen a requirements specification
which said that the program shouldn't core dump, but of course,
programs which core dump aren't robust.)

I'm talking about all requirements. Everything you expect the
software to do in all situations. I'm not necessarily talking about
formal written requirements that are part of some engineering
process. Clearly most software is expected not to core dump.
Whose talking about an "onerous process"?  We're talking about a
means of simplifying, and perhaps even mechanising, part of an
essential step.

Stuart Golodetz was talking about onerous process, in the paragraph I
was replying to. Consider reading a thread before chiming in.
Perfect software is practically impossible for a nontrivial
requirement set.  You have to look at how much robustness you
get per unit development time.  This is true whether or not
the software is mission-critical and/or potentially
life-threatening.  That is what I mean when I say that moving
from the balance point toward (B) doesn't get you more
robustness in the same amount of time.

No.  It gets you more robustness in less time, because the
errors are found earlier.  Finding an error which only shows up
in a test takes significantly more time [snip] integration tests costs
less than one found in the field.

Your equation assumes that the number of errors found with each
process per unit time is constant. I agree that repairing a runtime
bug takes longer than repairing one found statically. However there
is a diminishing return from static analysis - eventually you reach a
BALANCE point at which looking for bugs at runtime is more useful.
Using a check list is simply a way of making some aspects of the
code review more efficient (or even automatic).

I am not against this. My position is that there is a balance.
Until some modification elsewhere causes f to be called with 3.

I said that in the *final* system, f cannot possibly be called with a
parameter of 3. There are no further modifications. Please study the
example more closely.
-Andrew.
 
A

Andrew Tomazos

If you expect wrong, you'd prefer that your mistake doesn't result in
your program launching a nuke at somebody.
If I write a piece of software that works correctly for all specified inputs, and I document
those for the end-user, then the program is "correct". If I don't
sensibly handle inputs that shouldn't be given by the end-user, then the
program still isn't "robust" - if the end-user accidentally provides
such an input regardless, bad things can happen (and we don't want
that). Two entirely different properties there.

You're quibbling about trivial differences in definition. Under *my*
definition of robustness (the one you are to use when I write the
word) there is no distinction between what you would prefer the
software to do and what you expect it to do. Your expectation covers
all possible inputs and states of the program. Think about it as a
function of the set of all possible input I to the set of all possible
behaviors B:

expect: I -> B

You can generate this function by answering the question:

for all i element of I:
expect(i) = "Given input i what behaviour should the program
exhibit?"

The robustness of the program is then measured by some metric (count
()) over I, where the actual performance:

actual: I -> B

is compared to the expected performance.

Let the set C be generated by:

for all i element of I
i is an element of C iff (actual(i) = expect(i))

And finally:

robustness = ( count(C) / count(I) )

This should be rigorous enough for you.
I guess that slightly depends on your definition of "trivially small".
See the paper entitled "Practitioners' views on the use of formal
methods: an industrial survey by structured interview" if you've got
access to it. Just in case, here's a quote (emphasis mine):

*they indicate that formal methods
were used on systems typically in the region of tens of Kloc*.

Yeah - the paper you have quoted is bullshitting you basically.
Nobody has applied a formal method to any realworld program of 20,000
lines. Perhaps some self-contained mathematical simulation that
doesn't actually do anything. It just can't be done. Anytime you hit
the network, or the filesystem, or any input or output device, or the
user interface, or a system call, or use a library - it just does not
fit into a formal proof.
"Formal methods are worthwhile in terms of improved quality of software
with little or no additional lifecycle costs, but only when compared to
a rigorous development lifecycle where the cost of software errors is
high. If the market does not demand high quality software, then it is
more difficult to justify their use."

Of course formal methods aren't a magic bullet - but you can achieve
levels of confidence in your software using them that are unattainable
just by testing. In some domains that's important.

Listen carefully. I've studied and used formal methods first hand.
They are infeasible to apply to realworld applications. A project
must be small, self-contained and have rigorous mathematically defined
requirements under all possible input. Go and try to use formal
methods to prove a tiny toy program is correct, and you will see quite
quickly what I am talking about.
I did understand your example, but there's no reason to suppose that
f(3) can never happen. Just because it doesn't happen in the present
iteration of the code doesn't mean that it never will. Software changes.
If you can assume that bad inputs will never happen, then software
robustness is a non-issue - but in the real world, you can't.

No, you have not understood my example. I said that in the *final*
application (ie there is no future iteration - software does not
magically change by itself) there is no calling site where f can
possibly be called with a parameter of 3. Therefore it is completely
impossible (even for bad input to the application) that f is called
with a parameter of 3. Look at it again until you understand.
-Andrew.
 
J

Jonathan Lee

No, you have not understood my example.  I said that in the *final*
application (ie there is no future iteration - software does not
magically change by itself) there is no calling site where f can
possibly be called with a parameter of 3.  Therefore it is completely
impossible (even for bad input to the application) that f is called
with a parameter of 3.  Look at it again until you understand.
  -Andrew.

You seem to haven simply moved the proof from the function to the
calling system. You claim that f() doesn't need to be checked for an
input of 3 because the caller _cannot_ produce as an argument. But to
make a claim that "it is completely impossible [...] that f is called
with a parameter of 3" you must have *proved* that.

Your conclusion is built into your hypothesis.

--Jonathan
 
A

Andrew Tomazos

No, you have not understood my example.  I said that in the *final*
application (ie there is no future iteration - software does not
magically change by itself) there is no calling site where f can
possibly be called with a parameter of 3.  Therefore it is completely
impossible (even for bad input to the application) that f is called
with a parameter of 3.  Look at it again until you understand.

You seem to haven simply moved the proof from the function to the
calling system. You claim that f() doesn't need to be checked for an
input of 3 because the caller _cannot_ produce as an argument. But to
make a claim that "it is completely impossible [...] that f is called
with a parameter of 3" you must have *proved* that.

Your conclusion is built into your hypothesis.

No, it is not a circular argument. Consider the following
application:

void f(unsigned int x)
{
if (x == 3)
CRASH_HORRIBLY;
else
OKAY;
}

void main_program(unsigned int i)
{
unsigned int x = i | 7;
f(x);
}

No matter what the application input value i is, f(3) can never by
called. It's impossible. This is true whether or not I prove it to
be true.

Fixing the function f for the case of x=3 will have zero impact on the
robustness of the application, and simply be a waste of time.

Do you understand now?
-Andrew.
 
J

Jonathan Lee

No matter what the application input value i is, f(3) can never by
called.  It's impossible.  This is true whether or not I prove it to
be true.

Yes, but you are using your knowledge that it is true to make the
claim that the check is unnecessary. You can't know that the check is
unnecessary unless, well, you know that the check is unnecessary. See,
in saying that _you know_ it is impossible you must have _proved_ it
somehow otherwise you just _believe_ it.

Consider a different example with the same f() you just provided, but
a different main_program(). One that I make up and won't tell you
about. You don't know if main_program() calls f() with 3 or not. Does
your argument still stand? Should you not handle 3 in f?

Now if I were to post the contents of main_program() you could
consider it "finalized" and determine if 3 was ever passed to f().
But starting from a position of ignorance about main_program() (as you
must) does not allow you to form your hypothesis. You have to check
main_program() in order to do that.

--Jonathan
 
A

Andrew Tomazos

Yes, but you are using your knowledge that it is true to make the
claim that the check is unnecessary. You can't know that the check is
unnecessary unless, well, you know that the check is unnecessary. See,
in saying that _you know_ it is impossible you must have _proved_ it
somehow otherwise you just _believe_ it.

In this particular example the check was unnecessary. It doesn't
matter whether we know it or not. With this application, someone that
did not make the check would have an equally robust application in the
same time. Therefore we can conclude that there exists (at least one)
case(s) where checking every function for the "checklist" is a waste
of time.
Consider a different example with the same f() you just provided, but
a different main_program(). One that I make up and won't tell you
about. You don't know if main_program() calls f() with 3 or not. Does
your argument still stand? Should you not handle 3 in f?

No, in your modified example, the same argument cannot be made,
because you have changed the conditions. This does not affect the
existence of my example.
Now if I were to post the contents of main_program() you could
consider it  "finalized" and determine if 3 was ever passed to f().
But starting from a position of ignorance about main_program() (as you
must) does not allow you to form your hypothesis. You have to check
main_program() in order to do that.

At this point I suspect you are being intentionally obtuse. My
purpose was to provide a concrete example of a situation where the
"checklist every function" process was a waste of time. I have
succeeded in providing that example. I was not suggesting this one
example was a proof that my overall position was correct - simply
trying to motivate your imagination to understand why there is a
balance between (A) and (B).
-Andrew.
 
M

Michael Tsang

Jonathan said:
Hello all,
To be a good little coder I want to ensure all of my functions pass
a checklist of "robustness". To keep things simple, I want to document
each function with a string that will indicate which of the checklist
items the function has been audited for. Something like

abcdefghiJklMnopqRsTuvwxyz

which would show that items J, M, R, and T have been checked. Off the
top of my head I came up with the list below. I wonder if anyone has
items they think should be added to the list. Any advice welcome,

--Jonathan

Audit list (an implicit "where applicable" should be assumed)
A - Arguments checked against domain
B - Arrays have bounded access
C - No C style casts, other casts as appropriate. Avoid
reinterpret_cast<>
D - No #define's - use static const, enum, or function
E - Exception safe
F - Floating point comparisons are safe (eg., don't check against 0.0)
I - Use initialization lists in constructors
L - Loops always terminate
M - Const qualify member functions that need it
N - "new" memory is not leaked, esp., in light of exceptions
O - Integer overflow
P - Wrap non-portable code in "#if"s and warn user with #else
R - Reentrant
Q - Const Qualify object arguments
T - Thread safe
V - Virtual destructor

I usually use the following compiler flags when compiling C/C++ programs:

CFLAGS="-pipe -pedantic -Wall -Wextra -Wformat=2 -Winit-self -Wunused -
Wfloat-equal -Wundef -Wshadow -Wcast-qual -Wcast-align -Wwrite-strings -Wno-
empty-body -Wsign-conversion -Wlogical-op -Wno-missing-field-initializers -
Wredundant-decls -ggdb3 -O0 -Wconversion -std=c99 -Wbad-function-cast -Wc++-
compat -Wstrict-prototypes -Wold-style-definition -Wnested-externs"

CXXFLAGS="-pipe -pedantic -Wall -Wextra -Wformat=2 -Winit-self -Wunused -
Wfloat-equal -Wundef -Wshadow -Wcast-qual -Wcast-align -Wwrite-strings -Wno-
empty-body -Wsign-conversion -Wlogical-op -Wno-missing-field-initializers -
Wredundant-decls -ggdb3 -O0 -Wconversion -std=c++0x -Wctor-dtor-privacy -
Weffc++ -Wstrict-null-sentinel -Wold-style-cast -Woverloaded-virtual -Wsign-
promo -Wno-vla"

However, I can't avoid some warning about virtual destructors because I
derive classes from the STL, as they don't have virtual destructors.
 
A

Andrew Tomazos

However, I can't avoid some warning about virtual destructors because I
derive classes from the STL, as they don't have virtual destructors.

That's not recommended practice you know. You're supposed to use
containment. ( Actually, we took the third option: and threw the
entire STL out the window. :) )
-Andrew.
 
N

Nick Keighley

I remember this happening - not sure what you're saying though. I
thought this was a straightforward case of testing gone wrong?

It was an example where if there had been a better way of validation
than testing then it might have paid to use it. The human test
subjects
were given a dose that was supposed to be far below that needed to
produce any clinical symptoms. And it damn nearly killed them.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top