accuracy when comparing doubles

V

vk

I am writing image processing code for a university project.
When I work with numbers (doubles), is it safe to assume that;

a) simple arithmetic is accurate to 15 decimal places, always
b) boolean comparisons will be accurate with doubles (up to 15 decimal
places)

?

If not, I'll have to scrap all of my c++ (!!) or use GMP.. and both
would suck.
 
K

Kai-Uwe Bux

vk wrote:

[snip]
When I work with numbers (doubles), is it safe to assume that;

a) simple arithmetic is accurate to 15 decimal places, always

That is implementation defined. The implementation, however, makes some
information about the precision of floating point arithmetic available
through the header <limits>. Note that even if the precision of a double is
15 decimals, it is still not guaranteed that, say, addition of two floats
will yield a value closest to the sum (one of at most two). However, the
compiler documentation should point out additional guarantees and standards
governing the floating point arithmetic.

b) boolean comparisons will be accurate with doubles (up to 15 decimal
places)
[snip]

Comparison of floating point numbers is accurate as long as the values
involved are objects. If you compare intermediate values, then clause
[5/10] may apply:

The values of the floating operands and the results of floating
expressions may be represented in greater precision and range than that
required by the type; the types are not changed thereby.

In particular, numbers may compare non-equal even if they are the same once
written to memory.


Best

Kai-Uwe Bux
 
J

Juha Nieminen

vk said:
I am writing image processing code for a university project.
When I work with numbers (doubles), is it safe to assume that;

a) simple arithmetic is accurate to 15 decimal places, always

Consider this:

double sum = 0.0;
for(int i = 1; i <= 10; ++i)
sum += 0.1;
if(sum == 1.0) std::cout << "Correct result." << std::endl;
else std::cout << "Incorrect result." << std::endl;

Take a guess of what this program will output.
If not, I'll have to scrap all of my c++ (!!) or use GMP.. and both
would suck.

GMP won't help you in this case.
 
J

joecook

If not, I'll have to scrap all of my c++ (!!) or use GMP.. and both
would suck.

If you have chosen to use 8 byte floating point representation in your
program, this has very little to do with c++. GMP will not help at
all, nor will scrapping c++. (i.e. Using MATLAB, Fortran, etc will
have the same issues)

You are going to have to define what it means to be accurate to 15
decimal digits for "simple arithmetic"

For instance, (1.0e16 + .05) - 1.0e16, you might expect to be 0.5,
but you may very well get "0" as the answer. This is simple
arithmetic, but is accurate to zero digits.

Joe Cook
 
C

Chinson

vk said:
I am writing image processing code for a university project.
When I work with numbers (doubles), is it safe to assume that;
a) simple arithmetic is accurate to 15 decimal places, always
b) boolean comparisons will be accurate with doubles (up to 15 decimal
places)

If not, I'll have to scrap all of my c++ (!!) or use GMP.. and both
would suck.

This may help

#ifndef FLOAT_UTILS_Header__
#define FLOAT_UTILS_Header__

inline int GetExpoBase2(double d)
{
        int i = 0;
        ((short *)(&i))[0] = (((short *)(&d))[3] & (short)32752); //
_123456789ab____ & 0111111111110000
        return (i >> 4) - 1023;

}

inline bool Equals(double d1, double d2)
{
        if (d1 == d2)
                return true;
        int e1 = GetExpoBase2(d1);
        int e2 = GetExpoBase2(d2);
        int e3 = GetExpoBase2(d1 - d2);
        if ((e3 - e2 < -48) && (e3 - e1 < -48))
                return true;
        return false;

}

inline int Compare(double d1, double d2)
{
        if (Equals(d1, d2) == true)
                return 0;
        if (d1 > d2)
                return 1;
        return -1;

}

inline bool Greater(double d1, double d2)
{
        if (Equals(d1, d2) == true)
                return false;
        if (d1 > d2)
                return true;
        return false;

}

inline bool GreaterOrEqual(double d1, double d2)
{
        if (Equals(d1, d2) == true)
                return true;
        if (d1 > d2)
                return true;
        return false;

}

inline bool Less(double d1, double d2)
{
        if (Equals(d1, d2) == true)
                return false;
        if (d1 < d2)
                return true;
        return false;

}

inline bool LessOrEqual(double d1, double d2)
{
        if (Equals(d1, d2) == true)
                return true;
        if (d1 < d2)
                return true;
        return false;

}

#endif

It's great. I learn a lot.
 
A

arnuld

  Consider this:

    double sum = 0.0;
    for(int i = 1; i <= 10; ++i)
        sum += 0.1;
    if(sum == 1.0) std::cout << "Correct result." << std::endl;
    else std::cout << "Incorrect result." << std::endl;

  Take a guess of what this program will output.

What the heck !

it prints: Incorrect Result. That's devastatingly funny.
 
A

arnuld

Now this is from the FAQ:

But on this pretend machine, 0.1 cannot be represented exactly since it
cannot be formed as a sum of a finite number of powers of 2. You can get
close but you can't represent it exactly. In particular you'd have a 0
in the ¨ö-place, a 0 in the ¨ù-place, a 0 in the ¨û-place, and finally a 1
in the "sixteenths" place, leaving a remainder of 1/10 - 1/16 = 3/80.
Figuring out the other bits is left as an exercise (hint: look for a
repeating bit-pattern, analogous to trying to represent 1/3 or 1/7 in
decimal format).


I don't understand what exactly he FAQ means by 0 in the 1/2 place and
a 0 in the 1/8th place. What does that mean ? .. what are those half
1/2 and 1/8 places. I understand that 0.1 can be written as 1/10, so
its 1/10th place.
 
B

Bart van Ingen Schenau

Now this is from the FAQ:


I don't understand what exactly he FAQ means by 0 in the 1/2 place and
a 0 in the 1/8th place. What does that mean ? .. what are those half
1/2 and 1/8 places. I understand that 0.1 can be written as 1/10, so
its 1/10th place.

The 1/2-place, 1/4-place, 1/8-place, etc. all refer to fractional
binary digits in a binary number. Each fractional binary digit has a
value corresponding to 2^(-N), where N denotes the position after the
binary point. As 10 is not a power of two, there is no 1/10-place in a
binary representation.

As an example of different representations, the value 3/8 can be
written as 0.011{2} or as 0.375{10} (The number in braces indicates
the base).
0.011{2} == 0*2^-1 + 1*2^-2 + 1*2^-3
0.375{10} == 3*10^-1 + 7*10^-2 + 5*10^-3

HTH,
Bart v Ingen Schenau
 
J

Juha Nieminen

Giuliano said:
- floating point numbers are an approximation of real numbers.

Not always. There are many real numbers which can be represented in a
completely exact manner with floating point numbers.
- you should never compare computed results against fixed value
constants with the equal operator.

Never say never. This should work just ok:

double value = 1.0;
if(value == 1.0)
std::cout << "Ok" << std::endl;
else
std::cout << "Error" << std::endl;
In the case above it would have been wiser to write something like:

if (fabs(sum - 1.0) < EPSILON) ...

where EPSILON is your maximum error tollerance. In this case: 1e-15

That's the "classical" way of comparing floating point numbers, but it
doesn't work well in all cases.

The problem with the above comparison is that the actual amount of
digits being compared depends on the magnitude of the values. For
example, if you are comparing values between 1000.0 and 10000.0, you are
actually comparing 19 digits, but if you are comparing values between
0.001 and 0.01, you are comparing only 12 digits. Clearly the
comparisons are not equal in tolerance: The larger values are being
compared with a much lesser tolerance than the smaller values, and a
change of the same relative magnitude may cause the comparison to fail
in one case but not the other.

Sometimes you need to compare the n *most significant* digits, not
that the numbers are equal up to n digits after the decimal. The formula
to do this is a bit more complicated.
 
J

James Kanze

Juha Nieminen ha scritto:
Some considerations/suggestions when using floating point:
- floating point numbers are an approximation of real numbers.

Not really, and that's where the problem lies. Floating point
numbers are not real numbers, and don't obey the rules of real
number arithmetic. They have their own rules, which must be
learned and understood to use them effectively.
- precision of the final result depends also on the
computation you make

The final result is always exact, according to the rules of
floating point arithmetic. It may not be what you expect,
however, if you are thinking in terms of real numbers. (And I'm
not sure that "precision" is the best word to categorize the
difference. The difference between a floating point result and
what the result would be with real numbers can easily be several
orders of magnitude.)
- you should never compare computed results against fixed value
constants with the equal operator.
Nonsense.

In the case above it would have been wiser to write something
like:
if (fabs(sum - 1.0) < EPSILON) ...
where EPSILON is your maximum error tollerance. In this case:
1e-15

That's fine for this case, but it doesn't cover all cases, by
far. In most of the floating point arithmetic in my current
application, we use ==, because that's what we need.
 
J

James Kanze

James Kanze ha scritto:
Well, technically floating point numbers are a subset of
rational numbers except for +/-INF and NAN elements (at least
according on the IEEE 754 standard which is the most used
today).
Clearly the opposite is not true, that is rational number
cannot always be represented by a floating point numbers.
However that's not the point. The point is:
- clearly floating point numbers can be viewed as an
independet algebraic structure with its own rules. However
they have been designed in that way with the specific purpose
of mimicking as close as possible the behaviour of real
numbers and operators. Therefore the approximation view is
preferable in most of the cases.

Except when it isn't. In fact, if you start thinking of
floating point values as an "approximation" of real numbers,
you'll quickly get into trouble. The issues are more
complicated, and if you want to write robust floating point
software, you have to understand floating point arithmetic, and
not just real arithmetic.
- whether the approximation is good or bad depends on many
factors including the number of bits used, the value, the
operators applied, the sequence of application and most
imporantly the accuracy required.

There are very few applications which can take errors of several
magnitude, and unless you design your algorithms with the
characteristics of floating point arithmetic (and not real
arithmetic) in mind, you can easily end up with errors of
several magnitude.
- their algebraic structure is important because its knowledge
allows us to estimate (at least in simple cases) the error we
are making with this model. I say simple cases because there
are complex algorithm (i.e. matrix inversion) where it can be
difficult to estimate the maximum error we're making.

There are some simple cases where you can pretty much settle for
using them as an "approximation"---I do so for calculating
percents, for example. But they don't go very far.
"The use of the equality test (if (x==y) ...) is usually not recommended
when expectations are based on results from pure mathematics.

There's a significant difference betweeh "usually not
recommended when expectations are based on the results from pure
mathematics" and "never". Even in the Wikipedia article,
however, the expression should be "mathematics with real
numbers", and not "pure mathematics"---floating point is also
subject to some real mathematics. And an even better
recommendation would be to not use floating point with such
expectations at all.
Such tests
are sometimes replaced with "fuzzy" comparisons (if (abs(x-y) < epsilon)
...), where epsilon is sufficiently small and tailored to the
application, such as 1.0E-13 - see machine epsilon). The wisdom of doing
this varies greatly. It is often better to organize the code in such a
way that such tests are unnecessary."

The important point is that you have to know what you're doing.
Just replacing all of your tests for equality with tests for
approximate equality isn't going to make broken code correct.
 
J

James Kanze

James Kanze ha scritto:
Your statements are simply contradictory.
You are after "robust floating point software", but at the
same time you consider exact the computations made with
floating point arithmetic.

They are exact, according to the definitions of floating point
arithmetic.
But with this logic whatever you do with fp is robust and
sound unless, and here's the contradiction, by robust you
actually mean that the final result approximates the better it
can the one you got from real arithmetics.

No. Floating point arithmetic obeys different rules than real
arithmetic, and you have to understand and obey those rules if
you want "useful" results.
So eventually you're after reals approximation.

Often (although not in my work). For the final results.
Whether the "exact" floating point result is an adequate
approximation of the real value you want depends on the laws of
floating point arithmetic, however, and not on the laws of real
arithmetic. And those laws are exact; there are no
approximations involved at that level.
But nobody is saying that you should follow the real model
(i.e. making the same computations), I'm saying you're after
the real model final result. The better path to get there is
something fp knowledge has to tell you.

OK. There are exceptions, but that's usually the case. The
results are an approximation, but the analysis you do to ensure
that the approximation is adequate depend on the exact behavior
of floating point arithmetic.
Can you show us an example where you use floating point not as
an approximation of real numbers ? (note: fractional or
integer numbers are also real numbers).

Well, if you take the point of view expressed in the
parentheses, no. But the most frequent use of floating point in
my current applications is monetary values, and the abstraction
we use isn't real, but decimal arithmetic (with five digits
after the decimal); we take pains to ensure that the results are
correctly rounded to these values before continuing
calculations.
Well, yes never is a bit too strong.
But that's really a bad practice.

In some cases. In others, it's perfectly appropriate. (In
particular, comparing for equality with 0.0 is often
appropriate.)

The point is that you have to understand what you're doing, and
act in consequence. Never use == is as bad a rule as always use
==.
To me it simply means that when x or y are the result of some
computations (es. x = a + b), you should not relay on the ==
operator because even a small error you carry along can lead
you completely off track.

It depends. If the small error is acceptable, fine. If it's
not, you should compare for equality. There are many cases
where such comparisons are reasonable.
This is different for example from settings some values to
-1.0 in a matrix and then checking back whether an element is
== -1.0.
Clearly this works if, for example, you're sure that no
computation on that matrix will set a negative value.

Sentinal values which you set are an obvious case. There are
also cases where a calculation must result in 0.0, and
doubtlessly some where it must result in some other exact value
(1.0, etc.).
That depends on why the code is broken.
If for example I'm trying to find a solution of:
x^2 = exp(-x)
replacing the equality test with a fabs( ... ) < epsilon helps
a lot. (searching for the minimum is even better).
Clearly the rest of the algorithm must be also clever enough
to hunt the root down properly, epsilon choosen properly and
so on, but that's another story.

It's that other story which always worries me. It's possible,
for example, for an algorithm to converge using real numbers,
but not with floating point arithmetic. Other algorithms need
serious rework---even something as simple as finding the average
of a large set of values.
 
J

James Kanze

With respect, I think you are arguing semantics rather than
mathematics.

No. Mathematics knows many different kinds of numbers. My
point is precisely that floating point numbers are a type of
number; there are a certain number of operations defined on
them, and those operations are defined precisely.
Your statement seems to follow circular logic -- floating
point arithmetic is exact because it follows the rules of
floating point arithmetic.

And integer arithmetic is exact because it follows the rules of
integer arithmetic, and real arithmetic is exact because it
follows the rules of real arithmetic. Obviously. We're dealing
with definitions here.
That's like saying that a digital computer program should give
the same answer every time, given the same inputs.

And that that answer is deterministically known, according to a
strict set of rules. (Of course, programs don't always give the
same answers for the same input. C++ has undefined behavior,
and when threading kicks in...)
Well, of course it does, unless the computer is broken. Even
programs with simulated noise (i.e., a random number generated
buried inside), should give the same exact answer, if the RNG
seed isn't changed.

Reading /dev/random doesn't always give the same answer.
Counting the number of times you can loop between keyboard input
doesn't always give the same answer.
Ok, let's clear the air here. We are really discussing AT
LEAST three different kinds of numbers here. "Real" numbers,
which I take to mean numbers in the mathematical sense

Let's clear the air here. There is no such thing as "numbers in
the mathematical sense". Mathematics knows many different types
of numbers, and they do follow different rules.
-- all rational plus irrational real numbers; the
APPROXIMATIONS of real numbers,

No. Rational numbers have their own laws, which aren't exactly
the same as real numbers. And integers have somewhat different
laws as well.
which are represented in computers using floating point (IEEE
or any other definition); and the practical case where we use
fp numbers to represent real-world values of some measurement
or computation.

Which may or may not work. Typically, we can represent
real-world values adequately with floating point numbers. The
problems come afterwards, because floating point numbers don't
obey the rules of real numbers. (They're not associative over
addition, for example)
I think part of the problems you and Guiliano are having are
that you tend to ping-pong from one of these entities to
another.
That's right.

The problem is that that point of view doesn't lead us anywhere.
All floating point numbers do correspond to an exact real
number; in that sense, floating point numbers are a subset of
the real numbers. But addition is defined differently over
them; in that sense, they aren't related to floating point
numbers.
Regardless of some notion you have of theoretical behavior of
floating point numbers, they were invented for the specific
purpose of representing values of things in the physical world
-- voltages, currents, positions, velocities, and the like. Or
even non-physical things like sqrt(2) or sin(x).

Why they were invented is irrelevant. Trying to understand what
they really do is relevant. Since they've been invented, a lot
of research has been done on them, and the principles underlying
them are fairly well understood today.
I can't think of a single use of fp that doesn't involve doing
the best one can to approximate some other number that can't
be represented, exactly, in the computer.

I can. I regularly use them to represent monetary values, and
monetary values can always be represented exactly in a computer.

But that's not the point. It's clear that we can't implement
real numbers in a computer, since that would require infinite
memory. And it's clear that for many applications, floating
point is the best we can do. But that doesn't make them real
numbers, and they don't magically start obeying the rules of
real numbers just because we want them to.
You seem to be arguing that fp numbers have a domain all their
own, defined by the way they are implemented in a given
computer. Those rules are non-portable from computer to
computer, or even from, say, real and double.

And. It's not so much that I'm arguing it; I'm simply
constating a fact of life, that we have to deal with. It
doesn't make things simpler, but that's life.
Giuliano (and I) are saying that use of fp doesn't have much
value, or make much sense, except to APPROXIMATE the values of
real numbers in the mathematical sense.

The problem is that arithmetic over floating point numbers
behaves significantly differently than arithmetic over real
numbers.
Ah. There's the rub. IMHO you are using a fundamentally wrong
method to represent a whole 'nuther kind of real number. Use
of fp numbers to represent monetary values is fundamentally
flawed. You should define a new type entirely, called
something like MonetaryValue, which uses decimal arithmetic
throughout. I think you are deluding yourself that your
numbers are being "correctly" computed. The fact that you get
the same number every time doesn't mean it's the right one.
And comparing it to the same number, computed by the same
computer, using the same fp calculations, does _NOT_ prove
that they are "equal."

Certainly. It's now what I would have choosen. But in fact,
for what we're doing, it works fairly well, and it's certainly a
lot easier than defining all of the necessary operations over
some new numeric type. (We're often dealing with things like
compound interest and such, which require more or less
complicated functions.)
No, it's not. Not even close. 1.0e-31, or even 1.0e-300, is
_NOT_ zero, nor will any == test claim it to be.

Exactly. And if you need 0.0, then you can't say that they are.
Furthermore, you seem to be ignoring the possibility that
computations lose precision. If you subtract two 15-digit
numbers, you can easily get a _ONE_ digit result.

You're the one who seems to be ignoring it. I'm the one saying
that it isn't a question of an approximation; that the results
aren't governed by the rules of real arithmetic.
Again IMO, it's _NOT_ a bad rule, and use of a == comparison
is _NEVER_ a good idea. That rule has not changed since 1960,
and for good reason.

That was never a rule among people who knew what they were
doing. It's some silly rule made up by people who didn't want
to go to the bother of learning what they were doing.
Even that (use of sentinel numbers), I submit, is bad
practice.

But you can't give a reason for it.
If you want to know if a given number has ever been given a
value, you should initialize it to NaN, not -1.0 or any other
"real" number.

Not all floating point representations support NaN.
Better yet, find another way, using a boolean or some other
flag, to indicate that a number has been initialized.

Sure. There are other possible solutions. Whether they are
better or not depends on context.
I'll go even further than that. An iterative computation --
the only kind for which the notion of convergence makes sense
-- will almost _ALWAYS_ fail to converge,

Unless it was designed to converge, according to the rules of
floating point arithmetic.
if one sets the criterion for convergence too small. When
differences get down to the point of differences in the LSB,
it's common for them to ping-pong back and forth between two
numbers, neither of which is exactly equal to zero.
Please tell me you are not using iterative convergence
techniques on monetary values.

Not directly. But I suppose that some of the functions we use
do. How else do you calculate a square root or an exponant.
And sqrt(2.0) has an exact answer in floating point arithmetic
(which isn't exactly the same as in real arithmetic); a test
program for a sqrt function will use an exact comparison
(probably with an artificially constructed floating point) to
verify that it works.
 
L

Lionel B

James Kanze wrote:
Is that true? If that's the case, then /dev/random is broken as well.

Um, the entire reason for existence of /dev/random is to generate (as near
as possible) *unpredictable* sequences.
Or, to be more accurate, one must use it with great care.

One must use it appropriately, with awareness of what it does.
Many
scientific and engineering problems require the use of random numbers to
simulate noise in real, physical systems. Those pseudorandom numbers are
generated, amazingly enough, by random number generators (RNGs).
Ironically, those RNGs _MUST_ be deterministic in what they do. Given a
certain seed, the RNG must always produce the same sequence of fp
numbers. If it didn't, it would be impossible to validate the software.
/dev/random may be Ok for dealing cards in Freecell, but not in serious
software.

It's a different domain; you use a PRNG when you actually want
reproducibility, /dev/random when you explicitly do *not* want
reproducibility.

[...]
Yes, indeed they do. Which is why testing them for equality is a Bad
Idea.

I find that a bit of a non-sequitur. Why should it be a Bad Idea to test
for equality of floating point numbers if that is what you want; ie. you
truly wish to test for equality in the (unambiguously defined) sense of
floating point - not real number - arithmetic.

Sentinel values are a case in point. You do, of course, have to be aware
of potential pitfalls, such as the possibility that you end up comparing
the result of a fp computation that "accidently" happens to compute your
sentinel value.

I might be more sympathetic to something along the lines of: "Testing
floating point numbers for equality is often a bad idea because there are
many potential pitfalls associated with doing so". But then I could
equally claim: "Using virtual functions is often a bad idea because there
are many potential pitfalls associated with doing so".

All we can say is that it is always a Bad Idea to use any technique that
you do not fully understand the implications of.
 
J

James Kanze

Are we going to talk about mathematics here, or computer arithmetic.

Computer arithmetic is a branch of mathematics.
As I said, I think the tendency to ping-pong between topics is
part of the "failure to communicate." I agree with you 100%
that one can define an arithmetic, in the purely mathematical
sense, for floating point numbers.

Not only can---it's been done, and it's what you use when you do
floating point arithmetic.
I also agree that such an arithmetic can be made very precise,
at least as long as you stick to the same hardware and same
computer language types (e.g., float or double). What we seem
to disagree on is whether such an arithmetic has any value
whatsoever, outside of a Comp. Sci. classroom.

Without it, you can't get useful results from floating point
arithmetic. Unless you're using it, you have no business using
floating point arithmetic, since you don't know what your
results mean.
As Giuliano and I have both pointed out, in the real world
people don't use computers (you might prefer "don't usually
use computers) to calculate practical things that apply to ...
um ... the real world. You yourself have said that this is
exactly what you're doing -- using fp numbers to represent
financial values. The very instant you associate the
arithmetic of fp numbers with the arithmetic of financial
data, you have crossed the line into the real world. In this
world, the fp number is only an approximation of the
real-world number.

Not "only". The original floating point number may be an
approximation of some real-world number (in my case, a decimal
fraction with five places after the decimal). The final results
may also be interpreted as such. But that doesn't make
floating point arithmetic decimal arithmetic. Floating point
arithmetic continues to be floating point arithmetic, and obey
the rules of floating point arithmetic, and not those of decimal
arithmetic (or real arithmetic, if your original abstraction is
real numbers).
Then we'd better dang well fix C++.

With regards to undefined behavior in a single threaded
environment, I agree. But the standardization committee
doesn't. With regards to threading, I don't think it's even
possible.
Repeatable behavior is not just a desire. It's an absolute
requirement. If your computer program, or the system that you
run it on, doesn't give repeatable results, it's broken, plain
and simple.

Then every system in the world is broken, because none of them
give truely repeatable results. (If /dev/random gave repeatable
results, I'd consider it broken.) A physical system can never
give repeatable results.

But I think we're straying from the subject. My comment was in
parentheses, because it wasn't concerned with floating point
arithmetic, but aleas due to timing issues, etc. (A real time
system will generally have different behavior depending on
whether event a occurs before or after event b. And when the
two events occur at the same time, or very close to the same
time, whether event a or event b is considered to occur first is
more or less random.)
Is that true? If that's the case, then /dev/random is broken
as well.

Just the opposite. The whole point of /dev/random is to
generate a truely random value, for programs which need such.
Or, to be more accurate, one must use it with great care. Many
scientific and engineering problems require the use of random
numbers to simulate noise in real, physical systems. Those
pseudorandom numbers are generated, amazingly enough, by
random number generators (RNGs). Ironically, those RNGs
_MUST_ be deterministic in what they do. Given a certain seed,
the RNG must always produce the same sequence of fp numbers.
If it didn't, it would be impossible to validate the software.
/dev/random may be Ok for dealing cards in Freecell, but not
in serious software.

Actually, it's used in some very serious software; there are
times when the computability (and even the possibility of
deterministically repeating a sequence) are undesirable. Things
like generating a one-time key in cryptographic applications,
for example.
As for the keyboard press example, that's a contrived case,
and I think you know it.

It's a case from the real world.
You're not computing anything in the computer at all: You're
simple measuring the operator at the keyboard. The only result
you get is that people are unpredictable. I think we already
knew that.

The result I'm getting is that physical systems are
unpredicatable. People admittedly more than most, but you have
some unpredicatability anytime a physical system is involved.

(None of which has anything to do with how to use floating point
arithmetic, of course. It's completely a side issue.)
Again, I think you have to distinguish between number theory
and computer implentations.

And what do you think underlies IEEE floating point, if it isn't
number theory?
It's also quite true that one can implement rational numbers
as rational numbers; i.e., compute the numerator and
denominator of each fraction. However, more often than not, we
use fp numbers to represent both rationals and integers. The
three kinds (four, if you count complex -- six, if you allow
for strange things like complex rationals and complex
integers) end up being calculated in the same way, and
according to the same rules.

No, they all have their own rules, and properties which hold for
one don't necessarily hold for others. In particular, (a+b)+c
== a+(b+c) for real numbers (and rationals, and integers), but
not for floating point. If this property is essential for your
algorithm to work, then it won't work with floating point
numbers.
I think we just said the same thing.

I don't think so. You seem to be trying to treat floating point
as if they were real numbers, with some error factor. They're
not; they form an arithmetic of their own, with different rules
and properties than real numbers.

[...]
What you just said is strictly true, but irrelevant. Since
you've made such a point that the rules of fp arithmetic is
not the same as those for real numbers, integers, and
rationals, you should be aware that every floating point
number is a rational number.

So. Every integer is a real number, too. Every real is a
complex number as well, but reals and complex numbers have
different properties.

You're ignoring the essential point: the properties of real
arithmetic don't hold for floating point arithmetic. Other
properties do, however, and if you're using floating point
arithmetic, you need to understand those properties.

[...]
It would not seem to be the case. If it were true, we wouldn't
be holding this discussion.

It's easily verified. Look at the amount of literature
concerning the mathematical theory behind floating point
arithmetic, and the papers analysing different algorithms. It's
true that there are still some things for which the best
algorithm isn't known, but the underlying principles of how
floating point arithmetic works, and the rules which apply to
it, are established. (They do require a high degree of
mathematical competence to understand, and so aren't presented
in vulgarizations. That doesn't mean that they aren't
understood, however.)
Yes, you said that. Yet, a bit further down your msg, you talk
about iteration, square roots, exp(x), and other things,
_NONE_ of which can be represented exactly.

The result I want can be represented exactly. The law specifies
very precisely what it must be. Even when exponants and such
are involved.
Again, we seem to agree. Real numbers are not the same as fp
numbers. As long as we recognize that they are not, we're all
happy. The problem arises because we are using computer
implementations of fp arithmetic to represent variables that
are continuous, rather than discrete.

Sort of. The real problem arises because programmers expect
properties to hold which don't, because floating point numbers
are not real numbers.
Yes, indeed they do. Which is why testing them for equality
is a Bad Idea.

No. That's why doing anything with them without understanding
how they work is a bad idea. There are times when you test for
equality, and times when you don't. Just avoiding tests for
equality, and replacing them with range tests (within an
epsilon) is a sure recepe for disaster.

[...]
Many years ago, I was asked to help rescue an accounting
program for the engineering firm I worked for. They had hired
a guy (for not much money, I think) to write the program, and
he chose Fortran. He, too, was using fp numbers to represent
monetary values. He, too, was computing things like compound
interest. And he, too, was rounding the calculations in a
manner that he thought was safe. But despite his best efforts,
he was not able to make the program produce the same results
that our customer (NASA) got using COBOL. Every week, a penny
would pop up, either positive or negative, due to rounding. So
every week, the accountants had to double-check the
computations by hand, and add or subtract a penny here or
there.
We looked the program over pretty carefully, and finally
concluded that it was unfixable. In the end, the company ended
up spending a lot more money to have the program rewritten in
COBOL.

Accounting laws define exactly how rounding is supposed to work.
In terms of decimal arithmetic. Obviously, it's a lot simpler
to implement if you have an abstraction which implements decimal
arithmetic directly. But it can also be made to work using
floating point, if you know what you're doing. (Not float, of
course---seven digits decimal precision isn't enough for much of
anything.)

[...]
Of course I can. It's sloppy programming.

Why? It's the established best practice, and it provably works.
It's using a real number to transmit information that is
fundamentally Boolean in nature (e.g., has the value of this
element been given a value?). People use sentinel numbers to
save having to pass boolean flags around. I've done it too,
but the better solution is to use a boolean to represent a
logical condition.

A lot depends on context. I'm a great fan of Fallible; I use it
all the time. But if you're dealing with large matrixes of
values, it can introduce unacceptable overhead.
 
S

SG

Without it, you can't get useful results from floating point
arithmetic. Unless you're using it, you have no business using
floating point arithmetic, since you don't know what your
results mean.

I beg to differ and agree with Jack.

Machine floats (as a set) and the operations +,-,*,/ defined on this
set don't form a field. But a field is one of the algebraic
structures mathematicians like to deal with. That's why numerical
error analysis of algorithms is much easier via the "approximation-of-
real-numbers model". Every operation is assumed to be exact in terms
of real arithmetic but is then followed by a distortion that is below
a certain fraction (epsilon) of the exact* result. This distortion is
due to the representation limits of numbers.

There's little value to the actual rules for the operations as long as
an epsilon that constraints this approximation is known. This is what
numeric_limits<double>::epsilon() is for. All numeric algorithms I
know of are based on this assumption. There are some exceptions that
go beyond that. For example multiplying a number with a power of B is
exact in terms of real arithmetic for representations of the form sign
* B^exponent * mantissa. This is used in some algorithms for error-
free normalization of numbers to avoid exponent underflow/overflow.

I don't know what the current practise is for banking software. I
always assumed they would use fixed-point nubmers.

Cheers!
SG
 
L

Lionel B

Lionel, can you give me an example of the latter, in an application more
serious than Freecell?

Another poster mentioned cryptographic applications.

In fact, reproducibility may not be the whole issue. No PRNG generates
"truly random" sequences [I really don't want to get into what a "truly
random" sequence might mean] and there are cases - in scientific
programming, for instance - where you want to be very, very sure indeed
that a sequence of random deviates doesn't have correlations, etc. that
might skew results. In which case you might use something like /dev/random
(or even download a huge file full of random deviates generated from some
comparable natural source of entropy.).
That's it exactly. Certainly testing a number for equality, when you're
the one who gave it its value, is perfectly valid and safe. The last
time I did that was yesterday. But, as you say, there is always the
possibility, however improbable, that the result of an fp computation
also yields that exact number. That's why it's not a good practice.

Testing floating point numbers for equality is often a bad idea because
there are many potential pitfalls associated with doing so.

How's that?

I think Pete Becker has answered that better than I could.
 
L

Lionel B

In fact, reproducibility may not be the whole issue. No PRNG generates
"truly random" sequences [I really don't want to get into what a "truly
random" sequence might mean] and there are cases - in scientific
programming, for instance - where you want to be very, very sure indeed
that a sequence of random deviates doesn't have correlations, etc. that
might skew results. In which case you might use something like
/dev/random (or even download a huge file full of random deviates
generated from some comparable natural source of entropy.).

But once you have that file the results are completely reproducible. And
at the very least that's what you need to stabilize the results for
debugging.

Indeed. That's what I meant by "reproducibility may not be the whole
issue".
 
L

Lionel B

Lionel said:
Lionel B wrote:
On Mon, 12 Jan 2009 03:29:35 -0500, Jack Crenshaw wrote:

James Kanze wrote:
Reading /dev/random doesn't always give the same answer.
Is that true? If that's the case, then /dev/random is broken as
well.
Um, the entire reason for existence of /dev/random is to generate (as
near as possible) *unpredictable* sequences.

Or, to be more accurate, one must use it with great care.
One must use it appropriately, with awareness of what it does.

Many
scientific and engineering problems require the use of random
numbers to simulate noise in real, physical systems. Those
pseudorandom numbers are generated, amazingly enough, by random
number generators (RNGs). Ironically, those RNGs _MUST_ be
deterministic in what they do. Given a certain seed, the RNG must
always produce the same sequence of fp numbers. If it didn't, it
would be impossible to validate the software. /dev/random may be Ok
for dealing cards in Freecell, but not in serious software.
It's a different domain; you use a PRNG when you actually want
reproducibility, /dev/random when you explicitly do *not* want
reproducibility.
Lionel, can you give me an example of the latter, in an application
more serious than Freecell?

Another poster mentioned cryptographic applications.

In fact, reproducibility may not be the whole issue. No PRNG generates
"truly random" sequences [I really don't want to get into what a "truly
random" sequence might mean] and there are cases - in scientific
programming, for instance - where you want to be very, very sure indeed
that a sequence of random deviates doesn't have correlations, etc. that
might skew results. In which case you might use something like
/dev/random (or even download a huge file full of random deviates
generated from some comparable natural source of entropy.).

And what are you going to do for the next run? Download a different huge
file?

Depends wether I want reproducibility or not :) [Actually I probably
wouldn't have to download a different huge file, as I would have sub-
sampled the original huge file].

Seriously, I might on the one hand, want to run with a different set of
(truly) random deviates to check whether, for instance, some confidence
limit agrees with some analytical result. Alternatively, I might be
testing/debugging my program - in which case I would obviously use the
same set of deviates.
Maybe I should clarify what I said about RNG's, testing, etc. When using
a program that includes an RNG (or RNG's), I agree that it's important
that the results not be repeatable.

It may or may not be, depending on the application.
When _TESTING_, however, it's critically important that they are.

Of course.
The best solution I've been able to
find is an RNG whose seed you can set. Set it to a constant (1 is not a
bad one) for testing; get it from somewhere else -- wall clock or
/dev/random or whatever -- for production.

My point was that there may be other reasons - eg. statistical
considerations - for *not* using a PRNG, but rather using a "truly random"
source such as /dev/random.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top