double add problem

Discussion in 'Java' started by Joe Pribele, Jul 30, 2003.

  1. Joe Pribele

    Joe Pribele Guest

    When I add these two numbers together I get a number that is off by very
    small amount is this because double is a floating point.
    This may not seem like much but when I add on couple more numbers and
    then round I am off by 1 which is alot.

    java
    20.5114 + 1.5890000000000002 = 22.100399999999997 // not quite right

    windows calculator
    20.5114 + 1.5890000000000002 = 22.1004000000000002 // this is correct

    public class test {
    public static void main(String[] args ) {
    System.out.println( 20.5114 + 1.5890000000000002 );
    }
    }
    Joe Pribele, Jul 30, 2003
    #1
    1. Advertising

  2. Joe Pribele:

    >When I add these two numbers together I get a number that is off by very
    >small amount is this because double is a floating point.


    Yes, floating point types by their very nature are inexact.

    Check out Roedy's Java glossary or just ask Google, there are some
    introductory texts on the topic.

    Regards,
    Marco
    --
    Please reply in the newsgroup, not by email!
    Java programming tips: http://jiu.sourceforge.net/javatips.html
    Other Java pages: http://www.geocities.com/marcoschmidt.geo/java.html
    Marco Schmidt, Jul 30, 2003
    #2
    1. Advertising

  3. Joe Pribele wrote:
    > When I add these two numbers together I get a number that is off by very
    > small amount is this because double is a floating point.
    > This may not seem like much but when I add on couple more numbers and
    > then round I am off by 1 which is alot.
    >
    > java
    > 20.5114 + 1.5890000000000002 = 22.100399999999997 // not quite right
    >
    > windows calculator
    > 20.5114 + 1.5890000000000002 = 22.1004000000000002 // this is correct
    >
    > public class test {
    > public static void main(String[] args ) {
    > System.out.println( 20.5114 + 1.5890000000000002 );
    > }
    > }
    >


    It seems that the current (Windows XP) version of calculator isn't using
    doubles for calculation. It can produce values like
    22.0004000000000001001000001, which has far more precision than a
    double. If you want to do this in Java, use the java.math.BigDecimal
    class. This is much slower than double, but will behave as expected.
    Floating point arithmetic is faster, but more complex to understand and
    use correctly.

    Mark Thornton
    Mark Thornton, Jul 30, 2003
    #3
  4. Joe Pribele

    Tim Slattery Guest

    Joe Pribele <jpribele@no_spam.bglgroup.com> wrote:

    >When I add these two numbers together I get a number that is off by very
    >small amount is this because double is a floating point.


    Floating point numbers are approximations, there's nothing you can do
    about that. If you require a large degree of precision, check out the
    BigDecimal class.

    --
    Tim Slattery
    Tim Slattery, Jul 30, 2003
    #4
  5. Joe Pribele <jpribele@no_spam.bglgroup.com> writes:

    > 20.5114 + 1.5890000000000002 = 22.100399999999997 // not quite right


    Binary floating point numbers are not 100% accurate.

    > 20.5114 + 1.5890000000000002 = 22.1004000000000002 // this is correct


    Then approximate the calculator by using java.math.BigDecimal
    Tor Iver Wilhelmsen, Jul 30, 2003
    #5
  6. Joe Pribele

    Joe Pribele Guest

    In article <MPG.1991ecfc4289d621989680
    @nntp.lndn.phub.net.cable.rogers.com>, jpribele@no_spam.bglgroup.com
    says...
    > When I add these two numbers together I get a number that is off by very
    > small amount is this because double is a floating point.
    > This may not seem like much but when I add on couple more numbers and
    > then round I am off by 1 which is alot.
    >
    > java
    > 20.5114 + 1.5890000000000002 = 22.100399999999997 // not quite right
    >
    > windows calculator
    > 20.5114 + 1.5890000000000002 = 22.1004000000000002 // this is correct
    >
    > public class test {
    > public static void main(String[] args ) {
    > System.out.println( 20.5114 + 1.5890000000000002 );
    > }
    > }
    >
    >

    Thanks every one for the quick response. You confirmed my suspicions.

    Joe
    Joe Pribele, Jul 30, 2003
    #6
  7. Joe Pribele

    Roedy Green Guest

    On Wed, 30 Jul 2003 19:59:07 GMT, Joe Pribele
    <jpribele@no_spam.bglgroup.com> wrote or quoted :

    >When I add these two numbers together I get a number that is off by very
    >small amount is this because double is a floating point.


    see http://mindprod.com/jgloss/floatingpoint.html

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
    Roedy Green, Jul 30, 2003
    #7
  8. Joe Pribele

    Tom McGlynn Guest

    Tor Iver Wilhelmsen wrote:

    > Joe Pribele <jpribele@no_spam.bglgroup.com> writes:
    >
    >
    >>20.5114 + 1.5890000000000002 = 22.100399999999997 // not quite right

    >
    >
    > Binary floating point numbers are not 100% accurate.
    >
    >
    >>20.5114 + 1.5890000000000002 = 22.1004000000000002 // this is correct

    >
    >
    > Then approximate the calculator by using java.math.BigDecimal


    This subject comes up frequently and I've got some problems with
    the typical discussion. It tends to say things that are not
    true or at least misleading. This includes Roedy's discussion
    in the glossary as well as the first statement in the reply above.
    My feeling is this tends to make floating point numbers
    more mysterious than they should be.

    Binary floating point numbers as used by Java <are>
    precise/accurate/exact. They represent a unique point on the real
    number line. However the set of available numbers is not dense.
    Only an infinitesimally small fraction (literally) of the
    points on the number line are included in either of the
    floating point representations thatJava uses.

    There are two common cases where a Java process needs to choose
    which one of these numbers that it can represent should be
    used in the place of a number that it cannot. These approximations
    are well defined, but give results different from what would
    be the case if Java floats/doubles could exactly represent all numbers.

    The first is the conversion from finite decimal floating
    point numbers. Most numbers which have finite decimal representation
    do not have a finite binary representation. E.g., the fraction 3/10 has
    a finite decimal floating representation (0.3) but no finite
    representation in binary floating point (it's something like
    0.010011001100110011...B).

    So when a user enters a string "0.3" and asks to convert that string
    to a binary floating point number -- at either compile time or during
    the execution of the program -- there just isn't any such number
    available. The rules state that the closest available number will be
    chosen, but in the vast majority of cases there will not be an exact
    match. The conversion is well defined, but it is many-one. Many
    (indeed an infinity) of real numbers could be converted to each of the
    available numbers.

    Once Java has numbers in binary, it allows the user to do arithmetic
    on them. There are specific rules for how this arithmetic is done.
    While these rules try to match what we are familiar with, often
    the 'exact' result is a number that is not in the representable
    set. E.g., with

    double x= 3; double y=10; double z = x/y;

    we again have a case where the value that would naively be expected
    for z is not in the set of double values that Java supports. Java is
    very explicit in specifying exactly which value will be selected, but
    it's not quite the same value as one might anticipate. This is
    of course what's happened in the example above. The number
    22.1004000000000002 is not representable in Java, so it took
    the nearest number in the representable set which is something
    very close to 22.100399999999997. In fact it probably requires
    over 50 digits to represent exactly but Java just writes out
    enough digits to distinguish it from any of the other representable
    numbers. [All numbers with finite binary representations also have
    finite decimal representations, usually with about as many decimal
    digits are there are in the binary representation.]

    Roedy's Web page implies that all calculations are fuzzy. This is not
    the case. If the calculated value is in the set of representable
    numbers, the calculation is performed exactly, e.g., for adding,
    subtracting or multiplying small integer values.
    If not, then the nearest representable value is chosen.
    Arithmetic is precisely defined -- there
    is almost no room for different machines to chose different values --
    but Java gives slightly different results than would be expected if all
    real numbers were representable in the set of Java floating point
    numbers.

    Roedy's page also misleads users in the discussion of the
    StrictMath library and the non-strictfp expressions. These have nothing
    to do with guard bits in the normal sense of the word.

    The StrictMath library is used to give the standard results from the
    standard math functions (trig, log, etc). A Java implementation
    is allowed to use a Math library which gives very slightly different
    (but essentially as accurate) values, that uses a different
    algorithm in the computation -- perhaps taking advantage of local
    hardware. This has nothing to do with guard bits. I believe in Sun's
    JDK's the two actually are the same.

    There is also a strictfp qualifier that users can specify for methods
    and blocks. Within a strictfp block, all intermediate results
    must be in the representable set. However outside of strictfp,
    intermediate results may have exponents which are outside the range
    of those that are usually permitted. If no intermediate result
    would overflow, underflow (or result in denormalized value), then
    the strictfp qualifier makes no difference. So, for double precision
    as long as the intermediate values have magnitudes between 1.e-300 and
    1.e300 (or exactly 0), strictfp makes no difference. However
    something like:

    double x= 1.e-200;
    x = x*x/x;

    must return 0 in a strictfp expression: since 1.e-400 is smaller than
    the smallest representable number it would underflow to 0. In a
    non-strictfp expression, it is allowed to return 1.e-200 (but not
    required to).

    While strictfp and StrictMath address quite distinct issues,
    the overwhelming majority of users can completely
    ignore the existence of both.

    This isn't meant as a criticism of Roedy or the other posters. However
    I think that in trying to simplify the discussion there's a tendency to
    use language that makes it hard to understand how floating point
    numbers really work and make it seem like floating point arithmetic
    isn't well defined.

    Tom
    Tom McGlynn, Jul 31, 2003
    #8
  9. Joe Pribele

    Roedy Green Guest

    On Thu, 31 Jul 2003 13:42:50 -0400, Tom McGlynn
    <> wrote or quoted :

    >Binary floating point numbers as used by Java <are>
    >precise/accurate/exact.


    That is a philosophical point. What does it mean to say an "number"
    is accurate? It means "does it reflect the value it stands for?"

    Ints can get values bang on. doubles often cannot.

    Further, I don't believe the results of floating point operations are
    guaranteed to be precisely correct to the bit -- e.g. presuming
    infinite accuracy, then rounded.

    Newbies are the ones puzzled by this. I think it easiest to explain
    that floating point always has some fuzz. When they get older they
    can learn the full truth.

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
    Roedy Green, Aug 4, 2003
    #9
  10. Joe Pribele

    Dale King Guest

    "Roedy Green" <> wrote in message
    news:...
    > On Thu, 31 Jul 2003 13:42:50 -0400, Tom McGlynn
    > <> wrote or quoted :
    >
    > >Binary floating point numbers as used by Java <are>
    > >precise/accurate/exact.

    >
    > That is a philosophical point. What does it mean to say an "number"
    > is accurate? It means "does it reflect the value it stands for?"


    And floating point fits that defintion. There are a precise finite subset of
    the real numbers that are represented in floating point and the encodings of
    those values do reflect the values they stand for.

    > Ints can get values bang on. doubles often cannot.


    That is not true. Ints can only get integers, just as doubles can only get
    certain values. Both pick some finite subset of the set of real numbers. The
    only difference is that the set chosen for floating point are not evenly
    spaced. The distribution is more concentrated around zero and more sparse as
    the values get larger. There is nothing imprecise about the operations.

    > Further, I don't believe the results of floating point operations are
    > guaranteed to be precisely correct to the bit -- e.g. presuming
    > infinite accuracy, then rounded.


    They are for each individual operation.

    > Newbies are the ones puzzled by this. I think it easiest to explain
    > that floating point always has some fuzz. When they get older they
    > can learn the full truth.


    I disagree. I do not believe in teaching someone a lie only to contradict it
    later.

    --
    Dale King
    Dale King, Aug 4, 2003
    #10
  11. Joe Pribele

    Tom McGlynn Guest

    Roedy Green wrote:

    > On Thu, 31 Jul 2003 13:42:50 -0400, Tom McGlynn
    > <> wrote or quoted :
    >
    >
    >>Binary floating point numbers as used by Java <are>
    >>precise/accurate/exact.

    >
    >
    > That is a philosophical point. What does it mean to say an "number"
    > is accurate? It means "does it reflect the value it stands for?"
    >
    > Ints can get values bang on. doubles often cannot.


    Int's can get values bang on if the underlying value is an integer
    in the appropriate range. However if I try to store the value 2.2 in
    an integer, the best it can do is the value 2. Similarly in integer
    arithmetic 5/3 results in 1. This behavior is perfectly reasonable:
    it doesn't reflect inaccuracy or fuzziness in the definition of ints
    or arithmetic operations on them. Nor do the similar issues that
    come up with floating point numbers.

    >
    > Further, I don't believe the results of floating point operations are
    > guaranteed to be precisely correct to the bit -- e.g. presuming
    > infinite accuracy, then rounded.
    >


    IEEE arithmetic <is> guaranteed to be precise in exactly the way you
    describe. From the JLS:

    The Java programming language requires that floating-point arithmetic
    behave as if every floating-point operator rounded its floating-point
    result to the result precision. Inexact results must be rounded to the
    representable value nearest to the infinitely precise result; if the
    two nearest representable values are equally near, the one with its
    least significant bit zero is chosen. This is the IEEE 754 standard's
    default rounding mode known as round to nearest. (JLS 4.2.4)


    > Newbies are the ones puzzled by this. I think it easiest to explain
    > that floating point always has some fuzz. When they get older they
    > can learn the full truth.


    My experience is that this approach leads to further confusion, whereas
    a discussion that is couched in terms of the the set of representable
    points seems to lead to a natural understanding of elementary and
    advanced issues in floating point arithmetic, e.g., which values are in
    the set of floats or doubles, or what strictfp really does.

    Your glossary discussion of floating point includes the phrase
    My general rule is, if at all possible, avoid floating point.
    That's a natural reaction if one thinks of floating point numbers as
    mysterious fuzzy things with inexact rules. But users should not
    be looking to avoid floating point numbers. In many regimes they
    are natural and seeking alternatives is counterproductive. Such
    misleading elements of the glossary's floating point
    discussion detract from its valuable comments on many issues.

    Regards,
    Tom McGlynn
    Tom McGlynn, Aug 5, 2003
    #11
  12. Joe Pribele

    Roedy Green Guest

    On Tue, 05 Aug 2003 13:04:18 -0400, Tom McGlynn
    <> wrote or quoted :

    > The Java programming language requires that floating-point arithmetic


    certainly not functions like sin though?

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
    Roedy Green, Aug 5, 2003
    #12
  13. Joe Pribele

    Roedy Green Guest

    On Mon, 4 Aug 2003 17:36:56 -0500, "Dale King" <>
    wrote or quoted :

    >I disagree. I do not believe in teaching someone a lie only to contradict it
    >later.


    if you read the essay I don't lie. I say that if a newbie acts as if
    a demon added some fuzz to the result of your every calculation it
    will keep him out of trouble.


    see http://mindprod.com/jgloss/floatingpoint.html

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
    Roedy Green, Aug 5, 2003
    #13
  14. Joe Pribele

    Roedy Green Guest

    On Tue, 05 Aug 2003 20:35:37 +0100, Mark Thornton
    <> wrote or quoted :

    >"The IBM Accurate Portable Mathematical library (IBM APMathLib) consists
    >of routines that compute some of the standard common transcendental
    >functions. The computed results are the exact theoretical values
    >correctly rounded (nearest or even) to the closest number representable
    >by the IEEE 754 double format."


    I wonder what sort of speed penalty you get for that. You think
    though that anyone designing such a library would tweak it so the
    obvious points came out bang on sin ( Math.PI ) cos ( Math.PI ) tan (
    Math.PI / 4 )
    etc.

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
    Roedy Green, Aug 5, 2003
    #14
  15. Joe Pribele

    ghl Guest

    "Roedy Green" <> wrote in message
    news:...
    > On Mon, 4 Aug 2003 17:36:56 -0500, "Dale King" <>
    > wrote or quoted :
    >
    > >I disagree. I do not believe in teaching someone a lie only to contradict

    it
    > >later.

    >
    > if you read the essay I don't lie. I say that if a newbie acts as if
    > a demon added some fuzz to the result of your every calculation it
    > will keep him out of trouble.


    This just reminded me of a comment I once heard about floating point
    numbers:
    Floating point numbers are like a pile of sand; every time you use it you
    lose a little sand and pick up a little dirt.
    --
    Gary
    ghl, Aug 6, 2003
    #15
  16. Roedy Green wrote:
    > On Tue, 05 Aug 2003 20:35:37 +0100, Mark Thornton
    > <> wrote or quoted :
    >
    >
    >>"The IBM Accurate Portable Mathematical library (IBM APMathLib) consists
    >>of routines that compute some of the standard common transcendental
    >>functions. The computed results are the exact theoretical values
    >>correctly rounded (nearest or even) to the closest number representable
    >>by the IEEE 754 double format."

    >
    >
    > I wonder what sort of speed penalty you get for that. You think
    > though that anyone designing such a library would tweak it so the
    > obvious points came out bang on sin ( Math.PI ) cos ( Math.PI ) tan (
    > Math.PI / 4 )
    > etc.


    No you can't tweak it like that. Math.PI must be the value of pi
    correctly rounded to double. The sin, cos, etc methods must then be
    applied to that rounded value not the infinite precision pi, and finally
    the result is rounded again. So it is quite correct that
    Math.sin(Math.PI) is not zero.

    Mark Thornton
    Mark Thornton, Aug 6, 2003
    #16
  17. Joe Pribele

    Tom McGlynn Guest

    Mark Thornton wrote:

    > Roedy Green wrote:
    >
    >> On Tue, 05 Aug 2003 20:35:37 +0100, Mark Thornton
    >> <> wrote or quoted :
    >>
    >>
    >>> "The IBM Accurate Portable Mathematical library (IBM APMathLib)
    >>> consists of routines that compute some of the standard common
    >>> transcendental functions. The computed results are the exact
    >>> theoretical values correctly rounded (nearest or even) to the closest
    >>> number representable by the IEEE 754 double format."

    >>
    >>
    >>
    >> I wonder what sort of speed penalty you get for that. You think
    >> though that anyone designing such a library would tweak it so the
    >> obvious points came out bang on sin ( Math.PI ) cos ( Math.PI ) tan (
    >> Math.PI / 4 )
    >> etc.

    >
    >
    > No you can't tweak it like that. Math.PI must be the value of pi
    > correctly rounded to double. The sin, cos, etc methods must then be
    > applied to that rounded value not the infinite precision pi, and finally
    > the result is rounded again. So it is quite correct that
    > Math.sin(Math.PI) is not zero.
    >
    > Mark Thornton
    >


    This is an example of how it helps to have a clean
    understanding of how floating point numbers work. If one is thinking
    of floating point numbers and calculations as 'fuzzy', then one could
    imagine tweaking the fuzz in ways such as Roedy suggests.
    However when one recognizes that floating point numbers have precisely
    defined behavior, one is led inevitably to Mark's conclusions.

    Looking at the source code for the library that Mark mentioned, it
    looks like it works using fairly standard algorithms with a big
    lookup table and with
    a kind of extended precision double. Each underlying value
    is broken into a base value and an offset and all additions
    and multiplications involve function calls. I'd imagine
    the penalty for using this library is relatively severe, something
    in the factor of 3-30 range, but it's hard to tell without
    testing.

    By the by, it would be perfectly legal for a Java implementation
    to use these accurate functions in the java.lang.Math class.
    However even though these functions are more accurate than the
    standard functions, they cannot be used in java.lang.StrictMath
    which must give identical results on all platforms. There
    portability is paramount.

    Regards,
    Tom McGlynn
    Tom McGlynn, Aug 6, 2003
    #17
  18. Tom McGlynn wrote:

    > Mark Thornton wrote:
    >
    >> Roedy Green wrote:
    >>
    >>> On Tue, 05 Aug 2003 20:35:37 +0100, Mark Thornton
    >>> <> wrote or quoted :
    >>>
    >>>
    >>>> "The IBM Accurate Portable Mathematical library (IBM APMathLib)
    >>>> consists of routines that compute some of the standard common
    >>>> transcendental functions. The computed results are the exact
    >>>> theoretical values correctly rounded (nearest or even) to the
    >>>> closest number representable by the IEEE 754 double format."
    >>>
    >>>
    >>>
    >>>
    >>> I wonder what sort of speed penalty you get for that. You think
    >>> though that anyone designing such a library would tweak it so the
    >>> obvious points came out bang on sin ( Math.PI ) cos ( Math.PI ) tan (
    >>> Math.PI / 4 )
    >>> etc.

    >>
    >>
    >>
    >> No you can't tweak it like that. Math.PI must be the value of pi
    >> correctly rounded to double. The sin, cos, etc methods must then be
    >> applied to that rounded value not the infinite precision pi, and
    >> finally the result is rounded again. So it is quite correct that
    >> Math.sin(Math.PI) is not zero.
    >>
    >> Mark Thornton
    >>

    >
    > This is an example of how it helps to have a clean
    > understanding of how floating point numbers work. If one is thinking
    > of floating point numbers and calculations as 'fuzzy', then one could
    > imagine tweaking the fuzz in ways such as Roedy suggests.
    > However when one recognizes that floating point numbers have precisely
    > defined behavior, one is led inevitably to Mark's conclusions.
    >
    > Looking at the source code for the library that Mark mentioned, it
    > looks like it works using fairly standard algorithms with a big
    > lookup table and with
    > a kind of extended precision double. Each underlying value
    > is broken into a base value and an offset and all additions
    > and multiplications involve function calls. I'd imagine
    > the penalty for using this library is relatively severe, something
    > in the factor of 3-30 range, but it's hard to tell without
    > testing.


    If you have a look at the code used for StrictMath, it isn't that simple
    either.

    >
    > By the by, it would be perfectly legal for a Java implementation
    > to use these accurate functions in the java.lang.Math class.
    > However even though these functions are more accurate than the
    > standard functions, they cannot be used in java.lang.StrictMath


    It has been suggested that the specification of StrictMath be changed,
    once a practical library giving perfectly rounded results has been
    demonstrated. It was not known to be possible at the time the Math
    specification was originally written.

    Mark Thornton
    Mark Thornton, Aug 6, 2003
    #18
  19. Joe Pribele

    Dale King Guest

    In article <>,
    says...
    > On Mon, 4 Aug 2003 17:36:56 -0500, "Dale King" <>
    > wrote or quoted :
    >
    > >I disagree. I do not believe in teaching someone a lie only to contradict it
    > >later.

    >
    > if you read the essay I don't lie. I say that if a newbie acts as if
    > a demon added some fuzz to the result of your every calculation it
    > will keep him out of trouble.


    It will not keep him out of trouble it just keeps him ignorant. If you
    want to see how far that ignorance can be carried, go take a look at this
    very long thread from a while back:

    http://groups.google.com/groups?th=22a6cb86dd19aa5

    The fact is that you are treating floating point as if it is somehow
    different from integers in this respect when in reality it isn't.

    Going back to my numerical analysis class instead of fuzz, what you are
    talking about is error. Each operation in a finite representation
    produces the exact mathematical value plus an error term.

    Take some mathematical operation which we will say is f(a,b). That is its
    exact mathematical value. We will call the computer version of that
    operation f'(a,b). Then

    f'(a,b) = f(a,b) + e

    With repeated operations the effects of these error terms can build up.

    But don't delude yourself into thinking that the error term is only there
    for floating point.

    Taking Tom's examples we have the operation of converting a number a
    value. The exact value would be nothing lost f(a) = a. For the case of
    integers and a = 2.2 we have:

    f.integer(2.2) = f(2.2) - 0.2

    Storing 2.2 into an integer introduces an error term of -0.2. For a
    double converting 2.2 into a double we get a double whose exact value is

    2.20000000000000017763568394002504646778106689453125

    Therefore for doubles we get:

    f.double(2.2) = f(2.2) +
    0.00000000000000017763568394002504646778106689453125

    In this case the error term is much less than the integer operation.

    Yet we perceive that integer operations are somehow more exact than
    floating point. So how do we classify the differences between the two?

    In numerical analysis, you are not so much interested in the exact value
    of the error term but rather on what the bounds of the magnitude of it
    is. In other words what is the maximum amount of error we can have. We
    want to say |e| < some value.

    For integers that is very easy. |e| <= 0.5. No operation can ever be off
    more than 0.5.

    For floating point it is that |e| < 0.5 ulp. Ulp stands for unit of last
    place. A ulp is not a constant it depends on the magnitude of the value.
    The size of a ulp is smallest near zero and gets larger as the value gets
    larger. Plotting I believe would give you a somewhat stair-stepped
    logarithmic shape or would it be exponential. I'll have to think about it
    some more.

    The other difference is that certain integer operations produce no error
    terms. Addition, subtraction, and multiplication adds no error (we are
    ignoring overflow for this discussion). Number conversion and division
    can introduce error. For floating point any operation can introduce
    additional error of 0.5 ulp.

    One operation in particular that contributes to this notion of the
    difference is that of conversion to a text string. (The subject of that
    very long thread I mentioned above). This is an exact operation for
    integers. The normal ways of converting a double to a text string using
    String.valueOf( double ) introduces an error in the result. What that
    method does is to produce the shortest string whose mathematical value
    is within +/-0.5 ulp of the exact string. Another way to think of it is
    that Double.parseDouble introduces some error e. String.valueOf( double )
    produces an error of -e so they cancel each other out.

    There is a way to convert to text that introduces no additional error. If
    you convert the double to a BigDecimal object and invoke the toString
    method on that, it is an exact operation. That is, in fact, how I
    obtained the value 2.20000000000000017763568394002504646778106689453125
    above.

    I only wish I could have explained it this simply to Paul Lutus in that
    thread, but he probably would not have listened anyway.

    Another confusing thing to many people is a difference between the
    default behavior of C and Java. While Java by default gives you a string
    that is within 0.5 ulp, C will by default usually round to something like
    6 decimal digits. C's printf introduces more error, but since it is
    rounded to produce fewer digits it can convince people that C is more
    accurate than Java.

    Consider this code:

    double a = 0.1;
    double b = 1.1;
    System.out.println( a + b );

    Java will print "1.2000000000000002". If you do the equivalent code in C
    using printf( "%f\n", a + b ) in place of the System.out.println you will
    probably get "1.2". Someone who doesn't understand the things I have
    explained here and just thinks that floating point is "fuzzy" will
    conclude that Java is generating more fuzz and is less exact than C. In
    reality if C is also using a 64-bit double then it computes the exact
    same value.

    By the way Roedy have you read this:

    http://docs.sun.com/source/806-3568/ncg_goldberg.html

    Judging by your glossary entry I'm not sure you have, particularly since
    you did not provide a link to it. Frankly, I would much rather that
    somebody read and understand that article than to just tell them that
    floating point has some mystical fuzz.

    --
    Dale King

    P.S. This is one of those threads where I really miss Patricia Shanahan,
    whom I call the queen of floating point.
    Dale King, Aug 7, 2003
    #19
  20. Joe Pribele

    Tom McGlynn Guest

    Dale King wrote:

    > In article <>,

    ....

    >
    > It will not keep him out of trouble it just keeps him ignorant. If you
    > want to see how far that ignorance can be carried, go take a look at this
    > very long thread from a while back:
    >
    > http://groups.google.com/groups?th=22a6cb86dd19aa5
    >
    > The fact is that you are treating floating point as if it is somehow
    > different from integers in this respect when in reality it isn't.
    >

    Kind of interesting to go back to an old thread like that and
    reread it. I only wish I could write with the clarity and
    precision that Patricia Shanahan shows in the early parts
    of that thread. I had not realized there was such a simple
    method to printing out the exact value of a double (which you also
    use below). Maybe people should submit their top 10 threads.
    Could be a new Google service...

    Regards,
    Tom McGlynn
    Tom McGlynn, Aug 7, 2003
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Web learner

    from List <double> to double[]

    Web learner, Apr 25, 2006, in forum: ASP .Net
    Replies:
    3
    Views:
    475
  2. sb
    Replies:
    4
    Views:
    304
    Alberto Barbati
    Feb 19, 2004
  3. Jacek Dziedzic
    Replies:
    5
    Views:
    384
    Old Wolf
    Apr 8, 2004
  4. ferran
    Replies:
    9
    Views:
    3,027
    Kevin Goodsell
    Apr 12, 2004
  5. Sydex
    Replies:
    12
    Views:
    6,485
    Victor Bazarov
    Feb 17, 2005
Loading...

Share This Page