Re: Python fails on math

Discussion in 'Python' started by Grant Edwards, Feb 22, 2011.

  1. On 2011-02-22, christian schulze <> wrote:
    > Hey guys,
    >
    > I just found out, how much Python fails on simple math.


    Python doesn't do math.

    It does floating point operations.

    They're different. Seriously.

    On all of the platforms I know of, it's IEEE 754 (base-2) floating
    point.

    > I checked a simple equation for a friend.
    >
    >
    Code:
    [color=green][color=darkred]
    >>>> from math import e as e
    >>>> from math import sqrt as sqrt
    >>>> 2*e*sqrt(3) - 2*e == 2*e*(sqrt(3) - 1)[/color][/color]
    > False
    > 
    >
    > So WTF?


    Python doesn't do equations. Python does floating point operations.

    [And it does them in _base_2_ -- which is important, because it makes
    things even more difficult.]


    > The equation is definitive equivalent. (See http://mathbin.net/59158)


    But, the two floating point expressions you provided are not
    equivalent.

    Remember, you're not doing math with Python.

    You're doing binary floating point operations.

    > #1:
    >>>> 2.0 * e * sqrt(3.0) - 2.0 * e

    > 3.9798408154464964
    >
    > #2:
    >>>> 2.0 * e * (sqrt(3.0) -1.0)

    > 3.979840815446496
    >
    > I was wondering what exactly is failing here. The math module?
    > Python, or the IEEE specifications?


    I'm afraid it's the user that's failing. Unfortunately, in many
    situations using floating point is neither intuitive nor easy to get
    right.

    http://docs.python.org/tutorial/floatingpoint.html
    http://en.wikipedia.org/wiki/Floating_point

    --
    Grant Edwards grant.b.edwards Yow! I like your SNOOPY
    at POSTER!!
    gmail.com
     
    Grant Edwards, Feb 22, 2011
    #1
    1. Advertising

  2. Grant Edwards

    Roy Smith Guest

    In article <ik0rmr$ck4$>,
    Grant Edwards <> wrote:

    > Python doesn't do equations. Python does floating point operations.


    More generally, all general-purpose programming languages have the same
    problem. You'll see the same issues in Fortran, C, Java, Ruby, Pascal,
    etc, etc. You'll see the same problem if you punch the numbers into a
    hand calculator. It's just the nature of how digital computers do
    floating point calculations.

    If you really want something that understands that:

    >>>> 2*e*sqrt(3) - 2*e == 2*e*(sqrt(3) - 1)


    you need to be looking at specialized math packages like Mathematica and
    things of that ilk.
     
    Roy Smith, Feb 22, 2011
    #2
    1. Advertising

  3. On 2011-02-22, Roy Smith <> wrote:
    > In article <ik0rmr$ck4$>,
    > Grant Edwards <> wrote:
    >
    >> Python doesn't do equations. Python does floating point operations.

    >
    > More generally, all general-purpose programming languages have the same
    > problem. You'll see the same issues in Fortran, C, Java, Ruby, Pascal,
    > etc, etc. You'll see the same problem if you punch the numbers into a
    > hand calculator.


    Some hand calculators use base-10 (BCD) floating point, so the
    problems aren't exactly the same, but they're very similar.

    --
    Grant Edwards grant.b.edwards Yow! YOU PICKED KARL
    at MALDEN'S NOSE!!
    gmail.com
     
    Grant Edwards, Feb 22, 2011
    #3
  4. Grant Edwards

    John Nagle Guest

    On 2/22/2011 9:59 AM, Roy Smith wrote:
    > In article<ik0rmr$ck4$>,
    > Grant Edwards<> wrote:
    >
    >> Python doesn't do equations. Python does floating point operations.

    >
    > More generally, all general-purpose programming languages have the same
    > problem. You'll see the same issues in Fortran, C, Java, Ruby, Pascal,
    > etc, etc.


    Not quite. CPython has the problem that it "boxes" its floating
    point numbers. After each operation, the value is stored back into
    a 64-bit space.

    The IEEE 754 compliant FPU on most machines today, though, has
    an 80-bit internal representation. If you do a sequence of
    operations that retain all the intermediate results in the FPU
    registers, you get 16 more bits of precision than if you store
    after each operation. Rounding occurs when the 80-bit value is
    forced back to 64 bits.

    So it's quite possible that this would look like an equality
    in C, or ShedSkin, or maybe PyPy (which has some unboxing
    optimizations) but not in CPython.

    (That's not the problem here, of course. The problem is that
    the user doesn't understand floating point. The issues I'm talking
    about are subtle, and affect few people. Those of us who've had
    to worry about this and read Kahan's papers are typically developers
    of simulation systems, where cumulative error can be a problem.
    In the 1990s, I had to put a lot of work into this for collision
    detection algorithms for a physics engine. As two objects settle
    into contact, issues with tiny differences between large numbers
    start to dominate. It takes careful handling to prevent that from
    causing high frequency simulated vibration in the simulation,
    chewing up CPU time at best and causing simulations to fly apart
    at worst. The problems are understood now, but they weren't in
    the mid-1990s. The licensed Jurassic Park game "Trespasser" was a flop
    for that reason.)

    John Nagle
     
    John Nagle, Feb 23, 2011
    #4
  5. On Wed, 23 Feb 2011 13:26:05 -0800, John Nagle wrote:

    > The IEEE 754 compliant FPU on most machines today, though, has an 80-bit
    > internal representation. If you do a sequence of operations that retain
    > all the intermediate results in the FPU registers, you get 16 more bits
    > of precision than if you store after each operation.


    That's a big if though. Which languages support such a thing? C doubles
    are 64 bit, same as Python.


    --
    Steven
     
    Steven D'Aprano, Feb 24, 2011
    #5
  6. Grant Edwards

    Ethan Furman Guest

    Steven D'Aprano wrote:
    > On Wed, 23 Feb 2011 13:26:05 -0800, John Nagle wrote:
    >
    >> The IEEE 754 compliant FPU on most machines today, though, has an 80-bit
    >> internal representation. If you do a sequence of operations that retain
    >> all the intermediate results in the FPU registers, you get 16 more bits
    >> of precision than if you store after each operation.

    >
    > That's a big if though. Which languages support such a thing? C doubles
    > are 64 bit, same as Python.


    Assembly! :)

    ~Ethan~
     
    Ethan Furman, Feb 24, 2011
    #6
  7. On Thu, 24 Feb 2011 04:56:46 -0800
    Ethan Furman <> wrote:
    > >> The IEEE 754 compliant FPU on most machines today, though, has an 80-bit
    > >> internal representation. If you do a sequence of operations that retain
    > >> all the intermediate results in the FPU registers, you get 16 more bits
    > >> of precision than if you store after each operation.

    > >
    > > That's a big if though. Which languages support such a thing? C doubles
    > > are 64 bit, same as Python.

    >
    > Assembly! :)


    Really? Why would you need that level of precision just to gather all
    the students into the auditorium?

    --
    D'Arcy J.M. Cain <> | Democracy is three wolves
    http://www.druid.net/darcy/ | and a sheep voting on
    +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
     
    D'Arcy J.M. Cain, Feb 24, 2011
    #7
  8. Grant Edwards

    Mel Guest

    D'Arcy J.M. Cain wrote:
    > On Thu, 24 Feb 2011 04:56:46 -0800
    > Ethan Furman <> wrote:


    >> > That's a big if though. Which languages support such a thing? C doubles
    >> > are 64 bit, same as Python.

    >>
    >> Assembly! :)

    >
    > Really? Why would you need that level of precision just to gather all
    > the students into the auditorium?


    You would think so, but darned if some of them don't wind up in a
    *different* *auditorium*!

    Mel.
     
    Mel, Feb 24, 2011
    #8
  9. Grant Edwards

    Robert Kern Guest

    On 2/24/11 5:55 AM, Steven D'Aprano wrote:
    > On Wed, 23 Feb 2011 13:26:05 -0800, John Nagle wrote:
    >
    >> The IEEE 754 compliant FPU on most machines today, though, has an 80-bit
    >> internal representation. If you do a sequence of operations that retain
    >> all the intermediate results in the FPU registers, you get 16 more bits
    >> of precision than if you store after each operation.

    >
    > That's a big if though. Which languages support such a thing? C doubles
    > are 64 bit, same as Python.


    C double *variables* are, but as John suggests, C compilers are allowed (to my
    knowledge) to keep intermediate results of an expression in the larger-precision
    FPU registers. The final result does get shoved back into a 64-bit double when
    it is at last assigned back to a variable or passed to a function that takes a
    double.

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
     
    Robert Kern, Feb 24, 2011
    #9
  10. On Thu, 24 Feb 2011 10:40:45 -0600, Robert Kern wrote:

    > On 2/24/11 5:55 AM, Steven D'Aprano wrote:
    >> On Wed, 23 Feb 2011 13:26:05 -0800, John Nagle wrote:
    >>
    >>> The IEEE 754 compliant FPU on most machines today, though, has an
    >>> 80-bit internal representation. If you do a sequence of operations
    >>> that retain all the intermediate results in the FPU registers, you get
    >>> 16 more bits of precision than if you store after each operation.

    >>
    >> That's a big if though. Which languages support such a thing? C doubles
    >> are 64 bit, same as Python.

    >
    > C double *variables* are, but as John suggests, C compilers are allowed
    > (to my knowledge) to keep intermediate results of an expression in the
    > larger-precision FPU registers. The final result does get shoved back
    > into a 64-bit double when it is at last assigned back to a variable or
    > passed to a function that takes a double.


    So...

    (1) you can't rely on it, because it's only "allowed" and not mandatory;

    (2) you may or may not have any control over whether or not it happens;

    (3) it only works for calculations that are simple enough to fit in a
    single expression; and

    (4) we could say the same thing about Python -- there's no prohibition on
    Python using extended precision when performing intermediate results, so
    it too could be said to be "allowed".


    It seems rather unfair to me to single Python out as somehow lacking
    (compared to which other languages?) and to gloss over the difficulties
    in "If you do a sequence of operations that retain all the intermediate
    results..." Yes, *if* you do so, you get more precision, but *how* do you
    do so? Such a thing will be language or even implementation dependent,
    and the implication that it just automatically happens without any effort
    seems dubious to me.

    But I could be wrong, of course. It may be that Python, alone of all
    modern high-level languages, fails to take advantage of 80-bit registers
    in FPUs *wink*



    --
    Steven
     
    Steven D'Aprano, Feb 25, 2011
    #10
  11. On Fri, 2011-02-25 at 00:33 +0000, Steven D'Aprano wrote:
    > On Thu, 24 Feb 2011 10:40:45 -0600, Robert Kern wrote:
    >
    > > On 2/24/11 5:55 AM, Steven D'Aprano wrote:
    > >> On Wed, 23 Feb 2011 13:26:05 -0800, John Nagle wrote:
    > >>
    > >>> The IEEE 754 compliant FPU on most machines today, though, has an
    > >>> 80-bit internal representation. If you do a sequence of operations
    > >>> that retain all the intermediate results in the FPU registers, you get
    > >>> 16 more bits of precision than if you store after each operation.
    > >>
    > >> That's a big if though. Which languages support such a thing? C doubles
    > >> are 64 bit, same as Python.

    > >
    > > C double *variables* are, but as John suggests, C compilers are allowed
    > > (to my knowledge) to keep intermediate results of an expression in the
    > > larger-precision FPU registers. The final result does get shoved back
    > > into a 64-bit double when it is at last assigned back to a variable or
    > > passed to a function that takes a double.

    >
    > So...
    >
    > (1) you can't rely on it, because it's only "allowed" and not mandatory;
    >
    > (2) you may or may not have any control over whether or not it happens;
    >
    > (3) it only works for calculations that are simple enough to fit in a
    > single expression; and
    >
    > (4) we could say the same thing about Python -- there's no prohibition on
    > Python using extended precision when performing intermediate results, so
    > it too could be said to be "allowed".
    >
    >
    > It seems rather unfair to me to single Python out as somehow lacking
    > (compared to which other languages?) and to gloss over the difficulties
    > in "If you do a sequence of operations that retain all the intermediate
    > results..." Yes, *if* you do so, you get more precision, but *how* do you
    > do so? Such a thing will be language or even implementation dependent,
    > and the implication that it just automatically happens without any effort
    > seems dubious to me.
    >
    > But I could be wrong, of course. It may be that Python, alone of all
    > modern high-level languages, fails to take advantage of 80-bit registers
    > in FPUs *wink*
    >
    >
    >
    > --
    > Steven


    Maybe I'm wrong, but wouldn't compiling Python with a compiler that
    supports extended precision for intermediates allow Python to use
    extended precision for its immediates? Or does Python use its own
    floating-point math?
     
    Westley Martínez, Feb 25, 2011
    #11
  12. On 2011-02-25, Steven D'Aprano <> wrote:

    >> C double *variables* are, but as John suggests, C compilers are allowed
    >> (to my knowledge) to keep intermediate results of an expression in the
    >> larger-precision FPU registers. The final result does get shoved back
    >> into a 64-bit double when it is at last assigned back to a variable or
    >> passed to a function that takes a double.

    >
    > So...
    >
    > (1) you can't rely on it, because it's only "allowed" and not mandatory;
    >
    > (2) you may or may not have any control over whether or not it happens;
    >
    > (3) it only works for calculations that are simple enough to fit in a
    > single expression; and


    (3) is sort of an interesting one.

    If a C compiler could elminate stores to temporary variables (let's
    say inside a MAC loop) it might get a more accurate result by leaving
    temporary results in an FP register. But, IIRC the C standard says
    the compiler can only eliminate stores to variables if it doesn't
    change the output of the program. So I think the C standard actually
    forces the compiler to convert results to 64-bits at the points where
    a store to a temporary variable happens. It's still free to leave the
    result in an FP register, but it has to toss out the extra bits so
    that it gets the same result as it would have if the store/load took
    place.

    > (4) we could say the same thing about Python -- there's no
    > prohibition on Python using extended precision when performing
    > intermediate results, so it too could be said to be "allowed".


    Indeed. Though C-python _will_ (AFAIK) store results to variables
    everywhere the source code says to, and C is allowed to skip those
    stores, C is still required to produce the same results as if the
    store had been done.

    IOW, I don't see that there's any difference between Python and C
    either.

    --
    Grant
     
    Grant Edwards, Feb 25, 2011
    #12
  13. On 2011-02-25, Westley Mart?nez <> wrote:

    > Maybe I'm wrong, but wouldn't compiling Python with a compiler that
    > supports extended precision for intermediates allow Python to use
    > extended precision for its immediates?


    I'm not sure what you mean by "immediates", but I don't think so. For
    the C compiler to do an optimization like we're talking about, you
    have to give it the entire expression in C for it to compile. From
    the POV of the C compiler, C-Python never does more than one FP
    operation at a time when evaluating Python bytecode, and there aren't
    any intemediate values to store.

    > Or does Python use its own floating-point math?


    No, but the C compiler has no way of knowing what the Python
    expression is.

    --
    Grant
     
    Grant Edwards, Feb 25, 2011
    #13
  14. On Fri, 2011-02-25 at 00:57 +0000, Grant Edwards wrote:
    > On 2011-02-25, Westley Mart?nez <> wrote:
    >
    > > Maybe I'm wrong, but wouldn't compiling Python with a compiler that
    > > supports extended precision for intermediates allow Python to use
    > > extended precision for its immediates?

    >
    > I'm not sure what you mean by "immediates", but I don't think so. For
    > the C compiler to do an optimization like we're talking about, you
    > have to give it the entire expression in C for it to compile. From
    > the POV of the C compiler, C-Python never does more than one FP
    > operation at a time when evaluating Python bytecode, and there aren't
    > any intemediate values to store.
    >
    > > Or does Python use its own floating-point math?

    >
    > No, but the C compiler has no way of knowing what the Python
    > expression is.
    >
    > --
    > Grant
    >
    >
    >


    I meant to say intermediate. I think I understand what you're saying.
    Regardless, the point is the same; floating-point numbers are different
    from real numbers and their limitations have to be taken into account
    when operating on them.
     
    Westley Martínez, Feb 25, 2011
    #14
  15. On 01/-10/-28163 02:59 PM, Grant Edwards wrote:
    > On 2011-02-25, Steven D'Aprano<> wrote:
    >
    >>> C double *variables* are, but as John suggests, C compilers are allowed
    >>> (to my knowledge) to keep intermediate results of an expression in the
    >>> larger-precision FPU registers. The final result does get shoved back
    >>> into a 64-bit double when it is at last assigned back to a variable or
    >>> passed to a function that takes a double.

    >>
    >> So...
    >>
    >> (1) you can't rely on it, because it's only "allowed" and not mandatory;
    >>
    >> (2) you may or may not have any control over whether or not it happens;
    >>
    >> (3) it only works for calculations that are simple enough to fit in a
    >> single expression; and

    >
    > <snip>
    >


    In 1975, I was writing the arithmetic and expression handling for an
    interpreter. My instruction primitives could add two bytes; anything
    more complex was done in my code. So I defined a floating point format
    (decimal, of course) and had extended versions of it available for
    intermediate calculations. I used those extended versions, in logs for
    example, whenever the user of our language could not see the
    intermediate results.

    When faced with the choice of whether to do the same inside explicit
    expressions, like (a*b) - (c*d), I deliberately chose *not* to do such
    optimizations, in spite of the fact that it would improve both
    performance and (sometimes) accuracy.

    I wrote down my reasons at the time, and they had to do with 'least
    surprise." If a computation for an expression gave a different result
    than the same one decomposed into separate variables, the developer
    would have a hard time knowing when results might change, and when they
    might not. Incidentally, the decimal format also assured "least
    surprise," since the times when quantization error entered in were
    exactly the same times as if one were doing the calculation by hand.

    I got feedback from a customer who was getting errors in a complex
    calculation (involving trig), and wanted help in understanding why.
    While his case might have been helped by intermediate values having
    higher accuracy, the real solution was to reformulate the calculation to
    avoid subtracting two large numbers that differed by very little. By
    applying a little geometry before writing the algorithm, I was able to
    change his accuracy from maybe a millionth of an inch to something
    totally unmeasurable.

    I still think the choice was appropriate for a business language, if not
    for scientific use.

    DaveA
     
    Heather Brown, Feb 25, 2011
    #15
  16. Grant Edwards

    Ben Guest

    On Feb 25, 12:33 am, Steven D'Aprano <steve
    > wrote:
    > On Thu, 24 Feb 2011 10:40:45 -0600, Robert Kern wrote:
    > > On 2/24/11 5:55 AM, Steven D'Aprano wrote:
    > >> On Wed, 23 Feb 2011 13:26:05 -0800, John Nagle wrote:

    >
    > >>> The IEEE 754 compliant FPU on most machines today, though, has an
    > >>> 80-bit internal representation.  If you do a sequence of operations
    > >>> that retain all the intermediate results in the FPU registers, you get
    > >>> 16 more bits of precision than if you store after each operation.

    >
    > >> That's a big if though. Which languages support such a thing? C doubles
    > >> are 64 bit, same as Python.

    >
    > > C double *variables* are, but as John suggests, C compilers are allowed
    > > (to my knowledge) to keep intermediate results of an expression in the
    > > larger-precision FPU registers. The final result does get shoved back
    > > into a 64-bit double when it is at last assigned back to a variable or
    > > passed to a function that takes a double.

    >
    > So...
    >
    > (1) you can't rely on it, because it's only "allowed" and not mandatory;
    >
    > (2) you may or may not have any control over whether or not it happens;
    >
    > (3) it only works for calculations that are simple enough to fit in a
    > single expression; and
    >
    > (4) we could say the same thing about Python -- there's no prohibition on
    > Python using extended precision when performing intermediate results, so
    > it too could be said to be "allowed".
    >
    > It seems rather unfair to me to single Python out as somehow lacking
    > (compared to which other languages?) and to gloss over the difficulties
    > in "If you do a sequence of operations that retain all the intermediate
    > results..." Yes, *if* you do so, you get more precision, but *how* do you
    > do so? Such a thing will be language or even implementation dependent,
    > and the implication that it just automatically happens without any effort
    > seems dubious to me.
    >
    > But I could be wrong, of course. It may be that Python, alone of all
    > modern high-level languages, fails to take advantage of 80-bit registers
    > in FPUs *wink*
    >
    > --
    > Steven


    And note that x64 machines use SSE for all their floating point maths,
    which is 64bit max precision anyway

    Ben
     
    Ben, Mar 9, 2011
    #16
  17. On Feb 25, 12:52 am, Grant Edwards <> wrote:
    > So I think the C standard actually
    > forces the compiler to convert results to 64-bits at the points where
    > a store to a temporary variable happens.


    I'm not sure that this is true. IIRC, C99 + Annex F forces this, but
    C99 by itself doesn't.

    > Indeed.  Though C-python _will_ (AFAIK) store results to variables
    > everywhere the source code says to


    Agreed.

    That doesn't rescue Python from the pernicious double-rounding
    problem, though: it still bugs me that you get different results for
    e.g.,

    >>> 1e16 + 2.99999

    1.0000000000000002e+16

    depending on the platform. OS X, Windows, 64-bit Linux give the
    above; 32-bit Linux generally gives 1.0000000000000004e+16 instead,
    thanks to using the x87 FPU with its default 64-bit precision.
    (Windows uses the x87 too, but changes the precision to 53-bit
    precision.)

    In theory this is prohibited too, under C99 + Annex F.


    --
    Mark
     
    Mark Dickinson, Mar 9, 2011
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. chirs
    Replies:
    18
    Views:
    776
    Chris Uppal
    Mar 2, 2004
  2. AciD_X
    Replies:
    4
    Views:
    8,113
    Jonathan Turkanis
    Apr 1, 2004
  3. Mark Healey
    Replies:
    7
    Views:
    1,494
    Tim Prince
    May 22, 2006
  4. Philipp
    Replies:
    9
    Views:
    1,133
    Mark Space
    Jul 23, 2008
  5. VK
    Replies:
    15
    Views:
    1,184
    Dr J R Stockton
    May 2, 2010
Loading...

Share This Page