When (32-bit) double precision isn't precise enough

Discussion in 'C Programming' started by DAVID SCHULMAN, Sep 10, 2003.

  1. I've been trying to perform a calculation that has been running into
    an underflow (insufficient precision) problem in Microsoft Excel, which
    calculates using at most 15 significant digits. For this purpose, that
    isn't enough.

    I was reading a book about some of the financial scandals of the
    1990s called "Inventing Money: The Story of Long-Term Capital Management
    and the Legends Behind it" by Nicholas Dunbar. On page 95, he mentions
    that the 1987 stock-market crash was designated in economists' computer
    models as a "20-sigma event". That is, their models (which obviously were
    badly flawed!) put such an event in the exponential "tails" of a normal
    Gaussian distribution, outside the range +/- 20 standard deviations from
    the mean. In other words - vastly unlikely.

    So I wanted to know: just how unlikely is that? Where did they get
    the data to support that idea, anyway?

    I put together an Excel spreadsheet, attempting to calculate this
    probability. But after about 8 sigma I ran into the limits of Excel's
    15-digit
    precision and could go no further. Since the odds of an 8-sigma event
    are about 819 trillion to 1 against, a 20-sigma event is obviously going to
    be so unlikely that it's not even worth talking about; would probably
    never happen in the lifetime of the universe. Further evidence that those
    computer models had some serious problems.

    To perform that calculation, I used Excel's built-in error function
    ERF(), and here's the output:

    s Confidence Interval Probability Odds against

    1.0 68.268949131685700% 31.7310508683143000000% 3
    1.5 86.638554269553900% 13.3614457304461000000% 7
    2.0 95.449972950730900% 4.55002704926911000000% 22
    2.5 98.758066814134400% 1.24193318586556000000% 81
    3.0 99.730020387427800% 0.26997961257220200000% 370
    3.5 99.953474183672100% 0.04652581632794690000% 2,149
    4.0 99.993665751560700% 0.00633424843932140000% 15,787
    4.5 99.999320465373800% 0.00067953462623560100% 147,160
    5.0 99.999942669685300% 0.00005733031473997840% 1,744,278
    5.5 99.999996202087500% 0.00000379791249560668% 26,330,254
    6.0 99.999999802682500% 0.00000019731752898267% 506,797,346
    6.5 99.999999991968000% 0.00000000803199728949% 12,450,203,405
    7.0 99.999999999744000% 0.00000000025596191833% 390,683,116,666
    7.5 99.999999999993600% 0.00000000000638378239% 15,664,694,356,071
    8.0 99.999999999999900% 0.00000000000000000000% 818,836,295,885,545
    8.5 100.00000000000000% 0.00000000000000000000% #DIV/0!
    .. . . .
    s =ERF(An/SQRT(2)) =1-Bn =1/Cn

    (The 'n' in the cell formulas above represents the row number).

    Evidently this calculation hits the limit of Excel's computational
    precision at about 8 sigma.

    For what it's worth, ERF(z) is defined as 2/pi * INT[(0,z) exp(-t²)
    dt], and the area under the "normal" Bell curve from -z to z is just
    ERF(z/sqrt(2)). There's a discussion of it here:
    http://mathworld.wolfram.com/Erf.html, here:
    http://mathworld.wolfram.com/ConfidenceInterval.html, and here:
    http://jove.prohosting.com/~skripty/page_295.htm.

    So I decided to try to tackle this in C. I downloaded a package
    called the GNU Scientific Library (http://www.gnu.org/software/gsl/) -
    there's a nice precompiled binary for Microsoft Visual C++ at
    http://www.network-theory.co.uk/gsl/freedownloads.html. I wrote a little
    piece of code to try it out:

    ==========================================================================
    #include <stdio.h>
    #include <gsl/gsl_math.h>
    #include <gsl/gsl_sf_erf.h>

    int main(void)
    {
    double odds;
    double index;
    double sigma;
    double result;

    for (index = 1.0; index <= 8.5; index += 0.5)
    {
    sigma = index / M_SQRT2;
    result = 1 - (gsl_sf_erf (sigma));
    odds = 1 / result;
    printf("P(%2.1f \345) = %.18LE \t %-1.18lE\n", index, result, odds);
    }
    return 0;
    }
    ==========================================================================

    And here's its output:

    P(1.0 s) = 3.173105078629142600E-001 3.151487187534375500E+000
    P(1.5 s) = 1.336144025377161700E-001 7.484223115226846800E+000
    P(2.0 s) = 4.550026389635841700E-002 2.197789450799282900E+001
    P(2.5 s) = 1.241933065155231800E-002 8.051963733448130300E+001
    P(3.0 s) = 2.699796063260206900E-003 3.703983473449563900E+002
    P(3.5 s) = 4.652581580710801700E-004 2.149344364311446000E+003
    P(4.0 s) = 6.334248366623995700E-005 1.578719276732396800E+004
    P(4.5 s) = 6.795346249477418600E-006 1.471595358480670300E+005
    P(5.0 s) = 5.733031437360480700E-007 1.744277893686913900E+006
    P(5.5 s) = 3.797912495606681200E-008 2.633025382119182500E+007
    P(6.0 s) = 1.973175289826656400E-009 5.067973459610119500E+008
    P(6.5 s) = 8.031997289492665000E-011 1.245020340467724800E+010
    P(7.0 s) = 2.559619183273298400E-012 3.906831166662759400E+011
    P(7.5 s) = 6.383782391594650100E-014 1.566469435607129100E+013
    P(8.0 s) = 1.221245327087672200E-015 8.188362958855447500E+014
    P(8.5 s) = 0.000000000000000000E+000 1.#INF00000000000000E+000


    Same limits as in Excel, it seems. GSL's gsl_sf_erf function takes a
    double-precision argument and returns double. My Visual C++ compiler (v.
    6.0) specifies that both double and long double use an 8-byte
    representation: "The long double contains 80 bits: 1 for sign, 15 for
    exponent, and 64 for mantissa. Its range is +/- 1.2E4932 with at least 19
    digits of precision."

    Strangely enough, I only seem to be getting 17 digits of precision.
    Regardless, I estimate that this calculation will require at least 38
    additional digits of precision. Interesting: according to IEEE 754, the
    condition for "positive underflow" (single precision) shouldn't happen
    for positive numbers greater than about 1.4E-045 or so. For double
    precision, it should happen only for positive numbers less than about
    1.0E-308.

    So here's what I'd like to know.

    Are there 64-bit implementations of something like GSL which would
    produce more precise output on appropriate OS/hardware platforms like
    WinXP-64, Solaris v. 7-9, Tru64 Unix, Linux, etc? Has anyone
    implemented a 128-bit long long double datatype or equivalent?

    Can a calculation like this be performed with some kind of arbitrary-
    precision numeric package (something like Michael Ring's MAPM or PHP
    BCMath), or evaluated directly by Mathematica or the java.math package?

    Or maybe I'm just confused and there's an easier way to do this which
    I'm just not seeing.

    By the way, I don't know of anyone who's bothered to tabulate the
    values of this function nearly this far: most such texts (Zwillinger's CRC
    Handbooks, Abramowitz & Stegun, Gradshteyn & Ryzhik, etc) only go to about
    4.0 sigma.

    Any ideas?

    -- Dave Schulman ()
    DAVID SCHULMAN, Sep 10, 2003
    #1
    1. Advertising

  2. DAVID SCHULMAN

    Jirka Klaue Guest

    DAVID SCHULMAN wrote:
    ....
    > ==========================================================================

    Code:
    [color=blue]
    > ==========================================================================
    > 
    >      And here's its output:
    > 
    > P(8.5 s) = 0.000000000000000000E+000     1.#INF00000000000000E+000[/color]
    ....[color=blue]
    >      Can a calculation like this be performed with some kind of arbitrary-
    > precision numeric package (something like Michael Ring's MAPM or PHP
    > BCMath), or evaluated directly by Mathematica or the java.math package?[/color]
    
    Mathematica, MathCAD, MathLab, Maple all can do such stuff.
    Here is what Maple spits out:
    [color=blue]
     > Digits := 190: 1 - erf(20.); 1 / %;[/color]
                                               -175
                            .539586561160790 10
    
    
       .185327076687888932317857224343679932565592017483138279103445269\
             2808207933709765329642460892658867968117131478014046442587\
             5897092558267970813224336350626158308236331344839925105294\
                           176
             06688155039 10
    
    Jirka
    Jirka Klaue, Sep 10, 2003
    #2
    1. Advertising

  3. "DAVID SCHULMAN" <> writes:

    [Long explanation snipped. The poster wants to calculate the exclusion
    probability corresponding to 20 standard deviations of a normal
    distribution.]

    In C99, you can use the `erfc' function. Many C89 implementations provide
    it as an extension.

    #include <stdio.h>
    #include <math.h>

    int main (void)
    {
    printf ("%e\n", erfc (20.0 / sqrt (2.0)));
    return 0;
    }

    This program prints `5.507248e-89'.

    Note that although the /mathematical/ definition of `erfc(x)' is
    `1.0 - erf(x)', this is not a good way to actually calculate the function.
    The `erf' function would have to return a value which is 5.507248e-89 less
    than 1.0, which would require at least 294 significant bits to represent.
    Usually, floating point numbers are much less precise in C, so `erf' just
    returns the closest value it can represent, 1.0.

    > So I decided to try to tackle this in C. I downloaded a package
    > called the GNU Scientific Library (http://www.gnu.org/software/gsl/) -
    > there's a nice precompiled binary for Microsoft Visual C++ at
    > http://www.network-theory.co.uk/gsl/freedownloads.html. I wrote a little
    > piece of code to try it out:
    >
    > ==========================================================================
    > #include <stdio.h>
    > #include <gsl/gsl_math.h>
    > #include <gsl/gsl_sf_erf.h>
    >
    > int main(void)
    > {
    > double odds;
    > double index;
    > double sigma;
    > double result;
    >
    > for (index = 1.0; index <= 8.5; index += 0.5)
    > {
    > sigma = index / M_SQRT2;
    > result = 1 - (gsl_sf_erf (sigma));
    > odds = 1 / result;
    > printf("P(%2.1f \345) = %.18LE \t %-1.18lE\n", index, result, odds);
    > }
    > return 0;
    > }
    > ==========================================================================


    While I'm not familiar with the GNU Scientific Library, I would guess that
    it also provides an `erfc' function if it provides an `erf' function.

    > My Visual C++ compiler (v. 6.0) specifies that both double and long
    > double use an 8-byte representation: "The long double contains 80 bits:
    > 1 for sign, 15 for exponent, and 64 for mantissa. Its range is +/-
    > 1.2E4932 with at least 19 digits of precision."


    As explained above `erf(20.0/sqrt(2.0))' can only be represented as a
    value different from 1.0 with at least 294 bits for the mantissa.

    Martin
    Martin Dickopp, Sep 10, 2003
    #3
  4. DAVID SCHULMAN

    jacob navia Guest

    Use lcc-win32 C compiler.
    Here is a small program that calculates using 350 bits precision. Math
    functions are suffixed with the 'q' (qfloat) suffix. lcc-win32 supports standard
    350 bits extended floats.

    #include <math.h>
    #include <stdio.h>
    #include <qfloat.h>
    int main(void)
    {
    printf("%.100qe\n",erfcq(20/sqrtq(2.0q)));
    return 0;
    }

    Output
    ---------
    5.5072482372124673901512455617149306656149954695186611353
    987433091698372417656070673834468842358561507e-89

    Lcc-win32 can be downloaded at no charge from
    http://www.cs.virginia.edu/~lcc-win32


    "DAVID SCHULMAN" <> wrote in message
    news:t4J7b.18514$...
    > I've been trying to perform a calculation that has been running into
    > an underflow (insufficient precision) problem in Microsoft Excel, which
    > calculates using at most 15 significant digits. For this purpose, that
    > isn't enough.
    >
    > I was reading a book about some of the financial scandals of the
    > 1990s called "Inventing Money: The Story of Long-Term Capital Management
    > and the Legends Behind it" by Nicholas Dunbar. On page 95, he mentions
    > that the 1987 stock-market crash was designated in economists' computer
    > models as a "20-sigma event". That is, their models (which obviously were
    > badly flawed!) put such an event in the exponential "tails" of a normal
    > Gaussian distribution, outside the range +/- 20 standard deviations from
    > the mean. In other words - vastly unlikely.
    >
    > So I wanted to know: just how unlikely is that? Where did they get
    > the data to support that idea, anyway?
    >
    > I put together an Excel spreadsheet, attempting to calculate this
    > probability. But after about 8 sigma I ran into the limits of Excel's
    > 15-digit
    > precision and could go no further. Since the odds of an 8-sigma event
    > are about 819 trillion to 1 against, a 20-sigma event is obviously going to
    > be so unlikely that it's not even worth talking about; would probably
    > never happen in the lifetime of the universe. Further evidence that those
    > computer models had some serious problems.
    >
    > To perform that calculation, I used Excel's built-in error function
    > ERF(), and here's the output:
    >
    > s Confidence Interval Probability Odds against
    >
    > 1.0 68.268949131685700% 31.7310508683143000000% 3
    > 1.5 86.638554269553900% 13.3614457304461000000% 7
    > 2.0 95.449972950730900% 4.55002704926911000000% 22
    > 2.5 98.758066814134400% 1.24193318586556000000% 81
    > 3.0 99.730020387427800% 0.26997961257220200000% 370
    > 3.5 99.953474183672100% 0.04652581632794690000% 2,149
    > 4.0 99.993665751560700% 0.00633424843932140000% 15,787
    > 4.5 99.999320465373800% 0.00067953462623560100% 147,160
    > 5.0 99.999942669685300% 0.00005733031473997840% 1,744,278
    > 5.5 99.999996202087500% 0.00000379791249560668% 26,330,254
    > 6.0 99.999999802682500% 0.00000019731752898267% 506,797,346
    > 6.5 99.999999991968000% 0.00000000803199728949% 12,450,203,405
    > 7.0 99.999999999744000% 0.00000000025596191833% 390,683,116,666
    > 7.5 99.999999999993600% 0.00000000000638378239% 15,664,694,356,071
    > 8.0 99.999999999999900% 0.00000000000000000000% 818,836,295,885,545
    > 8.5 100.00000000000000% 0.00000000000000000000% #DIV/0!
    > . . . .
    > s =ERF(An/SQRT(2)) =1-Bn =1/Cn
    >
    > (The 'n' in the cell formulas above represents the row number).
    >
    > Evidently this calculation hits the limit of Excel's computational
    > precision at about 8 sigma.
    >
    > For what it's worth, ERF(z) is defined as 2/pi * INT[(0,z) exp(-t²)
    > dt], and the area under the "normal" Bell curve from -z to z is just
    > ERF(z/sqrt(2)). There's a discussion of it here:
    > http://mathworld.wolfram.com/Erf.html, here:
    > http://mathworld.wolfram.com/ConfidenceInterval.html, and here:
    > http://jove.prohosting.com/~skripty/page_295.htm.
    >
    > So I decided to try to tackle this in C. I downloaded a package
    > called the GNU Scientific Library (http://www.gnu.org/software/gsl/) -
    > there's a nice precompiled binary for Microsoft Visual C++ at
    > http://www.network-theory.co.uk/gsl/freedownloads.html. I wrote a little
    > piece of code to try it out:
    >
    > ==========================================================================
    > #include <stdio.h>
    > #include <gsl/gsl_math.h>
    > #include <gsl/gsl_sf_erf.h>
    >
    > int main(void)
    > {
    > double odds;
    > double index;
    > double sigma;
    > double result;
    >
    > for (index = 1.0; index <= 8.5; index += 0.5)
    > {
    > sigma = index / M_SQRT2;
    > result = 1 - (gsl_sf_erf (sigma));
    > odds = 1 / result;
    > printf("P(%2.1f \345) = %.18LE \t %-1.18lE\n", index, result, odds);
    > }
    > return 0;
    > }
    > ==========================================================================
    >
    > And here's its output:
    >
    > P(1.0 s) = 3.173105078629142600E-001 3.151487187534375500E+000
    > P(1.5 s) = 1.336144025377161700E-001 7.484223115226846800E+000
    > P(2.0 s) = 4.550026389635841700E-002 2.197789450799282900E+001
    > P(2.5 s) = 1.241933065155231800E-002 8.051963733448130300E+001
    > P(3.0 s) = 2.699796063260206900E-003 3.703983473449563900E+002
    > P(3.5 s) = 4.652581580710801700E-004 2.149344364311446000E+003
    > P(4.0 s) = 6.334248366623995700E-005 1.578719276732396800E+004
    > P(4.5 s) = 6.795346249477418600E-006 1.471595358480670300E+005
    > P(5.0 s) = 5.733031437360480700E-007 1.744277893686913900E+006
    > P(5.5 s) = 3.797912495606681200E-008 2.633025382119182500E+007
    > P(6.0 s) = 1.973175289826656400E-009 5.067973459610119500E+008
    > P(6.5 s) = 8.031997289492665000E-011 1.245020340467724800E+010
    > P(7.0 s) = 2.559619183273298400E-012 3.906831166662759400E+011
    > P(7.5 s) = 6.383782391594650100E-014 1.566469435607129100E+013
    > P(8.0 s) = 1.221245327087672200E-015 8.188362958855447500E+014
    > P(8.5 s) = 0.000000000000000000E+000 1.#INF00000000000000E+000
    >
    >
    > Same limits as in Excel, it seems. GSL's gsl_sf_erf function takes a
    > double-precision argument and returns double. My Visual C++ compiler (v.
    > 6.0) specifies that both double and long double use an 8-byte
    > representation: "The long double contains 80 bits: 1 for sign, 15 for
    > exponent, and 64 for mantissa. Its range is +/- 1.2E4932 with at least 19
    > digits of precision."
    >
    > Strangely enough, I only seem to be getting 17 digits of precision.
    > Regardless, I estimate that this calculation will require at least 38
    > additional digits of precision. Interesting: according to IEEE 754, the
    > condition for "positive underflow" (single precision) shouldn't happen
    > for positive numbers greater than about 1.4E-045 or so. For double
    > precision, it should happen only for positive numbers less than about
    > 1.0E-308.
    >
    > So here's what I'd like to know.
    >
    > Are there 64-bit implementations of something like GSL which would
    > produce more precise output on appropriate OS/hardware platforms like
    > WinXP-64, Solaris v. 7-9, Tru64 Unix, Linux, etc? Has anyone
    > implemented a 128-bit long long double datatype or equivalent?
    >
    > Can a calculation like this be performed with some kind of arbitrary-
    > precision numeric package (something like Michael Ring's MAPM or PHP
    > BCMath), or evaluated directly by Mathematica or the java.math package?
    >
    > Or maybe I'm just confused and there's an easier way to do this which
    > I'm just not seeing.
    >
    > By the way, I don't know of anyone who's bothered to tabulate the
    > values of this function nearly this far: most such texts (Zwillinger's CRC
    > Handbooks, Abramowitz & Stegun, Gradshteyn & Ryzhik, etc) only go to about
    > 4.0 sigma.
    >
    > Any ideas?
    >
    > -- Dave Schulman ()
    >
    >
    jacob navia, Sep 10, 2003
    #4
  5. "DAVID SCHULMAN" <> wrote in message
    news:t4J7b.18514$...
    > I've been trying to perform a calculation that has been running into
    > an underflow (insufficient precision) problem in Microsoft Excel, which
    > calculates using at most 15 significant digits. For this purpose, that
    > isn't enough.
    >
    > I was reading a book about some of the financial scandals of the
    > 1990s called "Inventing Money: The Story of Long-Term Capital Management
    > and the Legends Behind it" by Nicholas Dunbar. On page 95, he mentions
    > that the 1987 stock-market crash was designated in economists' computer
    > models as a "20-sigma event". That is, their models (which obviously were
    > badly flawed!) put such an event in the exponential "tails" of a normal
    > Gaussian distribution, outside the range +/- 20 standard deviations from
    > the mean. In other words - vastly unlikely.
    >
    > So I wanted to know: just how unlikely is that? Where did they get
    > the data to support that idea, anyway?
    >
    > I put together an Excel spreadsheet, attempting to calculate this
    > probability. But after about 8 sigma I ran into the limits of Excel's
    > 15-digit
    > precision and could go no further. Since the odds of an 8-sigma event
    > are about 819 trillion to 1 against, a 20-sigma event is obviously going

    to
    > be so unlikely that it's not even worth talking about; would probably
    > never happen in the lifetime of the universe. Further evidence that those
    > computer models had some serious problems.
    >
    > To perform that calculation, I used Excel's built-in error function
    > ERF(), and here's the output:
    >
    > s Confidence Interval Probability Odds against
    >
    > 1.0 68.268949131685700% 31.7310508683143000000% 3
    > 1.5 86.638554269553900% 13.3614457304461000000% 7
    > 2.0 95.449972950730900% 4.55002704926911000000% 22
    > 2.5 98.758066814134400% 1.24193318586556000000% 81
    > 3.0 99.730020387427800% 0.26997961257220200000% 370
    > 3.5 99.953474183672100% 0.04652581632794690000% 2,149
    > 4.0 99.993665751560700% 0.00633424843932140000% 15,787
    > 4.5 99.999320465373800% 0.00067953462623560100% 147,160
    > 5.0 99.999942669685300% 0.00005733031473997840% 1,744,278
    > 5.5 99.999996202087500% 0.00000379791249560668% 26,330,254
    > 6.0 99.999999802682500% 0.00000019731752898267% 506,797,346
    > 6.5 99.999999991968000% 0.00000000803199728949% 12,450,203,405
    > 7.0 99.999999999744000% 0.00000000025596191833% 390,683,116,666
    > 7.5 99.999999999993600% 0.00000000000638378239% 15,664,694,356,071
    > 8.0 99.999999999999900% 0.00000000000000000000% 818,836,295,885,545
    > 8.5 100.00000000000000% 0.00000000000000000000% #DIV/0!
    > . . . .
    > s =ERF(An/SQRT(2)) =1-Bn =1/Cn
    >
    > (The 'n' in the cell formulas above represents the row number).
    >
    > Evidently this calculation hits the limit of Excel's computational
    > precision at about 8 sigma.
    >
    > For what it's worth, ERF(z) is defined as 2/pi * INT[(0,z) exp(-t²)
    > dt], and the area under the "normal" Bell curve from -z to z is just
    > ERF(z/sqrt(2)). There's a discussion of it here:
    > http://mathworld.wolfram.com/Erf.html, here:
    > http://mathworld.wolfram.com/ConfidenceInterval.html, and here:
    > http://jove.prohosting.com/~skripty/page_295.htm.
    >
    > So I decided to try to tackle this in C. I downloaded a package
    > called the GNU Scientific Library (http://www.gnu.org/software/gsl/) -
    > there's a nice precompiled binary for Microsoft Visual C++ at
    > http://www.network-theory.co.uk/gsl/freedownloads.html. I wrote a little
    > piece of code to try it out:
    >
    > ==========================================================================
    > #include <stdio.h>
    > #include <gsl/gsl_math.h>
    > #include <gsl/gsl_sf_erf.h>
    >
    > int main(void)
    > {
    > double odds;
    > double index;
    > double sigma;
    > double result;
    >
    > for (index = 1.0; index <= 8.5; index += 0.5)
    > {
    > sigma = index / M_SQRT2;
    > result = 1 - (gsl_sf_erf (sigma));
    > odds = 1 / result;
    > printf("P(%2.1f \345) = %.18LE \t %-1.18lE\n", index, result, odds);
    > }
    > return 0;
    > }
    > ==========================================================================
    >
    > And here's its output:
    >
    > P(1.0 s) = 3.173105078629142600E-001 3.151487187534375500E+000
    > P(1.5 s) = 1.336144025377161700E-001 7.484223115226846800E+000
    > P(2.0 s) = 4.550026389635841700E-002 2.197789450799282900E+001
    > P(2.5 s) = 1.241933065155231800E-002 8.051963733448130300E+001
    > P(3.0 s) = 2.699796063260206900E-003 3.703983473449563900E+002
    > P(3.5 s) = 4.652581580710801700E-004 2.149344364311446000E+003
    > P(4.0 s) = 6.334248366623995700E-005 1.578719276732396800E+004
    > P(4.5 s) = 6.795346249477418600E-006 1.471595358480670300E+005
    > P(5.0 s) = 5.733031437360480700E-007 1.744277893686913900E+006
    > P(5.5 s) = 3.797912495606681200E-008 2.633025382119182500E+007
    > P(6.0 s) = 1.973175289826656400E-009 5.067973459610119500E+008
    > P(6.5 s) = 8.031997289492665000E-011 1.245020340467724800E+010
    > P(7.0 s) = 2.559619183273298400E-012 3.906831166662759400E+011
    > P(7.5 s) = 6.383782391594650100E-014 1.566469435607129100E+013
    > P(8.0 s) = 1.221245327087672200E-015 8.188362958855447500E+014
    > P(8.5 s) = 0.000000000000000000E+000 1.#INF00000000000000E+000
    >
    >
    > Same limits as in Excel, it seems. GSL's gsl_sf_erf function takes a
    > double-precision argument and returns double. My Visual C++ compiler (v.
    > 6.0) specifies that both double and long double use an 8-byte
    > representation: "The long double contains 80 bits: 1 for sign, 15 for
    > exponent, and 64 for mantissa. Its range is +/- 1.2E4932 with at least 19
    > digits of precision."
    >
    > Strangely enough, I only seem to be getting 17 digits of precision.
    > Regardless, I estimate that this calculation will require at least 38
    > additional digits of precision. Interesting: according to IEEE 754, the
    > condition for "positive underflow" (single precision) shouldn't happen
    > for positive numbers greater than about 1.4E-045 or so. For double
    > precision, it should happen only for positive numbers less than about
    > 1.0E-308.
    >
    > So here's what I'd like to know.
    >
    > Are there 64-bit implementations of something like GSL which would
    > produce more precise output on appropriate OS/hardware platforms like
    > WinXP-64, Solaris v. 7-9, Tru64 Unix, Linux, etc? Has anyone
    > implemented a 128-bit long long double datatype or equivalent?
    >
    > Can a calculation like this be performed with some kind of arbitrary-
    > precision numeric package (something like Michael Ring's MAPM or PHP
    > BCMath), or evaluated directly by Mathematica or the java.math package?
    >
    > Or maybe I'm just confused and there's an easier way to do this which
    > I'm just not seeing.
    >
    > By the way, I don't know of anyone who's bothered to tabulate the
    > values of this function nearly this far: most such texts (Zwillinger's CRC
    > Handbooks, Abramowitz & Stegun, Gradshteyn & Ryzhik, etc) only go to about
    > 4.0 sigma.
    >
    > Any ideas?
    >
    > -- Dave Schulman ()
    >
    >

    &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

    Whatever method you used to compute those values of the normal
    integral, they are not very accurate.

    Here are the true values to 15 places, listed with those you posted:

    True, to 15 places Your posted values
    1.0 3.173105078629141e-01 3.173105078629142600E-001
    1.5 1.336144025377161e-01 1.336144025377161700E-001
    2.0 4.550026389635841e-02 4.550026389635841700E-002
    2.5 1.241933065155227e-02 1.241933065155231800E-002
    3.0 2.699796063260189e-03 2.699796063260206900E-003
    3.5 4.652581580710501e-04 4.652581580710801700E-004
    4.0 6.334248366623984e-05 6.334248366623995700E-005
    4.5 6.795346249460121e-06 6.795346249477418600E-006
    5.0 5.733031437583878e-07 5.733031437360480700E-007
    5.5 3.797912493177544e-08 3.797912495606681200E-008
    6.0 1.973175290075396e-09 1.973175289826656400E-009
    6.5 8.032001167718236e-11 8.031997289492665000E-011
    7.0 2.559625087771670e-12 2.559619183273298400E-012
    7.5 6.381783345821792e-14 6.383782391594650100E-014
    8.0 1.244192114854357e-15 1.221245327087672200E-015
    8.5 1.895906964440664e-17 0.000000000000000000E+000
    9.0 2.257176811907681e-19
    9.5 2.098903015072521e-21
    10.0 1.523970604832105e-23
    10.5 8.638012635618461e-26
    11.0 3.821319148997350e-28
    11.5 1.319154289222600e-30
    12.0 3.552964224200000e-33

    Why most of those who deal with the normal integral in probability
    theory are still stuck with the historical baggage of the error function
    is a puzzle to me, as is the poor quality of the results one gets from
    standard library implementations of erf(). (One of the most common
    is based on ALGORITHM AS66, APPL. STATIST.(1973) Vol.22, .424 by HILL,
    which gives only 6-8 digit accuracy).

    Here is a listing of my method:

    /*
    Marsaglia Complementary Normal Distribution Function
    cPhi(x) = integral from x to infinity of exp(-.5*t^2)/sqrt(2*pi), x<15
    15-digit accuracy for x<15, returns 0 for x>15.
    #include <math.h>
    */

    double cPhi(double x){
    long double v[]={0.,.65567954241879847154L,
    ..42136922928805447322L,.30459029871010329573L,
    ..23665238291356067062L,.19280810471531576488L,
    ..16237766089686746182L,.14010418345305024160L,
    ..12313196325793229628L,.10978728257830829123L,
    ..99028596471731921395e-1L,.90175675501064682280e-1L,
    ..82766286501369177252e-1L,.76475761016248502993e-1L,
    ..71069580538852107091e-1L,.66374235823250173591e-1L};
    long double h,a,b,z,t,sum,pwr;
    int i,j;
    if(x>15.) return (0.);
    if(x<-15.) return (1.);
    j=fabs(x)+1.;
    z=j;
    h=fabs(x)-z;
    a=v[j];
    b=z*a-1.;
    pwr=1.;
    sum=a+h*b;
    for(i=2;i<60;i+=2){
    a=(a+z*b)/i;
    b=(b+z*a)/(i+1);
    pwr=pwr*h*h;
    t=sum;
    sum=sum+pwr*(a+h*b);
    if(sum==t) break; }
    sum=sum*exp(-.5*x*x-.91893853320467274178L);
    if(x<0.) sum=1.-sum;
    return ((double) sum);
    }
    */
    end of listing
    */

    The method is based on defining phi(x)=exp(-x^2)/sqrt(2pi) and

    R(x)=cPhi(x)/phi(x).

    The function R(x) is well-behaved and terms of its Taylor
    series are readily obtained by a two-term recursion. With an accurate
    representation of R(x) at ,say, x=0,1,2,...,15, a simple evaluation
    of the Taylor series at intermediate points provides up to
    15 digits of accuracy.
    An article describing the method will be in the new version of
    my Diehard CDROM. A new version of the Diehard tests
    of randomness (but not yet the new DVDROM) is at
    http://www.csis.hku.hk/~diehard/

    One other point about your posting:

    As is often done, you mistake odds for probabilities.
    The odds for an event should be represented as the ratio
    of p to 1-p, not 1/p. Thus if a bookie estimates the
    probability that a certain horse will win as .2, then
    the odds are 2 to 3 for and 3 to 2 against.
    Of course when p is close to zero, the ratio 1/p is close
    to (1-p)/p, but it is probably a sound practice to maintain
    the distinction between odds and probabilities.

    George Marsaglia
    George Marsaglia, Sep 11, 2003
    #5
  6. DAVID SCHULMAN

    John L Guest

    "George Marsaglia" <> wrote in message news:...
    >
    >
    > As is often done, you mistake odds for probabilities.
    > The odds for an event should be represented as the ratio
    > of p to 1-p, not 1/p. Thus if a bookie estimates the
    > probability that a certain horse will win as .2, then
    > the odds are 2 to 3 for and 3 to 2 against.
    > Of course when p is close to zero, the ratio 1/p is close
    > to (1-p)/p, but it is probably a sound practice to maintain
    > the distinction between odds and probabilities.
    >


    Typo? P = 0.2 = 1/5 <=> odds of 4-1 against.

    John.
    John L, Sep 12, 2003
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Busin
    Replies:
    4
    Views:
    388
    Devesh Mandowara
    Jun 28, 2004
  2. Auronc
    Replies:
    2
    Views:
    361
    Ron Natalie
    Aug 26, 2004
  3. Moritz Beller
    Replies:
    2
    Views:
    518
    Chris Theis
    Sep 19, 2004
  4. Sydex
    Replies:
    12
    Views:
    6,454
    Victor Bazarov
    Feb 17, 2005
  5. Andy
    Replies:
    11
    Views:
    638
    Keith Thompson
    Jun 1, 2010
Loading...

Share This Page