Avoiding NaN and Inf on floating point division

Discussion in 'C Programming' started by ardi, Jan 4, 2014.

  1. ardi

    ardi Guest


    Am I right supposing that if a floating point variable x is normal (not denormal/subnormal) it is guaranteed that for any non-NaN and non-Inf variablecalled y, the result y/x is guaranteed to be non-NaN and non-Inf?

    If affirmative, I've two doubts about this. First, how efficient can one expect the isnormal() macro to be? I mean, should one expect it to be much slower than doing an equality comparison to zero (x==0.0) ? Or should theperformance be similar?

    Second, how could I "emulate" isnormal() on older systems that lack it? Forexample, if I compile on IRIX 6.2, which AFAIK lacks isnormal(), is there some workaround which would also guarantee me that the division doesn't generate NaN nor Inf?

    Also, if the isnormal() macro can be slow, is there any other approach which would also give me the guarantee I'm asking for? Maybe comparing to some standard definition which holds the smallest normal value available for each data type? Are such definitions standardized in some way such that I can expect to find them in some standard header on most OSs/compilers? Would I be safe to test it this way rather than with the isnormal() macro?

    Thanks a lot!


    P.S: Yes, I realize a floating point division can result in a meaningless value even if it's non-NaN and non-Inf, because you might be dividing by a very small (yet normal) which should be zero but it isn't because of the math operations performed on it previously. But this is another problem. I'm asking for an scenario where you don't care if the result is meaningless or not, you just need to be sure it isn't NaN nor Inf.
    ardi, Jan 4, 2014
    1. Advertisements

  2. ardi

    jacob navia Guest

    Le 04/01/2014 12:07, ardi a écrit :
    it is guaranteed that for any non-NaN and non-Inf variable called y, the
    result y/x is guaranteed

    to be non-NaN and non-Inf?


    #include <stdio.h>
    int main(void)
    double a=1e300,b=1e-300,c;
    jacob navia, Jan 4, 2014
    1. Advertisements

  3. ardi

    ardi Guest

    Ooops!!! I believe this means I forgot you can also get Inf from overflow.... if a number is very big and a division turns it even larger, it can overflow, and then it becomes Inf even if the denominator is a normal value.

    This effectively breaks my quest for "healthy divisions". I guess I'm back to my old arbitrary epsilon checking approach (i.e.: check the denominator for fabs(x)>epsilon for deciding whether the division can be performed or not, where epsilon is left as an exercise for the reader ;-)


    ardi, Jan 4, 2014
  4. No. Assuming what goes by the name of IEEE floating point, you will get
    NaN when y == x == 0, and Inf from all sorts of values for x and y
    (DBL_MAX/0.5 for example).

    An excellent starting point is to search the web for Goldberg's paper
    "What Every Computer Scientist Should Know About Floating-Point
    Arithmetic". It will pay off the time spent in spades.
    I'd expect it to be fast. Probably not as fast as a test for zero, but
    it can be done by simple bit testing.

    However, you say "if affirmative" and the answer to your question is
    "no" so maybe all the rest is moot.
    There are lots of ways. For example, IEEE double precision sub-normal
    numbers have an absolute value less less than DBL_MIN (defined in
    float.h). You can also test normality by looking at the bits. For
    example, a sub-normal IEEE number has zero bits in the exponent field
    and a non-zero fraction.
    The guarantee you want is that a division won't generate NaN or +/-Inf?
    The simplest method is to do the division and test the result, but maybe
    one or more of your systems generates a signal that you want to avoid?
    I think you should a bit more about what you are trying to do.

    It's generally easy to test if you'll get a NaN from the division of
    non-NaN numbers (you only get NaN from 0/0 and the four signed cases of
    Inf/Inf), but pre-testing for Inf is harder.
    Your C library should have float.h and that should define FLT_MIN,
    DBL_MIN and LDBL_MIN but I don't think that helps you directly.

    Ben Bacarisse, Jan 4, 2014
  5. ardi

    Tim Prince Guest

    Maybe you could simply edit the glibc or OpenBSD implementation into
    your working copy of your headers, if you aren't willing to update your
    compiler or run-time library.


    Is your compiler so old that it doesn't implement inline functions?
    That's the kind of background you need to answer your own question about
    speed. Then you may need to use an old-fashioned macro (with its
    concerns about double evaluation of expressions).
    Tim Prince, Jan 4, 2014
  6. ardi

    James Kuyper Guest

    How could that be true? If the mathematical value of y/x were greater
    than DBL_MAX, or smaller than -DBL_MAX, what do you expect the floating
    point value of y/x to be? What you're really trying to do is prevent
    floating point overflow, and a test for isnormal() is not sufficient.
    You must also check whether fabs(x) > fabs(y)/DBL_MAX (assuming that x
    and y are both doubles).

    As far as the C standard is concerned, the accuracy of floating point
    math is entirely implementation-defined, and it explicitly allows the
    implementation-provided definition to be "the accuracy is unknown"
    ( Therefore, a fully conforming implementation of C is
    allowed to implement math that is so inaccurate that DBL_MIN/DBL_MAX >
    DBL_MAX. In practice, you wouldn't be able to sell such an
    implementation to anyone who actually needed to perform floating point
    math - but that issue is outside the scope of the standard.

    However, if an implementation pre-#defines __STDC_IEC_559__, it is
    required to conform to the requirements of Annex F (, which
    are based upon but not completely identical to the requirements of IEC
    60559:1989, which in turn is essentially equivalent to IEEE 754:1985.
    That implies fairly strict requirements on the accuracy; for the most
    part, those requirements are as strict as they reasonably could be.
    The performance is inherently system-specific; for all I know there
    might be floating point chips where isnormal() can be implemented by a
    single floating point instruction; but at the very worst it shouldn't be
    much more complicated than a few mask and shift operations on the bytes
    of a copy of the argument.
    Find a precise definition of the floating point format implemented on
    that machine (which might not fully conform to IEEE requirements), and
    you can then implement isnormal() by performing a few mask and shift
    operations on the individual bytes of the argument.
    If you can find a alternative way of implementing the equivalent of
    isnormal() that is significantly faster than calling the macro provided
    by a given version of the C standard library, then you should NOT use
    that alternative; what you should do is drop that version of the C
    standard library and replace it with one that's better-implemented.
    Yes, that's what FLT_MIN, DBL_MIN, and LDBL_MIN are for.
    It could be safe, if you handle correctly the possibility that the value
    is a NaN. Keep in mind that all comparisons with a NaN fail, so
    x>=DBL_MIN is not the same as !(x<DBL_MIN). If x is a NaN, the first
    expression is false, while the second is true.
    James Kuyper, Jan 4, 2014
  7. ardi

    Tim Prince Guest

    1/x is well-behaved when x is normal (only possible flag raised is
    inexact). That is an important enough consideration to be part of
    IEEE754 design, but not guaranteed in C without IEEE754 (the latter
    being a reasonable expectation of a good quality platform, but there are
    still exceptions). As others pointed out, your goal seems to be well
    beyond that.
    Tim Prince, Jan 4, 2014
  8. That's not always an option. What you should probably do in
    that case is (a) consider carefully whether your faster version
    is actually correct, and (b) contact the maintainers of your
    Keith Thompson, Jan 4, 2014
  9. Yes, but it seems that it might not be so far off for rounding to allow
    fabs(y)/(fabs(y)/DBL_MAX) to overflow, such that your test doesn't
    guarantee no overflow.

    -- glen
    glen herrmannsfeldt, Jan 4, 2014
  10. I haven't looked at IEEE754 in that much detail, but on many floating
    point systems the exponent range is such that the smallest normal
    floating point value will overflow on computing 1/x. If the exponent
    range is symmetric, there is a factor of the base (2 or 10) to
    -- glen
    glen herrmannsfeldt, Jan 4, 2014
  11. ardi

    Tim Prince Guest

    IEEE754 specifically requires an asymmetric range such that 1/TINY(x)
    (Fortran) or 1/FLT_MIN, 1/DBL_MIN, ... don't overflow. At the other
    end, of course, there are large numbers whose reciprocal is sub-normal
    and will "flush-to-zero" with Intel default compiler options. As far as
    I know, CPUs other than Intel(r) Xeon Phi(tm) introduced in the last 3
    years support sub-normal numbers with reasonable efficiency (it was
    decades to fulfill the promise made when the standard was instituted).
    Tim Prince, Jan 4, 2014
  12. (snip, I wrote)
    So that is where the change in bias came from. IEEE754 is pretty similar
    to VAX (other than the byte ordering), but the bias is off by one.

    If you do it in the obvious way, there is one more value of negative
    exponent than positive exponent, and an additional factor of (almost)
    the base between the largest and smallest significand.

    But it tends to take a lot of extra hardware to do denormals fast.
    For people doing floating point in FPGAs, where it is already pretty
    inefficient to generate a floating point add/subtract unit.
    (The barrel shifter for pre/post normalization is huge, compared
    to the actual add/subtract.) Then a lot more logic to get denormals.

    Otherwise, denormals give a fraction of additional bit of exponent
    range for a large additional cost of logic. Better to add one more
    exponent bit instead.

    -- glen
    glen herrmannsfeldt, Jan 5, 2014
  13. ardi

    James Kuyper Guest

    That is a real problem, though a marginal one. The best solution to that
    problem is to allow the overflow to happen, and test for it afterwards,
    but the OP seems uninterested in that option.
    James Kuyper, Jan 5, 2014
  14. What you can do is call the function frexp(). This will give you an exponent.
    Then chuck out any numbers outside or a reasonable range. IEEE doubles have
    11 bits of exponent so, -1024 to 1023. You're highly unlikely to need anything
    bigger or smaller than +/- 100, it's got to be either corrupt data or an
    intermediate in a calculation which was unstable and has lost precision.
    Malcolm McLean, Jan 5, 2014
  15. ardi

    Tim Prince Guest

    This looks attractive from the point of view of not requiring integer
    bit field examination of a memory copy of x. I have a concern about how
    to prevent compiler optimization from breaking it, as that's the point
    of leaving a standard intrinsic to the implementation to know how.
    Tim Prince, Jan 5, 2014
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.