Exact integer-valued floats

Discussion in 'Python' started by Steven D'Aprano, Sep 21, 2012.

  1. Python floats can represent exact integer values (e.g. 42.0), but above a
    certain value (see below), not all integers can be represented. For
    example:

    py> 1e16 == 1e16 + 1 # no such float as 10000000000000001.0
    True
    py> 1e16 + 3 == 1e16 + 4 # or 10000000000000003.0
    True

    So some integers are missing from the floats. For large enough values,
    the gap between floats is rather large, and many numbers are missing:

    py> 1e200 + 1e10 == 1e200
    True

    The same applies for large enough negative values.

    The question is, what is the largest integer number N such that every
    whole number between -N and N inclusive can be represented as a float?

    If my tests are correct, that value is 9007199254740992.0 = 2**53.

    Have I got this right? Is there a way to work out the gap between one
    float and the next?

    (I haven't tried to exhaustively check every float because, even at one
    nanosecond per number, it will take over 200 days.)


    --
    Steven
    Steven D'Aprano, Sep 21, 2012
    #1
    1. Advertising

  2. Steven D'Aprano

    Ian Kelly Guest

    On Fri, Sep 21, 2012 at 11:29 AM, Steven D'Aprano
    <> wrote:
    > The question is, what is the largest integer number N such that every
    > whole number between -N and N inclusive can be represented as a float?
    >
    > If my tests are correct, that value is 9007199254740992.0 = 2**53.
    >
    > Have I got this right? Is there a way to work out the gap between one
    > float and the next?


    That looks mathematically correct. The "gap" between floats is the
    equivalent of a difference of 1 bit in the significand. For a
    floating point number represented as (sign * c * 2 ** q), where c is
    an integer, the gap between floats is equal to 2 ** q. There are 53
    bits of precision in a double-precision float (technically an implicit
    1 followed by 52 bits), so q becomes greater than 0 at 2 ** 53.

    Cheers,
    Ian
    Ian Kelly, Sep 21, 2012
    #2
    1. Advertising

  3. Steven D'Aprano writes:

    > Python floats can represent exact integer values (e.g. 42.0), but above a
    > certain value (see below), not all integers can be represented. For
    > example:
    >
    > py> 1e16 == 1e16 + 1 # no such float as 10000000000000001.0
    > True
    > py> 1e16 + 3 == 1e16 + 4 # or 10000000000000003.0
    > True
    >
    > So some integers are missing from the floats. For large enough values,
    > the gap between floats is rather large, and many numbers are missing:
    >
    > py> 1e200 + 1e10 == 1e200
    > True
    >
    > The same applies for large enough negative values.
    >
    > The question is, what is the largest integer number N such that every
    > whole number between -N and N inclusive can be represented as a float?
    >
    > If my tests are correct, that value is 9007199254740992.0 = 2**53.
    >
    > Have I got this right? Is there a way to work out the gap between one
    > float and the next?


    There is a way to find the distance between two IEEE floats in "ulps",
    or "units in the last position", computable from the bit pattern using
    integer arithmetic. I think it's then also possible to find the next
    float by adding one.

    I don't have a link at hand, I'm too tired to search at the moment,
    and I'm no expert on floats, but you might find an answer by looking
    for ulps.

    > (I haven't tried to exhaustively check every float because, even at one
    > nanosecond per number, it will take over 200 days.)


    Come to think of it, the difference between adjacent floats is exactly
    one ulp. Just use the right unit :)
    Jussi Piitulainen, Sep 21, 2012
    #3
  4. Steven D'Aprano

    Nobody Guest

    On Fri, 21 Sep 2012 17:29:13 +0000, Steven D'Aprano wrote:

    > The question is, what is the largest integer number N such that every
    > whole number between -N and N inclusive can be represented as a float?
    >
    > If my tests are correct, that value is 9007199254740992.0 = 2**53.
    >
    > Have I got this right? Is there a way to work out the gap between one
    > float and the next?


    CPython's "float" type uses C's "double". For a system where C's "double"
    is IEEE-754 double precision, N=2**53 is the correct answer.

    An IEEE-754 double precision value consists of a 53-bit integer whose
    first bit is a "1", multiplied or divided by a power of two.

    http://en.wikipedia.org/wiki/IEEE_754-1985

    The largest 53-bit integer is 2**53-1. 2**53 can be represented as
    2**52 * 2**1. 2**53+1 cannot be represented in this form. 2**53+2 can be
    represented as (2**52+1) * 2**1.

    For values x where 2**52 <= x < 2**53, the the interval between
    representable values (aka Unit in the Last Place or ULP) is 1.0.
    For 2**51 <= x < 2**52, the ULP is 0.5.
    For 2**53 <= x < 2**54, the ULP is 2.0.
    And so on.
    Nobody, Sep 21, 2012
    #4
  5. On 21 Sep 2012 17:29:13 GMT, Steven D'Aprano
    <> declaimed the following in
    gmane.comp.python.general:

    >
    > The question is, what is the largest integer number N such that every
    > whole number between -N and N inclusive can be represented as a float?
    >

    Single precision commonly has 7 significant (decimal) digits. Double
    precision runs somewhere between 13 and 15 (decimal) significant digits

    > If my tests are correct, that value is 9007199254740992.0 = 2**53.
    >


    For an encoding of a double precision using one sign bit and an
    8-bit exponent, you have 53 bits available for the mantissa. This
    ignores the possibility of an implied msb in the mantissa (encodings
    which normalize to put the leading 1-bit at the msb can on some machines
    remove that 1-bit and shift the mantissa one more place; effectively
    giving a 54-bit mantissa). Something like an old XDS Sigma-6 used
    non-binary exponents (exponent was in power of 16 <> 2^4) and used
    "non-normalized" mantissa -- the mantissa could have up to three leading
    0-bits); this affected the decimal significance...


    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Sep 21, 2012
    #5
  6. Steven D'Aprano

    Hans Mulder Guest

    On 21/09/12 22:26:26, Dennis Lee Bieber wrote:
    > On 21 Sep 2012 17:29:13 GMT, Steven D'Aprano
    > <> declaimed the following in
    > gmane.comp.python.general:
    >
    >>
    >> The question is, what is the largest integer number N such that every
    >> whole number between -N and N inclusive can be represented as a float?
    >>

    > Single precision commonly has 7 significant (decimal) digits. Double
    > precision runs somewhere between 13 and 15 (decimal) significant digits
    >
    >> If my tests are correct, that value is 9007199254740992.0 = 2**53.


    The expression 2 / sys.float_info.epsilon produces exactly that
    number. That's probably not a coincidence.

    > For an encoding of a double precision using one sign bit and an
    > 8-bit exponent, you have 53 bits available for the mantissa.


    If your floats have 64 bits, and you use 1 bit for the sign and 8 for
    the exponent, you'll have 55 bits available for the mantissa.

    > This
    > ignores the possibility of an implied msb in the mantissa (encodings
    > which normalize to put the leading 1-bit at the msb can on some machines
    > remove that 1-bit and shift the mantissa one more place; effectively
    > giving a 54-bit mantissa).


    My machine has 64-bits floats, using 1 bit for the sign, 11 for the
    exponent, leaving 52 for the mantissa. The mantissa has an implied
    leading 1, so it's nominally 53 bits.

    You can find this number in sys.float_info.mant_dig

    > Something like an old XDS Sigma-6 used
    > non-binary exponents (exponent was in power of 16 <> 2^4) and used
    > "non-normalized" mantissa -- the mantissa could have up to three leading
    > 0-bits); this affected the decimal significance...


    Your Sigma-6 must have sys.float_info.radix == 16 then.


    Hope this helps,

    -- HansM
    Hans Mulder, Sep 21, 2012
    #6
  7. Steven D'Aprano

    Paul Rubin Guest

    Steven D'Aprano <> writes:
    > Have I got this right? Is there a way to work out the gap between one
    > float and the next?


    Yes, 53-bit mantissa as people have mentioned. That tells you what ints
    can be exactly represented. But, arithmetic in some situations can have
    a 1-ulp error. So I wonder if it's possible that if n is large enough,
    you might have something like n+1==n even if the integers n and n+1 have
    distinct floating point representations.
    Paul Rubin, Sep 21, 2012
    #7
  8. On Fri, 21 Sep 2012 23:04:14 +0200, Hans Mulder <>
    declaimed the following in gmane.comp.python.general:

    > On 21/09/12 22:26:26, Dennis Lee Bieber wrote:


    >
    > > For an encoding of a double precision using one sign bit and an
    > > 8-bit exponent, you have 53 bits available for the mantissa.

    >
    > If your floats have 64 bits, and you use 1 bit for the sign and 8 for
    > the exponent, you'll have 55 bits available for the mantissa.
    >

    Mea Culpa -- doing mental arithmetic too fast

    > > This
    > > ignores the possibility of an implied msb in the mantissa (encodings
    > > which normalize to put the leading 1-bit at the msb can on some machines
    > > remove that 1-bit and shift the mantissa one more place; effectively
    > > giving a 54-bit mantissa).

    >
    > My machine has 64-bits floats, using 1 bit for the sign, 11 for the
    > exponent, leaving 52 for the mantissa. The mantissa has an implied
    > leading 1, so it's nominally 53 bits.
    >
    > You can find this number in sys.float_info.mant_dig
    >
    > > Something like an old XDS Sigma-6 used
    > > non-binary exponents (exponent was in power of 16 <> 2^4) and used
    > > "non-normalized" mantissa -- the mantissa could have up to three leading
    > > 0-bits); this affected the decimal significance...

    >
    > Your Sigma-6 must have sys.float_info.radix == 16 then.
    >
    >
    > Hope this helps,
    >
    > -- HansM

    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Sep 22, 2012
    #8
  9. On Fri, 21 Sep 2012 15:23:41 -0700, Paul Rubin wrote:

    > Steven D'Aprano <> writes:
    >> Have I got this right? Is there a way to work out the gap between one
    >> float and the next?

    >
    > Yes, 53-bit mantissa as people have mentioned. That tells you what ints
    > can be exactly represented. But, arithmetic in some situations can have
    > a 1-ulp error. So I wonder if it's possible that if n is large enough,
    > you might have something like n+1==n even if the integers n and n+1 have
    > distinct floating point representations.


    I don't think that is possible for IEEE 754 floats, where integer
    arithmetic is exact. But I'm not entirely sure, which is why I asked.

    For non IEEE 754 floating point systems, there is no telling how bad the
    implementation could be :(


    --
    Steven
    Steven D'Aprano, Sep 22, 2012
    #9
  10. On 22 Sep 2012 01:36:59 GMT, Steven D'Aprano
    <> declaimed the following in
    gmane.comp.python.general:

    >
    > For non IEEE 754 floating point systems, there is no telling how bad the
    > implementation could be :(


    Let's see what can be found...

    http://www.bitsavers.org/pdf/sds/sigma/sigma6/901713B_Sigma_6_Reference_Man_Jun71.pdf
    pages 50-54
    A sign bit, 7-bit offset base 16 exponent/characteristic, 24-bit
    mantissa/fraction (for short; long float is 56 bit mantissa).

    IBM 360: Same as Sigma-6 (no surprise; hearsay is the Sigma was
    designed by renegade IBM folk; even down to using EBCDIC internally --
    but with a much different interrupt system [224 individual interrupt
    vectors as I recall, vs the IBM's 7 vectors and polling to find what
    device]).



    Motorola Fast Floating Point (software library for the Amiga, and
    apparently also used on early Palm units)
    Sign bit, 7-bit binary exponent, 24-bit mantissa


    VAX http://nssdc.gsfc.nasa.gov/nssdc/formats/VAXFloatingPoint.htm
    (really nasty looking as bits 6 is most significant, running down to bit
    0, THEN bits 32-16 with 16 the least significant; extend for longer
    formats)
    F-float: sign bit, 8-bit exponent, 23 bits mantissa
    D-float: as above but 55 bits mantissa
    G-float: sign bit, 11-bit exponent, 52 bits mantissa
    H-float: sign bit, 15-bit exponent, 112 bits mantissa

    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Sep 22, 2012
    #10
  11. Steven D'Aprano

    Nobody Guest

    On Fri, 21 Sep 2012 15:23:41 -0700, Paul Rubin wrote:

    > Steven D'Aprano <> writes:
    >> Have I got this right? Is there a way to work out the gap between one
    >> float and the next?

    >
    > Yes, 53-bit mantissa as people have mentioned. That tells you what ints
    > can be exactly represented. But, arithmetic in some situations can have a
    > 1-ulp error. So I wonder if it's possible that if n is large enough, you
    > might have something like n+1==n even if the integers n and n+1 have
    > distinct floating point representations.


    Not for IEEE-754. Or for any sane implementation, for that matter. OTOH,
    you can potentially get n != n due to the use of extended precision for
    intermediate results.

    For IEEE-754, addition, subtraction, multiplication, division, remainder
    and square root are "exact" in the sense that the result is as if the
    arithmetic had been performed with an infinite number of bits then rounded
    afterwards. For round-to-nearest, the result will be the closest
    representable value to the exact value.

    Transcendental functions suffer from the "table-maker's dilemma", and the
    result will be one of the two closest representable values to the exact
    value, but not necessarily *the* closest.
    Nobody, Sep 22, 2012
    #11
  12. Steven D'Aprano

    Dave Angel Guest

    On 09/22/2012 05:05 PM, Tim Roberts wrote:
    > Dennis Lee Bieber <> wrote:
    >> On 22 Sep 2012 01:36:59 GMT, Steven D'Aprano wrote:
    >>> For non IEEE 754 floating point systems, there is no telling how bad the
    >>> implementation could be :(

    >> Let's see what can be found...
    >>
    >> IBM 360: Same as Sigma-6 (no surprise; hearsay is the Sigma was
    >> designed by renegade IBM folk; even down to using EBCDIC internally --
    >> but with a much different interrupt system [224 individual interrupt
    >> vectors as I recall, vs the IBM's 7 vectors and polling to find what
    >> device]).

    > The Control Data 6000/Cyber series had sign bit and 11-bit exponent, with
    > either a 48-bit mantissa or a 96-bit mantissa, packed into one or two
    > 60-bit words. Values were not automatically normalized, so there was no
    > assumed 1 bit, as in IEEE-754.


    And it's been a long time (about 39 years), but as I recall the CDC 6400
    (at least) had no integer multiply or divide. You had to convert to
    float first. The other oddity about the CDC series is it's the last
    machine I've encountered that used ones-complement for ints, with two
    values for zero.


    --

    DaveA
    Dave Angel, Sep 23, 2012
    #12
  13. Steven D'Aprano

    Hans Mulder Guest

    On 23/09/12 01:06:08, Dave Angel wrote:
    > On 09/22/2012 05:05 PM, Tim Roberts wrote:
    >> Dennis Lee Bieber <> wrote:
    >>> On 22 Sep 2012 01:36:59 GMT, Steven D'Aprano wrote:
    >>>> For non IEEE 754 floating point systems, there is no telling how bad the
    >>>> implementation could be :(
    >>> Let's see what can be found...
    >>>
    >>> IBM 360: Same as Sigma-6 (no surprise; hearsay is the Sigma was
    >>> designed by renegade IBM folk; even down to using EBCDIC internally --
    >>> but with a much different interrupt system [224 individual interrupt
    >>> vectors as I recall, vs the IBM's 7 vectors and polling to find what
    >>> device]).

    >> The Control Data 6000/Cyber series had sign bit and 11-bit exponent, with
    >> either a 48-bit mantissa or a 96-bit mantissa, packed into one or two
    >> 60-bit words. Values were not automatically normalized, so there was no
    >> assumed 1 bit, as in IEEE-754.

    >
    > And it's been a long time (about 39 years), but as I recall the CDC 6400
    > (at least) had no integer multiply or divide. You had to convert to
    > float first.


    You didn't have to convert if your ints would fit in 48 bits.
    If that was the case, then using the float multiply and divide
    instructions would do the trick.

    > The other oddity about the CDC series is it's the last machine I've
    > encountered that used ones-complement for ints, with two values for zero.


    Floats still have 0.0 and -0.0, even in IEEE-754.


    Was Python ever ported to such unusual hardware?


    -- HansM
    Hans Mulder, Sep 23, 2012
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Guest
    Replies:
    1
    Views:
    650
    Sajan
    Feb 4, 2004
  2. Guru Prasad
    Replies:
    2
    Views:
    987
    Lokesh Kanna
    Jan 30, 2014
  3. John Kooistra

    Multiple Valued Variables in XSLT

    John Kooistra, Jul 2, 2004, in forum: XML
    Replies:
    3
    Views:
    620
    Philippe Poulard
    Jul 2, 2004
  4. Kosio

    Floats to chars and chars to floats

    Kosio, Sep 16, 2005, in forum: C Programming
    Replies:
    44
    Views:
    1,251
    Tim Rentsch
    Sep 23, 2005
  5. Tim Roberts

    Re: Exact integer-valued floats

    Tim Roberts, Sep 23, 2012, in forum: Python
    Replies:
    0
    Views:
    153
    Tim Roberts
    Sep 23, 2012
Loading...

Share This Page