# Exact integer-valued floats

Discussion in 'Python' started by Steven D'Aprano, Sep 21, 2012.

1. ### Steven D'ApranoGuest

Python floats can represent exact integer values (e.g. 42.0), but above a
certain value (see below), not all integers can be represented. For
example:

py> 1e16 == 1e16 + 1 # no such float as 10000000000000001.0
True
py> 1e16 + 3 == 1e16 + 4 # or 10000000000000003.0
True

So some integers are missing from the floats. For large enough values,
the gap between floats is rather large, and many numbers are missing:

py> 1e200 + 1e10 == 1e200
True

The same applies for large enough negative values.

The question is, what is the largest integer number N such that every
whole number between -N and N inclusive can be represented as a float?

If my tests are correct, that value is 9007199254740992.0 = 2**53.

Have I got this right? Is there a way to work out the gap between one
float and the next?

(I haven't tried to exhaustively check every float because, even at one
nanosecond per number, it will take over 200 days.)

--
Steven

Steven D'Aprano, Sep 21, 2012

2. ### Ian KellyGuest

On Fri, Sep 21, 2012 at 11:29 AM, Steven D'Aprano
<> wrote:
> The question is, what is the largest integer number N such that every
> whole number between -N and N inclusive can be represented as a float?
>
> If my tests are correct, that value is 9007199254740992.0 = 2**53.
>
> Have I got this right? Is there a way to work out the gap between one
> float and the next?

That looks mathematically correct. The "gap" between floats is the
equivalent of a difference of 1 bit in the significand. For a
floating point number represented as (sign * c * 2 ** q), where c is
an integer, the gap between floats is equal to 2 ** q. There are 53
bits of precision in a double-precision float (technically an implicit
1 followed by 52 bits), so q becomes greater than 0 at 2 ** 53.

Cheers,
Ian

Ian Kelly, Sep 21, 2012

3. ### Jussi PiitulainenGuest

Steven D'Aprano writes:

> Python floats can represent exact integer values (e.g. 42.0), but above a
> certain value (see below), not all integers can be represented. For
> example:
>
> py> 1e16 == 1e16 + 1 # no such float as 10000000000000001.0
> True
> py> 1e16 + 3 == 1e16 + 4 # or 10000000000000003.0
> True
>
> So some integers are missing from the floats. For large enough values,
> the gap between floats is rather large, and many numbers are missing:
>
> py> 1e200 + 1e10 == 1e200
> True
>
> The same applies for large enough negative values.
>
> The question is, what is the largest integer number N such that every
> whole number between -N and N inclusive can be represented as a float?
>
> If my tests are correct, that value is 9007199254740992.0 = 2**53.
>
> Have I got this right? Is there a way to work out the gap between one
> float and the next?

There is a way to find the distance between two IEEE floats in "ulps",
or "units in the last position", computable from the bit pattern using
integer arithmetic. I think it's then also possible to find the next

I don't have a link at hand, I'm too tired to search at the moment,
and I'm no expert on floats, but you might find an answer by looking
for ulps.

> (I haven't tried to exhaustively check every float because, even at one
> nanosecond per number, it will take over 200 days.)

Come to think of it, the difference between adjacent floats is exactly
one ulp. Just use the right unit

Jussi Piitulainen, Sep 21, 2012
4. ### NobodyGuest

On Fri, 21 Sep 2012 17:29:13 +0000, Steven D'Aprano wrote:

> The question is, what is the largest integer number N such that every
> whole number between -N and N inclusive can be represented as a float?
>
> If my tests are correct, that value is 9007199254740992.0 = 2**53.
>
> Have I got this right? Is there a way to work out the gap between one
> float and the next?

CPython's "float" type uses C's "double". For a system where C's "double"
is IEEE-754 double precision, N=2**53 is the correct answer.

An IEEE-754 double precision value consists of a 53-bit integer whose
first bit is a "1", multiplied or divided by a power of two.

http://en.wikipedia.org/wiki/IEEE_754-1985

The largest 53-bit integer is 2**53-1. 2**53 can be represented as
2**52 * 2**1. 2**53+1 cannot be represented in this form. 2**53+2 can be
represented as (2**52+1) * 2**1.

For values x where 2**52 <= x < 2**53, the the interval between
representable values (aka Unit in the Last Place or ULP) is 1.0.
For 2**51 <= x < 2**52, the ULP is 0.5.
For 2**53 <= x < 2**54, the ULP is 2.0.
And so on.

Nobody, Sep 21, 2012
5. ### Dennis Lee BieberGuest

On 21 Sep 2012 17:29:13 GMT, Steven D'Aprano
<> declaimed the following in
gmane.comp.python.general:

>
> The question is, what is the largest integer number N such that every
> whole number between -N and N inclusive can be represented as a float?
>

Single precision commonly has 7 significant (decimal) digits. Double
precision runs somewhere between 13 and 15 (decimal) significant digits

> If my tests are correct, that value is 9007199254740992.0 = 2**53.
>

For an encoding of a double precision using one sign bit and an
8-bit exponent, you have 53 bits available for the mantissa. This
ignores the possibility of an implied msb in the mantissa (encodings
which normalize to put the leading 1-bit at the msb can on some machines
remove that 1-bit and shift the mantissa one more place; effectively
giving a 54-bit mantissa). Something like an old XDS Sigma-6 used
non-binary exponents (exponent was in power of 16 <> 2^4) and used
"non-normalized" mantissa -- the mantissa could have up to three leading
0-bits); this affected the decimal significance...

--
Wulfraed Dennis Lee Bieber AF6VN
HTTP://wlfraed.home.netcom.com/

Dennis Lee Bieber, Sep 21, 2012
6. ### Hans MulderGuest

On 21/09/12 22:26:26, Dennis Lee Bieber wrote:
> On 21 Sep 2012 17:29:13 GMT, Steven D'Aprano
> <> declaimed the following in
> gmane.comp.python.general:
>
>>
>> The question is, what is the largest integer number N such that every
>> whole number between -N and N inclusive can be represented as a float?
>>

> Single precision commonly has 7 significant (decimal) digits. Double
> precision runs somewhere between 13 and 15 (decimal) significant digits
>
>> If my tests are correct, that value is 9007199254740992.0 = 2**53.

The expression 2 / sys.float_info.epsilon produces exactly that
number. That's probably not a coincidence.

> For an encoding of a double precision using one sign bit and an
> 8-bit exponent, you have 53 bits available for the mantissa.

If your floats have 64 bits, and you use 1 bit for the sign and 8 for
the exponent, you'll have 55 bits available for the mantissa.

> This
> ignores the possibility of an implied msb in the mantissa (encodings
> which normalize to put the leading 1-bit at the msb can on some machines
> remove that 1-bit and shift the mantissa one more place; effectively
> giving a 54-bit mantissa).

My machine has 64-bits floats, using 1 bit for the sign, 11 for the
exponent, leaving 52 for the mantissa. The mantissa has an implied
leading 1, so it's nominally 53 bits.

You can find this number in sys.float_info.mant_dig

> Something like an old XDS Sigma-6 used
> non-binary exponents (exponent was in power of 16 <> 2^4) and used
> "non-normalized" mantissa -- the mantissa could have up to three leading
> 0-bits); this affected the decimal significance...

Hope this helps,

-- HansM

Hans Mulder, Sep 21, 2012
7. ### Paul RubinGuest

Steven D'Aprano <> writes:
> Have I got this right? Is there a way to work out the gap between one
> float and the next?

Yes, 53-bit mantissa as people have mentioned. That tells you what ints
can be exactly represented. But, arithmetic in some situations can have
a 1-ulp error. So I wonder if it's possible that if n is large enough,
you might have something like n+1==n even if the integers n and n+1 have
distinct floating point representations.

Paul Rubin, Sep 21, 2012
8. ### Dennis Lee BieberGuest

On Fri, 21 Sep 2012 23:04:14 +0200, Hans Mulder <>
declaimed the following in gmane.comp.python.general:

> On 21/09/12 22:26:26, Dennis Lee Bieber wrote:

>
> > For an encoding of a double precision using one sign bit and an
> > 8-bit exponent, you have 53 bits available for the mantissa.

>
> If your floats have 64 bits, and you use 1 bit for the sign and 8 for
> the exponent, you'll have 55 bits available for the mantissa.
>

Mea Culpa -- doing mental arithmetic too fast

> > This
> > ignores the possibility of an implied msb in the mantissa (encodings
> > which normalize to put the leading 1-bit at the msb can on some machines
> > remove that 1-bit and shift the mantissa one more place; effectively
> > giving a 54-bit mantissa).

>
> My machine has 64-bits floats, using 1 bit for the sign, 11 for the
> exponent, leaving 52 for the mantissa. The mantissa has an implied
> leading 1, so it's nominally 53 bits.
>
> You can find this number in sys.float_info.mant_dig
>
> > Something like an old XDS Sigma-6 used
> > non-binary exponents (exponent was in power of 16 <> 2^4) and used
> > "non-normalized" mantissa -- the mantissa could have up to three leading
> > 0-bits); this affected the decimal significance...

>
>
>
> Hope this helps,
>
> -- HansM

--
Wulfraed Dennis Lee Bieber AF6VN
HTTP://wlfraed.home.netcom.com/

Dennis Lee Bieber, Sep 22, 2012
9. ### Steven D'ApranoGuest

On Fri, 21 Sep 2012 15:23:41 -0700, Paul Rubin wrote:

> Steven D'Aprano <> writes:
>> Have I got this right? Is there a way to work out the gap between one
>> float and the next?

>
> Yes, 53-bit mantissa as people have mentioned. That tells you what ints
> can be exactly represented. But, arithmetic in some situations can have
> a 1-ulp error. So I wonder if it's possible that if n is large enough,
> you might have something like n+1==n even if the integers n and n+1 have
> distinct floating point representations.

I don't think that is possible for IEEE 754 floats, where integer
arithmetic is exact. But I'm not entirely sure, which is why I asked.

For non IEEE 754 floating point systems, there is no telling how bad the
implementation could be

--
Steven

Steven D'Aprano, Sep 22, 2012
10. ### Dennis Lee BieberGuest

On 22 Sep 2012 01:36:59 GMT, Steven D'Aprano
<> declaimed the following in
gmane.comp.python.general:

>
> For non IEEE 754 floating point systems, there is no telling how bad the
> implementation could be

Let's see what can be found...

http://www.bitsavers.org/pdf/sds/sigma/sigma6/901713B_Sigma_6_Reference_Man_Jun71.pdf
pages 50-54
A sign bit, 7-bit offset base 16 exponent/characteristic, 24-bit
mantissa/fraction (for short; long float is 56 bit mantissa).

IBM 360: Same as Sigma-6 (no surprise; hearsay is the Sigma was
designed by renegade IBM folk; even down to using EBCDIC internally --
but with a much different interrupt system [224 individual interrupt
vectors as I recall, vs the IBM's 7 vectors and polling to find what
device]).

Motorola Fast Floating Point (software library for the Amiga, and
apparently also used on early Palm units)
Sign bit, 7-bit binary exponent, 24-bit mantissa

VAX http://nssdc.gsfc.nasa.gov/nssdc/formats/VAXFloatingPoint.htm
(really nasty looking as bits 6 is most significant, running down to bit
0, THEN bits 32-16 with 16 the least significant; extend for longer
formats)
F-float: sign bit, 8-bit exponent, 23 bits mantissa
D-float: as above but 55 bits mantissa
G-float: sign bit, 11-bit exponent, 52 bits mantissa
H-float: sign bit, 15-bit exponent, 112 bits mantissa

--
Wulfraed Dennis Lee Bieber AF6VN
HTTP://wlfraed.home.netcom.com/

Dennis Lee Bieber, Sep 22, 2012
11. ### NobodyGuest

On Fri, 21 Sep 2012 15:23:41 -0700, Paul Rubin wrote:

> Steven D'Aprano <> writes:
>> Have I got this right? Is there a way to work out the gap between one
>> float and the next?

>
> Yes, 53-bit mantissa as people have mentioned. That tells you what ints
> can be exactly represented. But, arithmetic in some situations can have a
> 1-ulp error. So I wonder if it's possible that if n is large enough, you
> might have something like n+1==n even if the integers n and n+1 have
> distinct floating point representations.

Not for IEEE-754. Or for any sane implementation, for that matter. OTOH,
you can potentially get n != n due to the use of extended precision for
intermediate results.

For IEEE-754, addition, subtraction, multiplication, division, remainder
and square root are "exact" in the sense that the result is as if the
arithmetic had been performed with an infinite number of bits then rounded
afterwards. For round-to-nearest, the result will be the closest
representable value to the exact value.

Transcendental functions suffer from the "table-maker's dilemma", and the
result will be one of the two closest representable values to the exact
value, but not necessarily *the* closest.

Nobody, Sep 22, 2012
12. ### Dave AngelGuest

On 09/22/2012 05:05 PM, Tim Roberts wrote:
> Dennis Lee Bieber <> wrote:
>> On 22 Sep 2012 01:36:59 GMT, Steven D'Aprano wrote:
>>> For non IEEE 754 floating point systems, there is no telling how bad the
>>> implementation could be

>> Let's see what can be found...
>>
>> IBM 360: Same as Sigma-6 (no surprise; hearsay is the Sigma was
>> designed by renegade IBM folk; even down to using EBCDIC internally --
>> but with a much different interrupt system [224 individual interrupt
>> vectors as I recall, vs the IBM's 7 vectors and polling to find what
>> device]).

> The Control Data 6000/Cyber series had sign bit and 11-bit exponent, with
> either a 48-bit mantissa or a 96-bit mantissa, packed into one or two
> 60-bit words. Values were not automatically normalized, so there was no
> assumed 1 bit, as in IEEE-754.

And it's been a long time (about 39 years), but as I recall the CDC 6400
(at least) had no integer multiply or divide. You had to convert to
float first. The other oddity about the CDC series is it's the last
machine I've encountered that used ones-complement for ints, with two
values for zero.

--

DaveA

Dave Angel, Sep 23, 2012
13. ### Hans MulderGuest

On 23/09/12 01:06:08, Dave Angel wrote:
> On 09/22/2012 05:05 PM, Tim Roberts wrote:
>> Dennis Lee Bieber <> wrote:
>>> On 22 Sep 2012 01:36:59 GMT, Steven D'Aprano wrote:
>>>> For non IEEE 754 floating point systems, there is no telling how bad the
>>>> implementation could be
>>> Let's see what can be found...
>>>
>>> IBM 360: Same as Sigma-6 (no surprise; hearsay is the Sigma was
>>> designed by renegade IBM folk; even down to using EBCDIC internally --
>>> but with a much different interrupt system [224 individual interrupt
>>> vectors as I recall, vs the IBM's 7 vectors and polling to find what
>>> device]).

>> The Control Data 6000/Cyber series had sign bit and 11-bit exponent, with
>> either a 48-bit mantissa or a 96-bit mantissa, packed into one or two
>> 60-bit words. Values were not automatically normalized, so there was no
>> assumed 1 bit, as in IEEE-754.

>
> And it's been a long time (about 39 years), but as I recall the CDC 6400
> (at least) had no integer multiply or divide. You had to convert to
> float first.

You didn't have to convert if your ints would fit in 48 bits.
If that was the case, then using the float multiply and divide
instructions would do the trick.

> The other oddity about the CDC series is it's the last machine I've
> encountered that used ones-complement for ints, with two values for zero.

Floats still have 0.0 and -0.0, even in IEEE-754.

Was Python ever ported to such unusual hardware?

-- HansM

Hans Mulder, Sep 23, 2012