Inconsistent float behaviour on "inf" - current state?

  • Thread starter =?iso-8859-15?Q?Peter_Kn=F6rrich?=
  • Start date
?

=?iso-8859-15?Q?Peter_Kn=F6rrich?=

Hello,

I've found another inconsistency, and looking through the
list archives I can find mentions of funky stuff like

print float('inf')

giving

Infanity

with Python 1.5.2 on Solaris 7 in 2001, and a couple closed bug
reports on why "float('inf')" isn't supported everywhere, and I
understand the underlying problem:

libc simply isn't that good - never mind standard - with regard to floats.

However, seeing how Python does a great job with long ints
by using something better than standard libc, I fail to see how
reliance on libc should be a reason not to provide at least consistency:

Why does conversion from a long constant not use the same
library function that conversion from a long string would use?

a = float(1234....298299300) -> OverflowError: long int too large
to convert to float
a = float('1234....298299300') -> inf, which led to silly errors in
SQL (table has no column 'inf')

a = inf -> NameError: name 'inf' is not defined
a = float('inf') -> inf

Now, I could understand how that may need some changes on lexing
the source code, so I'm not actually opening more bugs having
found workarounds for my own troubles here (quality guys insisting
on entering silly numbers, hah, try that with a 50 character input
field limit!).

Still, I'm interested to hear what the current state of affairs is
on using possibly inconsistent libc for floats?

Bye,

Peter Knörrich
 
S

Scott David Daniels

Peter said:
.... why "float('inf')" isn't supported everywhere, and I
understand the underlying problem:
libc simply isn't that good - never mind standard - with regard to floats.
I think you believe there is an underlying model of floating point
numbers that all machines share. That is not true. Many systems
(even those that support IEEE 754) make distinct choices about what
particular bit patterns in memory represent what particular values.

You can take a pickle of a data structure in CPython on one machine,
carry it to a different machine with a different CPU & FPU and a
different OS, and reasonably expect that the unpickled data structure
running on the CPython on the other machine will produce essentially the
same value. That is _amazing_. You might not have the same number
of bits of precision, you might not have the same range of exponents,
and the base of the exponent might even be different (well, ... perhaps
nobody is running Python on 360s -- I may exaggerate here). This is not
accomplished by lots of clever coding of distinct implementations across
the range of machines CPython runs on. It is accomplished by staying
close to the middle of the road in the C89 standard. And it breaks down
for infinities and the various denormals and NaNs that you can at least
detect in C99.

C89, the standard we go by for CPython, provides pretty good support
for floating point numbers in "reasonable" ranges with "reasonable"
amounts of precision. That is to say, you can calculate with C doubles
and get pretty similar results for the same C program running on quite
different compilers, operating systems, and hardware _if_ you "walk the
middle of the road" (staying away from rough edges, boundary values,
things like calculating sin(1e300) [trigs work better w/in -2pi to 2pi]
and the like. C89 does not mention the Not-A-Number group at all, and
only says that infinity is something that an "implementation may
support." That "may" means that a C89 compiler is free to do nothing
for Infinities, and even if it does something, it it doesn't have to
behave like some other C89 compiler's version of what to do with
Infinities. So it is no wonder that values like plus and minus
infinity "behave funny" across different CPython systems.

In the eighties (when C89 was being worked on), it was not clear whether
IEEE 754 would be _the_ standard or _a_ standard for floating point. It
would have been pointless to force all C implementers (and they included
Lifeboat Systems who did C compilers for Z80s as well as Cray Research,
who did C compilers for slightly larger machines) to specify how to make
positive and negative zeroes, how and when to produce "denormalized"
numbers, how to produce positive and negative infinity (and whether they
are distinct from each other), how create the various signaling and
quiet NaNs when may systems did not have IEEE 754 floating point
hardware.

C is kind of a "glorified assembler," and (as such) it reveals the
hardware it works on. That is why the C expression ((-7) % 4) can be
1 or -1; different hardware solves that problem differently, and C
doesn't want to slow down all divisions on some hardware just to get
a consistent result. If you stick to positive numbers for division,
both kinds of hardware get the same result, and the programmer can
put the test in if he needs it. It would not be in the spirit of C
to dictate a floating point behavior that would require a lot of
code generated on each operation, and without common hardware, the
infinity / denormal / NaN behavior would mean code at every floating
point operation. The C standards body was not interested in that work.

--Scott David Daniels
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top