4 hundred quadrillonth?

D

Dave Angel

Lawrence said:
In message <[email protected]>, Christian
Heimes wrote:



It used to be worse in the days before IEEE 754 became widespread. Anybody
remember a certain Prof William Kahan from Berkeley, and the foreword he
wrote to the Apple Numerics Manual, 2nd Edition, published in 1988? It's
such a classic piece that I think it should be posted somewhere...
I remember the professor. He was responsible for large parts of the
Intel 8087 specification, which later got mostly codified as IEEE 754.
In those days, the 8087 was a couple hundred extra dollars, so few
machines had one. And the software simulation was horribly slow (on a
4.7 mhz machine). So most compilers would have two math libraries. If
you wanted 8087 equivalence, and the hardware wasn't there, it was dog
slow. On the other hand, if you specified the other math package, it
didn't benefit at all from the presence of the 8087.
 
L

Lawrence D'Oliveiro

Erik Max Francis said:
I only see used versions of it available for purchase. Care to hum a
few bars?

Part I of this book is mainly for people who perform scientific,
statistical, or engineering computations on Apple® computers. The rest is
mainly for producers of software, especially of language processors, that
people will use on Apple computers to perform computations in those fields
and in finance and business too. Moreover, if the first edition was any
indication, people who have nothing to do with Apple computers may well buy
this book just to learn a little about an arcane subject, floating-point
arithmetic on computers, and will wish they had an Apple.

Computer arithmetic has two properties that add to its mystery:

* What you see is often not what you get, and
* What you get is sometimes not what you wanted.

Floating-point arithmetic, the kind computers use for protracted work with
approximate data, is intrinsically approximate because the alternative,
exact arithmetic, could take longer than most people are willing to wait--
perhaps forever. Approximate results are customarily displayed or printed to
show only as many of their leading digits as matter instead of all digits;
what you see need not be exactly what you've got. To complicate matters,
whatever digits you see are /decimal/ digits, the kind you saw first in
school and the kind used in hand-held calculators. Nowadays almost no
computers perform their arithmetic with decimal digits; most of them use
/binary/, which is mathematically better than decimal where they differ, but
different nonetheless. So, unless you have a small integer, what you see is
rarely just what you have.

In the mid 1960's, computer architects discovered shortcuts that made
arithmetic run faster at the cost of what they reckoned to be a slight
increase in the level of rounding error; they thought you could not object
to slight alterations in the rightmost digits of numbers since you could not
see those digits anyway. They had the best intentions, but they accomplished
the opposite of what they intended. Computer throughputs were not improved
perceptibly by those shortcuts, but a few programs that had previously been
trusted unreservedly turned treacherous, failing in mysterious ways on
extremely rare occasions.

For instance, a very Important Bunch of Machines launched in 1964 were found
to have two anomalies in their double-precision arithmetic (though not in
single): First, multiplying a number /Z/ by 1.0 would lop off /Z/'s last
digit. Second, the difference between two nearly equal numbers, whose digits
mostly canceled, could be computed wrong by a factor almost as big as 16
instead of being computed exactly as is normal. The anomalies introduced a
kind of noise in the feedback loops by which some programs had compensated
for their own rounding errors, so those programs lost their high accuracies.
These anomalies were not "bugs"; they were "features" designed into the
arithmetic by designers who thought nobody would care. Customers did care;
the arithmetic was redesigned and repairs were retrofitted in 1967.

Not all Capriciously Designed Computer arithmetics have been repaired. One
family of computers has enjoyed notoriety for two decades by allowing
programs to generate tiny "partially underflowed" numbers. When one of these
creatures turns up as the value of /T/ in an otherwise innocuous statement
like

if T = 0.0 then Q := 0.0 else Q := 702345.6 / (T + 0.00189 / T);

it causes the computer to stop execution and emit a message alleging
"Division by Zero". The machine's schizophrenic attitude toward zero comes
about because the test for T = 0.0 is carried out by the adder, which
examines at least 13 of /T/'s leading digits, whereas the divider and
multiplier examine only 12 to recognize zero. Doing so saved less than a
dollar's worth of transistors and maybe a picosecond of time, but at the
cost of some disagreement about whether a very tiny number /T/ is zero or
not. Fortunately, the divider agrees with the multiplier about whether /T/
is zero, so programmers could prevent spurious divisions by zero by slightly
altering the foregoing statement as follows:

if 1.0 * T = 0.0 then Q := 0.0 else Q := 702345.6 / (T + 0.00189 / T);

Unfortunately, the Same Computer designer responsible for "partial
underflow" designed another machine that can generate "partially
underflowed" numbers /T/ for which this statement malfunctions. On that
machine, /Q/ would be computed unexceptionably except that the product 1.0 *
T causes the machine to stop and emit a message alleging "Overflow". How
should a programmer rewrite that innocuous statement so that it will work
correctly on both machines? We should be thankful that such a task is not
encountered every day.

Anomalies related to roundoff are extremely difficult to diagnose. For
instance, the machine on which 1.0 * T can overflow also divides in a
peculiar way that causes quotients like 240.0 / 80.0, which ought to produce
small integers, sometimes to produce nonintegers instead, sometimes slightly
too big, sometimes slightly too small. The same machine multiplies in a
peculiar way, and it subtracts in a peculiar way that can get the difference
wrong by almost a factor of 2 when it ought to be exact because of
cancellation.

Another peculiar kind of subtraction, but different, afflicts the machines
that are schizophrenic about zero. Sets of three values /X/, /Y/ and /Z/
abound for which the statement

if (X = Y) and ((X - Z) > (Y - Z)) then writeln('Strange!');

will print "Strange!" on those machines. And many machines will print
"Strange!" for unlucky values /X/ and /Y/ in the statement

if (X - Y = 0.0) and (X > Y) then writeln('Strange!');

because of underflow.

/These strange things cannot happen on current Apple computers./

I do not wish to suggest that all but Apple computers have had quirky
arithmetics. A few other computer companies, some Highly Prestigious, have
Demonstrated Exemplary Concern for arithmetic integrity over many years. Had
their concern been shared more widely, numerical computation would now be
easier to understand. Instead, because so many computers in the 1960's and
1970's possessed so many different arithmetic anomalies, computational lore
has become encumbered with a vast body of superstition purporting to cope
with them. One such superstitious rule is "/Never/ ask whether floating-
point numbers are exactly equal".

Presumably the reasonable thing to do instead is to ask whether the numbers
differ by less than some tolerance; and this /is/ truly reasonable provided
you know what tolerance to choose. But the word /never/ is what turns the
rule from reasonable into mere superstition. Even if every floating-point
comparison in your program involved a tolerance, you would wish to predict
which path execution would follow from various input data, and whether the
different comparisons were mutually consistent. For instance, the predicates
X < Y - TOL and Y - TOL > X seem equivalent to the naked eye, but computers
exist (/not/ made by Apple!) on which one can be true and the other false
for certain values of the variables. To ask "Which?" violates the
superstitious rule.

There have been several attempts to avoid superstition by devising
mathematical rules called /axioms/ that would be valid for all commercially
significant computers and from which a programmer might hope to be able to
deduce whether his program will function correctly on all those computers.
Unfortunately, such attempts cannot succeed without failing! The paradox
arises because any such rules, to be valid universally, have to encompass so
wide a range of anomalies as to constitute the specifications for a
hypothetical computer far worse arithmetically than any ever actually built.
In consequence, many computations provably impossible on that hypothetical
computer would be quite feasible on almost every actual computer. For
instance, the axioms must imply limits to the accuracy with which
differential equations can be solved, integrals evaluated, infinite series
summed, and areas of triangles calculated; but these limits are routinely
surpassed nowadays by programs that run on most commercially significant
computers, although some computers may require programs that are so special
that they would be useless on any other machine.

Arithmetic anarchy is where we seemed headed until a decade ago when work
began upon IEEE Standard 754 for binary floating-point arithmetic. Apple's
mathematicians and engineers helped from the very beginning. The resulting
family of coherent designs for computer arithmetic has been adopted more
widely, and by more computer manufacturers, than any other single design.
Besides the undoubted benefits that flow from any standard, the principal
benefit derived from the IEEE standard in particular is this:

/Program importability:/ Almost any application of floating-point
arithmetic, designed to work on a few different families of computers in
existence before the IEEE Standard and programmed in a higher-level
language, will, after recompilation, work at least about as well on an Apple
computer or on any other machine that conforms to IEEE Standard 754 as on
any nonconforming computer with comparable capacity (memory, speed, and word
size).

The Standard Apple Numerics Environment (SANE) is the most thorough
implementation of IEEE Standard 754 to date. The fanatical attention to
detail that permeates SANE's implementation largely relieves Apple computer
users from having to know any more about those details than they like. If
you come to an Apple computer from some other computer that you were fond
of, you will find the Apple computer's arithmetic at least about as good,
and quite likely rather better. An Apple computer can be set up to mimic the
worthwhile characteristics of almost any reasonable past computer
arithmetic, so existing libraries of numerical software do not have to be
discarded if they can be recompiled. SANE also offers features that are
unique to the IEEE Standard, new capabilities that previous generations of
computer users could only yearn for; but to learn what they are, you will
have to read this book.

As one of the designers of IEEE Standard 754, I can only stand in awe of the
efforts that Apple has expended to implement that standard faithfully both
in hardware and in software, including language processors, so that users of
Apple computers will actually reap tangible benefits from the Standard. And
I thank Apple for letting me explain in this foreword why we needed that
standard.

Professor W. Kahan
Mathematics Department and
Electrical Engineering and
Computer Science Department
University of California at Berkeley
December 16, 1987
 
L

Lawrence D'Oliveiro

Dennis Lee said:
By decreeing that the value of PI is 3?

Interesting kind of mindset, that assumes that the opposite of "real" must
be "integer" or a subset thereof...
 
S

Steven D'Aprano

Interesting kind of mindset, that assumes that the opposite of "real"
must be "integer" or a subset thereof...


(0) "Opposite" is not well-defined unless you have a dichotomy. In the
case of number fields like the reals, you have more than two options, so
"opposite of real" isn't defined.

(1/3) Why do you jump to the conclusion that "pi=3" implies that only
integers are defined? One might have a mapping where every real number is
transferred to the closest multiple of 1/3 (say), rather than the closest
integer. That would still give "pi=3", without being limited to integers.

(1/2) If you "get rid of real numbers", then obviously you must have a
smaller set of numbers, not a larger. Any superset of reals will include
the reals, and therefore you haven't got rid of them at all, so we can
eliminate supersets of the reals from consideration if your description
of Chaitin's work is accurate.

(2/3) There is *no* point (2/3).

(1) I thought about numbering my points as consecutive increasing
integers, but decided that was an awfully boring convention. A shiny
banananana for the first person to recognise the sequence.
 
D

Dennis Lee Bieber

Interesting kind of mindset, that assumes that the opposite of "real" must
be "integer" or a subset thereof...

No, but since PI (and e) are both transcendentals, there is NO
representation (except by the symbols themselves) which is NOT an
approximation.




* My HP50g tends to annoy me at times -- I suspect I should change a
configuration option -- as doing 22 divide by 3 displays

22
--
3

and requires me to hit the ->num button to get the value I really
wanted.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
L

Lawrence D'Oliveiro

Dave Angel said:
I remember the professor. He was responsible for large parts of the
Intel 8087 specification, which later got mostly codified as IEEE 754.

The 8087 was poorly designed. It was stack-based, which caused all kinds of
performance problems that never really went away, though I think Intel tried
to patch over them with various SSE extensions. I believe AMD64 does have
proper floating-point registers, at last.

Apple's implementation of IEEE 754 was so rigorous that, when Motorola
introduced the 68881, which implemented a few of the "shortcuts" that Kahan
reviled in his foreword, Apple added a patch to its SANE library to restore
correct results, with the usual controversy over whether the performance
loss was worth it. If you didn't think it was, you could always use the
68881 instructions directly.
 
L

Lawrence D'Oliveiro

Steven said:
(1/2) If you "get rid of real numbers", then obviously you must have a
smaller set of numbers, not a larger.

Chaitin is trying to use only computable numbers. Pi is computable, as is e,
sqrt(2), the Feigenbaum constant, and many others familiar to us all.

Trouble is, they only make up 0% of the reals. It's the other 100% he wants
to get rid of.
 
S

Steven D'Aprano

Chaitin is trying to use only computable numbers. Pi is computable, as
is e, sqrt(2), the Feigenbaum constant, and many others familiar to us
all.

Trouble is, they only make up 0% of the reals. It's the other 100% he
wants to get rid of.


+1 QOTD
 
L

Luis Zarrabeitia

In py3k Eric Smith and Mark Dickinson have implemented Gay's floating
point algorithm for Python so that the shortest repr that will round
trip correctly is what is used as the floating point repr....

Little question: what was the goal of such a change? (is there a pep for me to
read?) Shouldn't str() do that, and leave repr as is?

While I agree that the change gets rid of the weekly newbie question
about "python's lack precision", I'd find more difficult to explain why
0.2 * 3 != 0.6 without showing them what 0.2 /really/ means.
 
M

Mark Dickinson

Luis Zarrabeitia said:
Little question: what was the goal of such a change? (is there a pep for me to
read?) Shouldn't str() do that, and leave repr as is?

It's a good question. I was prepared to write a PEP if necessary, but
there was essentially no opposition to this change either in the
python-dev thread that Ned already mentioned, in the bugs.python.org
feature request (see http://bugs.python.org/issue1580; set aside
half-an-hour or so if you want to read this one) or amongst the people
we spoke to at PyCon 2009, so in the end Eric and I just went ahead
and merged the changes. It didn't harm that Guido supported the idea.

I think the main goal was to see fewer complaints from newbie users
about 0.1 displaying as 0.10000000000000001. There's no real reason
to produce 17 digits here. Neither 0.1 nor 0.10000000000000001
displays the true value of the float---both are approximations, so why
not pick the approximation that actually displays nicely. The only
requirement is that float(repr(x)) recovers x exactly, and since 0.1
produced the float in the first place, it's clear that taking
repr(0.1) to be '0.1' satisfies this requirement.

The problem is particularly acute with the use of the round function,
where newbies complain that round is buggy because it's not rounding
to 2 decimal places:
2.4500000000000002

With the new float repr, the result of rounding a float to 2 decimal
places will always display with at most 2 places after the point.
(Well, possibly except when that float is very large.)

Of course, there are still going to be complaints that the following
is rounding in the wrong direction:
0.07

I'll admit to feeling a bit uncomfortable about the fact that the new
repr goes a little bit further towards hiding floating-point
difficulties from numerically-naive users.

The main things that I like about the new representation is that its
definition is saner (give me the shortest string that rounds
correctly, versus format to 17 places and then somewhat arbitrarily
strip all trailing zeros) and it's more consistent than the old. With
the current 2.6/3.0 repr (on my machine; your results may vary):
0.040000000000000001

With Python 3.1:
0.04

A cynical response would be to say that the Python 2.6 repr lies only
some of the time; with Python 3.1 it lies *all* of the time. But
actually all of the above outputs are lies; it's just that the second
set of lies is more consistent and better looking.

There are also a number of significant 'hidden' benefits to using
David Gay's code instead of the system C library's functions, though
those benefits are mostly independent of the choice to use the short
float repr:

- the float repr is much more likely to be consistent across platforms
(or at least across those platforms using IEEE 754 doubles, which
seems to be 99.9% percent of them)

- the C library double<->string conversion functions are buggy on many
platforms (including at least OS X, Windows and some flavours of
Linux). While I won't claim that Gay's code (or our adaptation of
it) is bug-free, I don't know of any bugs (reports welcome!) and at
least when bugs are discovered it's within our power to fix them.
Here's one example of an x == eval(repr(x)) failure due to a bug in
the OS X implementation of strtod:
False

- similar to the last point: on many platforms string formatting is
not correctly rounded, in the sense that e.g. '%.6f' % x does not
necessarily produce the closest decimal with 6 places after the
decimal point to x. This is *not* a platform bug, since there's no
requirement of correct rounding in the C standards. However, David
Gay's code does provide correctly rounded string -> double and
double -> string conversions, so Python's string formatting will now
always be correctly rounded. A small thing, but it's nice to have.

- since both round() and string formatting now both use Gay's code, we
can finally guarantee that round and string formatting give
equivalent results: e.g., that the digits in round(x, 2) are the
same as the digits in '%.2f' % x. That wasn't true before: round
could round up while '%.2f' % x rounded down (or vice versa) leading
to confusion and at least one semi-bogus bug report.

- a lot of internal cleanup has become possible as a result of not
having to worry about all the crazy things that platform string <->
double conversions can do. This makes the CPython code smaller,
clearer, easier to maintain, and less likely to contain bugs.
While I agree that the change gets rid of the weekly newbie question
about "python's lack precision", I'd find more difficult to explain why
0.2 * 3 != 0.6 without showing them what 0.2 /really/ means.

There are still plenty of ways to show what 0.2 really means. My
favourite is to use the Decimal.from_float method:
Decimal('0.200000000000000011102230246251565404236316680908203125')

This is only available in 2.7 and 3.1, but then the repr change isn't
happening until 3.1 (and it almost certainly won't be backported to
2.7, by the way), so that's okay. But there's also float.hex,
float.as_integer_ratio, and Fraction.from_float to show the exact
value that's stored for a float.
Fraction(3602879701896397, 18014398509481984)

Hmm. That was a slightly unfortunate choice of example: the hex form
of 0.2 looks uncomfortably similar to 1.9999999.... An interesting
cross-base accident.

This is getting rather long. Perhaps I should put the above comments
together into a 'post-PEP' document.

Mark
 
L

Luis Zarrabeitia

It's a good question. I was prepared to write a PEP if necessary, but
there was essentially no opposition to this change either in the
python-dev thread that Ned already mentioned, in the bugs.python.org
feature request (see http://bugs.python.org/issue1580; set aside
half-an-hour or so if you want to read this one) or amongst the people
we spoke to at PyCon 2009, so in the end Eric and I just went ahead
and merged the changes. It didn't harm that Guido supported the idea.

Thank you for the reply and the explanation.
After reading the thread, I was sold on the idea. It still feels weird not
being able to introduce /that quickly/ the idea of float vs real to newbies,
but now I understand the tradeoff.

(Now, I'm going to miss showing that to my students who almost inevitably
complain about the uselessness of the 'floating point representation' chapter
in the numerical analysis course :D).
There are still plenty of ways to show what 0.2 really means. My

favourite is to use the Decimal.from_float method:

Decimal('0.200000000000000011102230246251565404236316680908203125')

Oh, thank you. That was the next thing I was going to ask.
 
A

Aahz

This is getting rather long. Perhaps I should put the above comments
together into a 'post-PEP' document.

Yes, you should. Better explanation of floating point benefits everyone
when widely available. I even learned a little bit here and I've been
following this stuff for a while (though by no means any kind of
numerical expert).
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"In many ways, it's a dull language, borrowing solid old concepts from
many other languages & styles: boring syntax, unsurprising semantics,
few automatic coercions, etc etc. But that's one of the things I like
about it." --Tim Peters on Python, 16 Sep 1993
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top