pickle broken: can't handle NaN or Infinity under win32

I

Ivan Van Laningham

T

Tim Peters

[Tim Peters]
....
Across platforms with a 754-conforming libm, the most portable way [to
distinguish +0.0 from -0.0 in standard C] is via using atan2(!):
pz = 0.0
mz = -pz
from math import atan2
atan2(pz, pz) 0.0
atan2(mz, mz)
-3.1415926535897931

[Ivan Van Laningham]
Never fails. Tim, you gave me the best laugh of the day.

Well, I try, Ivan. But lest the point be missed <wink>, 754 doesn't
_want_ +0 and -0 to act differently in "almost any" way. The only
good rationale I've seen for why it makes the distinction at all is in
Kahan's paper "Branch Cuts for Complex
Elementary Functions, or Much Ado About Nothing's Sign Bit". There
are examples in that where, when working with complex numbers, you can
easily stumble into getting real-world dead-wrong results if there's
only one flavor of 0. And, of course, atan2 exists primarily to help
convert complex numbers from rectangular to polar form.

Odd bit o' trivia: following "the rules" for signed zeroes in 754
makes exponeniation c**n ambiguous, where c is a complex number with
c.real == c.imag == 0.0 (but the zeroes may be signed), and n is a
positive integer. The signs on the zeroes coming out can depend on
the exact order in which multiplications are performed, because the
underlying multiplication isn't associative despite that it's exact.
I stumbled into this in the 80's when KSR's Fortran compiler failed a
federal conformance test, precisely because the test did atan2 on the
components of an all-zero complex raised to an integer power, and I
had written one of the few 754-conforming libms at the time. They
wanted 0, while my atan2 dutifully returned -pi. I haven't had much
personal love for 754 esoterica since then ...
 
S

Steven D'Aprano

Well, I try, Ivan. But lest the point be missed <wink>, 754 doesn't
_want_ +0 and -0 to act differently in "almost any" way. The only
good rationale I've seen for why it makes the distinction at all is in
Kahan's paper "Branch Cuts for Complex
Elementary Functions, or Much Ado About Nothing's Sign Bit". There
are examples in that where, when working with complex numbers, you can
easily stumble into getting real-world dead-wrong results if there's
only one flavor of 0. And, of course, atan2 exists primarily to help
convert complex numbers from rectangular to polar form.

It isn't necessary to look at complex numbers to see the difference
between positive and negative zero. Just look at a graph of y=1/x. In
particular, look at the behaviour of the graph around x=0. Now tell me
that the sign of zero doesn't make a difference.

Signed zeroes also preserve 1/(1/x) == x for all x, admittedly at the cost
of y==x iff 1/y == 1/x (which fails for y=-0 and x=+0). Technically, -0
and +0 are not the same (for some definition of "technically"); but
practicality beats purity and it is more useful to have -0==+0 than the
alternative.
Odd bit o' trivia: following "the rules" for signed zeroes in 754
makes exponeniation c**n ambiguous, where c is a complex number with
c.real == c.imag == 0.0 (but the zeroes may be signed), and n is a
positive integer. The signs on the zeroes coming out can depend on
the exact order in which multiplications are performed, because the
underlying multiplication isn't associative despite that it's exact.

That's an implementation failure. Mathematically, the sign of 0**n should
depend only on whether n is odd or even. If c**n is ambiguous, then that's
a bug in the implementation, not the standard.
I stumbled into this in the 80's when KSR's Fortran compiler failed a
federal conformance test, precisely because the test did atan2 on the
components of an all-zero complex raised to an integer power, and I
had written one of the few 754-conforming libms at the time. They
wanted 0, while my atan2 dutifully returned -pi. I haven't had much
personal love for 754 esoterica since then ...

Sounds to me that the Feds wanted something broken and you gave them
something that was working. No wonder they failed you :)
 
T

Tim Peters

[Tim Peters']
[Steven D'Aprano]
It isn't necessary to look at complex numbers to see the difference
between positive and negative zero. Just look at a graph of y=1/x. In
particular, look at the behaviour of the graph around x=0. Now tell me
that the sign of zero doesn't make a difference.

OK, I looked, and it made no difference to me. Really. If I had an
infinitely tall monitor, maybe I could see a difference, but I don't
-- the sign of 0 on the nose makes no difference to the behavior of
1/x for any x other than 0. On my finite monitor, I see it looks like
the line x=0 is an asymptote, and the graph approaches minus infinity
on that line from the left and positive infinity from the right; the
value of 1/0 doesn't matter to that.
Signed zeroes also preserve 1/(1/x) == x for all x,

No, signed zeros "preverse" that identity for exactly the set {+Inf,
-Inf}, and that's all. That's worth something, but 1/(1/x) == x isn't
generally true in 754 anyway. Most obviously, when x is subnormal,
1/x overflows to an infinity (the 754 exponent range isn't symmetric
around 0 -- subnormals make it "heavy" on the negative side), and then
1/(1/x) is a zero, not x. 1/(1/x) == x doesn't hold for a great many
normal x either (pick a pile at random and check -- you'll find
counterexamples quickly).
admittedly at the cost of y==x iff 1/y == 1/x (which fails for y=-0 and x=+0).

Technically, -0 and +0 are not the same (for some definition of "technically"); but
practicality beats purity and it is more useful to have -0==+0 than the alternative.

Can just repeat that the only good rationale I've seen is in Kahan's
paper (previously referenced).
That's an implementation failure. Mathematically, the sign of 0**n should
depend only on whether n is odd or even. If c**n is ambiguous, then that's
a bug in the implementation, not the standard.

As I said, these are complex zeroes, not real zeroes. The 754
standard doesn't say anything about complex numbers. In rectangular
form, a complex zero contains two real zeroes. There are 4
possiblities for a complex zero if the components are 754
floats/doubles:

+0+0i
+0-0i
-0+0i
-0-0i

Implement Cartesian complex multiplication in the obvious way:

(a+bi)(c+di) = (ac-bd) + (ad+bc)i

Now use that to raise the four complex zeroes above to various integer
powers, trying different ways of grouping the multiplications. For
example, x**4 can be computed as

((xx)x)x

or

(xx)(xx)

or

x((xx)x)

etc. You'll discover that, in some cases, for fixed x and n, the
signs of the zeroes in the result depend how the multiplications were
grouped. The 754 standard says nothing about any of this, _except_
for the results of multiplying and adding 754 zeroes. Multiplication
of signed zeroes in 754 is associative. The problem is that the
extension to Cartesian complex multiplication isn't associative under
these rules in some all-zero cases, mostly because the sum of two
signed zeroes is (under 3 of the rounding modes) +0 unless both
addends are -0. Try examples and you'll discover this for yourself.

I was part of NCEG (the Numerical C Extension Group) at the time I
stumbled into this, and they didn't have any trouble following it
<wink>. It was a surprise to everyone at the time that Cartesian
multiplication of complex zeroes lost associativity when applying 754
rules in the obvious way, and no resolution was reached at that time.
Sounds to me that the Feds wanted something broken and you gave them
something that was working. No wonder they failed you :)

Yup, and they did a lot of that <0.9 wink>. Luckily(?), Fortran is so
eager to allow optimizations that failure due to numeric differences
in conformance tests rarely withstood challenge.
 
I

Ivan Van Laningham

S

Steven D'Aprano

Tim said:
[Steven D'Aprano]
It isn't necessary to look at complex numbers to see the difference
between positive and negative zero. Just look at a graph of y=1/x. In
particular, look at the behaviour of the graph around x=0. Now tell me
that the sign of zero doesn't make a difference.


OK, I looked, and it made no difference to me. Really. If I had an
infinitely tall monitor, maybe I could see a difference, but I don't
-- the sign of 0 on the nose makes no difference to the behavior of
1/x for any x other than 0. On my finite monitor, I see it looks like
the line x=0 is an asymptote, and the graph approaches minus infinity
on that line from the left and positive infinity from the right; the
value of 1/0 doesn't matter to that.

Well, I didn't say that the value of zero made a
difference for _other_ values of x. Perhaps you and I
are interpreting the same graph differently. To me, the
behaviour of 1/x around x = 0 illustrates why +0 and -0
are different, but it isn't worth arguing about.
No, signed zeros "preverse" that identity for exactly the set {+Inf,
-Inf}, and that's all.

Preverse? Did you mean "pervert"? Or just a mispelt
"preserve"? Sorry, not trying to be a spelling Nazi, I
genuinely don't understand what you are trying to get
across here.

> That's worth something, but 1/(1/x) == x isn't
generally true in 754 anyway.

Of course it isn't. Sorry, I was thinking like a
mathematician instead of a programmer again.

(Note to self: write out 1000 lines, "Real number !=
floating point number".)

[snip]
As I said, these are complex zeroes, not real zeroes. The 754
standard doesn't say anything about complex numbers. In rectangular
form, a complex zero contains two real zeroes. There are 4
possiblities for a complex zero if the components are 754
floats/doubles:

+0+0i
+0-0i
-0+0i
-0-0i

Implement Cartesian complex multiplication in the obvious way:

(a+bi)(c+di) = (ac-bd) + (ad+bc)i

Yes, but that supports what I said: it is an
_implementation_ issue. A different implementation
might recognise a complex zero and return the correctly
signed complex zero without actually doing the
multiplication.

Assuming mathematicians can decide on which complex
zero is the correct one.

Now use that to raise the four complex zeroes above to various integer
powers, trying different ways of grouping the multiplications. For
example, x**4 can be computed as

((xx)x)x

or

(xx)(xx)

or

x((xx)x)

etc. You'll discover that, in some cases, for fixed x and n, the
signs of the zeroes in the result depend how the multiplications were
grouped. The 754 standard says nothing about any of this, _except_
for the results of multiplying and adding 754 zeroes. Multiplication
of signed zeroes in 754 is associative. The problem is that the
extension to Cartesian complex multiplication isn't associative under
these rules in some all-zero cases, mostly because the sum of two
signed zeroes is (under 3 of the rounding modes) +0 unless both
addends are -0. Try examples and you'll discover this for yourself.

Yes, point taken. But that is just another example of
where floats fail to be sufficiently close to real
numbers. In a "perfect" representation, this would not
be a factor.

However... now that I think of it... in polar form,
there are an uncountably infinite number of complex zeroes.

z = 0*cis(0), z = 0*cis(0.1), z = 0*cis(-0.21), ...

I think I won't touch that one with a fifty foot pole.
 
R

Robert Kern

Tim said:
OK, I looked, and it made no difference to me. Really. If I had an
infinitely tall monitor, maybe I could see a difference, but I don't
-- the sign of 0 on the nose makes no difference to the behavior of
1/x for any x other than 0. On my finite monitor, I see it looks like
the line x=0 is an asymptote, and the graph approaches minus infinity
on that line from the left and positive infinity from the right; the
value of 1/0 doesn't matter to that.

Well, the value of 1/0 is undefined. Occasionally, it's useful to report
+inf as the value of 1.0/+0.0 because practically we're more concerned
with limiting behavior from an assumed limiting process than being
correct. By the same token, we might also be concerned with the limiting
behavior coming from the other direction (a different limiting process),
so we might want 1.0/-0.0 to give -inf (although it's still actually
undefined, no different from the first expression, and inf is really the
same thing as -inf, too).

Although I haven't read the paper you cited, it seems to me that the
branch cut issue is the same thing. If you're on the cut itself, the
value, practically, depends on which end of the branch you're deciding
to approach the point from. It's arbitrary; there's no correct answer;
but signed zeros give a way to express some of the desired, useful but
wrong answers.

And floating point is about nothing if not being usefully wrong.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
M

Michael Hudson

Terry Reedy said:
I believe that changes have been made to marshal/unmarshal in 2.5 CVS with
respect to NAN/INF to eliminate annoying/surprising behavior differences
between corresponding .py and .pyc files. Perhaps these revisions would be
relevant to pickle changes.

If you use a binary protocol for pickle, yes.

Cheers,
mwh
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top