zero up memory

Malcolm McLean · Jan 25, 2012

The case that I'm most familiar with, IEEE 754 double precision,
sub-normal numbers have an exponent of 0,
Within this scheme, could you provide an example of the "many possible
representations for each number" that you're referring to?

IEEE denormalised numbers have an exponent of 0 (represents 2^-1023),
but no implicit leading set bit. So very small numbers can be
represented.
Now imagine that instead of an implict set bit, we have a real bit,
which can be toggled on or off, and a floating exponent. Now we can
represent very many numbers in several ways. We can shift the mantissa
right, clear the leading bit, and increment the exponent by one,
without changing the value, assuming that the lost low bit of the
mantissa was clear.

So my question was, how to do efficent tests for equality, on such an
architecture?

James Kuyper · Jan 25, 2012

IEEE denormalised numbers have an exponent of 0 (represents 2^-1023),
but no implicit leading set bit. So very small numbers can be
represented.

Am I right in assuming that you concede the feasibility of efficiently
implemented equality comparisons involving IEEE 754 sub-normals?

Now imagine that instead of an implict set bit, we have a real bit,
which can be toggled on or off, and a floating exponent. Now we can
represent very many numbers in several ways. We can shift the mantissa
right, clear the leading bit, and increment the exponent by one,
without changing the value, assuming that the lost low bit of the
mantissa was clear.

You're no longer talking about IEEE 754 sub-normals, I presume.
Incrementing the exponent on a sub-normal would make it non-zero, so it
would no longer be considered sub-normal. As a normal number, it would
be interpreted as having an implicit leading bit, which would change the
value. Therefore, I presume you're talking about some alternative
floating point representation.

....

So my question was, how to do efficent tests for equality, on such an
architecture?

I don't know. Has anyone other than you been talking about such an
architecture on this thread?

There is some ambiguity about terminology here. I've heard terms such as
unnormalized, denormalized, and sub-normal all used to refer to the same
basic idea. According to the documentation I've seen, sub-normal seems
to be IEEE's preferred term. When you read Ben's first message about
"unnormalized numbers" it's quite clear that he was talking about what
IEEE calls sub-normals, for which the exponent must be 0; and not about
anything resembling the architecture you describe.

Malcolm McLean · Jan 25, 2012

Am I right in assuming that you concede the feasibility of efficiently
implemented equality comparisons involving IEEE 754 sub-normals?

I'm answering this:

Ben Bacarisse

In systems where the mantissa is not normalised (i.e. there is no
implicit leading 1) then zero is specified quite naturally with a
zero
mantissa. A zero sign bit and exponent (even it biased) makes no
difference. Systems like this are common: IEEE decimal floating
point
and IBM's hexadecimal floating point come to mind.

With IEEE 754 sub-normals, the problem doesn't arise, because the
exponent is fixed.

Kleuske · Jan 25, 2012

That doesn't address the question, which is how the processor performs a
floating-point equality test, the one specified by C's "==" operator.

In fact it did, if only in passing. The comparison is a bit-wise one, IIRC,
and using '==' for floats and doubles is strongly discouraged where i work.

I think it might be a universal best practice.

Kleuske · Jan 25, 2012

In fact it did, if only in passing. The comparison is a bit-wise one,
IIRC, and using '==' for floats and doubles is strongly discouraged
where i work.

"Strongly discouraged" as in "Do it again, and you're fired".

Ben Bacarisse · Jan 25, 2012

Keith Thompson said:
Ben Bacarisse <[email protected]> writes:

I think you mean +0 and -0.

<facepalm> Thanks.

Ben Bacarisse · Jan 25, 2012

James Kuyper said:
When you read Ben's first message about
"unnormalized numbers" it's quite clear that he was talking about what
IEEE calls sub-normals, for which the exponent must be 0; and not about
anything resembling the architecture you describe.

Actually I wasn't. I was referring to representation with no "hidden
bit". It's linked to normalisation because, when the base is > 2 there
can be leading zeros in the mantissa even when the number is
"normalised" (mantissa shifted as far left as possible).

IBM machine use/used such a format. The result is that there are lots
of zeros: a zero mantissa does not mean 0.1xxx... but 0 so the exponent
has no effect on the value.

Ben Bacarisse · Jan 25, 2012

Malcolm McLean said:
I'm answering this:

Ben Bacarisse

In systems where the mantissa is not normalised (i.e. there is no
implicit leading 1) then zero is specified quite naturally with a zero
mantissa. A zero sign bit and exponent (even it biased) makes no
difference. Systems like this are common: IEEE decimal floating point
and IBM's hexadecimal floating point come to mind.

s/is not normalised/has no hidden bit/

I should not have conflated this with normalisation though my excuse is
that such formats can have leading zero bits in the mantissa so the
"look" un-normalised.

It does not alter the issue that there are then a very large number of
representations of zero.

James Kuyper · Jan 25, 2012

Actually I wasn't. I was referring to representation with no "hidden
bit". It's linked to normalisation because, when the base is > 2 there
can be leading zeros in the mantissa even when the number is
"normalised" (mantissa shifted as far left as possible).

IBM machine use/used such a format. The result is that there are lots
of zeros: a zero mantissa does not mean 0.1xxx... but 0 so the exponent
has no effect on the value.

I read "unnormalized", and didn't pay close enough attention to realize
that you were talking about something more exotic than ordinary IEEE 754
sub-normals. After Malcolm's last message I looked up "decimal floating
point" and learned that it has precisely the feature he was talking
about. I apologize for my confusion, and any confusion I may have caused.

Ben Pfaff · Jan 25, 2012

Robert Wessel said:
FWIW, a slightly closer to common example is the 80-bit real format on
x87. While still binary, it does not have the implicit (hidden) one
bit of the 32 and 64 bit formats (IOW, the leading one bit is actually
stored), and thus has a bunch of "pseudo-zeros" (Intel terminology)
where the mantissa is zero but the exponent is not. Early versions of
the x87 treated those mostly as true zeros, but since the 387 (I
think), pseudo-zeros have been disallowed.

I've never understood why the 80-bit format differs from the 32-
and 64-bit formats in this respect.

Keith Thompson · Jan 25, 2012

Kleuske said:
In fact it did, if only in passing. The comparison is a bit-wise one, IIRC,
and using '==' for floats and doubles is strongly discouraged where i work.

I think it might be a universal best practice.

But that has nothing to do with the original question, which was about
how "==" is implemented, *not* whether it's a good idea to use it.

Floating-point "==" is *not* simply bitwise comparison. If the values
stored in x and y are positive and negative zero, then "x == y" must
yield 1. If x and y are both NaNs with the same representation (for
implementations that support NaNs), then "x == y" must yield 0.

BTW, there are cases where floating-point "==" is well defined. For
example, if this:

if (1.0 + 1.0 == 2.0) puts("ok");

doesn't print "ok", then there's something wrong with your
implementation (certainly if __STDC_IEC_559__ is defined; I'm
less certain of that if it isn't). A policy that forbids using
floating-point "==" in all cases is overkill.

(Not that overkill is necessarily inappropriate in all cases.)

Malcolm McLean · Jan 25, 2012

In fact it did, if only in passing. The comparison is a bit-wise one, IIRC,
and using '==' for floats and doubles is strongly discouraged where iwork.

I think it might be a universal best practice.

If you're comparing reals for equality, usually it means that there's
something logically wrong in what you're doing, and bunging in a +/-
epsilon is only masking the problem.

But there are exceptions of course. An example would be if you want to
check that two data items are actually copies of each other.

Sjouke Burry · Jan 25, 2012

If you're comparing reals for equality, usually it means that there's
something logically wrong in what you're doing, and bunging in a +/-
epsilon is only masking the problem.

But there are exceptions of course. An example would be if you want to
check that two data items are actually copies of each other.

Also I have used == for inserting a flag value in data.
They compare OK, because flag and testvalue are not
the result of a calculation, and far out of range of the data.

BartC · Jan 25, 2012

Robert Wessel said:
It is odd. The standard actually allows that for the extended
formats. My theory has always been that this was a more convenient
internal representation for the 8087 (IOW, with an explicit one bit),
and they just ended up exposing that as an enhancement along the way.

The 32/64-bit formats are designed to be stored in memory, so there could be
millions of such numbers, and it was useful to have an extra bit of
precision in all those stored numbers by not storing that '1' bit. (Using
33/65-bits would be a trifle inconvenient.)

With the internal 80-bit format, that was not important, as there are only 8
or so such numbers. The ability to store that format externally was probably
just a bonus as you say.

And in the 80-bit format, I believe 64-bits of that is mantissa, with an
explicit '1' bit. Presumably the internal circuits were 64-bits wide. There
would have been little benefit to storing only 63-bits in the user register;
you'd get an slightly better exponent range perhaps, but it's already wide
enough for most purposes. Or you'd only need to have 79-bit wide registers!

Nick Keighley · Jan 26, 2012

"Strongly discouraged" as in "Do it again, and you're fired".

it sometimes makes sense. For instance floating point variables can be
used to hold integers. As long as you're careful with / and % you can
safely do == tests. If I initialise a variable to 0.0 and do no
operations on it it will test equal to 0.0. Hence I can test if it's
only been initialised.

Nick Keighley · Jan 26, 2012

If you're comparing reals for equality, usually it means that there's
something logically wrong in what you're doing, and bunging in a +/-
epsilon is only masking the problem.

why? If I want to test if something is near zero what is wrong with
epsilon tests?

Kleuske · Jan 26, 2012

But that has nothing to do with the original question, which was about
how "==" is implemented, *not* whether it's a good idea to use it.

That's the good thing about USENET threads. You are usually allowed to
digress a little.

Floating-point "==" is *not* simply bitwise comparison. If the values
stored in x and y are positive and negative zero, then "x == y" must
yield 1. If x and y are both NaNs with the same representation (for
implementations that support NaNs), then "x == y" must yield 0.

Ok. Bitwise comparison with a few caveats. Point was, using '==' on
reals is generally a bad idea. There are exceptions to the rule, as
there always are, but still.

BTW, there are cases where floating-point "==" is well defined. For
example, if this:

if (1.0 + 1.0 == 2.0) puts("ok");

doesn't print "ok", then there's something wrong with your
implementation (certainly if __STDC_IEC_559__ is defined; I'm less
certain of that if it isn't). A policy that forbids using
floating-point "==" in all cases is overkill.

AFAIK, defining '__STDC_IEC_559__' isn't a requirement and unless the
above is hardcoded, you can never guarantee the values behave nicely.

(Not that overkill is necessarily inappropriate in all cases.)

hmmm...

Kleuske · Jan 26, 2012

If you're comparing reals for equality, usually it means that there's
something logically wrong in what you're doing, and bunging in a +/-
epsilon is only masking the problem.

I usually don't use reals at all, only when absolutely required, but that's
another matter.

In fields where they are required, i suppose the logic dictates you first put
your equations in a form that preserves as much precision as possible. But
that's maths. Since you probably have many different calculations, an
appropriate epsilon cannot sensibly be a single constant, since it not only
depends on math applied, but also on the physical tolerances of the input and
any "equality-checking" degrades to checking whether or not something falls in
some specified range, which is a different kettle of fish.

In that way, you are (probably) right.

But there are exceptions of course. An example would be if you want to
check that two data items are actually copies of each other.

In one of my hobby projects, however, the user perform math in a toy language,
and, rather than stunning the user with the absence of a equality operator for
reals, decided to implement one using the above method and an epsilon the
user can specify.

Keith Thompson · Jan 26, 2012

Kleuske said:
In one of my hobby projects, however, the user perform math in a toy
language, and, rather than stunning the user with the absence of a
equality operator for reals, decided to implement one using the above
method and an epsilon the user can specify.

The loss of transitivity can be a problem, though. With such a
definition of equality, x == y && y == z doesn't imply x == z.

Malcolm McLean · Jan 26, 2012

why? If I want to test if something is near zero what is wrong with
epsilon tests?

There are two questions you might be asking:

1) is the answer actually zero but my floating point unit not good
enough to give a value of zero?

2) is the value so close to zero that I can regard it as zero?

Both questions are likely to involve you in difficulties and
subtleties. If you can answer those difficulties and subtleties, then
a test involving epsilon might be appropriate. Just reflexively
bunging in fabs(x - Y) < epsilon where the logic says x == y is
unlikely to be a good idea.

Adding adressing of IPv6 to program	1	Feb 16, 2023
Organization Assignment in C programming	0	Sep 7, 2022
Linux: using "clone3" and "waitid"	0	Oct 17, 2023
Array of structs function pointer	10	Jul 16, 2023
Queue in C	25	May 19, 2014
I dont know how to modify this. Can someone help.	0	Sep 25, 2014
possible NULL && dereferencing NULL pointer	8	Jan 31, 2012
all-bits-zero pointer-to-object representation	20	Apr 26, 2010

zero up memory

Malcolm McLean

James Kuyper

Malcolm McLean

Kleuske

Kleuske

Ben Bacarisse

Ben Bacarisse

Ben Bacarisse

James Kuyper

Ben Pfaff

Keith Thompson

Malcolm McLean

Sjouke Burry

BartC

Nick Keighley

Nick Keighley

Kleuske

Kleuske

Keith Thompson

Malcolm McLean

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads