C/C++ pitfalls related to 64-bits (unsigned long & double)

James Kuyper · Feb 13, 2012

....
documented for the architecture in numeric_limits. Still wondering here
about the "up for grabs" part. It seems to imply some edge condition that
isn't accounted for.

It sounds to me like he's not sure how to detect situations where the
conversion would be unsafe.

Ben Bacarisse · Feb 13, 2012

James Kuyper said:
It sounds to me like he's not sure how to detect situations where the
conversion would be unsafe.

That's not how I took it. I think the condition "even though double has
more bits than, say, an integer" is meant to suggest that exact
conversion is possible. Of course, that's not a literal consequence of
double having more bits (it must have a nearly equal number of mantissa
bits) but it's the only way I could make sense of "it can do so in
theory". All such things are guesses, of course, so I anticipate being
wrong.

BGB · Feb 14, 2012

It's not up for grabs in C (and C++ is essentially the same in this
regard). If the integer can be represented exactly in the floating
point type, it must be.

theoretically, yes.

given a double has 52 bits, and int is 32 bits, and it is possible to
convert exactly, it should always be reliable.

but, I have seen it not work exactly right, albeit in rare cases (IME,
usually on 64-bit Linux systems, generally fairly rare and when using
AMD chips IME).

I haven't seen the issue on Win64 though that I can remember, nor with
32-bit code, so I don't know.

it "might" have something to do with SSE for all I know (since 32-bit
code typically uses x87, and 64-bit typically uses SSE), or maybe
something to do with GCC, or similar.

might require researching, like trying to figure how exactly AMD chips
implement the "cvtsi2sd" and "cvtsd2si" instructions or similar... (I am
not even particularly sure which side of the conversion would have been
introducing a loss of accuracy, or if the cause could be something else
"in the middle" somewhere).

That's true of the truncated value can be represented as an int. If
not, the behaviour is undefined. For example, in the example that
triggered this thread my implementation produces zero as the result.

I didn't notice that part of the thread until after I posted.

If the int is "in range" you don't have a conforming C implementation.

could be.

all I know is I am fairly sure I have seen it happen in the past (unless
I remember seeing an issue probably with other causes, like maybe
arithmetic was being done somewhere was messing it up or similar).

I never really went and did an in depth investigation of the problem, as
it was fixed easily enough when I ran into it (and is not too much
different than other epsilon-type stuff when working with floating-point
types).

like, the whole: "if(fabs(a-b)<0.000001)" thing to compare for equality
or similar.

I can't see how this helps. If v is representable exactly as a double,
the round trip has no effect so this code is not needed. Can you give me
a use-case?

generally, I had seen it in my 3D engine, where in some cases integers
ended up getting converted to doubles and back, and sometimes they would
get "bumped" in this way. adding a small adjustment seemed to fix the
problem.

mostly since then I have been working under the assumption of trying to
avoid conversions to/from floating point types when possible (partly
also as I had made the past observation that these conversions have also
tended to be costly).

or such...

Malcolm McLean · Feb 14, 2012

all I know is I am fairly sure I have seen it happen in the past (unless
I remember seeing an issue probably with other causes, like maybe
arithmetic was being done somewhere was messing it up or similar).

Sounds like a hardware bug. The C compiler can't necessarily work
round those.

BGB · Feb 14, 2012

That's not how I took it. I think the condition "even though double has
more bits than, say, an integer" is meant to suggest that exact
conversion is possible. Of course, that's not a literal consequence of
double having more bits (it must have a nearly equal number of mantissa
bits) but it's the only way I could make sense of "it can do so in
theory". All such things are guesses, of course, so I anticipate being
wrong.

as I understand it, the entire range of 32 bit integers can be exactly
represented by a double.

theoretically, it should be a matter of sticking the bits into the
mantissa and adjusting the exponent as needed (so that the value is
normalized).

the issue is that, assuming my memory is correct, I had seen systems
where it didn't always work, but it was more like "once in a great
while", rather than the conversion being consistently wrong.

this wasn't being an issue with large values either, but more like a
value of "1000" would occasionally end up as "999" and similar, but it
did seem to always tend towards 0, so it wasn't like it was getting
"1001" or similar.

IIRC, when measuring, it was typically off by a tiny amount.
I am not certain whether or not any arithmetic was being performed on
the values.

what I remember about the configuration I saw it on:
Linux x86-64 (Fedora 11 IIRC), compiling with GCC;
CPU: AMD Athlon X2 (I forget which core).

at the time I was also compiling for Win64 ("Windows XP x64") using
MSVC, but did not see the issue with this configuration.

IIRC, there was a difference, namely that GCC was doing conversions
directly using "cvtsi2sd" and "cvtsd2si", whereas MSVC was doing the
conversion via internal function calls (this particular difference
seemed fairly common between GCC and MSVC, where GCC would typically
directly use math instructions, but MSVC would call functions to do
stuff like this, even with compiler optimizations turned on).

but, this is not to say my memory is being entirely accurate though (all
this was several years ago).

Ben Bacarisse · Feb 14, 2012

BGB said:
theoretically, yes.

given a double has 52 bits, and int is 32 bits, and it is possible to
convert exactly, it should always be reliable.

but, I have seen it not work exactly right, albeit in rare cases (IME,
usually on 64-bit Linux systems, generally fairly rare and when using
AMD chips IME).

I haven't seen the issue on Win64 though that I can remember, nor with
32-bit code, so I don't know.

it "might" have something to do with SSE for all I know (since 32-bit
code typically uses x87, and 64-bit typically uses SSE), or maybe
something to do with GCC, or similar.

might require researching, like trying to figure how exactly AMD chips
implement the "cvtsi2sd" and "cvtsd2si" instructions or similar... (I
am not even particularly sure which side of the conversion would have
been introducing a loss of accuracy, or if the cause could be
something else "in the middle" somewhere).

Since this behaviour is required for C implementations to be conforming,
deviation from it is important. Was there perhaps a bug report filed?

generally, I had seen it in my 3D engine, where in some cases integers
ended up getting converted to doubles and back, and sometimes they
would get "bumped" in this way. adding a small adjustment seemed to
fix the problem.

Can you add a test to the code to print v when

(int)(double)v != v &&
(v >= 0 ? (int)(v+0.0001) : (int)(v-0.0001)) == v

? That way we might get an example of the problem you are reporting.

<snip>

BGB · Feb 14, 2012

Since this behaviour is required for C implementations to be conforming,
deviation from it is important. Was there perhaps a bug report filed?

not at the time, I merely thought of it as an interesting occurrence and
worked around it.

Can you add a test to the code to print v when

(int)(double)v != v&&
(v>= 0 ? (int)(v+0.0001) : (int)(v-0.0001)) == v

? That way we might get an example of the problem you are reporting.

I would have to find an example of it again...

I remember seeing the problem a few years ago in some code of mine, but
don't have any recent memory of bugs resulting from it (but, then again,
this could also be due to code paranoia...).

just went and tried to recreate it, with mixed results:
a raw conversion does not show any issues (seems to always be reliable);
if I add a value to the double, and subtract the same value, then it
starts acting up.

testing the code below in Fedora 13 x86-64 within VMware (yes, not the
raw HW, but I would otherwise have to reboot).

#include <stdio.h>

int main()
{
double d;
int i, j, k;

for(i=0; i<100000000; i++)
{
j=rand()*rand()*i;
d=j;
d=d+1.0; //(1)
d=d-1.0; //(1)
k=d; //(2)
k=(d>=0?(int)(d+0.0001)

int)(d-0.0001)); //(2)
if(j!=k)
printf("%d %d\n", j, k);
}
}

1: if these lines are commented out, then the printf is never called,
but if uncommented (along with using different constant values), then I
start seeing messages (with it off-by-one, rounded towards 0).

2: if I switch to the second form, which makes the fudging, then the
messages disappear (they still appear with the first form).

so, it would seem to be mostly an issue in this case of whether or not
one does any arithmetic on the values (not sure whether or not this
still counts). CPU is an "AMD Athlon II X4 630".

or, at least, this is what I am seeing here...

here is the inner part of the loop (in ASM):
..L4:
movl $0, %eax
call rand
movl %eax, %ebx
movl $0, %eax
call rand
imull %ebx, %eax
imull -20(%rbp), %eax
movl %eax, -24(%rbp)
cvtsi2sd -24(%rbp), %xmm0
movsd %xmm0, -32(%rbp)
movsd -32(%rbp), %xmm1
movsd .LC0(%rip), %xmm0
addsd %xmm1, %xmm0
movsd %xmm0, -32(%rbp)
movsd -32(%rbp), %xmm0
movsd .LC0(%rip), %xmm1
subsd %xmm1, %xmm0
movsd %xmm0, -32(%rbp)
movsd -32(%rbp), %xmm0
cvttsd2si %xmm0, %eax
movl %eax, -36(%rbp)
movl -24(%rbp), %eax
cmpl -36(%rbp), %eax
je .L3
movl $.LC1, %eax
movl -36(%rbp), %edx
movl -24(%rbp), %ecx
movl %ecx, %esi
movq %rax, %rdi
movl $0, %eax
call printf
..L3:

....

..LC0:
.long 0
.long 1072693248

BGB · Feb 14, 2012

not at the time, I merely thought of it as an interesting occurrence and
worked around it.

I would have to find an example of it again...

I remember seeing the problem a few years ago in some code of mine, but
don't have any recent memory of bugs resulting from it (but, then again,
this could also be due to code paranoia...).

just went and tried to recreate it, with mixed results:
a raw conversion does not show any issues (seems to always be reliable);
if I add a value to the double, and subtract the same value, then it
starts acting up.

testing the code below in Fedora 13 x86-64 within VMware (yes, not the
raw HW, but I would otherwise have to reboot).

#include <stdio.h>

int main()
{
double d;
int i, j, k;

for(i=0; i<100000000; i++)
{
j=rand()*rand()*i;
d=j;
d=d+1.0; //(1)
d=d-1.0; //(1)
k=d; //(2)
k=(d>=0?(int)(d+0.0001)int)(d-0.0001)); //(2)
if(j!=k)
printf("%d %d\n", j, k);
}
}

1: if these lines are commented out, then the printf is never called,
but if uncommented (along with using different constant values), then I
start seeing messages (with it off-by-one, rounded towards 0).

2: if I switch to the second form, which makes the fudging, then the
messages disappear (they still appear with the first form).

so, it would seem to be mostly an issue in this case of whether or not
one does any arithmetic on the values (not sure whether or not this
still counts). CPU is an "AMD Athlon II X4 630".

or, at least, this is what I am seeing here...

oh yeah, here is an example of the output (with a slight tweak to show
the value held by the double):
1073741824 1073741823 41CFFFFFFFFFFFFF
1073741824 1073741823 41CFFFFFFFFFFFFF
262144 262143 410FFFFFFFFFFFFF
262144 262143 410FFFFFFFFFFFFF
262144 262143 410FFFFFFFFFFFFF
67108864 67108863 418FFFFFFFFFFFFF
4 3 400FFFFFFFFFFFFF
1073741824 1073741823 41CFFFFFFFFFFFFF
1073741824 1073741823 41CFFFFFFFFFFFFF
1073741824 1073741823 41CFFFFFFFFFFFFF
4194304 4194303 414FFFFFFFFFFFFF
4194304 4194303 414FFFFFFFFFFFFF
16384 16383 40CFFFFFFFFFFFFF
67108864 67108863 418FFFFFFFFFFFFF
16384 16383 40CFFFFFFFFFFFFF

hmm, a lot of the same values seem to repeat...

Eric Sosman · Feb 14, 2012

In comp.lang.c++ Eric Sosman said:
In comp.lang.c++ Eric Sosman said:

[...] Since hardware that offers 64 bits of precision in the
floating-point format used for `double', some loss of precision in
`b = a' must be expected.

Click to expand...

Click to expand...

Oh, drat. There was supposed to be an "is fairly rare" just
before the comma ...

Click to expand...

x87 hardware isn't that rare. Depending on the implementation,
the compiler might do the calculation in temporary real form,
with all 64 bits.

... which wouldn't help, as the eventual result must be
converted to plain `double'. Yes, there are systems that support
a floating-point format with >64 bits' precision, but are there
any that use such a wide precision for `double'? `long double',
maybe, but plain `double'?

Hands up: Who's got a C implementation where

sizeof(double) * CHAR_BIT > 64

? Or, more accurately to the O.P.'s question, where

DBL_MANT_DIG * log(FLT_RADIX) / log(2) >= 64?

?

Ben Bacarisse · Feb 14, 2012

BGB said:
as I understand it, the entire range of 32 bit integers can be exactly
represented by a double.

In the architecture in question, yes.

theoretically, it should be a matter of sticking the bits into the
mantissa and adjusting the exponent as needed (so that the value is
normalized).

the issue is that, assuming my memory is correct, I had seen systems
where it didn't always work, but it was more like "once in a great
while", rather than the conversion being consistently wrong.

this wasn't being an issue with large values either, but more like a
value of "1000" would occasionally end up as "999" and similar, but it
did seem to always tend towards 0, so it wasn't like it was getting
"1001" or similar.

It would seem you are talking about a hardware bug. I'd say it was one
if it weren't for the fact that you are sure you recall correctly.

Had it been me, I'd have documented it. You can get famous for finding
Intel floating points bugs! Maybe it's not too late (see my other
post).

Can you recall which part of the round-trip was going wrong? Did
cvtsi2sd turn 1000 into something less that 1000 or did cvtsd2si turn
1000 into 999?

IIRC, when measuring, it was typically off by a tiny amount.
I am not certain whether or not any arithmetic was being performed on
the values.

Oh, if there might have been arithmetic being done, how do you know the
conversion was not being done as it should? Maybe the arithmetic was
rounding in some way you did not expect?

<snip>

BGB · Feb 14, 2012

In the architecture in question, yes.

It would seem you are talking about a hardware bug. I'd say it was one
if it weren't for the fact that you are sure you recall correctly.

Had it been me, I'd have documented it. You can get famous for finding
Intel floating points bugs! Maybe it's not too late (see my other
post).

Can you recall which part of the round-trip was going wrong? Did
cvtsi2sd turn 1000 into something less that 1000 or did cvtsd2si turn
1000 into 999?

I don't remember, I think my thoughts at the time were "well, I am
getting values which are off by a tiny amount, oh well, I will fudge it".

it was a situation roughly along the lines of integers being converted
to doubles, maybe having arithmetic done on them (mostly still with
integer values), and converted back to integers later.

since it was off by a tiny amount, I just added code to fix it.

Oh, if there might have been arithmetic being done, how do you know the
conversion was not being done as it should? Maybe the arithmetic was
rounding in some way you did not expect?

it is possible, in my test elsewhere, it seems I can only really
recreate the issue if a value is added and then subtracted again from
the same value (in double form).

so, this may not be a conversion bug, but more of an "integer arithmetic
with doubles isn't exact" issue (leads to values ever slightly smaller
than what they would need to be).

fudging it does fix the problem, which was either-way, the original
intent of the "add a tiny amount to fudge it to the correct value"
kludge (I was not worried about the exact cause of the inexactness, I
just added something to compensate for it).

theoretically, the epsilon could probably be a bit smaller though...

Ben Bacarisse · Feb 14, 2012

I've set followup-to: since the code is all C.

BGB said:
On 2/13/2012 6:31 PM, BGB wrote:

Yes, it looks like conversion is not the issue.

That suggests that the round-trip conversion is happening as expected.

And this is no longer a mystery. If the +1.0 and -1.0 is producing a
non-integer result, then, yes, this fudge factor will repair it.

I see nothing on my Intel hardware (gcc version 4.6.1).

oh yeah, here is an example of the output (with a slight tweak to show
the value held by the double):
1073741824 1073741823 41CFFFFFFFFFFFFF
1073741824 1073741823 41CFFFFFFFFFFFFF
262144 262143 410FFFFFFFFFFFFF
262144 262143 410FFFFFFFFFFFFF
262144 262143 410FFFFFFFFFFFFF
67108864 67108863 418FFFFFFFFFFFFF
4 3 400FFFFFFFFFFFFF
1073741824 1073741823 41CFFFFFFFFFFFFF
1073741824 1073741823 41CFFFFFFFFFFFFF
1073741824 1073741823 41CFFFFFFFFFFFFF
4194304 4194303 414FFFFFFFFFFFFF
4194304 4194303 414FFFFFFFFFFFFF
16384 16383 40CFFFFFFFFFFFFF
67108864 67108863 418FFFFFFFFFFFFF
16384 16383 40CFFFFFFFFFFFFF

hmm, a lot of the same values seem to repeat...

They are more interesting in hex. (0x4, 0x4000, 0x40000 and so on).

If the result is reliable with 4 (i.e. if you take the loop out, set
j = 4 and find that j != k) it will be simple to see if it the +1.0 or
-1.0 that leads to the loss of precision.

However, this is not a problem for the C implementation because the
accuracy of floating point arithmetic is implementation defined. It may
be a problem for the chip, in that the specification might be that this
should not happen, but such error are very rare, so it's more likely to
be by design.

You said elsewhere that there might have been arithmetic being done.
Had that come right away, we could have cut the whole discussion!

Miles Bader · Feb 14, 2012

BGB said:
so, this may not be a conversion bug, but more of an "integer
arithmetic with doubles isn't exact" issue (leads to values ever
slightly smaller than what they would need to be).

Integer arithmetic with doubles _is_ exact though, if the integers can
be exactly-represented as doubles (which appears to be the case here).

-miles

Ben Bacarisse · Feb 14, 2012

Miles Bader said:
Integer arithmetic with doubles _is_ exact though, if the integers can
be exactly-represented as doubles (which appears to be the case here).

I agree it should be, but is it a bug if it isn't? IEEE floating-point
mandates that the results of the basic arithmetic operators be exactly
rounded (i.e. the result is the closest representable number to the
mathematical result) but maybe the hardware in question does not claim
to conform to the IEEE spec.

Neither C nor C++ require such accuracy on their own (though an
implementation can claim to be using IEEE conforming floating-point) so
it's not a bug as far as the language is concerned either.

BGB · Feb 14, 2012

Integer arithmetic with doubles _is_ exact though, if the integers can
be exactly-represented as doubles (which appears to be the case here).

yes.

the issue may be partly a matter of HW though, as I am seeing it on my
HW (using an AMD chip), but apparently someone else is not seeing it
(with an Intel chip), but there does seem to be a pattern in the values
(apparently: 0x4, 0x40, 0x400, 0x4000, ..., so for whatever reason
integer results which should land on one of these values is off by a
tiny amount...).

it could be a minor issue of the "arithmetic with doubles may not be
exact even if the doubles represent integers" variety, which is odd, but
whatever (it can be compensated for by fudging the value).

in theory though, these sorts of calculations should probably be exact.

Eric Sosman · Feb 14, 2012

Integer arithmetic with doubles _is_ exact though, if the integers can
be exactly-represented as doubles (which appears to be the case here).

C doesn't actually guarantee this. It guarantees exact
conversion to an F-P type for all values the type can represent
exactly (for example, 42 must convert to exactly 42.0, not to
42.0000000000000010173 or some such), but it does not guarantee
that 42.0 (exact) plus 1.0 (exact) equals 43.0 (exact).

5.2.4.2.2p4: "The accuracy of the floating-point
operations (+, -, *, /) [...] is implementation-
defined. The implementation may state that the
accuracy is unknown."

C implementations that define __STDC_IEC_559__ provide
additional guarantees that may make your statement true -- for
those implementations. But as far as I can tell, it is not a
guarantee for C with "J. Random Floating-Point."

James Kuyper · Feb 14, 2012

Integer arithmetic with doubles _is_ exact though, if the integers can
be exactly-represented as doubles ...

For IEEE double precision, 2^100 and 1 are both exactly representable,
so is the result of multiplying or dividing them, but their sum and
difference are not exactly representable.

Miles Bader · Feb 14, 2012

BGB said:
yes.

the issue may be partly a matter of HW though, as I am seeing it on my
HW (using an AMD chip), but apparently someone else is not seeing it
(with an Intel chip), but there does seem to be a pattern in the values
(apparently: 0x4, 0x40, 0x400, 0x4000, ..., so for whatever reason
integer results which should land on one of these values is off by a
tiny amount...).

Any conventional PC-type system these days is going to use IEEE FP,
and if the system claims to support IEEE FP, it has to be exact. If
it isn't, it's a bug.

I did run your program on my AMD system (phenom I), and it showed no
output. It would be interesting to see somebody with an identical CPU
to yours try it...

it could be a minor issue of the "arithmetic with doubles may not be
exact even if the doubles represent integers" variety, which is odd, but
whatever (it can be compensated for by fudging the value).

I'm not sure you could call it a minor issue. A lot of software
assumes that FP arithmetic is exact for integer values within a
certain range, and isn't going to do any fudging (because it shouldn't
be necessary, and would have a severe performance impact), so such a
system where fudging is necessary would have ... problems.

-miles

Miles Bader · Feb 14, 2012

Eric Sosman said:
C doesn't actually guarantee this. It guarantees exact
conversion to an F-P type for all values the type can represent
exactly (for example, 42 must convert to exactly 42.0, not to
42.0000000000000010173 or some such), but it does not guarantee
that 42.0 (exact) plus 1.0 (exact) equals 43.0 (exact).

Not C, but C-on-a-system-using-IEEE-FP, which is basically everything
mainstream. [In practice it's a pretty good bet that even wackier FP
hardware actually maintains the same constraint.]

Although C-the-language hedges its bets for extreme portability (and
to some degree, history: things were a lot more wild-n-wooly when C
was created), people writing the actual applications tend to be a bit
more practical, and _do_ assume things that aren't guaranteed by the
language, if the likelihood of that assumption being violated is
infinitesimal. I think this is a reasonable stance where the cost of
not making such assumptions is non-trivial.

-miles

Miles Bader · Feb 14, 2012

James Kuyper said:
For IEEE double precision, 2^100 and 1 are both exactly representable,
so is the result of multiplying or dividing them, but their sum and
difference are not exactly representable.

Well "the integers" should include the answer of course!

-miles

64 bits values or 32 bits	2	Jun 30, 2011
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
Unsigned long problem in c coding	3	Jul 31, 2019
Ordenate and remove duplicate cases of an array passed as a c function argument	0	Sep 27, 2022
Any integer number is always 32 bits	2	May 14, 2012
Cannot convert (double) to (double*)	1	Sep 5, 2022
Struct Member Variable Problems	1	Jun 21, 2023
sizeof (long double)	2	Feb 18, 2011

C/C++ pitfalls related to 64-bits (unsigned long & double)

James Kuyper

Ben Bacarisse

BGB

Malcolm McLean

BGB

Ben Bacarisse

BGB

BGB

Eric Sosman

Ben Bacarisse

BGB

Ben Bacarisse

Miles Bader

Ben Bacarisse

BGB

Eric Sosman

James Kuyper

Miles Bader

Miles Bader

Miles Bader

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads