double vs float

  • Thread starter subramanian100in
  • Start date
S

subramanian100in

In the post with heading "Learning C - Scanf or Getch, or Getchar not
working correctly after first loop" that appears in today's list in
comp.lang.c,

it has been mentioned in the answer to this post, that double should
be preferred to float when space is not a constraint.

Can someone explain why double should be preferred to float ?
 
E

Erik de Castro Lopo

Can someone explain why double should be preferred to float ?

IEEE floats have 24 bits of mantissa while IEEE double floats
have 51 bits of mantissa.

That means that calculations using doubles are less likely to
be affected by rounding and truncation problems than calculations
using floats.

Once very good read is "What Every Computer Scientist Should
Know About Floating-Point Arithmetic":

http://docs.sun.com/source/806-3568/ncg_goldberg.html

Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"The RIAA is obsessed to the point of comedy with the frustration
of having its rules broken, without considering whether such rules
might be standing in the way of increased revenues. Indeed,
Napster and Gnutella may turn out to be the two best music-marketing
gimmicks yet devised, if only the RIAA would take its head out of
its ass long enough to realise it."
-- Thomas C Greene on www.theregister.co.uk
 
S

santosh

In the post with heading "Learning C - Scanf or Getch, or Getchar not
working correctly after first loop" that appears in today's list in
comp.lang.c,

it has been mentioned in the answer to this post, that double should
be preferred to float when space is not a constraint.

Can someone explain why double should be preferred to float ?

Usually it's because double corresponds more closely with the system's
native floating point type than float. Also, it gives more range and
precision than float, which is an important point when using floating-
point types.
 
J

Jack Klein

Usually it's because double corresponds more closely with the system's
native floating point type than float. Also, it gives more range and
precision than float, which is an important point when using floating-
point types.

What system is it that has "a" native floating point type, only one
that is, that corresponds more closely to double than float?

Your answer makes no sense at all, for several reasons:

1. If a platform only has a single floating point type supported in
hardware, and it meets the requirements for C's double, there is
nothing at all preventing the C implementation from using it for float
as well.

2. I can't think of any hardware architecture off-hand that has
hardware floating point for something suitable for a double, such as
64 bits or more, that does not also have hardware support for a
narrower floating point type.

On the other hand, I know of quite a few platforms where the opposite
is true, including 32-bit controllers and DSPs, namely that they have
hardware 32-bit floating point, which meets the C requirements for
float, but not a wider hardware floating point type.
 
S

santosh

After reading your points I realise I may have been incorrect in my
assumptions.
What system is it that has "a" native floating point type, only one
that is, that corresponds more closely to double than float?

Well the IA32 architecture has an 80 bit hardware floating point type,
to which the C double would correspond more closely than a float.
Your answer makes no sense at all, for several reasons:

1. If a platform only has a single floating point type supported in
hardware, and it meets the requirements for C's double, there is
nothing at all preventing the C implementation from using it for float
as well.

Except for the fact that one of float's purpose is to reduce memory
requirements.
2. I can't think of any hardware architecture off-hand that has
hardware floating point for something suitable for a double, such as
64 bits or more, that does not also have hardware support for a
narrower floating point type.

On the other hand, I know of quite a few platforms where the opposite
is true, including 32-bit controllers and DSPs, namely that they have
hardware 32-bit floating point, which meets the C requirements for
float, but not a wider hardware floating point type.

I stand corrected.
 
M

Malcolm McLean

In the post with heading "Learning C - Scanf or Getch, or Getchar not
working correctly after first loop" that appears in today's list in
comp.lang.c,

it has been mentioned in the answer to this post, that double should
be preferred to float when space is not a constraint.

Can someone explain why double should be preferred to float ?
For the vast majority of calculations time and space to hold the variable
are non-issues. In real [pun] applications doubles virtually always have
sufficient precision, floats often need careful coding. For instance if you
are calculating the mean of a list of numbers it might well be important to
add the lowest first in single precision to reduce round-off errors, it is
most unlikely that this would be necessary in double precision.

Then the maths library functions use doubles.

The big exception for the programs I write is geometry routines. Here size
is an issue because we are often storing many millions of 3d points,
precision isn't very important - no one minds if a polygon is off by a pixel
as long as it joins to its neighbour without showing white - and speed is
often crucial. However it is real nuisance having two representations of
real numbers in the machine.
 
E

Ernie Wright

Malcolm said:
Can someone explain why double should be preferred to float ?

[...]
The big exception for the programs I write is geometry routines. Here
size is an issue because we are often storing many millions of 3d
points, precision isn't very important - no one minds if a polygon is
off by a pixel as long as it joins to its neighbour without showing
white - and speed is often crucial.

All true, except for the "no one minds" part.

A great many 3D artists using commercial software will eventually
encounter the effects of limited precision. The most common effect is
a quantization of point positions that makes smooth polygonal surfaces
look like they're made of Lego bricks. Some highly optimized raytracers
are also very sensitive to roundoff error in the calculation of polygon
normals; a common symptom is cracks in raytraced shadows.

Most artists don't understand why this happens, or why the software
allows it to happen. I've been trying to explain it to them for a very
long time:

http://groups.google.com/group/comp.graphics.apps.lightwave/msg/e6c1f01b8bf470b5

Float is still a good tradeoff in this case. An even better use of
float is in 2D image formats,

http://en.wikipedia.org/wiki/High_dynamic_range_imaging

where double would increase storage and transmission costs without any
visible benefit.

- Ernie http://home.comcast.net/~erniew
 
P

pete

Jack said:
What system is it that has "a" native floating point type, only one
that is, that corresponds more closely to double than float?

Type float is subject to "the default argument promotions".
Type double, isn't.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,266
Messages
2,571,082
Members
48,773
Latest member
Kaybee

Latest Threads

Top