double precise enough for unsigned short?

Discussion in 'C Programming' started by Andy, May 31, 2010.

  1. Andy

    Andy Guest

    Hello,

    I will be storing a series of unsigned shorts (C language)
    as doubles. I used gcc 3.2 to print out a bunch of
    sizeof statements, and unsigned shorts are 4 bytes.
    Doubles are 8 bytes, with the fractional part being 52 bits.
    Everything seems to indicate that double has more than
    enough precision to represent all possible values of
    unsigned short. I'm coding with this assumption in mind.
    Just to be sure, can anyone confirm this? My simplistic
    reason for assuming this is that 32 bits of the fractional
    part of the double can be made nonfractional by shifting
    by 32. That's enough nonfractional bits to represent all
    32 bits of an unsigned short. To indicate this 32-bit
    shift, the exponent part of the double must be 6 bits. This
    easily fits into the 11 bits for the exponent of a double.

    Despite this justification, I just want to be sure that I'm
    not missing something conceptually about the IEEE
    number representation. I recall vaguely that the numbers
    represented by double are not uniformly distributed, but
    the above reasoning seems to indicate that they are good
    enough for unsigned shorts.

    Thanks.
     
    Andy, May 31, 2010
    #1
    1. Advertising

  2. Andy

    SG Guest

    On 31 Mai, 23:12, Andy wrote:
    >
    > I will be storing a series of unsigned shorts (C language)
    > as doubles.  I used gcc 3.2 to print out a bunch of
    > sizeof statements, and unsigned shorts are 4 bytes.


    Interesting. What platform is this?

    > Doubles are 8 bytes, with the fractional part being 52 bits.
    > Everything seems to indicate that double has more than
    > enough precision to represent all possible values of
    > unsigned short.  I'm coding with this assumption in mind.
    > Just to be sure, can anyone confirm this?


    There is (as far as I know) no upper bound on what values can be
    represented with unsigned shorts. Typically shorts are 16 bits wide
    but could be wider.

    If I remember correctly, the C standard requires the macro DBL_DIG
    (see float.h) to evaluate to an int of at least 10. So you have at
    least 10 significant digits. For IEEE 754 doubles (52 bit mantissas)
    you have even more precision. But 10 digits is enough to store 16 bit
    integers losslessly. And it might be just enough to store 32 bit
    integers losslessly (not 100% about that). But IEEE 754 doubles
    certainly will do so.

    > Despite this justification, I just want to be sure that I'm
    > not missing something conceptually about the IEEE
    > number representation.


    No, I don't think that you missed something -- except maybe that short
    is not restricted in size. If cases where short is 64 bits (or more)
    wide an IEEE-754 double won't be able to losslessly hold all short
    values.

    Cheers!
    SG
     
    SG, May 31, 2010
    #2
    1. Advertising

  3. Andy

    bart.c Guest

    "Andy" <> wrote in message
    news:hu18ns$9ho$...
    > Hello,
    >
    > I will be storing a series of unsigned shorts (C language)
    > as doubles. I used gcc 3.2 to print out a bunch of
    > sizeof statements, and unsigned shorts are 4 bytes.
    > Doubles are 8 bytes, with the fractional part being 52 bits.
    > Everything seems to indicate that double has more than
    > enough precision to represent all possible values of
    > unsigned short. I'm coding with this assumption in mind.
    > Just to be sure, can anyone confirm this? My simplistic
    > reason for assuming this is that 32 bits of the fractional
    > part of the double can be made nonfractional by shifting
    > by 32. That's enough nonfractional bits to represent all
    > 32 bits of an unsigned short. To indicate this 32-bit
    > shift, the exponent part of the double must be 6 bits. This
    > easily fits into the 11 bits for the exponent of a double.
    >
    > Despite this justification, I just want to be sure that I'm
    > not missing something conceptually about the IEEE
    > number representation. I recall vaguely that the numbers
    > represented by double are not uniformly distributed, but
    > the above reasoning seems to indicate that they are good
    > enough for unsigned shorts.


    You might just try testing every possible value:

    #include <stdio.h>
    #include <limits.h>

    int main(void){
    unsigned short i,j;
    double x;

    i=0;
    do {
    x=i;
    j=x;
    if (i!=j)
    printf("Can't represent %u as double\n",i);
    ++i;
    } while (i!=0);
    }

    (Turn off optimisation)

    --
    Bartc
     
    bart.c, May 31, 2010
    #3
  4. Andy <> wrote:
    > I will be storing a series of unsigned shorts (C language)
    > as [IEEE] doubles.


    Why?

    --
    Peter
     
    Peter Nilsson, Jun 1, 2010
    #4
  5. Andy

    bart.c Guest

    "Richard Heathfield" <> wrote in message
    news:...
    > In <hmWMn.34334$8g7.14401@hurricane>, bart.c wrote:


    >> You might just try testing every possible value:
    >> unsigned short i,j;
    >> double x;
    >>
    >> i=0;
    >> do {
    >> x=i;
    >> j=x;
    >> if (i!=j)
    >> printf("Can't represent %u as double\n",i);
    >> ++i;
    >> } while (i!=0);
    >> }

    >
    > If there is genuine cause for concern, unsigned short is sufficiently
    > large to make your program rather tedious to run.


    16-bit shorts are instant. 32-bit ones took a few minutes on my computer,
    but the test is only done once.

    With 64-bits shorts, you already know they can't be represented.

    > It would make more
    > sense to binary-search. Start at USHRT_MAX, and play the max/min
    > game.


    You mean, to find the point in the range where it stops working? More
    sensible then to just test 0 and USHRT_MAX then.

    You need to test every possible value if there are any doubts being able to
    represent each one (for example if the conversion to and from double is more
    elaborate than simple assignment).

    >> (Turn off optimisation)

    >
    > By all means turn off machine optimisation if you think it'll make a
    > difference,


    It's just to stop the compiler possibly using a register floating point
    value (which on my machine has more precision than double).

    > but there's no need to turn off design optimisation.


    --
    Bartc
     
    bart.c, Jun 1, 2010
    #5
  6. Andy

    SG Guest

    On 1 Jun., 11:33, bart.c wrote:
    >
    > You need to test every possible value if there are any doubts being able to
    > represent each one (for example if the conversion to and from double is more
    > elaborate than simple assignment).


    I think it's safe to assume that the differences between consecutive
    representable numbers is a monotonically increasing function. In that
    case you only need to check USHRT_MAX and USHRT_MAX-1.

    But something of the form

    DBL_EPSILON * USHRT_MAX < some_threshold

    for some value of some_threshold (between 0.4 and 1.0) is probably a
    better way to test this.

    > >> (Turn off optimisation)

    >
    > > By all means turn off machine optimisation if you think it'll make a
    > > difference,

    >
    > It's just to stop the compiler possibly using a register floating point
    > value (which on my machine has more precision than double).


    Aren't there rules that prohibit a compiler from using such
    intermetiate high precision representations beyond the evaluation of a
    full expression? I'm not sure. But I think I read something like that
    somewhere ... If so, it should suffice to store the double value in a
    variable in one expression and read/use the variable's value in
    another expression a la

    double x = ...;
    ...x... // new expression

    (?)


    Cheers!
    SG
     
    SG, Jun 1, 2010
    #6
  7. Andy

    Eric Sosman Guest

    On 6/1/2010 4:35 AM, Richard Heathfield wrote:
    > In<hmWMn.34334$8g7.14401@hurricane>, bart.c wrote:
    >
    > <snip>
    >>
    >> You might just try testing every possible value:
    >>
    >> #include<stdio.h>
    >> #include<limits.h>
    >>
    >> int main(void){
    >> unsigned short i,j;
    >> double x;
    >>
    >> i=0;
    >> do {
    >> x=i;
    >> j=x;
    >> if (i!=j)
    >> printf("Can't represent %u as double\n",i);
    >> ++i;
    >> } while (i!=0);
    >> }

    >
    > If there is genuine cause for concern, unsigned short is sufficiently
    > large to make your program rather tedious to run. It would make more
    > sense to binary-search. Start at USHRT_MAX, and play the max/min
    > game.


    Not sure how you'd apply binary search to this problem, since
    there's no obvious "order:" each test tells you that a given value
    does or does not garble when converted to double and back, and gives
    no obvious clue about what happens to larger and smaller values.
    Did you have some other order relation in mind?

    Anyhow, I think the only value one need test is USHRT_MAX itself,
    unless FLT_RADIX is something really, really weird. For the common
    FLT_RADIX values of 2 and 16 (and even for the uncommon value 10),
    USHRT_MAX will need at least as many non-zero fraction digits as
    any other unsigned short value.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Jun 1, 2010
    #7
  8. Andy

    bart.c Guest

    "Richard Heathfield" <> wrote in message
    news:...
    > bart.c wrote:
    >>
    >> "Richard Heathfield" <> wrote in message
    >> news:...

    >
    > <snip>
    >
    >>> If there is genuine cause for concern, unsigned short is sufficiently
    >>> large to make your program rather tedious to run.

    >>
    >> 16-bit shorts are instant. 32-bit ones took a few minutes on my computer,
    >> but the test is only done once.
    >>
    >> With 64-bits shorts, you already know they can't be represented.

    >
    > Why? If an implementation has 64-bit short ints, maybe it also has 256-bit
    > doubles.


    I think a 64-bit short/64-bit double implementation would be far more
    common. But given 64-bit short/256-bit double, then probably a different
    approach is needed, or just more confidence that all 64-bit values are in
    fact representable.

    --
    Bartc
     
    bart.c, Jun 1, 2010
    #8
  9. Richard Heathfield <> writes:
    > Eric Sosman wrote:
    >> On 6/1/2010 4:35 AM, Richard Heathfield wrote:
    >>> In<hmWMn.34334$8g7.14401@hurricane>, bart.c wrote:
    >>>
    >>> <snip>
    >>>>
    >>>> You might just try testing every possible value:
    >>>>

    [snip]
    >>>
    >>> If there is genuine cause for concern, unsigned short is sufficiently
    >>> large to make your program rather tedious to run. It would make more
    >>> sense to binary-search. Start at USHRT_MAX, and play the max/min
    >>> game.

    >>
    >> Not sure how you'd apply binary search to this problem, since
    >> there's no obvious "order:" each test tells you that a given value
    >> does or does not garble when converted to double and back, and gives
    >> no obvious clue about what happens to larger and smaller values.
    >> Did you have some other order relation in mind?

    >
    > Oh, it was just a faulty (albeit probably fairly accurate!) assumption -
    > that double will be able to represent integers precisely, up to a
    > certain point, but not (reliably) beyond that point. If there is such a
    > point for unsigned short ints, it would be nice to know what it is.


    But since the non-representable values aren't consecutive, a simple
    binary search might not find the smallest one.

    It's probably safe to assume that if N-1 and N are both exactly
    representable, then all values from 0 to N are exactly representable.
    Given that assumption, you could do a binary search to find the
    smallest non-representable value.

    > (Having said that, I suspect that the range of contiguous integer values
    > starting from 0 and exactly representable by double is likely to exceed
    > USHRT_MAX on any "normal" implementation, so there is probably no such
    > point.)


    Well, I've used a system (Cray T90) where both short and double
    were 64 bits. But short may have had padding bits; I never had a
    chance to check.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Jun 1, 2010
    #9
  10. On Jun 1, 12:12 am, Andy <> wrote:
    >
    > I will be storing a series of unsigned shorts (C language)
    > as doubles.  
    >

    Double is usually capable of representing every possible short, long
    or int. However some very small compilers have very low-precision
    floating point representation (this is not compliant), and some very
    large systems have shorts with 64 bits and doubles with 64 bits (this
    is compliant, but normally you would only use the lower two bytes of a
    short).
     
    Malcolm McLean, Jun 1, 2010
    #10
  11. Andy

    Guest

    On Jun 1, 10:58 am, Malcolm McLean <>
    wrote:
    > On Jun 1, 12:12 am, Andy <> wrote:
    >
    > > I will be storing a series of unsigned shorts (C language)
    > > as doubles.  

    >
    > Double is usually capable of representing every possible short, long
    > or int. However some very small compilers have very low-precision
    > floating point representation (this is not compliant), and some very
    > large systems have shorts with 64 bits and doubles with 64 bits (this
    > is compliant, but normally you would only use the lower two bytes of a
    > short).



    Certainly not true for any of the 64 bit *nix platforms I'm aware of,
    which have 64 bit doubles and 64 bit longs. And that would be the
    case for any other LP64 or ILP64 system (and I'm sure there's a 64-bit
    *nix implementation out there somewhat that *isn't* LP64, but I don't
    know it). Windows, OTOH, is LLP64 (32 bit longs, 64 bit long longs).
     
    , Jun 1, 2010
    #11
  12. "" <> writes:
    > On Jun 1, 10:58 am, Malcolm McLean <>
    > wrote:
    >> On Jun 1, 12:12 am, Andy <> wrote:
    >>
    >> > I will be storing a series of unsigned shorts (C language)
    >> > as doubles.  

    >>
    >> Double is usually capable of representing every possible short, long
    >> or int. However some very small compilers have very low-precision
    >> floating point representation (this is not compliant), and some very
    >> large systems have shorts with 64 bits and doubles with 64 bits (this
    >> is compliant, but normally you would only use the lower two bytes of a
    >> short).

    >
    > Certainly not true for any of the 64 bit *nix platforms I'm aware of,
    > which have 64 bit doubles and 64 bit longs. And that would be the
    > case for any other LP64 or ILP64 system (and I'm sure there's a 64-bit
    > *nix implementation out there somewhat that *isn't* LP64, but I don't
    > know it). Windows, OTOH, is LLP64 (32 bit longs, 64 bit long longs).


    As I mentioned elsethread, the Cray T90 is (was?) a Unix system with
    64-bit shorts and 64-bit doubles. The more recent SV1 has the same
    characteristics, and other Cray vector systems are probably similar.
    But short could well have a significant number of padding bits.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Jun 1, 2010
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sydex
    Replies:
    12
    Views:
    6,505
    Victor Bazarov
    Feb 17, 2005
  2. DAVID SCHULMAN

    When (32-bit) double precision isn't precise enough

    DAVID SCHULMAN, Sep 10, 2003, in forum: C Programming
    Replies:
    5
    Views:
    632
    John L
    Sep 12, 2003
  3. Replies:
    4
    Views:
    830
    Kaz Kylheku
    Oct 17, 2006
  4. fancyerii
    Replies:
    21
    Views:
    1,494
    Roedy Green
    Nov 5, 2007
  5. Ioannis Vranos

    unsigned short, short literals

    Ioannis Vranos, Mar 4, 2008, in forum: C Programming
    Replies:
    5
    Views:
    682
    Eric Sosman
    Mar 5, 2008
Loading...

Share This Page