long double precision

Discussion in 'C++' started by vi, Nov 16, 2007.

  1. vi

    vi Guest

    Hello
    I have a question concerning the precision of long double, think may
    be stupid question, I apalogyze if it is so

    here a piece of code


    #include <iomanip>
    #include <iostream>
    using namespace::std;
    int main () {
    long double toto=0.123456789123456789123456789123456789;
    cout << sizeof(long double) << endl;
    cout << setprecision(21) << toto << endl;
    double titi=0.123456789123456789123456789123456789;
    cout << sizeof(double) << endl;
    cout << setprecision(21) << titi << endl;
    return 0;
    }

    and the result
    16
    0.1234567891234567838
    8
    0.1234567891234567838

    I don't understand why long double and double have the same precision
    in the output,
    they seem to be different in memory, so the problem come from the
    initialisation or for the wrinting in the output?

    Thanks in advance for your reply,
     
    vi, Nov 16, 2007
    #1
    1. Advertising

  2. vi

    Markus Moll Guest

    Hi

    vi wrote:

    > Hello
    > I have a question concerning the precision of long double, think may
    > be stupid question, I apalogyze if it is so
    >
    > here a piece of code
    >
    >
    > #include <iomanip>
    > #include <iostream>
    > using namespace::std;
    > int main () {
    > long double toto=0.123456789123456789123456789123456789;


    The above literal is a double literal. Therefore, the same value is assigned
    to both toto and titi.

    Use 0.123456...L or 0.123456...l to denote that the literal is a long double
    literal (unlike with integers, the type to be chosen is not immediately
    clear. 0.1 is likely not representable in any of the floating point types,
    but you would expect its type to be double, not the type with the greatest
    precision).

    > cout << sizeof(long double) << endl;
    > cout << setprecision(21) << toto << endl;
    > double titi=0.123456789123456789123456789123456789;
    > cout << sizeof(double) << endl;
    > cout << setprecision(21) << titi << endl;
    > return 0;
    > }


    Markus
     
    Markus Moll, Nov 16, 2007
    #2
    1. Advertising

  3. Markus Moll wrote:
    > vi wrote:
    >
    >> Hello
    >> I have a question concerning the precision of long double, think may
    >> be stupid question, I apalogyze if it is so
    >>
    >> here a piece of code
    >>
    >>
    >> #include <iomanip>
    >> #include <iostream>
    >> using namespace::std;
    >> int main () {
    >> long double toto=0.123456789123456789123456789123456789;

    >
    > The above literal is a double literal. Therefore, the same value is
    > assigned to both toto and titi.
    >
    > Use 0.123456...L or 0.123456...l to denote that the literal is a long
    > double literal (unlike with integers, the type to be chosen is not
    > immediately clear. 0.1 is likely not representable in any of the
    > floating point types, but you would expect its type to be double, not
    > the type with the greatest precision).


    The problem may actually be simpler: the Standard does not guarantee
    that 'long double' has more precision than 'double'. BTW, it is the
    case with Microsoft Visual C++ on Windows, for example.

    >
    >> cout << sizeof(long double) << endl;
    >> cout << setprecision(21) << toto << endl;
    >> double titi=0.123456789123456789123456789123456789;
    >> cout << sizeof(double) << endl;
    >> cout << setprecision(21) << titi << endl;
    >> return 0;
    >> }

    >
    > Markus


    V
    --
    Please remove capital 'A's when replying by e-mail
    I do not respond to top-posted replies, please don't ask
     
    Victor Bazarov, Nov 16, 2007
    #3
  4. vi

    Markus Moll Guest

    Hi

    Victor Bazarov wrote:

    > Markus Moll wrote:
    >> Use 0.123456...L or 0.123456...l to denote that the literal is a long
    >> double literal (unlike with integers, the type to be chosen is not
    >> immediately clear. 0.1 is likely not representable in any of the
    >> floating point types, but you would expect its type to be double, not
    >> the type with the greatest precision).

    >
    > The problem may actually be simpler: the Standard does not guarantee
    > that 'long double' has more precision than 'double'. BTW, it is the
    > case with Microsoft Visual C++ on Windows, for example.


    Phew... as the OP said that his long double was twice the size of a double,
    I assumed that the precision would also be greater. However, of course it's
    possible that all the space is wasted or used for the exponent (or for
    redundant sign-bits for error-correction or something like this :p)

    What does MSVC++ say about sizeof(long double) vs sizeof(double)?

    Markus
     
    Markus Moll, Nov 16, 2007
    #4
  5. Markus Moll wrote:
    > [..]
    > What does MSVC++ say about sizeof(long double) vs sizeof(double)?


    8 vs 8

    V
    --
    Please remove capital 'A's when replying by e-mail
    I do not respond to top-posted replies, please don't ask
     
    Victor Bazarov, Nov 16, 2007
    #5
  6. vi

    vi Guest

    Hello
    Great, it works with L!
    Thanks


    On 16 nov, 11:47, Markus Moll <> wrote:
    > Hi
    >
    > vi wrote:
    > > Hello
    > > I have a question concerning the precision of long double, think may
    > > be stupid question, I apalogyze if it is so

    >
    > > here a piece of code

    >
    > > #include <iomanip>
    > > #include <iostream>
    > > using namespace::std;
    > > int main () {
    > > long double toto=0.123456789123456789123456789123456789;

    >
    > The above literal is a double literal. Therefore, the same value is assigned
    > to both toto and titi.
    >
    > Use 0.123456...L or 0.123456...l to denote that the literal is a long double
    > literal (unlike with integers, the type to be chosen is not immediately
    > clear. 0.1 is likely not representable in any of the floating point types,
    > but you would expect its type to be double, not the type with the greatest
    > precision).
    >
    > > cout << sizeof(long double) << endl;
    > > cout << setprecision(21) << toto << endl;
    > > double titi=0.123456789123456789123456789123456789;
    > > cout << sizeof(double) << endl;
    > > cout << setprecision(21) << titi << endl;
    > > return 0;
    > > }

    >
    > Markus
     
    vi, Nov 16, 2007
    #6
  7. Victor Bazarov wrote:
    > Markus Moll wrote:
    >> [..]
    >> What does MSVC++ say about sizeof(long double) vs sizeof(double)?

    >
    > 8 vs 8


    MSVC++ has all kinds of odd settings which are standard, but different
    from any other compiler. Another one is that, if I'm not mistaken,
    sizeof(long) == 32 even in 64-bit platforms when compiling a 64-bit
    binary. (So if you ever programmed assuming 'long' will be 64 bits in a
    64-bit system, then you are for a surprise.)
    Makes one wonder how you seek a file larger than 4GB, given that fseek
    takes a long as parameter.

    (Btw, *why* does it take a long as parameter? Shouldn't it take
    size_t? It's not like what MSVC++ does is wrong or against the standard.
    It just makes it impossible to seek large files with standard code.)
     
    Juha Nieminen, Nov 16, 2007
    #7
  8. Juha Nieminen wrote:
    > Victor Bazarov wrote:
    >> Markus Moll wrote:
    >>> [..]
    >>> What does MSVC++ say about sizeof(long double) vs sizeof(double)?

    >>
    >> 8 vs 8

    >
    > MSVC++ has all kinds of odd settings which are standard, but
    > different from any other compiler. Another one is that, if I'm not
    > mistaken, sizeof(long) == 32 even in 64-bit platforms when compiling
    > a 64-bit binary. (So if you ever programmed assuming 'long' will be
    > 64 bits in a 64-bit system, then you are for a surprise.)
    > Makes one wonder how you seek a file larger than 4GB,


    2GB, actually. 'long' is signed, the largest value is 2^31-1. You
    must be thinking 'unsigned long', but that's not what 'fseek' is
    taking (as you correctly pointed out).

    > given that
    > fseek takes a long as parameter.
    >
    > (Btw, *why* does it take a long as parameter? Shouldn't it take
    > size_t? It's not like what MSVC++ does is wrong or against the
    > standard. It just makes it impossible to seek large files with
    > standard code.)


    (a) It takes 'long' because when C Library was standardised (1989)
    there was no concern probably with the files larger than what 'long'
    can service, and besides, as the files grow, so will 'long', right?
    [Well, Microsoft told them all, didn't it?] (b) If you need to seek
    in files larger than 'long' allows, use either 'fsetpos' or some OS
    specific means. (c) size_t is not a very suitable thing for that,
    since 'size_t' is for the sizes of objects. I would rather think
    that 'ptrdiff_t' is a better choice. (d) Don't use C Library for
    file I/O, use C++ Library, there you'll deal with the special type
    for the position, 'std::basic_streambuf::pos_type'. And if it's
    not large enough, complain to the compiler vendor.

    V
    --
    Please remove capital 'A's when replying by e-mail
    I do not respond to top-posted replies, please don't ask
     
    Victor Bazarov, Nov 16, 2007
    #8
  9. vi

    BobR Guest

    Markus Moll wrote in message...
    > Victor Bazarov wrote:
    > > [snip]
    > > The problem may actually be simpler: the Standard does not guarantee
    > > that 'long double' has more precision than 'double'. BTW, it is the
    > > case with Microsoft Visual C++ on Windows, for example.

    >
    > Phew... as the OP said that his long double was twice the size of a

    double,
    > I assumed that the precision would also be greater. However, of course

    it's
    > possible that all the space is wasted or used for the exponent (or for
    > redundant sign-bits for error-correction or something like this :p)


    // #include <iostream>, <limits>
    std::cout <<" dbl digits ="
    <<(std::numeric_limits<double>::digits)<<std::endl;
    std::cout<<" LD digits ="
    <<(std::numeric_limits<long double>::digits)<<std::endl;

    /* - output - (GCC(MinGW), win98se)
    dbl digits =53
    LD digits =64
    */
    See what you get from those lines.

    >
    > What does MSVC++ say about sizeof(long double) vs sizeof(double)?


    I asked My Second Virtual Cousin (twice added), and he said nothing! <G>

    In Assembler, I used to use eight-byte(dd) and ten-byte(dt) types. That's
    not even close to "twice the size" (If we're talking number of bits).
    [ assembler == a386 ]

    --
    Bob R
    POVrookie
     
    BobR, Nov 16, 2007
    #9
  10. vi

    James Kanze Guest

    On Nov 16, 9:55 pm, "Victor Bazarov" <> wrote:
    > Juha Nieminen wrote:


    > > (Btw, *why* does it take a long as parameter? Shouldn't it take
    > > size_t? It's not like what MSVC++ does is wrong or against the
    > > standard. It just makes it impossible to seek large files with
    > > standard code.)


    > (a) It takes 'long' because when C Library was standardised (1989)
    > there was no concern probably with the files larger than what 'long'
    > can service, and besides, as the files grow, so will 'long', right?


    I don't think that's true. It's been a while, and maybe I'm
    remembering wrong, but I think the problem with using long was
    knows already back then. I *think* (that is, I'm far from sure)
    that the "answer" was supposed to be fgetpos and fsetpos; fseek,
    with long was maintained for reasons of compatilibity with
    existing code.

    Whatever the case, fsetpos and fgetpos didn't take; people
    continued using fseek. And C++ went in yet another direction,
    and ended up requiring the impossible in the standard. (The
    standard requires round-trip conversions between streamoff and
    streampos, but it also requires streampos to contain more
    information.)

    IMHO, the real problem is more fundamental: text files and seek
    simply don't mix, and any attempts by the standard to make it
    work are bound to have problems. C (and indirectly C++) sort of
    addresses those problems by limiting the possibilities of
    seeking in a file opened in text mode. The fact that filebuf
    does code translation even in binary mode reintroduces them in
    C++. And somewhere in all that, implementations seem to have
    forgotten that neither streampos nor streamoff are required to
    be integral types. (Or perhaps rather, they don't dare change
    them from their historical types for fear of breaking existing
    code.)

    With regards to size_t: size_t is related to memory size or
    addressability, not file size: there's certainly nothing
    impossible about a 16 bit system allowing files of more than 4
    GB. Posix uses off_t in its standard (but requires it to be an
    integral type---of course, Posix systems have to support long
    long as well). The logical solution is a different type(def).
    Like in fgetpos and fsetpos.

    > [Well, Microsoft told them all, didn't it?] (b) If you need to seek
    > in files larger than 'long' allows, use either 'fsetpos' or some OS
    > specific means. (c) size_t is not a very suitable thing for that,
    > since 'size_t' is for the sizes of objects. I would rather think
    > that 'ptrdiff_t' is a better choice. (d) Don't use C Library for
    > file I/O, use C++ Library, there you'll deal with the special type
    > for the position, 'std::basic_streambuf::pos_type'. And if it's
    > not large enough, complain to the compiler vendor.


    Who also has to deal with existing code:). How many times have
    we seen people implicitly converting streampos (i.e.
    std::streambuf::pos_type) to some integral type?

    Systems have the same problem, with regards to existing code,
    and Sun, for example, offers three or four different options to
    handle it at the Posix level (not all of which are strictly
    Posix conform, obviously).

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Nov 17, 2007
    #10
  11. vi

    James Kanze Guest

    On Nov 16, 10:55 pm, "BobR" <> wrote:
    > Markus Moll wrote in message...
    > > What does MSVC++ say about sizeof(long double) vs sizeof(double)?


    > I asked My Second Virtual Cousin (twice added), and he said nothing! <G>


    > In Assembler, I used to use eight-byte(dd) and ten-byte(dt) types. That's
    > not even close to "twice the size" (If we're talking number of bits).
    > [ assembler == a386 ]


    And g++ on a PC, at least in some configurations, uses 8 and 12
    bytes.

    At the hardware level, there are 10 bytes of information in an
    Intel long double. But 10 bytes results in some awkward
    alignments. Microsoft, from what I understand, punts on the
    question, by ignoring the hardware long double type (which is
    conform, if not very useful). G++ originally chose to use 12
    bytes (with 2 garbage bytes) according to alignment
    considerations on some older machines: with modern hardware,
    unless you have 16 bytes alignment, you might as well go with
    10. (I'm not sure, but there may also be options to control
    this.)

    Once again, there is no right answer.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Nov 17, 2007
    #11
  12. James Kanze <> writes:

    > On Nov 16, 10:55 pm, "BobR" <> wrote:
    >> Markus Moll wrote in message...
    >> > What does MSVC++ say about sizeof(long double) vs sizeof(double)?

    >
    >> I asked My Second Virtual Cousin (twice added), and he said nothing! <G>

    >
    >> In Assembler, I used to use eight-byte(dd) and ten-byte(dt) types. That's
    >> not even close to "twice the size" (If we're talking number of bits).
    >> [ assembler == a386 ]

    >
    > And g++ on a PC, at least in some configurations, uses 8 and 12
    > bytes.
    >
    > At the hardware level, there are 10 bytes of information in an
    > Intel long double.


    True, with some additional considerations. The commonly used IEEE 754
    floating point formats are

    single precision: 32 bits including 1 sign bit, 23 significand bits
    (with an implicit leading 1, for 24 total), and 8 exponent bits

    double precision: 64 bits including 1 sign bit, 52 significand bits
    (with an implicit leading 1, for 24 total), and 11 exponent bits

    double extended precision: 80 bits including 1 sign bit, 64 significand
    bits (no implicit leading 1), and 15 exponent bits.

    The native format of the x87 FPU is "double extended". We run into this
    from time to time when floating point computations compile with more
    optimization give slightly different results. The issue is that one
    optimiziation is to hold intermediate results in 80-bit FPU registers
    instead of rounding them down to fit in 64-bit memory locations. gcc
    offers the -ffloat-store to suppress this optimization for that very
    reason.

    In addition, the x87 control register allows one to select between
    extended (the default), double and single precision. Try this (on
    Linux/gcc/glibc), for example:

    #include <fpu_control.h>

    void set_double(void)
    {
    unsigned short cw;
    _FPU_GETCW(cw);
    cw = (cw & ~_FPU_EXTENDED) | _FPU_DOUBLE;
    _FPU_SETCW(cw);
    }

    and similarly for "set_extended". Note that this will make all kinds of
    trouble for you because libm depends on having the FPU in extended
    precision mode.

    Now comes the real kicker: the SSE/SSE2/SSE3 vector co-processors do not
    support double extended precision; only single and double precision.
    gcc and icc are both using these co-processors pretty extensively now,
    so you're not guaranteed to get even intermediate results done in
    double-extended arithmetic.

    Going back on-topic, if you want to peek at the binary representations
    of IEEE floating-point numbers, you might enjoy the template included
    below. I supply typedefs for single and double precision, writing the
    typedef for double extended precision is left as an exercise to the
    reader.

    // IEEE Floating-Point template
    // Copyright (C) 2007 Charles M. "Chip" Coldwell <>

    // This program is free software: you can redistribute it and/or modify
    // it under the terms of the GNU General Public License as published by
    // the Free Software Foundation, either version 3 of the License, or
    // (at your option) any later version.

    // This program is distributed in the hope that it will be useful,
    // but WITHOUT ANY WARRANTY; without even the implied warranty of
    // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    // GNU General Public License for more details.

    // You should have received a copy of the GNU General Public License
    // along with this program. If not, see <http://www.gnu.org/licenses/>.

    #ifndef IEEEFLOAT_HH
    #define IEEEFLOAT_HH

    template
    <typename _float_t, typename _sint_t, typename _uint_t, int _mbits, int _ebits>
    class ieee_float {
    public:
    typedef _uint_t uint_t;
    typedef _sint_t sint_t;
    typedef _float_t float_t;
    enum { mbits = _mbits, ebits = _ebits };

    #ifdef __BIG_ENDIAN__
    uint_t s:1;
    uint_t e:ebits;
    uint_t m:mbits;
    #else
    uint_t m:mbits;
    uint_t e:ebits;
    uint_t s:1;
    #endif

    static const uint_t mdenom = ((uint_t)1 << mbits);
    static const uint_t ebias = ((uint_t)1 << (ebits - 1)) - 1;

    sint_t sign(void) const { return 1 - 2*s; }
    sint_t exponent(void) const { return e ? e - ebias : -(ebias - 1); }
    uint_t mantissa(void) const { return ((uint_t)(!!e) << mbits) | m; }
    bool infinity(void) const { return (e == ((1 << ebits) - 1)) && (m == 0); }
    bool nan(void) const { return (e == ((1 << ebits) - 1)) && (m != 0); }
    bool denormal(void) const { return (e == 0) && (m != 0); }

    ieee_float(float_t f) { *reinterpret_cast<float_t *>(this) = f; }
    ieee_float(uint_t u) { *reinterpret_cast<uint_t *>(this) = u; }

    operator float_t() const { return *reinterpret_cast<const float_t *>(this); }
    operator uint_t() const { return *reinterpret_cast<const uint_t *>(this); }
    };

    typedef ieee_float<float, int, unsigned, 23, 8>
    single_precision;
    typedef ieee_float<double, long long, unsigned long long, 52, 11>
    double_precision;

    #endif

    Chip

    --
    Charles M. "Chip" Coldwell
    "Turn on, log in, tune out"
    GPG Key ID: 852E052F
    GPG Key Fingerprint: 77E5 2B51 4907 F08A 7E92 DE80 AFA9 9A8F 852E 052F
     
    Charles Coldwell, Apr 20, 2008
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ferran
    Replies:
    9
    Views:
    3,096
    Kevin Goodsell
    Apr 12, 2004
  2. Sydex
    Replies:
    12
    Views:
    6,600
    Victor Bazarov
    Feb 17, 2005
  3. Dan Pop
    Replies:
    0
    Views:
    1,199
    Dan Pop
    Jun 24, 2003
  4. cyberdude
    Replies:
    2
    Views:
    5,153
    Keith Thompson
    Jun 25, 2003
  5. Daniel Rudy

    unsigned long long int to long double

    Daniel Rudy, Sep 19, 2005, in forum: C Programming
    Replies:
    5
    Views:
    1,234
    Peter Shaggy Haywood
    Sep 20, 2005
Loading...

Share This Page