integer literals

Discussion in 'C++' started by Armen Tsirunyan, Sep 26, 2010.

  1. Hi all.
    Consider the following program
    #include <iostream>
    #include <typeinfo>

    template <class T>
    void f(T x)
    {
    std::cout << typeid(x).name() << std::endl;
    }
    int main()
    {
    f(2000000000);
    f(3000000000u);
    f(3000000000);
    }

    I am using Microsoft Visual Studio 2008 and on my machine int and long
    are both 32 bits.
    As far as I understood from the 2003 C++ standard, this program should
    print

    int
    unsigned int
    unsigned int

    however it prints

    int
    unsigned int
    unsigned long

    Is this a bug of MSVC9.0 or I have misinterpreted the standard?
    Also, please note that I am aware that the standard imposes no
    considerable requirements on type_info::name(). So please let's not
    say "the output is correct since name can be anything".
    Also, if this indeed is a bug of MSVC (this is a bit off-top now) is
    there any use reporting that bug to them, I mean do they care? :)
    Thank you in advance for your comments,
    Armen Tsirunyan.
     
    Armen Tsirunyan, Sep 26, 2010
    #1
    1. Advertising

  2. Armen Tsirunyan

    Kai-Uwe Bux Guest

    Armen Tsirunyan wrote:

    > Hi all.
    > Consider the following program
    > #include <iostream>
    > #include <typeinfo>
    >
    > template <class T>
    > void f(T x)
    > {
    > std::cout << typeid(x).name() << std::endl;
    > }
    > int main()
    > {
    > f(2000000000);
    > f(3000000000u);
    > f(3000000000);
    > }
    >
    > I am using Microsoft Visual Studio 2008 and on my machine int and long
    > are both 32 bits.
    > As far as I understood from the 2003 C++ standard, this program should
    > print
    >
    > int
    > unsigned int
    > unsigned int
    >
    > however it prints
    >
    > int
    > unsigned int
    > unsigned long
    >
    > Is this a bug of MSVC9.0 or I have misinterpreted the standard?


    From the standard [2.13.1/2]

    The type of an integer literal depends on its form, value, and suffix. If
    it is decimal and has no suffix, it has the first of these types in which
    its value can be represented: int, long int; if the value cannot be
    represented as a long int, the behavior is undefined. If it is octal or
    hexadecimal and has no suffix ...

    The last literal seems to be decimal, without suffix, and not representable
    as long int. My reading is that the program has UB.

    Now, as a matter of implementation quality, I would strongly expect a
    compilation error. The standard also gives license for that (one could even
    read it as a requirement) [5/5]:

    If during the evaluation of an expression, the result is not
    mathematically defined or not in the range of representable
    values for its type, the behavior is undefined, unless such an expression
    is a constant expression (5.19), in which case the program is ill-formed.


    > Also, please note that I am aware that the standard imposes no
    > considerable requirements on type_info::name(). So please let's not
    > say "the output is correct since name can be anything".
    > Also, if this indeed is a bug of MSVC (this is a bit off-top now) is
    > there any use reporting that bug to them, I mean do they care? :)
    > Thank you in advance for your comments,
    > Armen Tsirunyan.



    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Sep 26, 2010
    #2
    1. Advertising


  3. > From the standard [2.13.1/2]
    >
    >   The type of an integer literal depends on its form, value, and suffix.. If
    >   it is decimal and has no suffix, it has the first of these types in which
    >   its value can be represented: int, long int; if the value cannot be
    >   represented as a long int, the behavior is undefined. If it is octal or
    >   hexadecimal and has no suffix ...
    >
    > The last literal seems to be decimal, without suffix, and not representable
    > as long int. My reading is that the program has UB.
    >


    Yes, you are right, I just saw the part where it said the first of
    int, unsigned int, long, unsigned long, and missed the part which said
    that this referred to octal and hexadecimal literals :)

    > Now, as a matter of implementation quality, I would strongly expect a
    > compilation error.


    How would you expect a compilation error if the standard says it's
    undefined behavior?

    > The standard also gives license for that (one could even
    > read it as a requirement) [5/5]:
    >


    License for what? For treating undefined behavior as a compilation
    error?
    Is this true or false? "Undefined behavior refers to syntactically
    well-formed programs. The behavior of the program is undefined, not
    the behavior of the compiler."

    Thanks,
    Armen Tsirunyan
     
    Armen Tsirunyan, Sep 26, 2010
    #3
  4. Armen Tsirunyan <>, on 26/09/2010 06:06:12, wrote:

    >
    >> From the standard [2.13.1/2]
    >>
    >> The type of an integer literal depends on its form, value, and suffix.. If
    >> it is decimal and has no suffix, it has the first of these types in which
    >> its value can be represented: int, long int; if the value cannot be
    >> represented as a long int, the behavior is undefined. If it is octal or
    >> hexadecimal and has no suffix ...
    >>
    >> The last literal seems to be decimal, without suffix, and not representable
    >> as long int. My reading is that the program has UB.
    >>

    >
    > Yes, you are right, I just saw the part where it said the first of
    > int, unsigned int, long, unsigned long, and missed the part which said
    > that this referred to octal and hexadecimal literals :)
    >
    >> Now, as a matter of implementation quality, I would strongly expect a
    >> compilation error.

    >
    > How would you expect a compilation error if the standard says it's
    > undefined behavior?
    >
    >> The standard also gives license for that (one could even
    >> read it as a requirement) [5/5]:
    >>

    >
    > License for what? For treating undefined behavior as a compilation
    > error?
    > Is this true or false? "Undefined behavior refers to syntactically
    > well-formed programs. The behavior of the program is undefined, not
    > the behavior of the compiler."


    Kai-Uwe referred to this part in particular:
    "unless such an expression is a constant expression (5.19), in which
    case the program is ill-formed."

    Since in your case we have a constant expressions which cannot be
    represented by a long int, a reading of the standard could lead to the
    compiler rejecting your compilation unit as ill-formed. Honestly, I
    think this is the only reasonable reading.

    Please keep attribution lines in place when quoting the messages you're
    replying to.

    --
    Francesco S. Carta
    http://fscode.altervista.org
     
    Francesco S. Carta, Sep 26, 2010
    #4
  5. Pete Becker <>, on 26/09/2010 09:28:10, wrote:

    > On 2010-09-26 09:16:30 -0400, Francesco S. Carta said:
    >
    >> Armen Tsirunyan <>, on 26/09/2010 06:06:12, wrote:
    >>
    >>>
    >>>> From the standard [2.13.1/2]
    >>>>
    >>>> The type of an integer literal depends on its form, value, and
    >>>> suffix.. If
    >>>> it is decimal and has no suffix, it has the first of these types in
    >>>> which
    >>>> its value can be represented: int, long int; if the value cannot be
    >>>> represented as a long int, the behavior is undefined. If it is octal or
    >>>> hexadecimal and has no suffix ...
    >>>>
    >>>> The last literal seems to be decimal, without suffix, and not
    >>>> representable
    >>>> as long int. My reading is that the program has UB.
    >>>>
    >>>
    >>> Yes, you are right, I just saw the part where it said the first of
    >>> int, unsigned int, long, unsigned long, and missed the part which said
    >>> that this referred to octal and hexadecimal literals :)
    >>>
    >>>> Now, as a matter of implementation quality, I would strongly expect a
    >>>> compilation error.
    >>>
    >>> How would you expect a compilation error if the standard says it's
    >>> undefined behavior?
    >>>
    >>>> The standard also gives license for that (one could even
    >>>> read it as a requirement) [5/5]:
    >>>>
    >>>
    >>> License for what? For treating undefined behavior as a compilation
    >>> error?
    >>> Is this true or false? "Undefined behavior refers to syntactically
    >>> well-formed programs. The behavior of the program is undefined, not
    >>> the behavior of the compiler."

    >>
    >> Kai-Uwe referred to this part in particular:
    >> "unless such an expression is a constant expression (5.19), in which
    >> case the program is ill-formed."
    >>
    >> Since in your case we have a constant expressions which cannot be
    >> represented by a long int, a reading of the standard could lead to the
    >> compiler rejecting your compilation unit as ill-formed. Honestly, I
    >> think this is the only reasonable reading.
    >>

    >
    > Right, in a compiler that strictly enforces C++03's rules. But most
    > compilers these days have long long and unsigned long long, and apply
    > the C++0x rules for interpreting integer literals; they're the obvious
    > analog to the C++03 rules.


    My MinGW 4.4.0, despite having long long which (I think) should be
    considered more fitting (as the original literal reported by the OP has
    no "u" suffix), interprets it as an unsigned long, furthermore it
    reports a warning telling that "this decimal constant is unsigned only
    in ISO C90".

    I have no idea about why it is using a C90 rule here.

    --
    Francesco S. Carta
    http://fscode.altervista.org
     
    Francesco S. Carta, Sep 26, 2010
    #5
  6. Pete Becker <>, on 26/09/2010 10:23:31, wrote:

    > On 2010-09-26 10:16:35 -0400, Francesco S. Carta said:
    >
    >> Pete Becker <>, on 26/09/2010 09:28:10, wrote:
    >>
    >>> On 2010-09-26 09:16:30 -0400, Francesco S. Carta said:
    >>>
    >>>> Armen Tsirunyan <>, on 26/09/2010 06:06:12, wrote:
    >>>>
    >>>>>
    >>>>>> From the standard [2.13.1/2]
    >>>>>>
    >>>>>> The type of an integer literal depends on its form, value, and
    >>>>>> suffix.. If
    >>>>>> it is decimal and has no suffix, it has the first of these types in
    >>>>>> which
    >>>>>> its value can be represented: int, long int; if the value cannot be
    >>>>>> represented as a long int, the behavior is undefined. If it is
    >>>>>> octal or
    >>>>>> hexadecimal and has no suffix ...
    >>>>>>
    >>>>>> The last literal seems to be decimal, without suffix, and not
    >>>>>> representable
    >>>>>> as long int. My reading is that the program has UB.
    >>>>>>
    >>>>>
    >>>>> Yes, you are right, I just saw the part where it said the first of
    >>>>> int, unsigned int, long, unsigned long, and missed the part which said
    >>>>> that this referred to octal and hexadecimal literals :)
    >>>>>
    >>>>>> Now, as a matter of implementation quality, I would strongly expect a
    >>>>>> compilation error.
    >>>>>
    >>>>> How would you expect a compilation error if the standard says it's
    >>>>> undefined behavior?
    >>>>>
    >>>>>> The standard also gives license for that (one could even
    >>>>>> read it as a requirement) [5/5]:
    >>>>>>
    >>>>>
    >>>>> License for what? For treating undefined behavior as a compilation
    >>>>> error?
    >>>>> Is this true or false? "Undefined behavior refers to syntactically
    >>>>> well-formed programs. The behavior of the program is undefined, not
    >>>>> the behavior of the compiler."
    >>>>
    >>>> Kai-Uwe referred to this part in particular:
    >>>> "unless such an expression is a constant expression (5.19), in which
    >>>> case the program is ill-formed."
    >>>>
    >>>> Since in your case we have a constant expressions which cannot be
    >>>> represented by a long int, a reading of the standard could lead to the
    >>>> compiler rejecting your compilation unit as ill-formed. Honestly, I
    >>>> think this is the only reasonable reading.
    >>>>
    >>>
    >>> Right, in a compiler that strictly enforces C++03's rules. But most
    >>> compilers these days have long long and unsigned long long, and apply
    >>> the C++0x rules for interpreting integer literals; they're the obvious
    >>> analog to the C++03 rules.

    >>
    >> My MinGW 4.4.0, despite having long long which (I think) should be
    >> considered more fitting (as the original literal reported by the OP
    >> has no "u" suffix), interprets it as an unsigned long, furthermore it
    >> reports a warning telling that "this decimal constant is unsigned only
    >> in ISO C90".
    >>
    >> I have no idea about why it is using a C90 rule here.

    >
    > Hmm, I don't either. The rule in C++0x is the first that fits from int,
    > long int, long long int. Well, maybe gcc hasn't implemented the C++0x
    > rules yet.


    That really seems to be so: using compiler flags such as -std=c++0x or
    -std=gnu++0x don't change anything, that warning still gets raised and
    that literal still gets interpreted as an unsigned long.

    --
    Francesco S. Carta
    http://fscode.altervista.org
     
    Francesco S. Carta, Sep 26, 2010
    #6
  7. * Francesco S. Carta, on 26.09.2010 16:40:
    > Pete Becker <>, on 26/09/2010 10:23:31, wrote:
    >
    >> Hmm, I don't either. The rule in C++0x is the first that fits from int,
    >> long int, long long int. Well, maybe gcc hasn't implemented the C++0x
    >> rules yet.

    >
    > That really seems to be so: using compiler flags such as -std=c++0x or
    > -std=gnu++0x don't change anything, that warning still gets raised and that
    > literal still gets interpreted as an unsigned long.


    I can confirm that for MinGW g++ 4.4.1.


    Cheers,

    - Alf

    --
    blog at <url: http://alfps.wordpress.com>
     
    Alf P. Steinbach /Usenet, Sep 26, 2010
    #7
  8. Pete Becker <> wrote:
    > behavior, such as might arise upon use of an erroneous program
    > construct or erroneous data, for which this International Standard
    > imposes no requirements. Undefined behavior may also be expected
    > when this International Standard omits the description of any explicit
    > definition of behavior. [Note: permissible undefined behavior ranges
    > from ignoring the situation completely with unpredictable results, to
    > behaving during translation or program execution in a documented
    > manner characteristic of the environment (with or without the
    > issuance of a diagnostic message), to terminating a translation
    > or execution (with the issuance of a diagnostic message). Many
    > erroneous program constructs do not engender undefined behavior;
    > they are required to be diagnosed. ??? end note ]


    Btw, what's the standard definition of "ill-formed"?
     
    Juha Nieminen, Sep 26, 2010
    #8
  9. >   Btw, what's the standard definition of "ill-formed"?

    the 2003 standard clause 1.3.14 defines "well-formed program":
    A C++ program constructed according to the syntax rules, diagnosable
    semantic rules, and the One Definition Rule.

    I guess a program is ill-formed if it is not well-formed. Am I right?
     
    Armen Tsirunyan, Sep 26, 2010
    #9
  10. Armen Tsirunyan

    Kai-Uwe Bux Guest

    Juha Nieminen wrote:

    > Pete Becker <> wrote:
    >> behavior, such as might arise upon use of an erroneous program
    >> construct or erroneous data, for which this International Standard
    >> imposes no requirements. Undefined behavior may also be expected
    >> when this International Standard omits the description of any
    >> explicit definition of behavior. [Note: permissible undefined
    >> behavior ranges from ignoring the situation completely with
    >> unpredictable results, to behaving during translation or program
    >> execution in a documented manner characteristic of the environment
    >> (with or without the issuance of a diagnostic message), to
    >> terminating a translation or execution (with the issuance of a
    >> diagnostic message). Many erroneous program constructs do not
    >> engender undefined behavior; they are required to be diagnosed.
    >> ??? end note ]

    >
    > Btw, what's the standard definition of "ill-formed"?


    Not well-formed :)

    From the standard [1.3.4]:

    1.3.4 ill-formed program [defns.ill.formed]
    input to a C++ implementation that is not a well-formed program (1.3.14).


    Best

    Kai-Uwe Bux

    Just in case you now wonder what a well-formed program is:

    1.3.14 well-formed program [defns.well.formed]
    a C++ program constructed according to the syntax rules, diagnosable
    semantic rules, and the One Definition Rule (3.2).
     
    Kai-Uwe Bux, Sep 26, 2010
    #10
  11. Kai-Uwe Bux <> wrote:
    >> Btw, what's the standard definition of "ill-formed"?

    >
    > Not well-formed :)
    >
    > From the standard [1.3.4]:
    >
    > 1.3.4 ill-formed program [defns.ill.formed]
    > input to a C++ implementation that is not a well-formed program (1.3.14).
    >
    >
    > Best
    >
    > Kai-Uwe Bux
    >
    > Just in case you now wonder what a well-formed program is:
    >
    > 1.3.14 well-formed program [defns.well.formed]
    > a C++ program constructed according to the syntax rules, diagnosable
    > semantic rules, and the One Definition Rule (3.2).


    I seem to remember some constructs which are ill-formed but which would
    nevertheless still compile (although I can't remember any concrete examples
    right now). That definition would seem to imply that an ill-formed program
    cannot even compile.
     
    Juha Nieminen, Sep 26, 2010
    #11
  12. Armen Tsirunyan

    Kai-Uwe Bux Guest

    Juha Nieminen wrote:

    > Kai-Uwe Bux <> wrote:
    >>> Btw, what's the standard definition of "ill-formed"?

    >>
    >> Not well-formed :)
    >>
    >> From the standard [1.3.4]:
    >>
    >> 1.3.4 ill-formed program [defns.ill.formed]
    >> input to a C++ implementation that is not a well-formed program
    >> (1.3.14).
    >>
    >>
    >> Best
    >>
    >> Kai-Uwe Bux
    >>
    >> Just in case you now wonder what a well-formed program is:
    >>
    >> 1.3.14 well-formed program [defns.well.formed]
    >> a C++ program constructed according to the syntax rules, diagnosable
    >> semantic rules, and the One Definition Rule (3.2).

    >
    > I seem to remember some constructs which are ill-formed but which would
    > nevertheless still compile (although I can't remember any concrete
    > examples right now). That definition would seem to imply that an
    > ill-formed program cannot even compile.


    It does not imply that: off the top of my head, violations of the One
    Definition Rule do not require a diagnostic. Maybe, there are more cases
    like this.


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Sep 26, 2010
    #12
  13. Armen Tsirunyan

    James Kanze Guest

    On Sep 26, 1:49 pm, Kai-Uwe Bux <> wrote:
    > Armen Tsirunyan wrote:


    > > Consider the following program
    > > #include <iostream>
    > > #include <typeinfo>


    > > template <class T>
    > > void f(T x)
    > > {
    > > std::cout << typeid(x).name() << std::endl;
    > > }
    > > int main()
    > > {
    > > f(2000000000);
    > > f(3000000000u);
    > > f(3000000000);
    > > }


    > > I am using Microsoft Visual Studio 2008 and on my machine
    > > int and long are both 32 bits. As far as I understood from
    > > the 2003 C++ standard, this program should print


    > > int
    > > unsigned int
    > > unsigned int


    > > however it prints


    > > int
    > > unsigned int
    > > unsigned long


    > > Is this a bug of MSVC9.0 or I have misinterpreted the standard?


    > From the standard [2.13.1/2]


    > The type of an integer literal depends on its form, value, and suffix. If
    > it is decimal and has no suffix, it has the first of these types in which
    > its value can be represented: int, long int; if the value cannot be
    > represented as a long int, the behavior is undefined. If it is octal or
    > hexadecimal and has no suffix ...


    > The last literal seems to be decimal, without suffix, and not representable
    > as long int. My reading is that the program has UB.


    > Now, as a matter of implementation quality, I would strongly expect a
    > compilation error.


    As a matter of implementation quality, I would expect the
    compiler to implement long long, and use that.

    (The intent is clear: in no case should a decimal constant be
    interpreted as signed. The existing practice is just as clear:
    use unsigned long is long doesn't fit. The compromise is to
    make it undefined behavior, and so allow both.)

    --
    James Kanze
     
    James Kanze, Sep 27, 2010
    #13
  14. Armen Tsirunyan

    Kai-Uwe Bux Guest

    James Kanze wrote:

    > On Sep 26, 1:49 pm, Kai-Uwe Bux <> wrote:
    >> Armen Tsirunyan wrote:

    >
    >> > Consider the following program
    >> > #include <iostream>
    >> > #include <typeinfo>

    >
    >> > template <class T>
    >> > void f(T x)
    >> > {
    >> > std::cout << typeid(x).name() << std::endl;
    >> > }
    >> > int main()
    >> > {
    >> > f(2000000000);
    >> > f(3000000000u);
    >> > f(3000000000);
    >> > }

    >
    >> > I am using Microsoft Visual Studio 2008 and on my machine
    >> > int and long are both 32 bits. As far as I understood from
    >> > the 2003 C++ standard, this program should print

    >
    >> > int
    >> > unsigned int
    >> > unsigned int

    >
    >> > however it prints

    >
    >> > int
    >> > unsigned int
    >> > unsigned long

    >
    >> > Is this a bug of MSVC9.0 or I have misinterpreted the standard?

    >
    >> From the standard [2.13.1/2]

    >
    >> The type of an integer literal depends on its form, value, and suffix.
    >> If it is decimal and has no suffix, it has the first of these types in
    >> which its value can be represented: int, long int; if the value cannot
    >> be represented as a long int, the behavior is undefined. If it is octal
    >> or hexadecimal and has no suffix ...

    >
    >> The last literal seems to be decimal, without suffix, and not
    >> representable as long int. My reading is that the program has UB.

    >
    >> Now, as a matter of implementation quality, I would strongly expect a
    >> compilation error.

    >
    > As a matter of implementation quality, I would expect the
    > compiler to implement long long, and use that.


    That would be a very bad C++03 compiler. I would expect it to use long long
    when invoked as a C++0x compiler. And then, I would expect the compiler to
    barf at some higher literals.

    > (The intent is clear: in no case should a decimal constant be
    > interpreted as signed.


    Huh? is that the reason why f(2000000000) prints int? You mean *un*signed,
    right?

    > The existing practice is just as clear:
    > use unsigned long is long doesn't fit. The compromise is to
    > make it undefined behavior, and so allow both.)


    But from the point of view of a programmer, undefined behavior is not a
    license but a prohibition. It is a license only from the point of view of
    the implementor. I agree that an implementor might go for UB and ignore
    [5/5] -- I just consider that an indication of laziness on the implementors
    part.


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Sep 27, 2010
    #14
  15. Armen Tsirunyan

    James Kanze Guest

    On Sep 27, 5:34 pm, Kai-Uwe Bux <> wrote:
    > James Kanze wrote:
    > > On Sep 26, 1:49 pm, Kai-Uwe Bux <> wrote:


    [...]
    > > As a matter of implementation quality, I would expect the
    > > compiler to implement long long, and use that.


    > That would be a very bad C++03 compiler. I would expect it to
    > use long long when invoked as a C++0x compiler. And then,
    > I would expect the compiler to barf at some higher literals.


    That would be a very usable C++ compliler. If invoked in strict
    mode, an error is called for, but from a usability point of
    view, I would expect support for long long to be the default.
    And an option (or combination of options) for strict mode,
    except long long. (It's not often that I suggest a default
    other than strict conformance. But in this case, it seems more
    appropriate.)

    > > (The intent is clear: in no case should a decimal constant be
    > > interpreted as signed.


    > Huh? is that the reason why f(2000000000) prints int? You mean
    > *un*signed, right?


    Yes. I miscounted my negations.

    > > The existing practice is just as clear:
    > > use unsigned long is long doesn't fit. The compromise is to
    > > make it undefined behavior, and so allow both.)


    > But from the point of view of a programmer, undefined behavior
    > is not a license but a prohibition. It is a license only from
    > the point of view of the implementor. I agree that an
    > implementor might go for UB and ignore [5/5] -- I just
    > consider that an indication of laziness on the implementors
    > part.


    In this case, I don't think it's laziness so much as doing what
    we've always done, to avoid breaking code.

    --
    James Kanze
     
    James Kanze, Sep 27, 2010
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Purush
    Replies:
    4
    Views:
    1,666
    Purush Rudrakshala
    Apr 13, 2005
  2. John Goche
    Replies:
    8
    Views:
    16,473
  3. Bart Samwel
    Replies:
    14
    Views:
    816
    Bart Samwel
    Apr 22, 2005
  4. Integer Literals

    , Oct 18, 2006, in forum: C++
    Replies:
    5
    Views:
    447
  5. Ivan Novick

    negative integer literals

    Ivan Novick, Dec 10, 2006, in forum: C++
    Replies:
    15
    Views:
    666
    Steve Pope
    Dec 10, 2006
Loading...

Share This Page