is 0xE-2 a valid expression?

J

jacob navia

jameskuyper a écrit :
Sorry - I got that backwards. Removing that diagnostic renders lcc-win
non-conforming. If, after generating the required diagnostic, your
compiler then chose to break up the pre-processing token into multiple
tokens, that would be perfectly conforming, since the standard does
not define the behaviour when a pp-number fails to parse as a valid
token in phase 7.

Good. That means that (if and only if) in conforming mode I should
emit a "warning, obscure specs ignored".

I do not know what to say actually. The wording I had was:

Error tg3.c: 3 '0xE-2' is a preprocessing number but an invalid integer constant.

But this is completely obscure. The message of gcc is clearer
(invalid suffix -2 in integer constant).

What wording would be OK with this obscure error?
 
J

jameskuyper

jacob said:
jameskuyper a écrit :

Good. That means that (if and only if) in conforming mode I should
emit a "warning, obscure specs ignored".

I do not know what to say actually. The wording I had was:

Error tg3.c: 3 '0xE-2' is a preprocessing number but an invalid integer constant.

But this is completely obscure. The message of gcc is clearer
(invalid suffix -2 in integer constant).

What wording would be OK with this obscure error?

For me, your original diagnostic would be perfect. It tells me
precisely why it's wrong, since I know precisely what a preprocessing
number is, and how it can fail to be a valid integer constant. The gcc
diagnostic is probably more appropriate for people less familiar with
standardese, which is probably at least 99% of C developers. It tells
them what they need to know in order to fix it, even it they don't
understand why it's wrong.
 
F

Francois Grieu

jameskuyper said:
If, after generating the required diagnostic, [a] compiler then
chose to break up the pre-processing token into multiple tokens,
that would be perfectly conforming, since the standard does
not define the behaviour when a pp-number fails to parse as a
valid token in phase 7.

What makes the diagnostic required, and why can't the compiler
make that diagnostic empty?

BTW, I wish the title was: is 0x5A3E-2 a valid expression?
since this tends to elicit more (wrong) "yes".

François Grieu
 
J

jameskuyper

Francois said:
jameskuyper said:
If, after generating the required diagnostic, [a] compiler then
chose to break up the pre-processing token into multiple tokens,
that would be perfectly conforming, since the standard does
not define the behaviour when a pp-number fails to parse as a
valid token in phase 7.

What makes the diagnostic required, and why can't the compiler
make that diagnostic empty?

0xE-2 gets parsed as a single preprocessing token during translation
phase 3, and must therefore be converted to a single token during
translation phase 7, but does not have the correct syntax to be parsed
as a single token of any type. At least one diagnostic must be
generated for any program that contains any syntax errors (5.1.1.3p1).

However, the standard says nothing about the actual content of any
diagnostics except the ones produced by an #error directive. A
diagnostic is not required to correctly identify the location where
the problem was detected, nor is it required to identify the rule that
has been broken. The diagnostic need not be written in any language
that anyone knows how to read. Only a single diagnostic is required,
regardless of how many different syntax errors or constraint
violations a program contains. An implementation can meet all of the
diagnostic requirements by generating a single diagnostic saying
"Doodely Doo!" for every program it processes, regardless of the
contents of that program.

An empty diagnostic is permitted. However, implementations are
required to document which subset of the implementation's message
output the diagnostics (3.11p1), so an empty diagnostic must still be
distinguishable from having no diagnostics.
 
R

Richard Tobin

I do not know what to say actually. The wording I had was:

Error tg3.c: 3 '0xE-2' is a preprocessing number but an invalid integer
constant.

But this is completely obscure. The message of gcc is clearer
(invalid suffix -2 in integer constant).

What wording would be OK with this obscure error?

"Hexadecimal constant cannot have exponent".

-- Richard
 
S

Seebs

But is there an error in the program in the first place?

Yes. 0xE-2 is a single pp-token, therefore a single token, but it's not
a valid token. The rationale fragment posted pretty much covers it.

-s
 
K

Keith Thompson

"Hexadecimal constant cannot have exponent".

But a floating hexadecimal constant can have an exponent:

0x1.2p3

The standard's grammar uses the term "hexadecimal-constant"
for integer constants that are hexadecimal, and
"hexadecimal-floating-constant" for floating constants that are
hexadecimal, but the unqualified term "Hexadecimal constant" might
be confusing for someone who knows that C99 supports hexadecimal
floating constants but isn't intimately familiar with the grammar.

Another case to consider is:

0x1.2e-3

which, again, is a valid pp-number but not a valid constant.
gcc's message for this is "error: hexadecimal floating constants
require an exponent", which is potentially misleading; the programmer
probably intended the 'e' to introduce an exponent, not to be
a hexadecimal digit. (That's why floating-point hexadecimal
constants use 'p' rather than 'e' to introduce the exponent.)
It might be worth going to some extra effort to detect this kind
of special case and make a better guess at what the user meant.
 
S

Seebs

I assume MSVC is non-conforming, as it didn't emit any diagnostic?

MSVC is made by the people whose documentation claims that, in C,
parentheses control order of evaluation.

I think the subordinate clause in your question is redundant.

I cannot conceive of circumstances under which I would expect that a
Microsoft product genuinely conformed to any standard. (And yes, I'm
aware of the bastardized "standardization" of one of their own internal
XML formats.)

-s
 
K

Keith Thompson

Kenneth Brody said:
Given the 1-line source file:

int i = 0xE-2;

gcc version "egcs-2.91.66" says:

usenet.c:1: missing white space after number `0xE'

Microsoft Visual C 14.00.50727.762 and 15.00.21022.08 say:

Nothing. They compile without warning, even with max warning level,
and treat it as "0xE minus 2".

I assume MSVC is non-conforming, as it didn't emit any diagnostic?

It looks that way, yes.

The definition of a pp-number changed slightly between C90 and C99,
but 0xE-2 is a pp-number in both versions of the standard.

To summarize what the grammar says: In C90 a pp-number is either a '.'
or a decimal digit, followed by zero or more of the following:
decimal digit
underscore
uppercase or lowercase letters
e-
e+
E-
E+
.

C99 adds to this list:
p-
p+
P-
P+
plus universal-character-name and any other implementation-defined
character that can appear in an identifier.

(Yes, E and P are letters, and can therefore appear in a pp-number;
the point of the special cases is that '-' and '+' can appear in
a pp-number only if they immediately follow one of e, E, p, P.)

All valid numeric (integer or floating) constants are pp-numbers,
but not all pp-numbers are valid numeric constants.
 
E

Eric Sosman

Kenneth said:
Given the 1-line source file:

int i = 0xE-2;

gcc version "egcs-2.91.66" says:

usenet.c:1: missing white space after number `0xE'

Microsoft Visual C 14.00.50727.762 and 15.00.21022.08 say:

Nothing. They compile without warning, even with max warning level,
and treat it as "0xE minus 2".

I assume MSVC is non-conforming, as it didn't emit any diagnostic?

Not sure. 6.6p10 has what might be an escape clause:

"An implementation may accept other forms of constant
expressions."

.... and it's not (immediately) clear that "other forms" implies
"other syntactically valid forms."
 
T

Tim Rentsch

Eric Sosman said:
Not sure. 6.6p10 has what might be an escape clause:

"An implementation may accept other forms of constant
expressions."

... and it's not (immediately) clear that "other forms" implies
"other syntactically valid forms."

I puzzled over this question for a little while. I now think
an answer is supplied by 5.1.1.3p1, regarding Diagnostics:

A conforming implementation shall produce at least one
diagnostic message (identified in an implementation-defined
manner) if a preprocessing translation unit or translation
unit contains a violation of any syntax rule or constraint,
__even if the behavior is also explicitly specified as
undefined or implementation-defined__. [emphasis added]

Note especially the 'implementation-defined' part. Seem pretty
airtight.

Given that, what's the point of allowing other forms of constant
expressions? I believe 6.6p10 is included to allow a small
window of cases to be accepted without getting a diagnostic,
namely, those cases that otherwise are syntactically correct and
have no constraint violations, and have a constraint that some
portion of an expression or a statement be a constant-expression
(or integer constant expression). In such cases, the constraint
can be met by an implementation-defined form of CE or ICE, hence
there are no constraint violations, hence no diagnostic is
required.
 
S

Seebs

Given that, what's the point of allowing other forms of constant
expressions?

Historically, probably, to let people do things like hex float constants.
But even now, there's nothing wrong with accepting and supporting other
kinds of constant expressions -- just that some of them need a diagnostic
in conforming mode. I guess the point would be local convenience or color.

-s
 
T

Tim Rentsch

Seebs said:
Historically, probably, to let people do things like hex float constants.
But even now, there's nothing wrong with accepting and supporting other
kinds of constant expressions -- just that some of them need a diagnostic
in conforming mode. I guess the point would be local convenience or color.

Right, but why bother to have the standard specifically allow it,
since they could be accepted anyway (as one form of undefined
behavior)? It makes sense to mention it only if having it there
allows some inputs to be accepted without having to give a
diagnostic.
 
K

Keith Thompson

Seebs said:
Historically, probably, to let people do things like hex float constants.

Or 0b11001001. Or 123_456_789.
But even now, there's nothing wrong with accepting and supporting other
kinds of constant expressions -- just that some of them need a diagnostic
in conforming mode. I guess the point would be local convenience or color.

Note also that the standard allows additional forms of constant
expressions, not just constants, potentially permitting things like:

const int x = 2;
const int y = sqrt(2.0);
x + y; /* implementation-defined constant expression */

Still, I'm not sure this isn't already covered by the general
permission to provide extensions.
 
S

Seebs

Right, but why bother to have the standard specifically allow it,
since they could be accepted anyway (as one form of undefined
behavior)? It makes sense to mention it only if having it there
allows some inputs to be accepted without having to give a
diagnostic.

I would say:
* No one actually expects you to ever run a compiler in a completely
strictly conforming mode unless you're prepping to move to a new
compiler.
* I think it's a very strong hint to the implementation to DEFINE
any additional things, rather than just silently accepting them.

Basically, if you provided such a thing, and did not define it or yield
a diagnostic, you're clearly a Bad Compiler.

If you define in your docs as an additional form of constant, I think you
still have to have a mode where you emit a diagnostic to be considered
conforming in that mode, but no one is expected to use that mode.

Basically, don't place the emphasis on "accept" -- place it on "implementation
defined", which specifically requires that the decision be documented.

-s
 
K

Keith Thompson

Tim Rentsch said:
Right, but why bother to have the standard specifically allow it,
since they could be accepted anyway (as one form of undefined
behavior)? It makes sense to mention it only if having it there
allows some inputs to be accepted without having to give a
diagnostic.

Well, it's not exactly a formal specification. :cool:}

This isn't the only example of redundancy in the standard. Another is
the explicit permission to accept alternative declarations for main(),
something that's already covered by the general permission to provide
extensions.

As for not having to give a diagnostic, somebody else already
mentioned C99 5.1.1.3p1:

A conforming implementation shall produce at least one
diagnostic message (identified in an implementation-defined
manner) if a preprocessing translation unit or translation
unit contains a violation of any syntax rule or constraint,
even if the behavior is also explicitly specified as undefined
or implementation-defined.

I don't see much wiggle room. (Well, I might make an argument
based on the wording of 6.6p10, "An implementation may accept
other forms of constant expressions", which doesn't *explicitly*
specify any behavior as undefined or implementation-defined, but
I think that's too picky even for me.)
 
T

Tim Rentsch

Seebs said:
I would say:
* No one actually expects you to ever run a compiler in a completely
strictly conforming mode unless you're prepping to move to a new
compiler.

Really? I run gcc using -ansi -pedantic or -std=c99 -pedantic
basically all the time.

* I think it's a very strong hint to the implementation to DEFINE
any additional things, rather than just silently accepting them.

I'm sure some implementors would take it that way, but I haven't
seen any text in the actual Standard that suggests that's what
the authors intended. It isn't listed in Annex J.3, for example.

Basically, if you provided such a thing, and did not define it or yield
a diagnostic, you're clearly a Bad Compiler.

I don't see any text in the Standard that indicates it takes this
position. As a personal reaction I might agree, but I don't see
where the Standard either requires it or expects it.

If you define in your docs as an additional form of constant, I think you
still have to have a mode where you emit a diagnostic to be considered
conforming in that mode, but no one is expected to use that mode.

I don't see anything in the Standard that requires implementations
to document other forms of constant expressions that they accept.
I also don't see anything in the Standard that requires a diagnostic
for use of an implementation-specific constant expression (assuming
no constraint violations or syntax errors are present otherwise).

Basically, don't place the emphasis on "accept" -- place it on "implementation
defined", which specifically requires that the decision be documented.

I think if you look again you will see that accepting other
forms of constant expressions is not implementation-defined
behavior.
 
T

Tim Rentsch

Seebs said:
Right, but why bother to have the standard specifically allow it,
since they could be accepted anyway (as one form of undefined
behavior)? It makes sense to mention it only if having it there
allows some inputs to be accepted without having to give a
diagnostic.
[snip]

If you define in your docs as an additional form of constant, I think you
still have to have a mode where you emit a diagnostic to be considered
conforming in that mode, but no one is expected to use that mode.

Basically, don't place the emphasis on "accept" -- place it on "implementation
defined", which specifically requires that the decision be documented.

An amendment to my last posting.

It's possible (and not unreasonble) to consider additional
forms of constant expressions accepted as "extensions",
and if so they must be documented. I don't see any text
in the Standard that clearly says such things are extensions,
but certainly that's one reasonable interpretation. Under
that interpretation these other forms would have to be
documented.

However, that doesn't change the situation with respect
to needing a diagnostic.
 
T

Tim Rentsch

Keith Thompson said:
Well, it's not exactly a formal specification. :cool:}

This isn't the only example of redundancy in the standard. Another is
the explicit permission to accept alternative declarations for main(),
something that's already covered by the general permission to provide
extensions.

It's not clear to me that strictly speaking this is a redundancy.
I don't think it was meant as a redundancy. In any case that
question isn't important to support the point I'm making.

As for not having to give a diagnostic, somebody else already
mentioned C99 5.1.1.3p1:

A conforming implementation shall produce at least one
diagnostic message (identified in an implementation-defined
manner) if a preprocessing translation unit or translation
unit contains a violation of any syntax rule or constraint,
even if the behavior is also explicitly specified as undefined
or implementation-defined.

I don't see much wiggle room. (Well, I might make an argument
based on the wording of 6.6p10, "An implementation may accept
other forms of constant expressions", which doesn't *explicitly*
specify any behavior as undefined or implementation-defined, but
I think that's too picky even for me.)

Ahh, but here's the thing. If another form of conditional-expression
is accepted as a constant expression, it's possible for a program to
have _no_ violation of any syntax rule or constraint even though it
is using one of the other forms of constant expression (for example,
as an array bound, bit-field width, or 'case' expression).

Is 6.6p10 talking about implementation-defined behavior? I think it's
pretty clear that it isn't. 6.6p10 does not describe it as such, and
it isn't mentioned in the summary of I-D behaviors given in Annex J.3.

Hence, with no explicit statement of implementation-defined behavior
or behavior, and with the expression in question being a constant
expression, and with no violations of syntax rules or constraints
otherwise, a constraint that specifies some expression must be
a constant expression is satisfied, and therefore no diagnostic
is required.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,276
Latest member
Sawatmakal

Latest Threads

Top