lcc-win32 conformance question

B

Ben Bacarisse

Chris Torek said:
<[email protected]> wrote:

Not from this example. The reason is that names prefixed with
various kinds of underscores (including single underscores followed
by uppercase, and double underscores) are reserved to the
implementation, and once you use an "implementation keyword", the
entire "contract", as it were, that the Standard provides to the
C programmer is terminated, at least in principle.

That is an interesting (and new to me) viewpoint. It means that the
standard does require that a conforming implementation should be able
to be told to reject all extensions. Obviously an implementation may
offer this help, but I can't be sure that turning on all the standards
flags will do it.

Allowing all bets to be off after a #pragma is not so worrying to me --
I can grep for those. I can't search for examples of __?words that are
not "standard" with anything like the same ease.
 
C

Chris Torek

(Incidentally, I am not sure whether one can at least expect
the file *up to that point* to respect C syntax.)

That is an interesting (and new to me) viewpoint. It means that the
standard does require that a conforming implementation should be able
to be told to reject all extensions. Obviously an implementation may
offer this help, but I can't be sure that turning on all the standards
flags will do it.

Allowing all bets to be off after a #pragma is not so worrying to me --
I can grep for those. I can't search for examples of __?words that are
not "standard" with anything like the same ease.

Indeed. However, it is worse (and yet better) than that. Consider,
for instance, the following:

#include "my.h"

This *looks* inocuous enough, until we find that my.h starts with:

#include <system-magic.h>

which, through a chain of 97812 other "#include"s, eventually does:

#pragma lisp

so everything after including "my.h" is Lisp!

This problem never actually occurs in practice, because people are
not quite that crazy. :)
 
Y

ymuntyan

Actually what I wanted was the reference to the standard,
and I found it (surprisingly, it's where "diagnostic"is
defined). I have always thought that syntax extensions are
allowed, that's why I insisted on "why?". E.g. I was sure
that gcc doesn't have to warn about __attribute__ even in
conforming mode. Now I have a question about conformance
of one unnamed compiler [which does not complain about a

translation unit consisting entirely of the one line]
__attribute__((unused)) static int a;
Is this unnamed compiler non-conforming?

Not from this example. The reason is that names prefixed with
various kinds of underscores (including single underscores followed
by uppercase, and double underscores) are reserved to the
implementation, and once you use an "implementation keyword", the
entire "contract", as it were, that the Standard provides to the
C programmer is terminated, at least in principle.

Thus, the following does not require a diagnostic:

#pragma comment !
! hi there.
static int a;

because "#pragma" can also do pretty much anything (although in
C99 there are now standardized #pragma operations). (Perhaps, as
in this case, the "#pragma" turns "!" into a comment-to-end-of-line
character. "#pragma fortran" might be used to switch to Fortran
code, which would change the syntax even more.)

This is, of course, the danger of doing anything that departs from
Standard C: the moment you abandon the standard, even for an instant,
the standard can abandon you, possibly "forever". :)

I believe I understand what you're saying, yet I don't
quite understand where that fine line between "syntax
error" and "not a syntax error" is drawn. Namely, why
is // a deviation from the C syntax which must be diagnosed,
while __attribute__ isn't? Or why __attribute__ is fine
while 234i literal isn't?

So, the question is: where exactly does the standard say
that implementation is free to use any __-prefixed identifier
as a magic word which may completely change semantics of the
program? It does say that such identifiers are reserved,
yet they still are identifiers.

Yevgen
 
Y

ymuntyan

(e-mail address removed) said:



Are you claiming that failure to issue a required diagnostic message for a
syntax error is somehow not a conformance problem?

No, and I have not claimed that. I didn't know that diagnostic
was required in that case, that's true. Yet I didn't even claim
it wasn't required.

Yevgen
 
I

Ian Collins

I believe I understand what you're saying, yet I don't
quite understand where that fine line between "syntax
error" and "not a syntax error" is drawn. Namely, why
is // a deviation from the C syntax which must be diagnosed,
while __attribute__ isn't? Or why __attribute__ is fine
while 234i literal isn't?
__attribute__ is in the implementation's namespace, so the
implementation is free to treat it how is sees fit. 234i is not.
So, the question is: where exactly does the standard say
that implementation is free to use any __-prefixed identifier
as a magic word which may completely change semantics of the
program? It does say that such identifiers are reserved,
yet they still are identifiers.
Reserved, for the implementation. So it can do what it likes with them.
 
Y

ymuntyan

__attribute__ is in the implementation's namespace, so the
implementation is free to treat it how is sees fit. 234i is not.

"namespace" here is not the standard term, so if you
mean "it's magic", then I already got that ;)
So, 234i is not C, "int __attribute__((something)) something;"
isn't C either, both are invalid according to the grammar.
The former is not magical, the latter is. Does "reserved" really
clearly imply this magic/not-magic distinction? I do think
it's a good rule, "__ => anything may happen", but it doesn't
seem to be what standard really say, or is it?
Reserved, for the implementation. So it can do what it likes with them.

Yevgen
 
J

jacob navia

Ben said:
Do you mean lcc-win32? If so, I have found problems with compound
literals, VLA parameters, complex numbers and designated initialisers
so some of the C99 parts are a bit rough round the edges. There have
been a few other reports of conformance issues, but they may well have
been fixed. For example, it used to consider long * and int * to be
compatible types.

It is probably fair to say the standards conformance is not a top
priority for that compiler.

All the bugs you mentioned were fixed as quickly as I could.

That is not, of course, TOP PRIORITY. Some bugs were
fixed in less than a few hours. But that is NOT TOP PRIORITY.

What, then, would be TOP PRIORITY?

A few seconds?
 
F

Flash Gordon

Richard Heathfield wrote, On 27/04/08 07:13:
(e-mail address removed) said:


Thank you for explaining this. I was under the impression that you knew.


No, you didn't claim that. I didn't realise you didn't know the situation
pertaining to the diagnostic obligations of conforming implementations.
Now that I do realise this, I can answer your above point more
pertinently:

I would guess that all the posters who knew that the given issue is to do
with non-conformance (and why) assumed that everyone else in the
discussion also knew that it was to do with non-conformance (and why).

And *that* assumption (i.e. the assumption that everyone else knew) now
appears to be ill-founded.

So: to clarify, the appearance of a // in a C90 program is either:

(a) part of a string literal " like // this "; or
(b) part of a comment /* like // this */; or
(c) a division followed immediately by a comment like//*this*/that; or
(d) a syntax error.

I think that covers all the possibilities.

You missed multi-byite character constants like this '//' ;-)
As you can see, it isn't *necessarily* a syntax error - but that is the
most likely of the four possibilities when someone has been using // for
comment syntax without malice aforethought. And syntax errors *must* be
diagnosed; the Standard requires it. So if a compiler finds a // that it
can't twist into some syntactically legitimate interpretation (e.g. (a)
through (c) above), it must issue a diagnostic message.

The example given, of course, clearly did not fit in to any of the
legitimate interpretations.
 
F

Flash Gordon

"namespace" here is not the standard term, so if you
mean "it's magic", then I already got that ;)

"namespace" is not a standard term, but "name spaces" is a standard term,
So, 234i is not C, "int __attribute__((something)) something;"
isn't C either, both are invalid according to the grammar.

Not necessarily. Consider the following TU which uses the same syntax:

#define attribute(a)
int attribute((something)) something;

So whether it is valid according to the grammar depends on the
definition of __attribute__. Note that the implementation is
specifically allowed to have predefined macros starting with a double
underscore. Note also that predefined macros are not subject to
#undef/#define according to the standard.
The former is not magical, the latter is. Does "reserved" really
clearly imply this magic/not-magic distinction? I do think
it's a good rule, "__ => anything may happen", but it doesn't
seem to be what standard really say, or is it?

<snip>

It is an interesting point. Conforming implementations are explicitly
allowed to have extensions provided they do not affect the behaviour of
any strictly conforming program, and no strictly conforming program can
use identifiers starting with __ except where they are explicitly
defined by the standard.
 
C

Chris Torek

So, 234i is not C, "int __attribute__((something)) something;"
isn't C either, both are invalid according to the grammar.
The former is not magical, the latter is. Does "reserved" really
clearly imply this magic/not-magic distinction?

It is not the "reserved" per se, but rather the fact that use
of a reserved identifer results in undefined behavior.

In my C99 draft, this is in section 7.1.3, "Reserved identifiers",
where we have the following text:

...
- All identifiers that begin with an underscore and
either an uppercase letter or another underscore are
always reserved for any use.
...
[#2] No other identifiers are reserved. If the program
declares or defines an identifier that is reserved in that
context (other than as allowed by 7.1.8), the behavior is
undefined.134

(Note that you must add more C99 text before interpreting the above
too literally, since (e.g.) __func__ fits this pattern, but has
Standard-defined semantics, so a C programmer can use __func__ in
the Standard-defined way without losing all of the other C99
guarantees.)

(Note also that the text above says "declares or defines", so one
might attempt to call gcc non-conformant at this point, since "int
__attribute__" is not yet a complete definition: we would have to
see a semicolon, comma, or equal-sign-and-expression in order to
get the definition to happen. Perhaps simpler, we could point to
gcc's __extension__, which gcc allows to appear in a context that
does not even *resemble* a declaration or definition. I would
argue, though, that the draft wording I quoted above fails to
capture the "true intent" of the Standard, which allows the compiler
to "pre-define" any reserved identifier. Perhaps, for instance,
the preprocessor starts out by doing "#define __extension__" in
such a way that the occurrence of __extension__ declares a second
reserved identifier, thus triggering the undefined behavior. [This
is not in fact what gcc does, but we cannot observe this fact with
any strictly conforming code -- even "#ifdef __extension__" is not
sufficient, since gcc *could* #define __extension__ only on those
source lines where there are no "#ifdef" tests for it -- so from
a black-box point of view, at least, gcc can get away with this.])

(If the goal is to find "gcc bugs", it is simpler to find "real"
failures in gcc's conformance, in which compiling with -std=c99
fails to implement C99 in various ways. There may well be various
conformance bugs with -std=c89 aka -ansi as well. And of course
gcc obviously fails to conform with "-ansi" unless one *also*
includes "-pedantic", since "-ansi", by design, does not turn on
all the required diagnostics. But this little aside is not relevant
to my rela point, which is: compilers are allowed, and even
encouraged, to use Standard C's "undefined behavior" in various
well-controlled, documented-in-the-implementation-manual ways to
provide features that are otherwise missing from Standard C, so
that real programmers can get real work done.)
 
K

Keith Thompson

Richard Heathfield said:
So: to clarify, the appearance of a // in a C90 program is either:

(a) part of a string literal " like // this "; or
(b) part of a comment /* like // this */; or
(c) a division followed immediately by a comment like//*this*/that; or
(d) a syntax error.

I think that covers all the possibilities.

Not quite.

(e) part of a multi-character character constant, such as '//' or
L'//'; or
(f) part of a header name, such as #include <foo//bar.h> or
#include "foo//bar.h" (no, the latter is not actually a string
literal)
(g) part of a sequence of preprocessor tokens that are discarded, as in:
#define IGNORE(x) /* nothing */
IGNORE(//)
or
#if 0
// no, this isn't a comment
#endif
; or
(h) part of a #error directive (though this causes the translation
unit to be rejected); or
(i) part of an implementation-defined #pragma directive.

I *think* that covers all the cases, but I wouldn't swear to it.
As you can see, it isn't *necessarily* a syntax error - but that is the
most likely of the four possibilities when someone has been using // for
comment syntax without malice aforethought. And syntax errors *must* be
diagnosed; the Standard requires it. So if a compiler finds a // that it
can't twist into some syntactically legitimate interpretation (e.g. (a)
through (c) above), it must issue a diagnostic message.

Yes. The requirement is stated in C90 5.1.1.3:

A conforming implementation shall produce at least one diagnostic
message (identified in an implementation-defined manner) for every
translation unit that contains a violation of any syntax rule or
constraint. Diagnostic messages need not be produced in other
circumstances.

(C99 has a similar rule, also in 5.1.1.3, though there's some
additional wording; of course, C99's syntax rules permit // comments.)
 
K

Keith Thompson

Ian Collins said:
__attribute__ is in the implementation's namespace, so the
implementation is free to treat it how is sees fit. 234i is not.

Yes, but ...

__attribute__ is an identifier, a valid token, so by itself it's not
necessarily a syntax error. 234i is not a valid token.
Reserved, for the implementation. So it can do what it likes with them.

Within limits.

I'll quote from the C99 standard, because my copy of it is easier to
copy-and-paste from than my copy of the C90 standard. C90 has similar
rules.

C99 4p6:

A conforming implementation may have extensions (including
additional library functions), provided they do not alter the
behavior of any strictly conforming program.

with a footnote:

This implies that a conforming implementation reserves no
identifiers other than those explicitly reserved in this
International Standard.

For example, since the identifier __attribute__ is reserved to the
implementation, no strictly conforming program may use it; therefore,
a conforming implementation may provide an extension that affects only
programs that use the identifier __attribute__.

But must the implementation still issue a diagnostic if the use of an
extension violates a constraint or syntax rule? In my opinion, the
answer is yes; the standard permits extensions, but that doesn't
override the requirement to issue a diagnostic. Doug Gwyn, a member
of the C standard committee, has expressed the same opinion over on
comp.std.c.

So, a conforming C90 implementation may allow // comments as an
extension (as long as a division symbol immediately followed by a
/*comment*/ is *not* treated as a // comment), *but* it must still
issue at least one diagnostic for any translation unit that uses a //
comment. It doesn't have to reject such a translation unit; the
standard's only requirement to reject a translation unit is for the
#error directive. The diagnostic could be "Thank you for using this
extension, and have a nice day".

And, of course, it needn't issue the diagnostic in a non-conforming
mode; after all, the standard cannot possibly impose *any*
requirements on an implementation that doesn't claim to conform to it.
 
Y

ymuntyan

So, 234i is not C, "int __attribute__((something)) something;"
isn't C either, both are invalid according to the grammar.
The former is not magical, the latter is. Does "reserved" really
clearly imply this magic/not-magic distinction?

It is not the "reserved" per se, but rather the fact that use
of a reserved identifer results in undefined behavior.

In my C99 draft, this is in section 7.1.3, "Reserved identifiers",
where we have the following text:

...
- All identifiers that begin with an underscore and
either an uppercase letter or another underscore are
always reserved for any use.
...
[#2] No other identifiers are reserved. If the program
declares or defines an identifier that is reserved in that
context (other than as allowed by 7.1.8), the behavior is
undefined.134

(Note that you must add more C99 text before interpreting the above
too literally, since (e.g.) __func__ fits this pattern, but has
Standard-defined semantics, so a C programmer can use __func__ in
the Standard-defined way without losing all of the other C99
guarantees.)

(Note also that the text above says "declares or defines", so one
might attempt to call gcc non-conformant at this point, since "int
__attribute__" is not yet a complete definition: we would have to
see a semicolon, comma, or equal-sign-and-expression in order to
get the definition to happen. Perhaps simpler, we could point to
gcc's __extension__, which gcc allows to appear in a context that
does not even *resemble* a declaration or definition. I would
argue, though, that the draft wording I quoted above fails to
capture the "true intent" of the Standard, which allows the compiler
to "pre-define" any reserved identifier. Perhaps, for instance,
the preprocessor starts out by doing "#define __extension__" in
such a way that the occurrence of __extension__ declares a second
reserved identifier, thus triggering the undefined behavior.

I won't buy that. UB or not, if there is a syntax error then
a diagnostic must be emitted. But, gcc may predefine __attribute__
and __extension__ to be empty (it's smart, __attribute__ is a
function-like thing with one argument), so a strictly conforming
programmer can't tell whether __attribute__ was expanded to nothing
or not, even if a real programmer knows that file.i is a preprocessed
file which is fed to the C parser which treats __attribute__ as a
keyword ;)
I guess the answer here is: given a __word used in a real
compiler, one can make up how that __word could be made a macro
which makes the program a valid C text, like in this case with
__extension__ and __attribute__. The "__ => magic happens" is
wrong, but it doesn't matter since there is only a small number
of __words actually accepted by compilers.
[This
is not in fact what gcc does, but we cannot observe this fact with
any strictly conforming code -- even "#ifdef __extension__" is not
sufficient, since gcc *could* #define __extension__ only on those
source lines where there are no "#ifdef" tests for it -- so from
a black-box point of view, at least, gcc can get away with this.])

(If the goal is to find "gcc bugs",

Not really. The goal is to understand if gcc is a nicer compiler
than the standard is not a text for humans. gcc wins again :)

[snip]

Yevgen
 
K

Keith Thompson

Keith Thompson said:
Not quite.

(e) part of a multi-character character constant, such as '//' or
L'//'; or
(f) part of a header name, such as #include <foo//bar.h> or
#include "foo//bar.h" (no, the latter is not actually a string
literal)
(g) part of a sequence of preprocessor tokens that are discarded, as in:
#define IGNORE(x) /* nothing */
IGNORE(//)
or
#if 0
// no, this isn't a comment
#endif
; or
(h) part of a #error directive (though this causes the translation
unit to be rejected); or
(i) part of an implementation-defined #pragma directive.

I *think* that covers all the cases, but I wouldn't swear to it.
[...]

(j) a comment followed immediately by a division like/*this*//that
(the reverse of case (c)).
 
Y

ymuntyan

(e-mail address removed) wrote, On 27/04/08 07:07:





"namespace" is not a standard term, but "name spaces" is a standard term,

Which isn't what was '"namespace" here'.
Not necessarily. Consider the following TU which uses the same syntax:

#define attribute(a)
int attribute((something)) something;

So whether it is valid according to the grammar depends on the
definition of __attribute__. Note that the implementation is
specifically allowed to have predefined macros starting with a double
underscore. Note also that predefined macros are not subject to
#undef/#define according to the standard.

Right. But I fed file.i to gcc for exactly that reason -
__attribute__ is part of gcc grammar, and it won't warn
about it. I do realize now that it doesn't matter.
<snip>

It is an interesting point. Conforming implementations are explicitly
allowed to have extensions provided they do not affect the behaviour of
any strictly conforming program, and no strictly conforming program can
use identifiers starting with __ except where they are explicitly
defined by the standard.

Well, strictly conforming programs are not interesting:
they are fine and there is nothing to worry about. It's
non-strictly-conforming programs which we want the compiler
to tell us about, like those containing // comments.
It's an interesting point that an implementation can get
away with anything as long as it *could* be that it behaved
in certain way :)

Yevgen
 
Y

ymuntyan

Yes, but ...

__attribute__ is an identifier, a valid token, so by itself it's not
necessarily a syntax error. 234i is not a valid token.



Within limits.

I'll quote from the C99 standard, because my copy of it is easier to
copy-and-paste from than my copy of the C90 standard. C90 has similar
rules.

C99 4p6:

A conforming implementation may have extensions (including
additional library functions), provided they do not alter the
behavior of any strictly conforming program.

with a footnote:

This implies that a conforming implementation reserves no
identifiers other than those explicitly reserved in this
International Standard.

For example, since the identifier __attribute__ is reserved to the
implementation, no strictly conforming program may use it; therefore,
a conforming implementation may provide an extension that affects only
programs that use the identifier __attribute__.

A program which uses // comments isn't strictly conforming
either, so a conforming implementation may provide an extension...
But must the implementation still issue a diagnostic if the use of an
extension violates a constraint or syntax rule? In my opinion, the
answer is yes; the standard permits extensions, but that doesn't
override the requirement to issue a diagnostic. Doug Gwyn, a member
of the C standard committee, has expressed the same opinion over on
comp.std.c.

Sounds right, and it doesn't explain why __attribute__ may get
away without a warning :)

[snip]

Yevgen
 
B

Ben Bacarisse

jacob navia said:
All the bugs you mentioned were fixed as quickly as I could.

That is good, but if the fixes don't show up in the compiler and you
don't reply to the postings about them, how can anyone tell? Do you
post a list of fixes and when they will get into the released
compiler?

I one case (of VLA array parameters) you did not reply to the posting
about it (I may have missed it of course) so I was not even sure you
confirmed it as a bug. It is certainly still there in the compiler as
are a lot of the others (I have more if you really want them).
That is not, of course, TOP PRIORITY. Some bugs were
fixed in less than a few hours. But that is NOT TOP PRIORITY.

What, then, would be TOP PRIORITY?

Top priority would be to finish C99 features before adding
extensions. I know one of the extensions is your intended way to
implement one part of C99 (complex numbers) but that is not true of
the others, is it?

I am not criticising this decision. It makes perfect commercial
sense to include extensions your customers rely on, but if standards
compliance were your top priority, it would be done first, almost by
definition.
 
B

Bartc

Keith Thompson said:
Keith Thompson said:
Not quite.

(e) part of a multi-character character constant, such as '//' or
L'//'; or
(f) part of a header name, such as #include <foo//bar.h> or
#include "foo//bar.h" (no, the latter is not actually a string
literal)
(g) part of a sequence of preprocessor tokens that are discarded, as in:
#define IGNORE(x) /* nothing */
IGNORE(//)
or
#if 0
// no, this isn't a comment
#endif
; or
(h) part of a #error directive (though this causes the translation
unit to be rejected); or
(i) part of an implementation-defined #pragma directive.

I *think* that covers all the cases, but I wouldn't swear to it.
[...]

(j) a comment followed immediately by a division like/*this*//that
(the reverse of case (c)).

Why would this be a problem? It's clearly a /*...*/ comment followed by
division; no ambiguity. Otherwise you can have other cases such as /*
comment 1 *//* comment 2 */
 
K

Keith Thompson

Bartc said:
Keith Thompson said:
I *think* that covers all the cases, but I wouldn't swear to it.
[...]

(j) a comment followed immediately by a division like/*this*//that
(the reverse of case (c)).

Why would this be a problem? It's clearly a /*...*/ comment followed by
division; no ambiguity. Otherwise you can have other cases such as /*
comment 1 *//* comment 2 */

It's not a problem, and I didn't say it was. "//" in a string literal
isn't a problem wither. We were enumerating the circumstances in
which // can appear in a C90 program.
 
K

Keith Thompson

CBFalconer said:
N869.txt is ideal for the purpose, and easily available as:

<http://cbfalconer.home.att.net/download/n869_txt.bz2>

which is only about 210 kB bzip2 compressed.

No, it's very bad for the purpose. N869.txt is a draft of the C99
standard. The problem I was having was that I don't have a copy of
the C90 standard from which I can easily copy-and-paste. (I already
have a PDF copy of the C99 standard; I can copy-and-paste from it with
no problem.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,479
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top