W
Walter Roberson
I have run into a peculiarity with SGI's C compiler (7.3.1.2m). I have been
reading carefully over the ANSI X3.159-1989 specification, but I cannot
seem to find a justification for the behaviour. Could someone point
me to the appropriate section, or else confirm the behaviour as a bug?
For a particular project, I am using the C preprocessor phase only.
I am not using the standalone program 'cpp' because proper functioning
of my project depends upon being able to splice preprocessor tokens,
which is not supported in the standalone 'cpp'.
I am having the compiler stop after preprocessing by using SGI's C -P
option:
-P Runs only the preprocessor and puts the result for each source
file in a corresponding .i file. The .i file has no inline
directives in it.
It should be noted that my source is *not* C code -- I am using the
preprocessor to generate data files based upon templates.
The point I am having trouble with can be illustrated fairly simply,
by running these lines through the preprocessing phase:
#define eye L@@K
I eye
$ cpp -P look.c
I L@@K
That's with the standalone cpp program, and is the output I expect. But,
$ cc -P look.c
$ cat look.i
I L@ @K
And
$ cat look2.c
I L@@K
$ cc -P look2.c
$ cat look2.i
I L@@K
In short, certain combinations of symbols, when macro-replaced into
source, get separated by single space characters. Not every combination
is so treated: -~ and ~$ are left alone, for example. It is not
operator based, as it happens especially for ` and @ and $ .
The work around I have found is:
$ cat look3.c
#define eye L@##@K
I eye
$ cc -P look3.c
$ cat look3.i
I L@@K
The closest I have found to the whitespace-introducing behaviour is
the ANSI description of translation phases, 2.1.1.2, for phase 3:
3. The source file is decomposed into preprocessing tokens and
sequences of white-space characters (including comments). A source
file shall not end in a partial preprocessing token or comment.
Each comment is replaced by one space character. New-line characters
are retained. Whether each nonempty sequence of white-space characters
other than new-line is retained or replaced by one space character
is implimentation-defined.
Okay, so there's implimentation behaviour for *nonempty* sequence
of white-space characters, but L@@K has only the -empty- sequence
between the two @.
I see nothing in the discussion of macro replacement that would
lead to spaces being introduced {other than the behaviour of # in
function-like macro replacements.}
The only excuse I can think of is that as ` and @ and $ are not
C operators, that outside of character strings and character literals
they are perhaps not considered to be valid preprocessor tokens,
in which case the behaviour would become undefined ?
reading carefully over the ANSI X3.159-1989 specification, but I cannot
seem to find a justification for the behaviour. Could someone point
me to the appropriate section, or else confirm the behaviour as a bug?
For a particular project, I am using the C preprocessor phase only.
I am not using the standalone program 'cpp' because proper functioning
of my project depends upon being able to splice preprocessor tokens,
which is not supported in the standalone 'cpp'.
I am having the compiler stop after preprocessing by using SGI's C -P
option:
-P Runs only the preprocessor and puts the result for each source
file in a corresponding .i file. The .i file has no inline
directives in it.
It should be noted that my source is *not* C code -- I am using the
preprocessor to generate data files based upon templates.
The point I am having trouble with can be illustrated fairly simply,
by running these lines through the preprocessing phase:
#define eye L@@K
I eye
$ cpp -P look.c
I L@@K
That's with the standalone cpp program, and is the output I expect. But,
$ cc -P look.c
$ cat look.i
I L@ @K
And
$ cat look2.c
I L@@K
$ cc -P look2.c
$ cat look2.i
I L@@K
In short, certain combinations of symbols, when macro-replaced into
source, get separated by single space characters. Not every combination
is so treated: -~ and ~$ are left alone, for example. It is not
operator based, as it happens especially for ` and @ and $ .
The work around I have found is:
$ cat look3.c
#define eye L@##@K
I eye
$ cc -P look3.c
$ cat look3.i
I L@@K
The closest I have found to the whitespace-introducing behaviour is
the ANSI description of translation phases, 2.1.1.2, for phase 3:
3. The source file is decomposed into preprocessing tokens and
sequences of white-space characters (including comments). A source
file shall not end in a partial preprocessing token or comment.
Each comment is replaced by one space character. New-line characters
are retained. Whether each nonempty sequence of white-space characters
other than new-line is retained or replaced by one space character
is implimentation-defined.
Okay, so there's implimentation behaviour for *nonempty* sequence
of white-space characters, but L@@K has only the -empty- sequence
between the two @.
I see nothing in the discussion of macro replacement that would
lead to spaces being introduced {other than the behaviour of # in
function-like macro replacements.}
The only excuse I can think of is that as ` and @ and $ are not
C operators, that outside of character strings and character literals
they are perhaps not considered to be valid preprocessor tokens,
in which case the behaviour would become undefined ?