Is the output of the preprocessor deterministic ?

S

spibou

Is the output of the C preprocessor deterministic ? What I mean
by that is , given 2 compilers which conform to the same standard,
will their preprocessors produce identical output given as input
the same file ? If not then how much variation is allowed ? Is it
just a bit more or less white space here and there or could could
there be larger differences ?

If the output is not deterministic then is it possible that the output
of the preprocessor of one compiler can not be processed correctly
by another compiler ?

Spiros Bousbouras
 
T

Tom St Denis

Is the output of the C preprocessor deterministic ? What I mean
by that is , given 2 compilers which conform to the same standard,
will their preprocessors produce identical output given as input
the same file ? If not then how much variation is allowed ? Is it
just a bit more or less white space here and there or could could
there be larger differences ?

If the output is not deterministic then is it possible that the output
of the preprocessor of one compiler can not be processed correctly
by another compiler ?

Shooting in dark...

But whitespace shouldn't stop one compiler from working with the output
from a competitor.

I think though you'll find it isn't going to be the same. For example,
if you use cpp from GCC on a Linux box then compare it to what MSVC
would give the outputs will differ, especially since the headers
themselves are different.

Tom
 
S

spibou

Tom said:
Shooting in dark...

But whitespace shouldn't stop one compiler from working with the output
from a competitor.

I think though you'll find it isn't going to be the same. For example,
if you use cpp from GCC on a Linux box then compare it to what MSVC
would give the outputs will differ, especially since the headers
themselves are different.

I specified that the input has to be the same. By that I also meant
same headers.
 
G

Guest

Is the output of the C preprocessor deterministic ? What I mean
by that is , given 2 compilers which conform to the same standard,
will their preprocessors produce identical output given as input
the same file ?

Even if you ignore that there is no standard file format for
preprocessed output, and that preprocessed output need not be
obtainable at all, then no, output may be very different.
If not then how much variation is allowed ? Is it
just a bit more or less white space here and there or could could
there be larger differences ?

Much larger. Some easy examples:

#ifndef __i386__
#error
#endif

#if 18446744073709551615U + 1
#error
#endif

#define f(x) g
#define g(x) f
f(1)(2)(3) /* either f(3) or g */

#define s(x) #x
s("\u0040") /* either "\"\u0040\"" or "\"\\u0040\"" */
 
A

Al Balmer

#include <stdio.h>

#ifdef putchar
#define STRING "putchar is a macro"
#else
#define STRING "putchar is not a macro"
#endif

He said "same headers", not "headers with the same name."

But Harald has a couple of examples.
 
S

spibou

Harald said:
#define f(x) g
#define g(x) f
f(1)(2)(3) /* either f(3) or g */

Ok , let's concentrate on this one for the time being. The
Sun compiler gives " g (2)(3) " (without the quotes). GNU
gives "g". Could someone explain to me the intermediate
steps which give these results ? Also for "f(3)" if that's
also possible.

By the way is there a tool or option for some compiler
which will make it show all the intermediate steps in
macro expansions ? So I'm asking for something similar
to Lisp's macroexpand-1.

Spiros Bousbouras
 
B

Barry Schwarz

I specified that the input has to be the same. By that I also meant
same headers.

Headers are system specific. The compiler writer is allowed to take
advantage of unique system attributes. It is unreasonable to expect
one compiler to even function with another's headers. What about the
case where headers are not files but built-in to the compiler?


Remove del for email
 
S

spibou

Barry said:
Headers are system specific. The compiler writer is allowed to take
advantage of unique system attributes. It is unreasonable to expect
one compiler to even function with another's headers. What about the
case where headers are not files but built-in to the compiler?

Then you can consider the case where you have 2 compilers on the same
platform. The essence of my original question is whether macro
expansion
rules are deterministic not about unique system attributes and
differences
in the environment.

Spiros Bousbouras
 
G

Gordon Burditt

Is the output of the C preprocessor deterministic ? What I mean
by that is , given 2 compilers which conform to the same standard,
will their preprocessors produce identical output given as input
the same file ?

There is no standardized format for the output of the preprocessor,
which is in tokens. Some may output plain text. Some may output
XML. Some may output some wierd binary format. Some may not even
HAVE a separate output format nor exist as a separate program.
If not then how much variation is allowed ? Is it
just a bit more or less white space here and there or could could
there be larger differences ?

The expansion of __TIME__ and __DATE__ may be different for successive
runs of the preprocessor. Many preprocessors have predefined symbols
indicating the type of system, OS, or compiler it is, and different
preprocessors may have different predefined symbols.

Differences in system header files may result in different output.
It is possible that some preprocessors cannot handle non-precompiled
system headers, and that others have no provision for precompiled
headers.

System headers may contain special magic, some of which might be
handled by the preprocessor. If a different preprocessor tries
to handle the magic, it may report errors, or handle it incorrectly.
If the output is not deterministic then is it possible that the output
of the preprocessor of one compiler can not be processed correctly
by another compiler ?

Due to possible:
- Lack of output from the preprocessor stage
- Inability to input to the compiler without preprocessing AGAIN
- Differences in preprocessor/compiler formats for preprocessed tokens
- Differences in system header files

it's very possible that output from one preprocessor can't be compiled
by another compiler, and even more likely that it won't link and run
correctly.

Gordon L. Burditt
 
G

Guest

Ok , let's concentrate on this one for the time being. The
Sun compiler gives " g (2)(3) " (without the quotes). GNU
gives "g". Could someone explain to me the intermediate
steps which give these results ? Also for "f(3)" if that's
also possible.

< f(1) >(2)(3) becomes < g >(2)(3)

g is not being expanded already, so it must be considered for
expansion:

< g(2) >(3) becomes < f >(3)

g was the result of the expansion of f(1), but ( wasn't. The standard
doesn't specify whether the final f is considered the result of the
expansion of f(1) in this case. If the compiler considers it as such,
f(3) is not expanded further. If it isn't, f(3) is expanded once more
to g.

(It was a slightly modified example from the C99 rationale, by the
way.)
By the way is there a tool or option for some compiler
which will make it show all the intermediate steps in
macro expansions ? So I'm asking for something similar
to Lisp's macroexpand-1.

No idea, but if there is such a thing, I'm interested too.
 
M

Mark McIntyre

Is the output of the C preprocessor deterministic ? What I mean
by that is , given 2 compilers which conform to the same standard,
will their preprocessors produce identical output given as input
the same file ?

Its trivial to show examples which differ:

#ifdef _WIN32
#error noway hose
#else
int x=0;
#endif

But I've a feeling that isn't what you're thinking of !
If not then how much variation is allowed ?

The C preprocessor is largely a simple textual replacement engine.
Such replacements will be identical. The results of conditional code
etc will obviously vary if its intended to.
If the output is not deterministic then is it possible that the output
of the preprocessor of one compiler can not be processed correctly
by another compiler ?

Sure - you could define a constant which was impossibly large for one
compiler, such as a 32-bit value in a 16-bit environment.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
M

Mark McIntyre

He said "same headers", not "headers with the same name."

Then the question was strictly meaningless - you can't generally use
the same headers on different implementations.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
A

Al Balmer

Then the question was strictly meaningless - you can't generally use
the same headers on different implementations.

But that wasn't the question. The question, as the subject line says,
was "Is the output of the preprocessor deterministic ?". It's a
perfectly legitimate question. There's nothing that would preclude a
single preprocessor being used with more than one implementation.
Forget about header files. The question was whether every preprocessor
is required to produce the same output, given the same input. We can
stipulate that the input is syntactically correct, of course.
 
M

Mark McIntyre

But that wasn't the question.

So what? I was commenting on your remark about headers.
Forget about header files.

Why? Thats what I was commenting on.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
K

Keith Thompson

There is no standardized format for the output of the preprocessor,
which is in tokens. Some may output plain text. Some may output
XML. Some may output some wierd binary format. Some may not even
HAVE a separate output format nor exist as a separate program.
[snip]

The question above, "Is the output of the C preprocessor
deterministic?", was posted by Spiros Bousbouras <[email protected]>.
I mention this because Gordon Burditt is too rude to atttribute
quotations.
 
G

Gordon Burditt

But that wasn't the question. The question, as the subject line says,
was "Is the output of the preprocessor deterministic ?". It's a
perfectly legitimate question. There's nothing that would preclude a
single preprocessor being used with more than one implementation.
Forget about header files. The question was whether every preprocessor
is required to produce the same output, given the same input. We can
stipulate that the input is syntactically correct, of course.

No, if the input uses __DATE__ or __TIME__ .

Possibly no unless you count command-line options that predefine
or pre-un-define macros as part of the input. Also possibly no
unless you count command-line options that change what system
header files are used as part of the input.

Possibly no if the name of the file given to the file is not
considered part of the input and variations of the name are used
(e.g. /home/gordon/foo.c vs. ./foo.c vs. foo.c) and the compiler
sticks the file name into the preprocessed output (__FILE__ is used
or the preprocessor sticks filename info into the preprocessed
output for use with error messages from the compiler).

Gordon L. Burditt
 
K

Kenny McCormack

[email protected] (Gordon Burditt) said:
There is no standardized format for the output of the preprocessor,
which is in tokens. Some may output plain text. Some may output
XML. Some may output some wierd binary format. Some may not even
HAVE a separate output format nor exist as a separate program.
[snip]

The question above, "Is the output of the C preprocessor
deterministic?", was posted by Spiros Bousbouras <[email protected]>.
I mention this because Gordon Burditt is too rude to atttribute
quotations.

Uh oh. Cat fight!
 
W

Walter Roberson

Is the output of the C preprocessor deterministic ? What I mean
by that is , given 2 compilers which conform to the same standard,
will their preprocessors produce identical output given as input
the same file ?

Not necessarily. It isn't uncommon for preprocessors to put
in # directives that help trace which line of which header file
that one has reached. Those # directives are not necessarily
in a standard format.

The C standard does not require the possibility of
"preprocessing only": preprocessing is only defined in terms
of one of the translation phases, and the translation phases
are defined by the standard as being logical phases that may
be internally combined, and which may pass information to each
other through any mechanism they like.
 
A

Al Balmer

So what? I was commenting on your remark about headers.

No, you weren't. You said "Then the question was strictly
meaningless". I didn't ask any question, so you could not have been
commenting about that. The question was asked by the OP.
Why? Thats what I was commenting on.

Then start your own thread.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,021
Latest member
AkilahJaim

Latest Threads

Top