The preprocessor is just a pass

S

Sam of California

Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?

I responded to that comment by saying that the preprocessor is not just a
pass. It processes statements that the compiler does not process. The good
people in the alt.comp.lang.learn.c-c++ newsgroup insist that the
preprocessor is just one of many passes. The preprocessor processes a
grammer unique to the preprocessor and only that grammer.

The discussion is at:

What in fact is the preprocessor?
http://groups.google.com/group/alt....2923b657312/871df10e2b29fbc2#871df10e2b29fbc2
 
J

John Harrison

Sam said:
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?

I responded to that comment by saying that the preprocessor is not just a
pass. It processes statements that the compiler does not process. The good
people in the alt.comp.lang.learn.c-c++ newsgroup insist that the
preprocessor is just one of many passes. The preprocessor processes a
grammer unique to the preprocessor and only that grammer.

The discussion is at:

What in fact is the preprocessor?
http://groups.google.com/group/alt....2923b657312/871df10e2b29fbc2#871df10e2b29fbc2

What I don't understand is your statement

"The preprocessor is not just a pass. It processes statements that the
compiler does not process. The language is very clear that the
preprocessor's statements are totally different from the compiler."

The second sentence is true, the third sentence is true. I don't
understand how the first sentence follows.

The preprocessor is just a pass, as far as I am concerned. By 'just a
pass' I mean that the preprocessor can be totally seperated from the
other phases (or passes) that proceed and follow it, i.e. the output of
each pass is the input to the next pass that follows it. Maybe you have
a different definition of 'just a pass'.

john
 
S

Sam of California

John Harrison said:
"The preprocessor is not just a pass. It processes statements that the
compiler does not process. The language is very clear that the
preprocessor's statements are totally different from the compiler."

The second sentence is true, the third sentence is true. I don't
understand how the first sentence follows.

Of course the first sentence is vague. It can be interpreted in many ways. I
clarify the first sentence with the subsequent sentences.
 
J

John Harrison

Sam said:
Of course the first sentence is vague. It can be interpreted in many ways. I
clarify the first sentence with the subsequent sentences.

Well it seems to me that you are defining 'a pass' in a certain way. And
as a consequence of the way you have defined 'a pass' it is true that
preprocessing is not just a pass.

No doubt those who disagreed with you (me included) defined 'a pass' in
a different way, so they are right as well.

This kind of argument about definitions is very boring, so I'm not
taking any further part, unless you have a substantive point to make. At
the moment I don't see it.

john
 
G

Gennaro Prota

Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?

I responded to that comment by saying that the preprocessor is not just a
pass.

How can a processor be a pass; something which performs a pass, at
most.

Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").
 
P

Phlip

Sam said:
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?

Very casually, yes.
I responded to that comment by saying that the preprocessor is not just a
pass. It processes statements that the compiler does not process. The good
people in the alt.comp.lang.learn.c-c++ newsgroup insist that the
preprocessor is just one of many passes. The preprocessor processes a
grammer unique to the preprocessor and only that grammer.

Historically, the preprocessor was a separate program. It read
preprocessor-ready code and wrote processed C code, without the extra #
statements and such. Then the C compiler read the raw C code.

Nowadays we naturally use only one compiling program. The vestiges of the
CPP filter have migrated into it, including a separate lexing system. So the
CPP does respect "" quotes and // comment markers, but does not respect
delimiters like {}. It is unaware they delimit blocks.

Explaining the system gets easier if you treat the preprocessor as a second
pass thru the text of the program. The historical note helps.
 
J

James Kanze

How can a processor be a pass; something which performs a pass, at
most.
Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").

To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing". With regards to "passes", I rather suspect
that very few compilers today use a separate pass for this; it's
generally integrated one way or another into the tokenization of
the input. Which doesn't mean that we can't speak of the
preprocessor, just because it isn't a separate pass.
 
M

Marcin Kalicinski

Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?

The C++ preprocessing step is very well defined, and every C++ programmer
knows precisely what it does. On the other hand, a "pass" is not a well
defined term. So, question "is preprocessing a pass" makes no sense. The
question you have to answer first is "what is a pass". Then, the answer to
your original question will follow automatically.

Alternatively, you can spend two weeks discussing whether preprocessor is
something you don't know, or if it isn't it. Good luck.
 
S

Sam of California

Marcin Kalicinski said:
The C++ preprocessing step is very well defined

Yes, thank you.
On the other hand, a "pass" is not a well defined term.
Yes.

So, question "is preprocessing a pass" makes no sense. The question you
have to answer first is "what is a pass". Then, the answer to your
original question will follow automatically.

Yes. The problem is that everyone is putting emphasis on "pass" and
commenting on that only. Then they say, using various terminology, that my
question is nonsense (makes no sense).

Everyone is ignoring the important part, except for brief commenst such as
yours. You say that the "C++ preprocessing step is very well defined", and
that is the important part that is ignored or only briefly commented on.

I am sorry if I fail at using the correct terminology. What I am saying is
that describing the preprocessor as (just) a pass is saying that there is
nothing unique about preprocessor statements or directives. If "preprocessor
statements" is incorrect terminology then what should I say? Is
"preprocessor directives" correct? Is it correct to say that the
preprocessor's grammer is separate from all the rest of the grammer for
C/C++?
 
J

John Harrison

I am sorry if I fail at using the correct terminology. What I am saying is
that describing the preprocessor as (just) a pass is saying that there is
nothing unique about preprocessor statements or directives. If "preprocessor
statements" is incorrect terminology then what should I say? Is
"preprocessor directives" correct? Is it correct to say that the
preprocessor's grammer is separate from all the rest of the grammer for
C/C++?

Yes that last statement is true. In fact the tokens that the
preprocessing grammar operates on are different from the tokens that the
main C++ grammar operates on.

But you introduced this terminaolgy 'just a pass'. I still don't think
that 'the preprocessor's grammer is separate from all the rest of the
grammer' justifies the statement 'the preprocessor is not just a pass',
quite the opposite I would say. The very fact that the two grammars are
unrelated *encourages* me to describe preprocessing as just a pass.

At least I'm sure you can agree that using this terminolgy is confusing,
just look at this thread and the last. So if you want to find out more
about preprocessing I suggest you drop it.

john
 
J

James Kanze

The C++ preprocessing step is very well defined, and every C++ programmer
knows precisely what it does.

Except that no two know exactly the same thing. The standard
speaks of phases of compilation. Taken literatally, "the C++
preprocessing step" would be phase 4. Generally however, one
tends to speak of the preprocessor for everything through phase
6.
On the other hand, a "pass" is not a well defined term.

It is in compiler technology. It's a separate phase which
treats the entire program. Most compilers today use four
passes: a front-end, which does preprocessing, tokenizing and
parsing; a "middle-end", which does more or less processor
independent optimizations, such as common sub-routine
elimination, a back-end, which does code generation, and a
peephole optimizer, which does peephole optimization. But of
course there are a lot of variants: most compilers will skip the
"middle-end" unless you've asked for optimization, and many
merge the back-end and the peephole optimizer into a single
pass.

The original C compilers, way back when, did use a separate pass
for the preprocessor, but I rather doubt that any compiler does
so today.
 
G

Gennaro Prota

How can a processor be a pass; something which performs a pass, at
most.
Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").

To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing".

Yes, that's another possibility. As I said, neither terms is defined
by the standard, and everything is quite vague. People use the terms
quite informally.

Speaking of terminology and personal preferences, I've always felt
that the standard could have given a name to the phases, rather than
just numbering them. In that case, I'd see something like (off the top
of my head: don't focus too much on the names):

Character Mapping (1)
Line Splicing (2)
Pre-tokenization (3)
Preprocessing (4)
Execution Character Set Mapping (5)
Literal Concatenation (6)
Tokenization (7a)
Syntactical and Semantic Analysis (7b)
Translation (7c)
Instantiation (8)
Linking (9)

I have noted in parentheses how they --more or less-- correspond to
the numbers used in the standard; in practice though if one used the
names then the separation would be slightly different and likely end
up in 10/12 items (in effect, what I've always felt odd isn't that
much that there are no names; rather it's the strange grouping of
things --see especially 7-- which in turn becomes manifest if you try
to give a *fitting* name to those groups).

Judging by my perception of the word "preprocessor", I see (3) and (4)
as definitely in its area of concern, though (3) is probably somehow
an "implementation detail"; (1) seems something which is logically
preceding and (2) is somehow borderline.
With regards to "passes", I rather suspect
that very few compilers today use a separate pass for this; it's
generally integrated one way or another into the tokenization of
the input. Which doesn't mean that we can't speak of the
preprocessor, just because it isn't a separate pass.

One could speak of it as a conceptual entity (for those who like to
show off: the "abstract machine" doing preprocessing :)). But the
facts are... the one rigorous specification we have, the standard,
doesn't define the term.

To sum it up, the original question is simply ill-posed. The
translation of a C++ program conceptually happens in phases, as
described in the standard. One may decide to call preprocessing some
specific sub-sequence, and compilation some other, but there's no such
official terminology. To some, a compiler is what performs phases from
(7a) to 8, included. A linker what performs (9). Others mean by
"compiler", or "translator", the executor of the whole translation.
 
J

James Kanze

On Sun, 27 May 2007 08:11:04 -0700, Sam of California wrote:
Is it accurate to say that "the preprocessor is just a pass in the parsing
of the source file"?
I responded to that comment by saying that the preprocessor is not just a
pass.
How can a processor be a pass; something which performs a pass, at
most.
Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").
To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing".
Yes, that's another possibility. As I said, neither terms is defined
by the standard, and everything is quite vague. People use the terms
quite informally.

There's also the historical context to be considered. In
Johnson's pcc, the preprocessor was a separate pass, before the
compiler front-end pass. Roughly speaking, the preprocessor
read your code, broke it up into preprocessing tokens, did what
it did, and then spit out text. The front end then read this
text, and broke it up into language tokens, and parsed it. Line
breaks had significance in the pre-processor, but not in the
front-end.

Based on this, it seems logical to make the break at the point
where preprocessor tokens are converted into language tokens,
and all white space (including new-lines) ceases to have any
significance.
Speaking of terminology and personal preferences, I've always felt
that the standard could have given a name to the phases, rather than
just numbering them.

The sole role of the phases in the standard is to define the
order in which the different actions take place. Numbers are
very good for defining order. Everyone knows that 1 comes
before 2, but it must be explicitly stated that character
mapping comes before line splicing.
In that case, I'd see something like (off the top
of my head: don't focus too much on the names):
Character Mapping (1)
Line Splicing (2)
Pre-tokenization (3)
Preprocessing (4)
Execution Character Set Mapping (5)
Literal Concatenation (6)
Tokenization (7a)
Syntactical and Semantic Analysis (7b)
Translation (7c)
Instantiation (8)
Linking (9)
I have noted in parentheses how they --more or less-- correspond to
the numbers used in the standard; in practice though if one used the
names then the separation would be slightly different and likely end
up in 10/12 items (in effect, what I've always felt odd isn't that
much that there are no names; rather it's the strange grouping of
things --see especially 7-- which in turn becomes manifest if you try
to give a *fitting* name to those groups).

The order of the different operations in 7 is implicit---you
can't translate without having "syntactically and semantically
analyzed", and you can't syntactically and semantically analyse
without having language tokens. Since there are no alternatives
in the order, there's no need to separate into separate phases
to define the order.
Judging by my perception of the word "preprocessor", I see (3) and (4)
as definitely in its area of concern, though (3) is probably somehow
an "implementation detail"; (1) seems something which is logically
preceding and (2) is somehow borderline.

So what are phases 1 and 2: a prepreprocessor?
One could speak of it as a conceptual entity (for those who like to
show off: the "abstract machine" doing preprocessing :)). But the
facts are... the one rigorous specification we have, the standard,
doesn't define the term.
To sum it up, the original question is simply ill-posed. The
translation of a C++ program conceptually happens in phases, as
described in the standard. One may decide to call preprocessing some
specific sub-sequence, and compilation some other, but there's no such
official terminology. To some, a compiler is what performs phases from
(7a) to 8, included. A linker what performs (9). Others mean by
"compiler", or "translator", the executor of the whole translation.

The traditional break (in C) has been: preprocessor: phases 1
through 6, compiler: phase 7, linker phase 9. (Phase 8 is
concerned with instantiating templates, and doesn't have a place
in traditional C.) If you're talking about passes, however,
most compilers today will use a single pass for everything
through your 7b, above, then up to three passes for 7c, and
linking remains separate. Where phase 8 fits in varies, but I
suspect that a lot of modern compilers cram it into the first
pass as well.
 
G

Gennaro Prota

To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing".
Yes, that's another possibility. As I said, neither terms is defined
by the standard, and everything is quite vague. People use the terms
quite informally.

There's also the historical context to be considered. In
Johnson's pcc, the preprocessor was a separate pass, before the
compiler front-end pass. Roughly speaking, the preprocessor
read your code, broke it up into preprocessing tokens, did what
it did, and then spit out text. The front end then read this
text, and broke it up into language tokens, and parsed it. Line
breaks had significance in the pre-processor, but not in the
front-end.

Yes. There's also another context (many refer to this one, I guess):
those of separate preprocessor executables. Borland CPP32 is an
example (and actually the most conformant preprocessor I know of --for
some odd reason the one builtin in the compiler isn't even close).
They usually take your source code as input and produce a textual file
which can be fed to the compiler proper (which is just what most
compilers can do with the appropriate command line switches, too); but
as far as I remember (it's not that I really use them) they won't
output UCNs, for instance. Which means that conceptually the compiler
proper has to perform (1) again. So, I see (1) as something which
logically comes "first": it is more or less necessary each time you
read a textual file, even if you start with phase 7 directly. (But let
me know if my point is clear :))
[...]
The order of the different operations in 7 is implicit---you
can't translate without having "syntactically and semantically
analyzed", and you can't syntactically and semantically analyse
without having language tokens. Since there are no alternatives
in the order, there's no need to separate into separate phases
to define the order.

I see, but I'd have liked it anyway, stylistically/logically.
So what are phases 1 and 2: a prepreprocessor?

Sort of :) Seriously, I think of "preprocessing" as the
transformation which happens on the source text by executing all the
preprocessing directives. That is *my own* perception of the word, as
I said, and it probably originates from the fact that, before I had
the standard, I thought that all that preprocessing is about was
executing #includes and expanding #defines.
 
J

JohnQ

(My post is pretty much "an aside" to the highly technical nature of the
thread. How should I actually post in that way? Putting an "Aside:" in front
of the "Re:"?).

On Sun, 27 May 2007 08:11:04 -0700, Sam of California wrote:
Is it accurate to say that "the preprocessor is just a pass in the
parsing
of the source file"?
I responded to that comment by saying that the preprocessor is not
just a
pass.
How can a processor be a pass; something which performs a pass, at
most.
Informally, I use the terms "preprocessing" or "preprocessing phase"
to identify --roughly-- the sequence of what the ISO standard defines
as phase 3 and phase 4 of translation -- but only when there isn't any
need to be more precise (it is also the name of one of the directories
in the Breeze source tree, for instance). In any case, the standard
doesn't use the term "preprocessor", nor "preprocessing" as a
standalone noun (it uses expressions such as "preprocessing
directive", though, which may make somewhat reasonable the personal
terminology choice explained above. It arose exactly because I didn't
like to use the term "preprocessor").
To quote the standard (§2.1/7): "[In phase 7] Each preprocessing
token is converted into a token." I've always understood
everything which preceded this (i.e. phases 1-6) to be
"preprocessing", and the "preprocessor" whatever does the
"preprocessing".
Yes, that's another possibility. As I said, neither terms is defined
by the standard, and everything is quite vague. People use the terms
quite informally.

"There's also the historical context to be considered. In
Johnson's pcc, the preprocessor was a separate pass, before the
compiler front-end pass. Roughly speaking, the preprocessor
read your code, broke it up into preprocessing tokens, did what
it did, and then spit out text. The front end then read this
text, and broke it up into language tokens, and parsed it. Line
breaks had significance in the pre-processor, but not in the
front-end."

I'm all for "the good ol' days" if it makes compiler system construction
easier. Do you high-end engineers feel that only the current "state of the
art" machinery is worthy of building upon? (If you know me yet, you know
what I think: that if its really complex, then it's not foundational).

"Based on this, it seems logical to make the break at the point
where preprocessor tokens are converted into language tokens,
and all white space (including new-lines) ceases to have any
significance."

See, now that's something I could grok if I wanted to learn about compiler
construction and wanted to develop a compiler. (I hope people still want to
build "simple" compilers, because I surely don't want to do it!)

All "remote references" aside, aren't things like "optimizing compilers" for
scientific computing and the like only now? I mean, I want to build my
program with multiple threads (!) (yes, and with C++!). Pretty risky once
you turn on the optimizations huh?

<Thoughts about "saving the preprocessor's life".... no wait, "giving it its
life back!", omitted>.
Speaking of terminology and personal preferences, I've always felt
that the standard could have given a name to the phases, rather than
just numbering them.

"The sole role of the phases in the standard is to define the
order in which the different actions take place. Numbers are
very good for defining order. Everyone knows that 1 comes
before 2, but it must be explicitly stated that character
mapping comes before line splicing."

Isn't that a programmer's dream (!): a sequential list of things to program.
(I admit, a bit boring, but fine work when the brain is only at half
capacity).
In that case, I'd see something like (off the top
of my head: don't focus too much on the names):
Character Mapping (1)
Line Splicing (2)
Pre-tokenization (3)
Preprocessing (4)
Execution Character Set Mapping (5)
Literal Concatenation (6)
Tokenization (7a)
Syntactical and Semantic Analysis (7b)
Translation (7c)
Instantiation (8)
Linking (9)

Damn, I'm learning too much about this stuff now and feel like I'm "going
backwards" again! :p (No worries though, I'm never going to write a
compiler!)
To sum it up, the original question is simply ill-posed. The
translation of a C++ program conceptually happens in phases, as
described in the standard. One may decide to call preprocessing some
specific sub-sequence, and compilation some other, but there's no such
official terminology. To some, a compiler is what performs phases from
(7a) to 8, included. A linker what performs (9). Others mean by
"compiler", or "translator", the executor of the whole translation.

"The traditional break (in C) has been: preprocessor: phases 1
through 6, compiler: phase 7, linker phase 9. (Phase 8 is
concerned with instantiating templates, and doesn't have a place
in traditional C.) If you're talking about passes, however,
most compilers today will use a single pass for everything
through your 7b, above, then up to three passes for 7c, and
linking remains separate. Where phase 8 fits in varies, but I
suspect that a lot of modern compilers cram it into the first
pass as well."

And if one wanted to get that kind of info formally, where would someone get
that? Certainly not the dragon compiler book (?). (Or would "one" just hire
you or you company to use that knowledge?)

John
 
J

James Kanze

On May 30, 6:22 am, "JohnQ" <[email protected]>
wrote:

[It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]
"James Kanze" <[email protected]> wrote in message

[...]
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?

They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization.
I mean, I want to build my program with multiple threads (!)
(yes, and with C++!). Pretty risky once you turn on the
optimizations huh?

Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees.

[...]
"The traditional break (in C) has been: preprocessor: phases 1
through 6, compiler: phase 7, linker phase 9. (Phase 8 is
concerned with instantiating templates, and doesn't have a place
in traditional C.) If you're talking about passes, however,
most compilers today will use a single pass for everything
through your 7b, above, then up to three passes for 7c, and
linking remains separate. Where phase 8 fits in varies, but I
suspect that a lot of modern compilers cram it into the first
pass as well."
And if one wanted to get that kind of info formally, where
would someone get that?

I don't know. The "traditional break" is just one of those
things you knew if you were programming under Unix back in the
1980's. What modern compilers do is a result of observation,
using a number of different modern compilers. (I seem to recall
having seen it actually mentionned in the documentation for g++,
but I could easily be wrong.)
Certainly not the dragon compiler book (?). (Or would "one" just hire
you or you company to use that knowledge?)

What would a company want to do with that kind of knowledge.
It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)
 
J

JohnQ

On May 30, 6:22 am, "JohnQ" <[email protected]>
wrote:

" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"

For some reason, my newsgroup reader doesn't insert the > symbols in certain
responses (on yours for instance!). I don't know why!
"James Kanze" <[email protected]> wrote in message

[...]
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?

"They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization."

I was suggesting that most programs don't really need it because processors
are so fast these days and the majority of programs aren't of the
"scientific-computing" kind.
I mean, I want to build my program with multiple threads (!)
(yes, and with C++!). Pretty risky once you turn on the
optimizations huh?

"Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees."

Preventing some reordering of statements within blocks by not turning on
optimization is what I was thinking.

[...]
"The traditional break (in C) has been: preprocessor: phases 1
through 6, compiler: phase 7, linker phase 9. (Phase 8 is
concerned with instantiating templates, and doesn't have a place
in traditional C.) If you're talking about passes, however,
most compilers today will use a single pass for everything
through your 7b, above, then up to three passes for 7c, and
linking remains separate. Where phase 8 fits in varies, but I
suspect that a lot of modern compilers cram it into the first
pass as well."
And if one wanted to get that kind of info formally, where
would someone get that?

I don't know. The "traditional break" is just one of those
things you knew if you were programming under Unix back in the
1980's. What modern compilers do is a result of observation,
using a number of different modern compilers. (I seem to recall
having seen it actually mentionned in the documentation for g++,
but I could easily be wrong.)
Certainly not the dragon compiler book (?). (Or would "one" just hire
you or you company to use that knowledge?)

"What would a company want to do with that kind of knowledge."

Build a compiler for a new language.

"It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)"


Building a compiler for a new language?

John
 
J

James Kanze

On May 30, 6:22 am, "JohnQ" <[email protected]>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!

Then it's time to change newsreaders. Even Google gets this
right.
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?
"They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization."
I was suggesting that most programs don't really need it
because processors are so fast these days and the majority of
programs aren't of the "scientific-computing" kind.

Most programs are IO bound, and of course, they won't use
optimization. But you don't have to be in scientific computing
to find exceptions.
"Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees."
Preventing some reordering of statements within blocks by not
turning on optimization is what I was thinking.

But you don't have any more guarantees. I'm most familiar with
the Posix environment, and Posix compliant compilers give the
guarantees you need, regardless of optimization. If you don't
want things like write order rearranged, you need special
hardware instructions---the compiler inserts these where needed
according to the guarantees it gives, and not otherwise.
Regardless of the level of optimization.
[...]
"What would a company want to do with that kind of knowledge."
Build a compiler for a new language.

If you want to write a compiler, you hire someone who knows
about compilers, and modern machine architecture. It is a
specialized field.
"It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)"
Building a compiler for a new language?

I don't know many companies in that business:). But seriously,
you don't hire beginners for that. You hire people who know
compilers.
 
G

Gennaro Prota

" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"

For some reason, my newsgroup reader doesn't insert the > symbols in certain
responses (on yours for instance!). I don't know why!

<OT>
You might want to google for "GNKSA" and "Outlook Express 6"
</OT>
 
J

JohnQ

On May 30, 6:22 am, "JohnQ" <[email protected]>
wrote:
" [It would help if you'd use standard Internet protocol
citations. It's hard to follow who says what in your
postings.]"
For some reason, my newsgroup reader doesn't insert the >
symbols in certain responses (on yours for instance!). I don't
know why!

"Then it's time to change newsreaders. Even Google gets this
right."

I'll keep an eye on it now, but I think it may only be happening
in response to your posts.
All "remote references" aside, aren't things like "optimizing
compilers" for scientific computing and the like only now?
"They're for anyone who needs the performance. Modern RISC
processors are designed so that you almost always need some
optimization."
I was suggesting that most programs don't really need it
because processors are so fast these days and the majority of
programs aren't of the "scientific-computing" kind.

"Most programs are IO bound, and of course, they won't use
optimization. But you don't have to be in scientific computing
to find exceptions."

So, optimization can be avoided is what you're saying.
"Not really. Even after optimizing, the compiler must respect
the appropriate guarantees. And not optimizing doesn't give you
any additional guarantees."
Preventing some reordering of statements within blocks by not
turning on optimization is what I was thinking.

"But you don't have any more guarantees. I'm most familiar with
the Posix environment, and Posix compliant compilers give the
guarantees you need, regardless of optimization. If you don't
want things like write order rearranged, you need special
hardware instructions---the compiler inserts these where needed
according to the guarantees it gives, and not otherwise.
Regardless of the level of optimization."

That's the theory at least. Better safe than sorry. Turning on
optimization for testing whether funny things happen is a good
idea though.
[...]
"What would a company want to do with that kind of knowledge."
Build a compiler for a new language.

"If you want to write a compiler, you hire someone who knows
about compilers, and modern machine architecture. It is a
specialized field."

Or learn it yourself it you are young enough.
"It's really purely annecdotal, and certainly has no effect on
how you write C++ or develop programs. (That doesn't mean that
it is uninteresting. Just that it doesn't have any real
financial value, that a company would pay for.)"
Building a compiler for a new language?

"I don't know many companies in that business:). But seriously,
you don't hire beginners for that. You hire people who know
compilers."

Isn't it dependendent on how complex the language is? Are front
ends or back ends harder to create? Don't both of those exist
premade already (EDG?).

John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,794
Messages
2,569,641
Members
45,354
Latest member
OrenKrause

Latest Threads

Top