Macro expansion

Eric Sosman · Nov 19, 2010

Not so!

The proposed action is re-running the preprocessor.

Since the first pass consumed all the #defines, there won't be any
more, unless the code was expanding to things which looked like #defines,
in which case it almost certainly didn't compile now.

So the further passes wouldn't *do* anything. The macro names
would no longer be macro names, because there'd be no macro definitions
to create macro names.

The proposal is actually completely harmless to just about any real
code -- except for possible issues with #line, I don't think it would
have ANY effect on real programs.

Ah. On rereading sandeep's proposal, I see that I was misled
by his use of the word "recursively." With that word in view, I
somehow imagined his proposal involved recursion. Silly me.

I withdraw my objection, and join my voice to your question:
In what way is his proposal different from a no-op? He suggests
(1) running the preprocessor repeatedly, which does nothing, and
(2) condemning the repeated runs, which just makes us feel shame.

Jon · Nov 19, 2010

Seebs said:
I think at this point, it is safe to say that sandeep is just
trolling us.

Think about the Stopped Clock. No one could possibly come up with
this many suggestions which are this bad without doing it on purpose.

You must think very highly of yourself to keep on suggesting that people
have nothing better to do with their time than troll here. You sound like
a broken record.

Jon · Nov 19, 2010

Peter said:
In what way is that a _serious_ limitation?

Perhaps, but how useful are they. The preprocessing phase
serves a very simple purpose. It was never meant to be a
language in it's own right. Most other languages don't even
have one.

Of course though, (C++) templates are just glorified macro processing.
Some such concoction is decidedly a necessary capability of "modern"
languages. Snippets, text replacement, macros, templates... all allude to
the higher-level concept of parametric code generation. How deeply it is
integrated into the language is an important design consideration.

Keith Thompson · Nov 19, 2010

Seebs said:
As-written, no currently valid C program would be changed by it.

(Because all the #defines would be gone in the resulting code, so
there'd be no macros to expand.)

Hmm?

A C program's source is what it is before the preprocessor is invoked,
macro definitions and all. I don't see how sandeep's proposal would
change that.

Here's a currently valid C program:

#include <stdio.h>
#define foo foo
int main(void)
{
char *foo = "hello, world";
puts(foo);
return 0;
}

Under sandeep's proposal, if I understand it correctly, it would be
invalid because it would send the compiler into infinite recursion.

Keith Thompson · Nov 19, 2010

Seebs said:
Not so!

The proposed action is re-running the preprocessor.

Since the first pass consumed all the #defines, there won't be any
more, unless the code was expanding to things which looked like #defines,
in which case it almost certainly didn't compile now.

So the further passes wouldn't *do* anything. The macro names
would no longer be macro names, because there'd be no macro definitions
to create macro names.

The proposal is actually completely harmless to just about any real
code -- except for possible issues with #line, I don't think it would
have ANY effect on real programs.

Ah, yes, I see what you mean.

For reference, here's sandeep's original proposal:

One of the serious limitations of C is the fact that macro
expansion is not done recursively. Many powerful constructions
would become possible if this was allowed. It is also something
that often trips up novices in the language.

I would suggest that in the next release of the ISO Standard,
it is decried that the preprocessor shall be run repeatedly. In
fact it shall be run n times until the nth run does not change
the source file.

You're right, feeding the preprocessor's output back to the
preprocessor repeatedly means that all #defines are removed on the
first pass. Any references to __LINE__ are resolved on the first
pass, so line numbers should be (mostly?) unaffected. There are some
corner cases involving predefined macros, and as you say #line might
be affected somehow (I'm not going to take the time to explore that).

Note that this is still rather ill-defined. Which translation
phases constitute "the preprocessor"? Phase 1 maps source file
characters to the source character set; repeating that could
certainly cause problems on some systems. Phase 5 translates the
source character set to the execution character set in character
constants and string literals; repeating this could also cause
problems. Phase 6 concatenates adjacent string literals; because
of this, the second pass could see longer logical source lines
than the first did, possibly exceeding an implementation limit.
It can also introduce new trigraphs: "??" "/"

I've guessed that what sandeep really meant (or would have meant
if he'd thought it through) was that, rather than re-running the
entire preprocessor phase on the entire source file, the "Rescanning
and further replacement" specified in 6.10.3.4 would simply not
avoid re-expanding the name of the macro being expanded, i.e.,
drop paragraph 2.

I'm still interested in seeing a concrete example where this is
useful -- either that, or an explanation of what he really has
in mind.

Seebs · Nov 19, 2010

You're trying to show how easy it is to change the standard.
Your'e failing because you're wrong, it isn't easy.

I honestly have a hard time believing this is someone doing a
Master's degree in any field related to computer science, unless
it's trolling as part of a psychology experiment on software
developers.

Although... you never know...

http://chronicle.com/article/The-Shadow-Scholar/125329/

Still, it's hard to imagine anyone whose understanding of C is
so persistently, consistently, awful getting to the point of working
on a thesis about C.

-s

Seebs · Nov 19, 2010

Hmm?

A C program's source is what it is before the preprocessor is invoked,
macro definitions and all. I don't see how sandeep's proposal would
change that.

Here's a currently valid C program:

#include <stdio.h>
#define foo foo
int main(void)
{
char *foo = "hello, world";
puts(foo);
return 0;
}

Under sandeep's proposal, if I understand it correctly, it would be
invalid because it would send the compiler into infinite recursion.

He said to run it repeatedly until there are no changes. Obviously, he
doesn't mean running it on the same source file repeatedly, he means running
it on the output.

So.

Pass 1 produces:
#line "stdio.h" 1
<magic>

int main(void)
{
char *foo = "hello, world";
puts(foo);
return 0;
}

pass 2 doesn't change this in any way.

To make it more clear, consider:
#define foo foo

int foo;

pass 1 produces:
[blank line]

int foo;

pass 2 doesn't change this. There's no longer any #defines for the
repeated runs of the preprocessor to expand.

-s

BartC · Nov 20, 2010

sandeep, can you provide a concrete example of something that
produces one expansion in C as it currently exists, and would
produce a better expansion under your proposal?

Possibly something like this (if you imagine a macro and #if directives can
be written like this):

#define fib(n)\
#if(n<=1)
1
#else
fib((n)-1)+fib((n)-2)
#endif

Original source: fib(5)
After pass1: fib(4)+fib(3)
After pass2: fib(3)+fib(2) + fib(2)+fib(1)
After pass3: fib(2)+fib(1)+fib(1)+fib(0) + fib(1)+fib(0) + 1
After pass4: fib(1)+fib(0) +1+1+1 + 1+1 + 1
After pass5: 1+1 +1+1+1 + 1+1 + 1
After folding: 8

There are a few problems implementing this (#if probably doesn't work that
way, for a start), but it's the sort of thing that might be possible:

int a = fib(5);

compiles to:

int a = 8;

Not sure what would happen though if you tried fib(fib(5))...

As it works now:

#define fib(n) ((n)<=1 ? fib((n)-1)+fib((n)-1))

expands fib(5) to: (5<=1 ? fib(4)+fib(3)), where fib() is undefined.

Stefan Ram · Nov 20, 2010

BartC said:
#define fib(n) ((n)<=1 ? fib((n)-1)+fib((n)-1))
expands fib(5) to: (5<=1 ? fib(4)+fib(3)), where fib() is undefined.

#define fib0 0
#define fib1 1
#define fib2 1
#define fib3 2
#define fib4 3
#define fib5 5
#define fib6 8
#define fib7 13
#define fib8 21
#define fib9 34
#define fib10 55
#define fib11 89
#define fib12 144
#define fib13 233
#define fib14 377
#define fib15 610
#define fib16 987
#define fib17 1597
#define fib18 2584
#define fib19 4181
#define fib20 6765
#define fib21 10946
#define fib22 17711
#define fib23 28657

(ISO/IEC 9899:1999 (E) requires int to only represent
positive values up to 32767 IIRC.)

BartC · Nov 20, 2010

Hmm, that should end with :1 (fib0 is 1) or :n (fib0 is 0). But it won't
work anyway...

#define fib0 0 ....
#define fib23 28657

(Or possibly up to fib46 if you assume 32-bits).

The advantage of such a macro is not bothering with tabulating all the
values. Assuming some script is written to create the list, you might as
well turn the script into a macro instead.

(Of course, as written, when calculating fib(46+), exceeding 32-bits is the
least of the problems, as it might never get that far...)

Stefan Ram · Nov 20, 2010

BartC said:
The advantage of such a macro is not bothering with tabulating all the
values. Assuming some script is written to create the list, you might as
well turn the script into a macro instead.

When I compiled

static int f( int const i ){ return i < 2 ? 1 : f( i - 1 )+ f( i - 2 ); }
int main( void ){ return f( 10 ); }

, I was a little bit disappointed that gcc has not compiled
the »f( 10 )« as if it would have been written as »55«.

Does anyone know how to make gcc do this?

(Another means to calculate a fibonacci number at compile time
might be to use C++ and template metaprogramming.)

Sjouke Burry · Nov 20, 2010

Stefan said:
When I compiled

static int f( int const i ){ return i < 2 ? 1 : f( i - 1 )+ f( i - 2 ); }
int main( void ){ return f( 10 ); }

, I was a little bit disappointed that gcc has not compiled
the »f( 10 )« as if it would have been written as »55«.

Does anyone know how to make gcc do this?

(Another means to calculate a fibonacci number at compile time
might be to use C++ and template metaprogramming.)

f0 1
f1 1
f2 2
f3 3
f4 5
f5 8
f6 13
f7 21
f8 34
f9 55
f10 89 what do you mean:55?????????????????
and btw, you could use a print statement
instead of "return f(10)".
Returning it as an error code might not
print anything.

Stefan Ram · Nov 20, 2010

Sjouke Burry said:
f10 89 what do you mean:55?????????????????

f was intended to give the fibonacci number, but was
not written correctly by me. So, f( 10 ) was supposed
to be 55, but indeed is 89.

Ian Collins · Nov 20, 2010

When I compiled

static int f( int const i ){ return i< 2 ? 1 : f( i - 1 )+ f( i - 2 ); }
int main( void ){ return f( 10 ); }

, I was a little bit disappointed that gcc has not compiled
the »f( 10 )« as if it would have been written as »55«.

Does anyone know how to make gcc do this?

(Another means to calculate a fibonacci number at compile time
might be to use C++ and template metaprogramming.)

That's close to what is being suggested here. Template expansions have
a built in mechanism (specialisation) to prevent infinite recursion, but
the preprocessor does not.

Jon · Nov 20, 2010

Seebs said:
I honestly have a hard time believing this is someone doing a
Master's degree in any field related to computer science, unless
it's trolling as part of a psychology experiment on software
developers.

Although... you never know...

http://chronicle.com/article/The-Shadow-Scholar/125329/

Still, it's hard to imagine anyone whose understanding of C is
so persistently, consistently, awful getting to the point of working
on a thesis about C.

What the irony is, is that anyone in this time would write a thesis about
something that can be summed up succinctly in less than a paragraph! (OK,
I took some liberty: he said "development of ISO standard", and I swapped
that out as "C"). Perhaps, the comittee thang never got it's tabloidal
write up, I wouldn't know 'bout dat.

Gene · Nov 20, 2010

One of the serious limitations of C is the fact that macro expansion is
not done recursively. Many powerful constructions would become possible
if this was allowed. It is also something that often trips up novices in
the language.

I would suggest that in the next release of the ISO Standard, it is
decried that the preprocessor shall be run repeatedly. In fact it shall
be run n times until the nth run does not change the source file.

Man I _really_ disagree with this. "Powerful" in this case means
write-only code. If you have ever tried to debug TeX/LaTeX (a
dementedly complex macro-based language) or a configure script (based
on the fairly powerful m4 macro processor), you will know that the
restrictions in the C pre-processor are by design. Lisp macros are
better because the language itself generates the macro expansions, but
even Lisp macros can get out of control pretty easily. Hence, serious
language designers gave up on pre-processing in the 80s. Please, C
standards people, don't do it!

Paul Mensonides · Nov 20, 2010

f was intended to give the fibonacci number, but was not written
correctly by me. So, f( 10 ) was supposed to be 55, but indeed is 89.

FYI:

// fib.c
#include <stdio.h>
#include <chaos/preprocessor.h>

#define FIB(n) \
CHAOS_PP_ARBITRARY_DEMOTE( \
CHAOS_PP_VARIADIC_ELEM( \
2, \
CHAOS_PP_EXPR(CHAOS_PP_WHILE_X( \
20, \
FIB_P, FIB_O, \
CHAOS_PP_ARBITRARY_DEC( \
CHAOS_PP_IIF(CHAOS_PP_IS_VARIADIC(n))( \
n, \
CHAOS_PP_ARBITRARY_PROMOTE(n) \
) \
), \
(0), (1) \
)) \
) \
) \
/**/
#define FIB_P(s, n, a, b) \
CHAOS_PP_ARBITRARY_BOOL(n) \
/**/
#define FIB_O(s, n, a, b) \
CHAOS_PP_ARBITRARY_DEC(n), b, CHAOS_PP_ARBITRARY_ADD(a, b) \
/**/

int main(void) {
printf(CHAOS_PP_STRINGIZE(FIB(500)) "\n");
return 0;
}

The program above computes the 500th Fibonacci number with the
preprocessor. It takes about 20 seconds to compile with GCC on the Linux
VM that I'm running on a mid-range machine.

I did the above on Linux, but will work under Cygwin with MinGW (either
the Cygwin-dependent version or the standalone).

$ gcc -std=c99 -I $CHAOS_ROOT -o fib fib.c
$ ./fib
139423224561697880139724382870407283950070256587697307264108962948325571622863290691557658876222521294125

The input to the macro FIB is an integer literal (without suffixes) in
the range 1-512. Values larger than that have to be input in Chaos'
arbitrary precision format such as (5)(1)(3) for 513. The output is a
single pp-number token. For FIB(500), as in the above, that pp-number is
far greater than is representable in 64-bit value so it has to be
stringized before the underlying compiler gets a hold of it.

Note, BTW, that I'm not suggesting that computing Fibonacci numbers with
the preprocessor is advisable. I'm merely mentioning that everything
mentioned in this thread is already possible with a standard C or C++
preprocessor without iteratively invoking the preprocessor in a LaTeX-
like style.

-----

Chaos has to be pulled from the repository and then put wherever you
want....

$ cd <whereever>
$ cvs -z3 -d

server:[email protected]:/cvsroot/chaos-pp co -P chaos-pp
$ export CHAOS_ROOT=<whereever>

Chaos is a preprocessor metaprogramming library originally derived from
the Boost Preprocessor Library. While Boost is a collection of C++
libraries (rather than C), the Boost Preprocessor Library is C/C++.
Chaos is a vastly more powerful derivative that doesn't mess around with
workarounds for broken preprocessors. Thus, it takes a very good
preprocessor to handle it (such as GCC's, EDG's, and a few others).

Regards,
Paul Mensonides

Mark Wooding · Nov 20, 2010

Sjouke Burry said:
f0 1
f1 1
f2 2
f3 3
f4 5
f5 8
f6 13
f7 21
f8 34
f9 55
f10 89 what do you mean:55?????????????????

Usually, one sets F(0) = 0 and F(1) = 1. Then F(n) is the coefficient
of x in the linear representative of x^n modulo x^2 - x - 1 (see HAKMEM
item 12).

-- [mdw]

Stefan Ram · Nov 20, 2010

Gene said:
Man I _really_ disagree with this. "Powerful" in this case means
write-only code. If you have ever tried to debug TeX/LaTeX (a
dementedly complex macro-based language) or a configure script (based
on the fairly powerful m4 macro processor), you will know that the
restrictions in the C pre-processor are by design. Lisp macros are
better because the language itself generates the macro expansions, but
even Lisp macros can get out of control pretty easily. Hence, serious
language designers gave up on pre-processing in the 80s.

Are there any sources where one can read more about those
deliberations/discussions of these language designers with
regard to preprocessors? Are there any text books reporting
about this discussions and their results?

Also, if /powerful/ preprocessors are write-only, does this
also mean that any kind of preprocessor (even a /less/
powerful preprocessor) should be avoided?

Could it be that the problem are not the macros themselve,
but lack of proper documentation? After all, code (without a
preprocessor) can do fairly complex activities at runtime
and one often only can cope with this using proper
documentation.

A sophisticated preprocessor allows one to built a whole new
language. A new programmer on the project needs to
learn this language, of course. Thus, he needs a tutorial
and a reference manual for this new language, and time to
learn this language. This new language of course only is
worth this effort only if it helps to get more things done
more efficiently in such an amount that it outweighs all
this effort.

Can we define what a preprocessor is? Is a compiler a
preprocessor for its target language?

The first C++ implementation (cfront) is called
»preprocessor« here:

»the first C++ implementation was preprocessor«

http://www.drdobbs.com/184415273

It also relates AOP (a technique of the 90s?) to
preprocessors:

»The first [AOP ]implementations were preprocessors«

Stefan Ram · Nov 20, 2010

Does anyone know how to make gcc do this?

Now, I have simplified the recursion:

static int f( int const i ){ return i < 10 ? 720 : 1 + f( i - 1 ); }
int main( void ){ return f( 10 ); }

At least in this case, gcc will eliminate f (with certain options)
and generate:

main:
(...)
MOVL 721, %EAX
(...)
RET

This shows that it might be possible for a C compiler to
calculate the Fibonacci numbers at compile time, too,
without the preprocessor.

macro expansion	2	Aug 23, 2007
Macro expansion : Confusion	2	Jun 7, 2009
Macro expansion in several lines	3	Nov 22, 2006
Another macro question	7	Jun 9, 2011
Macro re-entrancy problem	1	Feb 4, 2011
A question on macro expansion	3	Jul 22, 2004
Macro expansion in C++ preprocessor	5	Aug 22, 2007
Redefining an object-like macro	4	Dec 9, 2009

Macro expansion

Eric Sosman

Jon

Jon

Keith Thompson

Keith Thompson

Seebs

Seebs

BartC

Stefan Ram

BartC

Stefan Ram

Sjouke Burry

Stefan Ram

Ian Collins

Jon

Gene

Paul Mensonides

Mark Wooding

Stefan Ram

Stefan Ram

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads