Preprocessor unique names

T

the.theorist

I was implementing the traditional FOREACH macro, and noticed that in
C99 mode the usual int used for iteration must be declared prior to
the for loop. This means that my macro must declare a variable in an
outer scope. In order to avoid scope pollution, I'm wondering if there
is a way to coerce the preprocessor into creating unique names. Let me
demonstrate what I mean:

My macro:
#define FOREACH(p, set) \
int _i = 0; \
for (p = (set)->pset[_i]; \
_i<(set)->num; \
++_i) \

It's clear that in any scope I use this macro, a int variable by the
name _i will be created. Though I'm reasonably sure that this will not
be a problem if I use the macro multiple times in the same scope, or
choose to nest the FOREACH's, I'm still curious about how I could get
the preprocessor to generate a unique name for _i, so as to avoid any
possible aliasing.

My first thought was to use the __LINE__ directive and ##
concatenation. My plan was for the iteration variable to be named
_i610 if the FOREACH macro occured on line 610 in the source. For my
purposes, using line numbers will be unique enough. Unfortunately, the
preprocessor doesn't expand the __LINE__ and the variable name came
out as _i__LINE__. So my question for the group: Is there a way to get
the c preprocessor to generate unique names?
 
N

Nate Eldredge

the.theorist said:
I was implementing the traditional FOREACH macro, and noticed that in
C99 mode the usual int used for iteration must be declared prior to
the for loop. This means that my macro must declare a variable in an
outer scope. In order to avoid scope pollution, I'm wondering if there
is a way to coerce the preprocessor into creating unique names. Let me
demonstrate what I mean:

My macro:
#define FOREACH(p, set) \
int _i = 0; \
for (p = (set)->pset[_i]; \
_i<(set)->num; \
++_i) \

It's clear that in any scope I use this macro, a int variable by the
name _i will be created. Though I'm reasonably sure that this will not
be a problem if I use the macro multiple times in the same scope, or
choose to nest the FOREACH's, I'm still curious about how I could get
the preprocessor to generate a unique name for _i, so as to avoid any
possible aliasing.

Hm? I think you definitely have a problem if you write

FOREACH(p, set) foo(p);
FOREACH(p, set) bar(p);

The variable _i will be declared twice, which is definitely a syntax
error. Moreover, you should not be using names beginning with
underscores; they are reserved.

Anyway, your macro looks bogus, since p doesn't change during iteration
of the loop.

One approach might be to make p be a pointer, and use it as the
iterator. E.g.

#define FOREACH(p, set) \
for (p = (set)->pset; p < (set)->pset + (set)->num; p++)

junk *p;
FOREACH(p, set) foo(*p);

Then no temporary variable is needed.
My first thought was to use the __LINE__ directive and ##
concatenation. My plan was for the iteration variable to be named
_i610 if the FOREACH macro occured on line 610 in the source. For my
purposes, using line numbers will be unique enough. Unfortunately, the
preprocessor doesn't expand the __LINE__ and the variable name came
out as _i__LINE__. So my question for the group: Is there a way to get
the c preprocessor to generate unique names?

AFAIK, no.
 
K

Keith Thompson

the.theorist said:
I was implementing the traditional FOREACH macro, and noticed that in
C99 mode the usual int used for iteration must be declared prior to
the for loop.

You mean C90 mode. In C99, you can declare the variable in the loop:

for (int i = 0; i < 10; i ++) { /* ... */ }
This means that my macro must declare a variable in an
outer scope. In order to avoid scope pollution, I'm wondering if there
is a way to coerce the preprocessor into creating unique names. Let me
demonstrate what I mean:

My macro:
#define FOREACH(p, set) \
int _i = 0; \
for (p = (set)->pset[_i]; \
_i<(set)->num; \
++_i) \

It's clear that in any scope I use this macro, a int variable by the
name _i will be created. Though I'm reasonably sure that this will not
be a problem if I use the macro multiple times in the same scope, or
choose to nest the FOREACH's, I'm still curious about how I could get
the preprocessor to generate a unique name for _i, so as to avoid any
possible aliasing.

Your macro expands to a declaration followed by a statement, which
means that, in C90, it can't legally follow a statement within a
block. (C90 requires all declarations to precede all statements in a
block.)

The simplest solution is to create a new scope. Add curly braces to
the macro definition. You'll also want to use the do ... while(0)
idiom to make sure the macro can be used anywhere a statement can
appear. See question 10.4 of the comp.lang.c FAQ,
<http://www.c-faq.com/>.

Ah, I just noticed that the macro expansion includes only the header
of the for statement. One solution is to define a separate macro to
mark the end of the loop, and require the user to write FOREACH and
END_FOREACH in matched pairs.

Note also that identifiers starting with underscores are reserved.
Identifiers starting with a double underscore, or an underscore an an
uppercase letter, are always reserved. Other identifiers starting
with underscores are reserved for use at file scope, so you can
probably get away with this usage. But there's no need to start the
identifer with an underscore anyway.

So here's one possibility (untested):

#define FOREACH(p, set) \
do { \
int i = 0; \
for ((p) = (set)->pset; i<(set)->num; ++i) {

#define END_FOREACH } while (0);

Invoked as:

FOREACH(p, set)
/* body of loop */
END_FOREACH

I'm not happy about the fact that the user doesn't provide the curly
braces -- but then you can add an extra set anyway:

FOREACH (p, set) {
/* body of loop */
} END_FOREACH

I'm not sure whether the final semicolon should be part of the macro
definition or provided by the user.
My first thought was to use the __LINE__ directive and ##
concatenation. My plan was for the iteration variable to be named
_i610 if the FOREACH macro occured on line 610 in the source. For my
purposes, using line numbers will be unique enough. Unfortunately, the
preprocessor doesn't expand the __LINE__ and the variable name came
out as _i__LINE__. So my question for the group: Is there a way to get
the c preprocessor to generate unique names?

If you really needed a unique name, you could pass __LINE__ as an
argument to the macro; it's inconvenient, but it should world (as long
as you don't use the macro twice on the same line). Or there *might*
be some clever trick that lets you use __LINE__ from the calling site
without making the user type it on each invocation.
 
N

Nate Eldredge

Is the following not conforming, then?

#define CONCAT_3_( a, b ) a##b
#define CONCAT_2_( a, b ) CONCAT_3_( a, b )
#define CONCAT( a, b ) CONCAT_2_( a, b )

#define FOO int CONCAT(x,__LINE__);

void f()
{
/* two uniquely-named variables */
FOO
FOO
}

I didn't think of that. But it will only work as long as the two
occurrences of FOO are on distinct lines. If you try to use it twice in
a macro, for instance, there will be a problem. So they're not,
strictly speaking, unique names.
 
T

the.theorist

the.theorist said:
I was implementing the traditional FOREACH macro, and noticed that in
C99 mode the usual int used for iteration must be declared prior to
the for loop. This means that my macro must declare a variable in an
outer scope. In order to avoid scope pollution, I'm wondering if there
is a way to coerce the preprocessor into creating unique names. Let me
demonstrate what I mean:
My macro:
#define FOREACH(p, set) \
int _i = 0; \
for (p = (set)->pset[_i]; \
_i<(set)->num; \
++_i) \
It's clear that in any scope I use this macro, a int variable by the
name _i will be created. Though I'm reasonably sure that this will not
be a problem if I use the macro multiple times in the same scope, or
choose to nest the FOREACH's, I'm still curious about how I could get
the preprocessor to generate a unique name for _i, so as to avoid any
possible aliasing.

Hm? I think you definitely have a problem if you write

FOREACH(p, set) foo(p);
FOREACH(p, set) bar(p);

The variable _i will be declared twice, which is definitely a syntax
error. Moreover, you should not be using names beginning with
underscores; they are reserved.

Anyway, your macro looks bogus, since p doesn't change during iteration
of the loop.

Good catch! That's a bug. I meant to reassign p to the next element of
(set)->pset.
One approach might be to make p be a pointer, and use it as the
iterator. E.g.

#define FOREACH(p, set) \
for (p = (set)->pset; p < (set)->pset + (set)->num; p++)

junk *p;
FOREACH(p, set) foo(*p);

I think I'll use this approach for two reasons:
(1) though I don't much care for pointer arithmetic, it's the way
thing are done in C, and it's straightforward enough for any other C
programmer (including myself a few months from now).
(2) it elegantly avoids the naming issue, and reduces the number of
variables in the code.
 
T

the.theorist

You mean C90 mode. In C99, you can declare the variable in the loop:

for (int i = 0; i < 10; i ++) { /* ... */ }

Another good catch! gcc said "‘for’ loop initial declaration used
outside C99 mode". That 'outside' bit is important. Because of the way
the error is phrased, the term C99 was stuck in my head when I made
the original post. Sorry about the confusion.
This means that my macro must declare a variable in an
outer scope. In order to avoid scope pollution, I'm wondering if there
is a way to coerce the preprocessor into creating unique names. Let me
demonstrate what I mean:
My macro:
#define FOREACH(p, set) \
int _i = 0; \
for (p = (set)->pset[_i]; \
_i<(set)->num; \
++_i) \
It's clear that in any scope I use this macro, a int variable by the
name _i will be created. Though I'm reasonably sure that this will not
be a problem if I use the macro multiple times in the same scope, or
choose to nest the FOREACH's, I'm still curious about how I could get
the preprocessor to generate a unique name for _i, so as to avoid any
possible aliasing.

Your macro expands to a declaration followed by a statement, which
means that, in C90, it can't legally follow a statement within a
block. (C90 requires all declarations to precede all statements in a
block.)

I've noticed this in older code that I've worked on before. For this
project, it hasn't been a problem (yet). But, this is now another
reason why I'm gonna go with the pointer arithmetic version of the
macro.
The simplest solution is to create a new scope. Add curly braces to
the macro definition. You'll also want to use the do ... while(0)
idiom to make sure the macro can be used anywhere a statement can
appear. See question 10.4 of the comp.lang.c FAQ,
<http://www.c-faq.com/>.

That's got really good stuff in it! I should just read a random bit of
that everyday, to keep all the c idioms fresh in my skull.
Ah, I just noticed that the macro expansion includes only the header
of the for statement. One solution is to define a separate macro to
mark the end of the loop, and require the user to write FOREACH and
END_FOREACH in matched pairs.

Note also that identifiers starting with underscores are reserved.
Identifiers starting with a double underscore, or an underscore an an
uppercase letter, are always reserved. Other identifiers starting
with underscores are reserved for use at file scope, so you can
probably get away with this usage. But there's no need to start the
identifer with an underscore anyway.

So here's one possibility (untested):

#define FOREACH(p, set) \
do { \
int i = 0; \
for ((p) = (set)->pset; i<(set)->num; ++i) {

#define END_FOREACH } while (0);

Invoked as:

FOREACH(p, set)
/* body of loop */
END_FOREACH

I'm not happy about the fact that the user doesn't provide the curly
braces -- but then you can add an extra set anyway:

FOREACH (p, set) {
/* body of loop */
} END_FOREACH

I'm not sure whether the final semicolon should be part of the macro
definition or provided by the user.


I'm just, esthetically, too much against using matching macros to
adopt this approach.
But I have seen it used before.
 
T

the.theorist

Is the following not conforming, then?

#define CONCAT_3_( a, b ) a##b
#define CONCAT_2_( a, b ) CONCAT_3_( a, b )
#define CONCAT( a, b ) CONCAT_2_( a, b )

#define FOO int CONCAT(x,__LINE__);

void f()
{
/* two uniquely-named variables */
FOO
FOO
}

Actually, I tried this code, and, a bit to my surprise, it worked!

I then tried removing the call to CONCAT_3_:
#define CONCAT_2_( a, b ) a##b
#define CONCAT( a, b ) CONCAT_2_( a, b )
which also worked!

However, when I removed the call to CONCAT_2_:
#define CONCAT( a, b ) a##b
I received (using 'gcc -E') the familiar:
void f()
{

int x__LINE__;
int x__LINE__;
}

This seems to me a bit interesting. A certain amount of indirection
seems to be required for the preprocessor to perform macro expansions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top