A variation of #

A

Andrey Vul

To stringify pp-constants, we use #foo notation.
Is there a method of charifying constants defined on the preprocessor
level, assuming the precondition of length being 1 is true?
That is, does there exist a pp-builtin CHARIFY such that CHARIFY(x) =
'x' ?
 
I

Ian Collins

To stringify pp-constants, we use #foo notation.
Is there a method of charifying constants defined on the preprocessor
level, assuming the precondition of length being 1 is true?
That is, does there exist a pp-builtin CHARIFY such that CHARIFY(x) =
'x' ?

Something like

#define CHAR(x) #x [0]

?
 
A

Andrey Vul

To stringify pp-constants, we use #foo notation.
Is there a method of charifying constants defined on the preprocessor
level, assuming the precondition of length being 1 is true?
That is, does there exist a pp-builtin CHARIFY such that CHARIFY(x) =
'x' ?

Something like

#define CHAR(x) #x [0]

Can't be used inside switch or anything else requiring a const
expression.
 
A

Andrey Vul

To stringify pp-constants, we use #foo notation.
Is there a method of charifying constants defined on the preprocessor
level, assuming the precondition of length being 1 is true?
That is, does there exist a pp-builtin CHARIFY such that CHARIFY(x) =
'x' ?

Something like

#define CHAR(x) #x [0]

Can't be used inside switch.
 
J

James Kuyper

I presume there's some reason why the constants cannot, themselves, be
defined as character constants?
That is, does there exist a pp-builtin CHARIFY such that CHARIFY(x) =
'x' ?

Something like

#define CHAR(x) #x [0]

Can't be used inside switch.

There's no simple way I can think of to define a macro that generates a
character constant whose value depends upon the expansion of another
macro. I did, however, manage to come up with this ugly monstrosity,
which cannot possibly qualify as "simple", which I've inserted into some
test code:

#include <stdlib.h>
#define TRICAT(a, b, c) a ## b ## c
#define CHAR(a, b, c) TRICAT(a, b, c)

#define CASE1 x

int main(int argc, char *argv[])
{
if(argc <= 1)
return EXIT_FAILURE;
switch (*argv[1])
{
case CHAR('
, CASE1, '
): break;
default:
return EXIT_SUCCESS;
}
return EXIT_FAILURE;
}

The key thing that makes this work is that ' immediately followed by a
newline is not recognized as any other type of pre-processing token, so
it counts as a preprocessing token of it's own, under the classification
"each non-white-space character that cannot be one of the above" (6.4p1).

The two-level expansion allows CASE1 to be expanded to x before being
concatenated with the ' tokens.

If you can convince the relevant person to define the relevant constants
as character constants, that would make things a lot easier.
 
J

James Kuyper

Nice try, but no cigar. The ## operator requires that the result be a
valid pp-token and, depending on the order in which the operators are
evaluated, you either end up with 'x or x', neither of which are valid
pp-tokens.

I hadn't considered that issue, but taking a look at 6.10.3.3p3, I found
that it says "each each instance of a ## preprocessing token
in the replacement list (not from an argument) is deleted and the
preceding preprocessing token is concatenated with the following
preprocessing token. ... If the result is not a valid preprocessing
token, the behavior is undefined." It seems to me ambiguous whether "the
result" it refers to is the result of each concatenation, or the
combined result of all of the concatenations.

I'm not going to argue strongly over this. I think this is a horribly
ugly solution to what should be a simple problem; if it actually fails
to be a solution by reason of having undefined behavior, I'm not going
to lose any sleep over it.
 
P

Peter Nilsson

Andrey Vul said:
To stringify pp-constants, we use #foo notation.
Is there a method of charifying constants defined
on the preprocessor level, assuming the precondition
of length being 1 is true? That is, does there exist
a pp-builtin CHARIFY such that CHARIFY(x) = 'x' ?

This is a good example of a question on a possible
solution to an unstated problem. As James Juyper asks,
what is the real problem that you're trying to solve?

In short, if you're already hardcoding x, why can't
you hardcode 'x'?

There are ways of synchronising elements using the
preprocessor. But different methods have different
advantages and disadvantages. The following will work
for some characters but not for others...

#define char_x 'x'
#define char_y 'y'

#define CHARIFY(x) char_ ## x

switch (blah)
{
case CHARIFY(x):
..
case CHARIFY(y):
..
}
 
A

Andrey Vul

I presume there's some reason why the constants cannot, themselves, be
defined as character constants?
That is, does there exist a pp-builtin CHARIFY such that CHARIFY(x) =
'x' ?
Something like
#define CHAR(x) #x [0]
Can't be used inside switch.

There's no simple way I can think of to define a macro that generates a
character constant whose value depends upon the expansion of another
macro. I did, however, manage to come up with this ugly monstrosity,
which cannot possibly qualify as "simple", which I've inserted into some
test code:

#include <stdlib.h>
#define TRICAT(a, b, c) a ## b ## c
#define CHAR(a, b, c) TRICAT(a, b, c)

#define CASE1 x

int main(int argc, char *argv[])
{
    if(argc <= 1)
        return EXIT_FAILURE;
    switch (*argv[1])
    {
    case CHAR('
        , CASE1, '
        ): break;
    default:
        return EXIT_SUCCESS;
    }
    return EXIT_FAILURE;

}

The key thing that makes this work is that ' immediately followed by a
newline is not recognized as any other type of pre-processing token, so
it counts as a preprocessing token of it's own, under the classification
"each non-white-space character that cannot be one of the above" (6.4p1).

The two-level expansion allows CASE1 to be expanded to x before being
concatenated with the ' tokens.

That looks like it shouldn't even compile!

In the end, I figured it's best to define as a char in the first place
and define the string as { 'f', 'o', 'o', 'b', 'a', 'r', ... } instead
of "foo" "bar" "...".

It's not like I'm making use of null-termination in the string
anyways.
 
J

James Kuyper

There's no simple way I can think of to define a macro that generates a
character constant whose value depends upon the expansion of another
macro. I did, however, manage to come up with this ugly monstrosity,
which cannot possibly qualify as "simple", which I've inserted into some
test code:

#include <stdlib.h>
#define TRICAT(a, b, c) a ## b ## c
#define CHAR(a, b, c) TRICAT(a, b, c)

#define CASE1 x

int main(int argc, char *argv[])
{
� � if(argc <= 1)
� � � � return EXIT_FAILURE;
� � switch (*argv[1])
� � {
� � case CHAR('
� � � � , CASE1, '
� � � � ): break;
� � default:
� � � � return EXIT_SUCCESS;
� � }
� � return EXIT_FAILURE;

}

The key thing that makes this work is that ' immediately followed by a
newline is not recognized as any other type of pre-processing token, so
it counts as a preprocessing token of it's own, under the classification
"each non-white-space character that cannot be one of the above" (6.4p1).

The two-level expansion allows CASE1 to be expanded to x before being
concatenated with the ' tokens.

That looks like it shouldn't even compile!

It does compile, and works as intended - with gcc, with maximal
conformance and warning levels turned on. However, whether or not it's
required to compile is, at best, ambiguous. See Lawrence Jone's message
on this same thread - he's a fairly authoritative source, which doesn't
mean he's necessarily right - but I wouldn't recommend betting a lot of
money against him.

I'm curious - is there any specific feature that bothers you about that
code, other than the issue that Larry raised?
In the end, I figured it's best to define as a char in the first place
and define the string as { 'f', 'o', 'o', 'b', 'a', 'r', ... } instead
of "foo" "bar" "...".

Yes, that's by far the better approach, even if there were no ambiguity
about my "solution".
 
T

Thad Smith

Nice try, but no cigar. The ## operator requires that the result be a
valid pp-token and, depending on the order in which the operators are
evaluated, you either end up with 'x or x', neither of which are valid
pp-tokens.

So if we are concatenating 3 pp-tokens, not only do the individual items (a,b,c)
and the final concatenation need to be valid pp-tokens, but also all potential
_intermediates_, as determined by the unspecified processor order. That seems a
needless and complicating restriction.

If I understand correctly, CHAR(<,<,=) will work because both << and <= are
valid pp-tokens, but CHAR(., ., .) will not because .. isn't.

Finally the result of CHAR(%:, %, :) is undefined/unspecified, since order of
concatenation determines whether the intermediate is %:% ## : or %: ## %: where
%:% is not a pp-token. I noticed that CHAR(C,3,_) is only required to work
because 3_ is a valid pp-number!

It seems that the rule could be replaced by a greedy algorithm that concatenates
all ##-connected pp-tokens before requiring a pp-token result. As far as I can
see such replacement would have no incompatibility with the current standard and
would add little additional wording and processing. It would simplify the
concept by not relying on happenstance intermediates as noted above.

Thad
 
L

lawrence.jones

James Kuyper said:
I hadn't considered that issue, but taking a look at 6.10.3.3p3, I found
that it says "each each instance of a ## preprocessing token
in the replacement list (not from an argument) is deleted and the
preceding preprocessing token is concatenated with the following
preprocessing token. ... If the result is not a valid preprocessing
token, the behavior is undefined." It seems to me ambiguous whether "the
result" it refers to is the result of each concatenation, or the
combined result of all of the concatenations.

I think it's pretty clear from context that it's talking about each
replacement individually. For a token-based preprocessor, you either
need each result to be a token or you need the concept of executing a
bunch of operators in parallel (which we don't have in C).
 
H

Harald van Dijk

Nice try, but no cigar.  The ## operator requires that the result be a
valid pp-token and, depending on the order in which the operators are
evaluated, you either end up with 'x or x', neither of which are valid
pp-tokens.

Even if ## were redefined so that intermediate results of a ## b ## c
are ignored, the behaviour is explicitly undefined.

6.4:
preprocessing-token:
[...]
each non-white-space character that cannot be one of the above

6.4p3:
If a ' or a " character matches the last category, the behavior is
undefined.
 
J

James Kuyper

Nice try, but no cigar. The ## operator requires that the result be a
valid pp-token and, depending on the order in which the operators are
evaluated, you either end up with 'x or x', neither of which are valid
pp-tokens.

Even if ## were redefined so that intermediate results of a ## b ## c
are ignored, the behaviour is explicitly undefined.

6.4:
preprocessing-token:
[...]
each non-white-space character that cannot be one of the above

6.4p3:
If a ' or a " character matches the last category, the behavior is
undefined.

Yes, I forgot about that item, too.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top