Preprocessor limitation workarounds

D Yuniskis · Nov 22, 2009

Hi,

I frequently have to write code that must run on a variety
of different platforms (portability). Many of these are
legacy systems with tools that are no longer supported.
(i.e., I often am working in a C89 context).

Almost always, the code needs data types to be tweaked to
best fit the characteristics of the target machine (embedded
system work so resources are almost always scarce). This is
an area that is ripe for error!

What I try to do is rely on the preprocessor and limits.h to
have the preprocessor "pick" the appropriate data types for me.
Often, this works. But, when it doesn't work, it doesn't
work at all!

For example, I might look for the smallest type that can
support a particular datum by doing something like:

#if (MY_MAX_VALUE < UCHAR_MAX)
typedef unsigned char mytype_t;
#elif (MY_MAX_VALUE < USHRT_MAX)
typedef unsigned short mytype_t;
#elif (MY_MAX_VALUE < ULONG_MAX)
....
#endif

Similarly, I may use type limits to determine just how
big I can let things get in my algorithms (e.g., "if
I use an unsigned long and I need to represent a maximum
value of MAX_VALUE then how many bits can I get to the
right of the binary point if I scale the datum accordingly?")

The problem I have is when MY_MAX_VALUE -- or, some
arithmetic expression which effectively becomes MY_MAX_VALUE
exceeds the capabilities of a 32 bit int (for older compilers).

For example, extend the first example to include:

....
#elif (MY_MAX_VALUE < ULLONG_MAX)
typedef unsigned long long mytype_t;
#else
#error "Unable to represent MY_MAX_VALUE with standard data types"
#endif

If, for example, MY_MAX_VALUE won't fit in a ULONG, then it
is too big for "older" preprocessors to handle predictably.
(Note that I can always #define MAX_ULLONG to be 0 to ensure
this conditional is never active. But, that doesn't say
how the preprocessor will behave when computing the value
MY_MAX_VALUE when said value exceeds a "long").

[Note that MY_MAX_VALUE is often an expression that might
evaluate to one of these "too large for preprocessor" values]

I currently work around this by having a dependency in my makefile
that causes an executable to run which "does the math" and makes
the choice for me. But, this presents other problems as it relies
on there being a compiler for the *host* available at build time
(I do mainly cross builds).

Any suggestions on alternative techniques to getting this support
at compile time (run-time checks are "too late" :> )

Thanks!
--don

Tim Prince · Nov 22, 2009

D Yuniskis wrote:

Maybe reserve the compile time data type optimization for the targets
where the C99 headers are present.
If you were to test executable code, I would think you would have to run
it on the target, rather than the build host.

Tim Prince · Nov 22, 2009

D Yuniskis wrote:

Maybe reserve the compile time data type optimization for the targets
where the C99 headers are present.
If you were to test executable code, I would think you would have to run
it on the target, rather than the build host.

Tim Prince · Nov 22, 2009

D Yuniskis wrote:

Maybe reserve the compile time data type optimization for the targets
where the C99 headers are present.
If you were to test executable code, I would think you would have to run
it on the target, rather than the build host.

Gene · Nov 22, 2009

Hi,

I frequently have to write code that must run on a variety
of different platforms (portability). Many of these are
legacy systems with tools that are no longer supported.
(i.e., I often am working in a C89 context).

Almost always, the code needs data types to be tweaked to
best fit the characteristics of the target machine (embedded
system work so resources are almost always scarce). This is
an area that is ripe for error!

What I try to do is rely on the preprocessor and limits.h to
have the preprocessor "pick" the appropriate data types for me.
Often, this works. But, when it doesn't work, it doesn't
work at all!

For example, I might look for the smallest type that can
support a particular datum by doing something like:

#if (MY_MAX_VALUE < UCHAR_MAX)
typedef unsigned char mytype_t;
#elif (MY_MAX_VALUE < USHRT_MAX)
typedef unsigned short mytype_t;
#elif (MY_MAX_VALUE < ULONG_MAX)
...
#endif

Similarly, I may use type limits to determine just how
big I can let things get in my algorithms (e.g., "if
I use an unsigned long and I need to represent a maximum
value of MAX_VALUE then how many bits can I get to the
right of the binary point if I scale the datum accordingly?")

The problem I have is when MY_MAX_VALUE -- or, some
arithmetic expression which effectively becomes MY_MAX_VALUE
exceeds the capabilities of a 32 bit int (for older compilers).

For example, extend the first example to include:

...
#elif (MY_MAX_VALUE < ULLONG_MAX)
typedef unsigned long long mytype_t;
#else
#error "Unable to represent MY_MAX_VALUE with standard data types"
#endif

If, for example, MY_MAX_VALUE won't fit in a ULONG, then it
is too big for "older" preprocessors to handle predictably.
(Note that I can always #define MAX_ULLONG to be 0 to ensure
this conditional is never active. But, that doesn't say
how the preprocessor will behave when computing the value
MY_MAX_VALUE when said value exceeds a "long").

[Note that MY_MAX_VALUE is often an expression that might
evaluate to one of these "too large for preprocessor" values]

I currently work around this by having a dependency in my makefile
that causes an executable to run which "does the math" and makes
the choice for me. But, this presents other problems as it relies
on there being a compiler for the *host* available at build time
(I do mainly cross builds).

Any suggestions on alternative techniques to getting this support
at compile time (run-time checks are "too late" :> )

Thanks!
--don

Well, if you stand back and look at it, there are only two feasible
sources of information about the target: the stuff encoded in header
files for the target and the stuff you somehow create yourself. You
are already exploiting the first kind (I suppose) as far as possible.
The second kind--target information you create yourself--can be
further divided into information you create manually by parsing
technical specs and, say, transcribing them to your own "config.h" for
each architecture and automatically created information. It looks
like you are not interested in the first kind, building your own
header files by hand. You want something automatic. Computing
information automatically about the target environment means a program
running in the target environment. There are only two ways to create
that situation: run a program in the actual target environment or run
it in an emulated target. The former has the problem of getting the
results of the program run from the target environment back into the
cross-compilation environment. That could be difficult if the
embedded target doesn't do i/o in an accessible manner. That leaves
us with an emulator-based solution as the last possible option. You
didn't give enough information to go any farther.

D Yuniskis · Nov 23, 2009

Hi Gene,

I frequently have to write code that must run on a variety
of different platforms (portability). Many of these are
legacy systems with tools that are no longer supported.
(i.e., I often am working in a C89 context).

Almost always, the code needs data types to be tweaked to
best fit the characteristics of the target machine (embedded
system work so resources are almost always scarce). This is
an area that is ripe for error!

What I try to do is rely on the preprocessor and limits.h to
have the preprocessor "pick" the appropriate data types for me.
Often, this works. But, when it doesn't work, it doesn't
work at all!

For example, I might look for the smallest type that can
support a particular datum by doing something like:

#if (MY_MAX_VALUE < UCHAR_MAX)
typedef unsigned char mytype_t;
#elif (MY_MAX_VALUE < USHRT_MAX)
typedef unsigned short mytype_t;
#elif (MY_MAX_VALUE < ULONG_MAX)
...
#endif

Similarly, I may use type limits to determine just how
big I can let things get in my algorithms (e.g., "if
I use an unsigned long and I need to represent a maximum
value of MAX_VALUE then how many bits can I get to the
right of the binary point if I scale the datum accordingly?")

The problem I have is when MY_MAX_VALUE -- or, some
arithmetic expression which effectively becomes MY_MAX_VALUE
exceeds the capabilities of a 32 bit int (for older compilers).

For example, extend the first example to include:

...
#elif (MY_MAX_VALUE < ULLONG_MAX)
typedef unsigned long long mytype_t;
#else
#error "Unable to represent MY_MAX_VALUE with standard data types"
#endif

If, for example, MY_MAX_VALUE won't fit in a ULONG, then it
is too big for "older" preprocessors to handle predictably.
(Note that I can always #define MAX_ULLONG to be 0 to ensure
this conditional is never active. But, that doesn't say
how the preprocessor will behave when computing the value
MY_MAX_VALUE when said value exceeds a "long").

[Note that MY_MAX_VALUE is often an expression that might
evaluate to one of these "too large for preprocessor" values]

I currently work around this by having a dependency in my makefile
that causes an executable to run which "does the math" and makes
the choice for me. But, this presents other problems as it relies
on there being a compiler for the *host* available at build time
(I do mainly cross builds).

Any suggestions on alternative techniques to getting this support
at compile time (run-time checks are "too late" :> )

Click to expand...

Well, if you stand back and look at it, there are only two feasible
sources of information about the target: the stuff encoded in header
files for the target and the stuff you somehow create yourself. You
are already exploiting the first kind (I suppose) as far as possible.
The second kind--target information you create yourself--can be
further divided into information you create manually by parsing
technical specs and, say, transcribing them to your own "config.h" for
each architecture and automatically created information. It looks
like you are not interested in the first kind, building your own
header files by hand. You want something automatic. Computing
information automatically about the target environment means a program
running in the target environment. There are only two ways to create
that situation: run a program in the actual target environment or run
it in an emulated target. The former has the problem of getting the
results of the program run from the target environment back into the
cross-compilation environment. That could be difficult if the
embedded target doesn't do i/o in an accessible manner. That leaves
us with an emulator-based solution as the last possible option. You
didn't give enough information to go any farther.

I think you missed the point of my post; or, perhaps my
post wasn't clear enough. :<

I *have* all the information that I need to characterize
my target. It is contained in the include files FOR THE
TARGET. E.g., ./target/include/limits.h in the examples
that I cited above.

I don't have to run an executable *on* the target in order
to make decisions based on these values. E.g., I can
build an executable that runs on the host, uses the host's
<stdio.h> to talk to The Developer (at compile time) and
*still* use the *target's* characterizations by referencing
the *target's* "target/include/limits.h".

For example, assume I am interfacing to a graphic display and
I need to represent coordinates in that display (assume it
is square). And, that I want to use the smallest data
type possible (perhaps I have tables of coordinates that
I don't want to waste memory storing).

I could say:

#if (MAX_COORD_VALUE < UCHAR_MAX)
typedef unsigned char coord_t;
#elif (MAX_COORD_VALUE < USHORT_MAX)
typedef unsigned short coord_t;
#elif (MAX_COORD_VALUE < ULONG_MAX)
typedef unsigned long coord_t;
#elif (MAX_COORD_VALUE < ULLONG_MAX)
typedef unsigned long long coord_t;
#elif
# error "can't represent MAX_COORD_VALUE"
#endif

and the compiler *might* be able to figure this out for me.

If MAX_COORD_VALUE evaluates to (note it may not be a simple
constant!) something like "100", then, for example, a uchar
would be most economical (space wise) as "coord_t". If
it evaluates to something like 10 000, then a ushort might
be a better fit.

The problem manifests when it would *like* to evaluate to
something like 5 000 000 000 -- too big for the preprocessor
(legacy compilers) to evaluate correctly using "long int"
math. So, the conditional is defective even if the compiler
might have support for long longs.

[consider, MAX_COORD_VALUE may be something like
#define MAX_COORD_VALUE (DOTS_PER_INCH * DISPLAY_WIDTH)
or even
#define MAX_COORD_VALUE (DOTS_PER_IN * DISPLAY_HEIGHT * ASPECT_RATIO)
The latter intended as a suggestion of how much more complex
these *expressions* can become.
]

However, I can write a program on my (64 bit?) host that
does this math correctly (even if the host doesn't support
long longs -- most do -- I could make a set of routines that
manipulate 8-byte arrays that I treat as 64 bit "numbers")
*using* the ULLONG_MAX value from the *target*'s include
files, computing the results and returning an "answer"
that the CROSS COMPILER for the target can then reliably
interpret (e.g., my host executable could *write* a small
include file that is included in the other files that are
referenced in my application).

So, I don't need access to a running target just to *build*
the application correctly.

But, this is clumsy. I would like to know if there are
any work-arounds that would let me rely solely on the
cross compiler (even legacy compilers) and still get
the "right answers".

Nick · Nov 23, 2009

D Yuniskis said:
The problem manifests when it would *like* to evaluate to
something like 5 000 000 000 -- too big for the preprocessor
(legacy compilers) to evaluate correctly using "long int"
math. So, the conditional is defective even if the compiler
might have support for long longs.

Could you divide your numbers in advance, and the limits.h values at
compile time?

So instead of
#if ULONG_MAX > 8000000
you'd have
#if ULONG_MAX/8 > 1000000
?

Or are you saying that ULONG_MAX itself might be too big for the
compiler to do arithmatic on?

D Yuniskis · Nov 23, 2009

Hi Nick,

Could you divide your numbers in advance, and the limits.h values at
compile time?

So instead of
#if ULONG_MAX > 8000000
you'd have
#if ULONG_MAX/8 > 1000000
?

This is what I have done in some *specific* cases where I knew
I could get away with "cheating".

I might need to pick a data type that I can use for an intermediate
result in some particular computation. For example, in my "big math"
library, I've opted to do carry detection "simply" (i.e., if
result of an addition exceeds the maximum value for the *base*
data type, then you have a carry-out). So, the expression that
I use to find a data type that will hold this intermediate value
is "MAX_VALUE + MAX_VALUE" (i.e., if using ULONGs, then MAX_VALUE
is ULONG_MAX). But, MAX_VALUE+MAX_VALUE would then exceed the
capabilities of the preprocessor (legacy compilers).

So, I will do something like test MAX_VALUE against ?????_MAX/2
(where ?????_MAX are the various limits.h constants).

Note, however, that I *can't* compare (MAX_VALUE+MAX_VALUE)/2 to
?????_MAX/2 because the first expression can't be (guaranteed)
evaluated in the preprocessor -- you have to cheat by doing the
optimizations *before* you "write it down".

Or are you saying that ULONG_MAX itself might be too big for the
compiler to do arithmatic on?

ULONG_MAX is always (?) safe, IIRC. But, ULLONG_MAX might not
be. Or, even if ULLONG is "undefined", the expression against
which I am comparing may be "too big".

E.g., imagine MAX_VALUE is a 30 bit number. And, imagine I
am trying to find a data type that will hold the result of
*squaring* such a value. MAX_VALUE*MAX_VALUE exceeds the
capabilities of the preprocessor to compute. So, how do I
know how it will handle that expression in a #if?

Nick · Nov 23, 2009

D Yuniskis said:
ULONG_MAX is always (?) safe, IIRC. But, ULLONG_MAX might not
be. Or, even if ULLONG is "undefined", the expression against
which I am comparing may be "too big".

E.g., imagine MAX_VALUE is a 30 bit number. And, imagine I
am trying to find a data type that will hold the result of
*squaring* such a value. MAX_VALUE*MAX_VALUE exceeds the
capabilities of the preprocessor to compute. So, how do I
know how it will handle that expression in a #if?

If you are only trying to find out which of a series of (widely spaced)
types you can use, how about using a rough and ready approximation of
the square root of the types maximum value and then comparing it to
MAX_VALUE?

Unwind a loop for, say, three iterations of Heron's method. You should
be able to do all that with pre-processor arithmatic without ever
getting bigger than your _MAX started at.

You're going to need some good comments on it though!

D Yuniskis · Nov 23, 2009

Nick said:
If you are only trying to find out which of a series of (widely spaced)
types you can use, how about using a rough and ready approximation of
the square root of the types maximum value and then comparing it to
MAX_VALUE?

Unwind a loop for, say, three iterations of Heron's method. You should
be able to do all that with pre-processor arithmatic without ever
getting bigger than your _MAX started at.

Yes, you can come up with *specific* approaches to each particular
arithmetic problem. But, you have to come up with a different
approach for each *and* hope your implementation isn't buggy.
It's just way too much work and too subject to mistakes.

E.g., imagine if the preprocessor math was only *8* bits -- think
of all the extra work you'd be doing just to use it! (silly example).
I'm hoping to find an approach that gets around the preprocessor
entirely.

Currently, I'm looking at bc(1) scripts to see if I can coax
them into generating information that could be #include'd
into the header files. Sure, it means porting bc(1) but
at least it gets me a solution (that is more versatile than
trying to write a dedicated "program" to compute everything).

I think C0X is a bit more forgiving in this regard (?). But,
you're still stuck with "creative expressionism" if you want
to do anything around the edges.

You're going to need some good comments on it though!

Yes. Makes include file maintenance more tedious than the
code itself!

Kaz Kylheku · Nov 23, 2009

Hi,

I frequently have to write code that must run on a variety
of different platforms (portability). Many of these are
legacy systems with tools that are no longer supported.
(i.e., I often am working in a C89 context).

Almost always, the code needs data types to be tweaked to
best fit the characteristics of the target machine (embedded
system work so resources are almost always scarce). This is
an area that is ripe for error!

What I try to do is rely on the preprocessor and limits.h to
have the preprocessor "pick" the appropriate data types for me.
Often, this works. But, when it doesn't work, it doesn't
work at all!

For example, I might look for the smallest type that can
support a particular datum by doing something like:

#if (MY_MAX_VALUE < UCHAR_MAX)
typedef unsigned char mytype_t;
#elif (MY_MAX_VALUE < USHRT_MAX)
typedef unsigned short mytype_t;
#elif (MY_MAX_VALUE < ULONG_MAX)
...
#endif

I'm looking into doing exactly this same kind of thing in a program I have been
working on. I need to know the best integral type to which a pointer can be
converted and which can be converted back. (There are C99 types for this but
they are nonportable).

I also want to support cross-compiling my program. So the stupid hacks
used by Autoconf of compiling little test programs and running them
are out of the question. This stupid approach is a complete non-starter.

Like you, I do cross builds. At work I developed and maintain a cross-compiled
Linux distro, which runs on MIPS and Intel. Programs that don't cross build
cleanly are a pet peeve.

I've come up with this idea. What you can do is write a test translation unit
in which some global arrays have sizes which are tied to the quantities you
want to measure. You can compile this translation unit with your toolchain and
then use the toolchain's ``nm'' utility to dump out the size information.

For example:

/* generate this from your script, call it conftest.c */
#include "conftest.h"
char sizeof_pointer[sizeof(char *)];
char sizeof_int[sizeof(int)];
char sizeof_long[sizeof(int)];
#ifdef HAVE_LONGLONG
char sizeof_longlong[sizeof(longlong_t)];
#endif

Regarding this LONGLONG; the idea is that, already, a set of previous test
was run to detect whether the compiler supports some kind of wider type
than long. It could be ``long long'' or ``__int64'' or whatever; another
test has already settled this matter and provided the definition in
the generated header "conftest.h".

Now your script compiles the unit like this:

$ /path/to/toolchain/bin/cc -c conftest.c

If this successfully compiles, you you can then run ``nm -t d -P conftest.o''

$ /path/to/toolchain/bin/nm -t d -P conftest.o
sizeof_int C 00000004 00000004
sizeof_long C 00000004 00000004
sizeof_longlong C 00000008 00000008
sizeof_pointer C 00000004 00000004

This works even if cc and nm are cross-tools for a different architecture from
that of your build machine.

Our script would take these values and conclude that either int or long
could be used as the type which can hold a pointer.

Because we tied the quantities that we want to the /sizes/ of storage
denoted by symbols in a symbol table, we don't actually have to resolve and
link anything. We can get the compiled code to reveal things to us without
having to scan header files from the cross toolchain, and without making
programs that have to be run (which we can't do without an emulator for
the target architecture!)

You'd have to adjust the trick for your target systems. Maybe nm doesn't
take the -P (POSIX mode) parameter on some of them. Or maybe some of your
target toolchains are not Unix-like at all, or there isn't even a POSIX
or Bourne-compatible command interpreter. Details.

Ben Pfaff · Nov 23, 2009

Kaz Kylheku said:
I'm looking into doing exactly this same kind of thing in a
program I have been working on. I need to know the best
integral type to which a pointer can be converted and which can
be converted back. (There are C99 types for this but they are
nonportable).

I also want to support cross-compiling my program. So the
stupid hacks used by Autoconf of compiling little test programs
and running them are out of the question. This stupid approach
is a complete non-starter.

You appear to be behind the times regarding Autoconf. Its
"stupid hacks" for measuring the sizes of types now support
cross-compilation. It compiles a program whose compilation will
succeed if the type's size is in a given range, and fail with an
error if it is not in the expected range, and then uses binary
search to narrow down the answer until it knows the exact answer.

Gene · Nov 24, 2009

Hi Gene,

I frequently have to write code that must run on a variety
of different platforms (portability). Many of these are
legacy systems with tools that are no longer supported.
(i.e., I often am working in a C89 context).
Almost always, the code needs data types to be tweaked to
best fit the characteristics of the target machine (embedded
system work so resources are almost always scarce). This is
an area that is ripe for error!
What I try to do is rely on the preprocessor and limits.h to
have the preprocessor "pick" the appropriate data types for me.
Often, this works. But, when it doesn't work, it doesn't
work at all!
For example, I might look for the smallest type that can
support a particular datum by doing something like:
#if (MY_MAX_VALUE < UCHAR_MAX)
typedef unsigned char mytype_t;
#elif (MY_MAX_VALUE < USHRT_MAX)
typedef unsigned short mytype_t;
#elif (MY_MAX_VALUE < ULONG_MAX)
...
#endif
Similarly, I may use type limits to determine just how
big I can let things get in my algorithms (e.g., "if
I use an unsigned long and I need to represent a maximum
value of MAX_VALUE then how many bits can I get to the
right of the binary point if I scale the datum accordingly?")
The problem I have is when MY_MAX_VALUE -- or, some
arithmetic expression which effectively becomes MY_MAX_VALUE
exceeds the capabilities of a 32 bit int (for older compilers).
For example, extend the first example to include:
...
#elif (MY_MAX_VALUE < ULLONG_MAX)
typedef unsigned long long mytype_t;
#else
#error "Unable to represent MY_MAX_VALUE with standard data types"
#endif
If, for example, MY_MAX_VALUE won't fit in a ULONG, then it
is too big for "older" preprocessors to handle predictably.
(Note that I can always #define MAX_ULLONG to be 0 to ensure
this conditional is never active. But, that doesn't say
how the preprocessor will behave when computing the value
MY_MAX_VALUE when said value exceeds a "long").
[Note that MY_MAX_VALUE is often an expression that might
evaluate to one of these "too large for preprocessor" values]
I currently work around this by having a dependency in my makefile
that causes an executable to run which "does the math" and makes
the choice for me. But, this presents other problems as it relies
on there being a compiler for the *host* available at build time
(I do mainly cross builds).
Any suggestions on alternative techniques to getting this support
at compile time (run-time checks are "too late" :> )

Click to expand...

Click to expand...

Well, if you stand back and look at it, there are only two feasible
sources of information about the target: the stuff encoded in header
files for the target and the stuff you somehow create yourself. You
are already exploiting the first kind (I suppose) as far as possible.
The second kind--target information you create yourself--can be
further divided into information you create manually by parsing
technical specs and, say, transcribing them to your own "config.h" for
each architecture and automatically created information. It looks
like you are not interested in the first kind, building your own
header files by hand. You want something automatic. Computing
information automatically about the target environment means a program
running in the target environment. There are only two ways to create
that situation: run a program in the actual target environment or run
it in an emulated target. The former has the problem of getting the
results of the program run from the target environment back into the
cross-compilation environment. That could be difficult if the
embedded target doesn't do i/o in an accessible manner. That leaves
us with an emulator-based solution as the last possible option. You
didn't give enough information to go any farther.

Click to expand...

I think you missed the point of my post; or, perhaps my
post wasn't clear enough. :<

I *have* all the information that I need to characterize
my target. It is contained in the include files FOR THE
TARGET. E.g., ./target/include/limits.h in the examples
that I cited above.

I don't have to run an executable *on* the target in order
to make decisions based on these values. E.g., I can
build an executable that runs on the host, uses the host's
<stdio.h> to talk to The Developer (at compile time) and
*still* use the *target's* characterizations by referencing
the *target's* "target/include/limits.h".

For example, assume I am interfacing to a graphic display and
I need to represent coordinates in that display (assume it
is square). And, that I want to use the smallest data
type possible (perhaps I have tables of coordinates that
I don't want to waste memory storing).

I could say:

#if (MAX_COORD_VALUE < UCHAR_MAX)
typedef unsigned char coord_t;
#elif (MAX_COORD_VALUE < USHORT_MAX)
typedef unsigned short coord_t;
#elif (MAX_COORD_VALUE < ULONG_MAX)
typedef unsigned long coord_t;
#elif (MAX_COORD_VALUE < ULLONG_MAX)
typedef unsigned long long coord_t;
#elif
# error "can't represent MAX_COORD_VALUE"
#endif

and the compiler *might* be able to figure this out for me.

If MAX_COORD_VALUE evaluates to (note it may not be a simple
constant!) something like "100", then, for example, a uchar
would be most economical (space wise) as "coord_t". If
it evaluates to something like 10 000, then a ushort might
be a better fit.

The problem manifests when it would *like* to evaluate to
something like 5 000 000 000 -- too big for the preprocessor
(legacy compilers) to evaluate correctly using "long int"
math. So, the conditional is defective even if the compiler
might have support for long longs.

[consider, MAX_COORD_VALUE may be something like
#define MAX_COORD_VALUE (DOTS_PER_INCH * DISPLAY_WIDTH)
or even
#define MAX_COORD_VALUE (DOTS_PER_IN * DISPLAY_HEIGHT * ASPECT_RATIO)
The latter intended as a suggestion of how much more complex
these *expressions* can become.
]

However, I can write a program on my (64 bit?) host that
does this math correctly (even if the host doesn't support
long longs -- most do -- I could make a set of routines that
manipulate 8-byte arrays that I treat as 64 bit "numbers")
*using* the ULLONG_MAX value from the *target*'s include
files, computing the results and returning an "answer"
that the CROSS COMPILER for the target can then reliably
interpret (e.g., my host executable could *write* a small
include file that is included in the other files that are
referenced in my application).

So, I don't need access to a running target just to *build*
the application correctly.

But, this is clumsy. I would like to know if there are
any work-arounds that would let me rely solely on the
cross compiler (even legacy compilers) and still get
the "right answers".- Hide quoted text -

I understood your point.

I was saying that if you've exploited all possible information in
limits.h and other header files available to the croos-compiler, and
you've exhausted the arithmetic of the cpp, then you're completely out
of beer unless you can get output from a program running on the target
or emulated target.

I did not think of the wonderful hack mentioned by other posters:
extract information one bit at a time from the compiler itself by
creating programs that are semantically correct only if a type meets a
boolean size criterion. Wow. I should have thought of this. It's
roughly the same hack you use to show general HP-Hard problems are at
least as hard as corresponding NP-complete decision problems, and I've
just been teaching this to undergrads.

D Yuniskis · Nov 24, 2009

Hi Kaz,

Kaz said:
I'm looking into doing exactly this same kind of thing in a program I have been
working on. I need to know the best integral type to which a pointer can be
converted and which can be converted back. (There are C99 types for this but
they are nonportable).

I think yours is a similar but very different problem.
You are really just concerned about sizeof() being "wide enough"
for a (particular) pointer to fit. And, the sizes of pointers
are few.

OTOH, I am trying to see if a *value* will fit into a particular
data type. And, there are lots of potential *values* that I
might come up with :>

E.g., 65536 is an unfortunate value. But 65535 isn't (typically).

I also want to support cross-compiling my program. So the stupid hacks
used by Autoconf of compiling little test programs and running them
are out of the question. This stupid approach is a complete non-starter.

Like you, I do cross builds. At work I developed and maintain a cross-compiled
Linux distro, which runs on MIPS and Intel. Programs that don't cross build
cleanly are a pet peeve.

Agreed. Though, in my case, the target may be a little 8 bit MCU
or a 64 bit machine.

I've come up with this idea. What you can do is write a test translation unit
in which some global arrays have sizes which are tied to the quantities you
want to measure. You can compile this translation unit with your toolchain and
then use the toolchain's ``nm'' utility to dump out the size information.

I don't see that this buys *me* much/anything. E.g., the
"quantity I want to measure" is (FOO*(FOO-1)+2). And, FOO might
be 5, today, or 5 000 000 tomorrow. Your approach seems like
it would easily gag on numbers that large.

For example:

/* generate this from your script, call it conftest.c */
#include "conftest.h"
char sizeof_pointer[sizeof(char *)];
char sizeof_int[sizeof(int)];
char sizeof_long[sizeof(int)];
#ifdef HAVE_LONGLONG
char sizeof_longlong[sizeof(longlong_t)];
#endif

Regarding this LONGLONG; the idea is that, already, a set of previous test
was run to detect whether the compiler supports some kind of wider type
than long. It could be ``long long'' or ``__int64'' or whatever; another
test has already settled this matter and provided the definition in
the generated header "conftest.h".

I'm achieving similar results by doing something like

#ifndef ULLONG_MAX
# define ULLONG_MAX (0)
#endif

and structuring my #ifs so "0" inherently falls out of the mix.
(see above example)

Now your script compiles the unit like this:

$ /path/to/toolchain/bin/cc -c conftest.c

If this successfully compiles, you you can then run ``nm -t d -P conftest.o''

$ /path/to/toolchain/bin/nm -t d -P conftest.o
sizeof_int C 00000004 00000004
sizeof_long C 00000004 00000004
sizeof_longlong C 00000008 00000008
sizeof_pointer C 00000004 00000004

This works even if cc and nm are cross-tools for a different architecture from
that of your build machine.

Our script would take these values and conclude that either int or long
could be used as the type which can hold a pointer.

Because we tied the quantities that we want to the /sizes/ of storage
denoted by symbols in a symbol table, we don't actually have to resolve and
link anything. We can get the compiled code to reveal things to us without
having to scan header files from the cross toolchain, and without making
programs that have to be run (which we can't do without an emulator for
the target architecture!)

You'd have to adjust the trick for your target systems. Maybe nm doesn't
take the -P (POSIX mode) parameter on some of them. Or maybe some of your
target toolchains are not Unix-like at all, or there isn't even a POSIX
or Bourne-compatible command interpreter. Details.

I don't think it will help me in this case. What I *really*
want is a preprocessor that can handle arbitrary precision
(even if restricted to integer only) math. m4 chokes. bc
handles the math but doesn't tie in nicely to the compiler's
source expectations...

D Yuniskis · Nov 24, 2009

Ben said:
You appear to be behind the times regarding Autoconf. Its
"stupid hacks" for measuring the sizes of types now support
cross-compilation. It compiles a program whose compilation will
succeed if the type's size is in a given range, and fail with an
error if it is not in the expected range, and then uses binary
search to narrow down the answer until it knows the exact answer.

But this just looks at sizeof()?

E.g., can I say, "what type will 'FOO*(FOO-1)+2' most tightly
fit into when FOO is ____?"

Nick · Nov 24, 2009

D Yuniskis said:
I don't see that this buys *me* much/anything. E.g., the
"quantity I want to measure" is (FOO*(FOO-1)+2). And, FOO might
be 5, today, or 5 000 000 tomorrow. Your approach seems like
it would easily gag on numbers that large.
[snip]

I don't think it will help me in this case. What I *really*
want is a preprocessor that can handle arbitrary precision
(even if restricted to integer only) math. m4 chokes. bc
handles the math but doesn't tie in nicely to the compiler's
source expectations...

I've given further thought, and given that you are limited in what you
can run that understands the target architecture (essentially, limited
to the compiler) I think you have to use the standard pre-processor.

In that case, I think you might need to do something like this:
Write a configuration program in whatever language you use that does all
the calculations on your constants - so (FOO*(FOO-1)+2) as above.

It should also read in an entire system of names for every plausible
type in your target C dialects, in increasing size for each type and
signedness.

Then it can loop round and find the best size for the results of each
evaluation, and spit out a small header file to be included.

There's a bit of work in defining the names, but no more than you'll
need to do in your current architecture specific header files, and you
can probably use a much more reader-friendly syntax (reading it from a
configuration file, probably).

I can't see a better way.

Eric Sosman · Nov 24, 2009

D said:
E.g., can I say, "what type will 'FOO*(FOO-1)+2' most tightly
fit into when FOO is ____?"

For C90 (untested),

#include <limits.h>
#define FOO some_number /* see below */
#if FOO <= (UCHAR_MAX - 2) / FOO + 1 \
+ ((UCHAR_MAX - 2) % FOO != 0)
typedef unsigned char Footype;
#elif ...
...
#else
#error Infoosable
#endif

This assumes that "some_number" is an integer constant expression
not exceeding ULONG_MAX -- if it's not an integer, or if it's
larger, you're sunk without a trace.

You could use a similar technique for C99 (increasing the FOO
limit to UINTMAX_MAX from <stdint.h>), but as far as I know there's
no portable way to enumerate all the integer types; in C99 they're
an open-ended set. Once you find a type T that's wide enough, it
may not be easy to know whether there's a narrower T2 that would
also work.

Personally, I'd write a helper program. Simply being able to
take a logarithm would be an enormous help.

D Yuniskis · Nov 24, 2009

Hi Eric,

Eric said:
For C90 (untested),

#include <limits.h>
#define FOO some_number /* see below */
#if FOO <= (UCHAR_MAX - 2) / FOO + 1 \
+ ((UCHAR_MAX - 2) % FOO != 0)
typedef unsigned char Footype;
#elif ...
...
#else
#error Infoosable

Ha! ;-) Are you sure that isn't *un*foosable?

#endif

This assumes that "some_number" is an integer constant expression
not exceeding ULONG_MAX -- if it's not an integer, or if it's
larger, you're sunk without a trace.

I think you missed one of the points:

#define FOO (10000000000/10000000000)

is an integer constant expression that *evaluates* to something
not exceeding ULONG_MAX (i.e., "1"). But, the preprocessor
can't *evaluate* it (because all of its component expressions
must also satisfy this criteria.

You also have the problem of *testing* for ULLONGs on platforms
that don't support them (i.e., my original example works if
you include a targettypes.h that defines:
#ifndef ULLONG_MAX
# define ULLONG_MAX (0)
#endif

You could use a similar technique for C99 (increasing the FOO
limit to UINTMAX_MAX from <stdint.h>), but as far as I know there's
no portable way to enumerate all the integer types; in C99 they're

Exactly -----------------------^^^^^^^^^^^^^^^^^^^^^

an open-ended set. Once you find a type T that's wide enough, it
may not be easy to know whether there's a narrower T2 that would
also work.

Personally, I'd write a helper program. Simply being able to
take a logarithm would be an enormous help.

That;s what I currently do. But, it hides the dependancies
(i.e. the expressions that govern the choices of data types)
from the code -- seeing them in a header actually helps
document how the code will use these data types.

And, its another significant effort to get that code right
*and* keep it running.

Finally, it means you need a *native* compiler (and toolchain)
in addition to the cross compiler that you are *really* using.
This isn't always easy (e.g., I support some applications with
DOS-based tools but don't have a compiler that will produce
DOS executables! :-/ )

If I have to invest time in something that will be running on
the host, then I would like it to be a one-time investment.
E.g., build a tool that *can* do the preprocessing that I
need, port it to DOS (and other host platforms that I work under)
and then forget it. My current approach requires me to
be able to rebuild the "helper program" each time I come
up with more conditions that drive the choice of data types.

D Yuniskis · Nov 24, 2009

Hi Nick,

D Yuniskis said:
D Yuniskis said:

I don't see that this buys *me* much/anything. E.g., the
"quantity I want to measure" is (FOO*(FOO-1)+2). And, FOO might
be 5, today, or 5 000 000 tomorrow. Your approach seems like
it would easily gag on numbers that large.
[snip]

I don't think it will help me in this case. What I *really*
want is a preprocessor that can handle arbitrary precision
(even if restricted to integer only) math. m4 chokes. bc
handles the math but doesn't tie in nicely to the compiler's
source expectations...

Click to expand...

I've given further thought, and given that you are limited in what you
can run that understands the target architecture (essentially, limited
to the compiler) I think you have to use the standard pre-processor.

Well, I can also use the header files!

In that case, I think you might need to do something like this:
Write a configuration program in whatever language you use that does all
the calculations on your constants - so (FOO*(FOO-1)+2) as above.

As I have mentioned in other replies, I'd rather invest the
time writing something that gives me "augmented preprocessor
functionality" than something that grinds out special cases.
E.g., using bc to evaluate expressions and then wrapping
its output appropriately to create the header files.

It should also read in an entire system of names for every plausible
type in your target C dialects, in increasing size for each type and
signedness.

This is actually an interesting idea! I had just assumed stick to
the largest (practical) set of common data types and handle those
that could conceivably be "missing" with #ifndef's.

Then it can loop round and find the best size for the results of each
evaluation, and spit out a small header file to be included.

There's a bit of work in defining the names, but no more than you'll
need to do in your current architecture specific header files, and you
can probably use a much more reader-friendly syntax (reading it from a
configuration file, probably).

I can't see a better way.

<frown> Sure there is -- leave the problem to the next bloke! :>

Eric Sosman · Nov 25, 2009

D said:
Hi Eric,

Ha! ;-) Are you sure that isn't *un*foosable?

I think you missed one of the points:

#define FOO (10000000000/10000000000)

is an integer constant expression that *evaluates* to something
not exceeding ULONG_MAX (i.e., "1"). But, the preprocessor
can't *evaluate* it (because all of its component expressions
must also satisfy this criteria.

No, I didn't miss the limitation. I stated the assumption
that FOO was "an integer constant expression," but your example
is a C90 I.C.E. only if ULONG_MAX is at least 10000000000. (In
C99, UINTMAX_MAX is guaranteed to exceed 10000000000, so you're
home.)

You also have the problem of *testing* for ULLONGs on platforms
that don't support them (i.e., my original example works if
you include a targettypes.h that defines:
#ifndef ULLONG_MAX
# define ULLONG_MAX (0)
#endif

You can do this if you like, but it's unnecessary. Any
unrecognized tokens in an #if or #elif expression are taken to
have the value zero.

That;s what I currently do. But, it hides the dependancies
(i.e. the expressions that govern the choices of data types)
from the code -- seeing them in a header actually helps
document how the code will use these data types.

And, its another significant effort to get that code right
*and* keep it running.

Finally, it means you need a *native* compiler (and toolchain)
in addition to the cross compiler that you are *really* using.

Perhaps I'm missing something, but I don't see what you're,
er, missing. Your cross-compiler knows the <limits.h> values
for the execution environment, and can do integer arithmetic in
accordance with those values, even if they're different from the
values used by the host platform. In a pinch, you can cross-
compile the helper and run *it* on the target platform.

preprocessor bug?	2	May 10, 2013
Limitations and workarounds to expressions defining size of an array	44	Sep 4, 2012
what would be the most portable way to test if UINT_MAX>=0xFFFFFFFwith the preprocessor?	5	Mar 28, 2012
Preprocessor commands usage.	11	Sep 15, 2013
Order of preprocessor macro replacement	3	Jun 15, 2013
Stripping tokens in the C preprocessor	12	Aug 5, 2012
preprocessor trick	4	Feb 10, 2011
Loading a variable with its maximum value	8	Jan 3, 2009

Preprocessor limitation workarounds

D Yuniskis

Tim Prince

Tim Prince

Tim Prince

Gene

D Yuniskis

Nick

D Yuniskis

Nick

D Yuniskis

Kaz Kylheku

Ben Pfaff

Gene

D Yuniskis

D Yuniskis

Nick

Eric Sosman

D Yuniskis

D Yuniskis

Eric Sosman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads