INT64_C etc.

C

Christopher Key

Hello,

I'm struggling to understand the exact requirement for the INTn_C /
UINTn_C macros, beyond being a tidier alternative to putting a cast on
the front.

Reading through n1124, it seems that any integer literals without
suffixes will have a type sufficently large to hold them (with the
exception of base 10 literals never having unsigned type, but this only
need a U suffix to overcome).

Is there therefore any significant difference between the following on a
standard 32bit int platform?

(INT64_C(1000000000)) /* 10^9 */
and
((int64_t) 1000000000) /* 10^9 */

or

(INT64_C(10000000000)) /* 10^10 */
and
(10000000000) /* 10^10 */


I'm wondering if this is something left over from previous C standards.
If I set my compiler to C89 rather than C99, it starts spitting out
warnings about values being too large for long. Were the rules for
integer literals different, or did they just stop at long?

If this is the case, is it therefore necessary to wrap values larger
that 2^32-1 (2^31-1 for base 10 literals) in INTn_C macros to ensure the
code compiles correctly on a C89 compiler?
 
J

James Kuyper

Christopher said:
Hello,

I'm struggling to understand the exact requirement for the INTn_C /
UINTn_C macros, beyond being a tidier alternative to putting a cast on
the front.

Reading through n1124, it seems that any integer literals without
suffixes will have a type sufficently large to hold them (with the
exception of base 10 literals never having unsigned type, but this only
need a U suffix to overcome).

Is there therefore any significant difference between the following on a
standard 32bit int platform?

(INT64_C(1000000000)) /* 10^9 */
and
((int64_t) 1000000000) /* 10^9 */

Keep in mind that the behavior of the conversion macros are connected to
the corresponding least-sized types, not the exact-sized types, so
INT64_C() corresponds to int_least64_t, not int64_t.
or

(INT64_C(10000000000)) /* 10^10 */
and
(10000000000) /* 10^10 */

The key difference is that the result of an INTn_C macro is required to
be suitable for use in a #if preprocessing directive, a context where
casts are not merely meaningless, but syntax errors. In the following
directive:

#if ((int64_t) 1000000000) > INT_MAX

the identifier int64_t is not recognized as a type name; preprocessing
occurs during translation phase 4, type names don't become meaningful
until translation phase 7.

"After all replacements due to macro expansion and the defined unary
operator have been performed, all remaining identifiers (including those
lexically identical to keywords) are replaced with the pp-number 0"
(6.10.1p4).

Therefore, the above #if directive is equivalent to

#if ((0) 1000000000) > INT_MAX

which is a syntax error.

Since the INTn_C macros can't use casts, how do they change the type of
the operand? The only method I can see is by appending 'L' or 'LL' (The
UINTn_C macros presumably add 'U' as well). This only works for
least-sized type whose promoted type is either int, long, or long long,
If any of them has a promoted type which is an extended integer type,
the implementation must use either "magic" or implementation-specific
extensions to create a constant of the correct type, that is usable in
#if directives.

There is one other difference between the INTn_C macros and the
corresponding cast expressions. The macros are supposed to expand to
constants having the promoted type, which is not necessarily the same as
the actual type, of the corresponding size-named type. Thus, on an
implementation where INT_LEAST16_MAX < INT_MAX, INT16_C(12345) will have
the type int, whereas ((int_least16_t) 12345) will have the type
int_least16_t.
I'm wondering if this is something left over from previous C standards.

No, the INTn_C macros, just like the corresponding size-named types,
are new in C99.
If I set my compiler to C89 rather than C99, it starts spitting out
warnings about values being too large for long. Were the rules for
integer literals different, or did they just stop at long?

Neither <stdint.h>, int64_t, nor INT64_C were part of the C standard
library in C89. If your code #include's <stdint.h>, and a C89 compiler
accepts the code despite that fact, it's probably the case that the C99
stdint.h is being treated, in your C89 code, as a user-defined header
file. There's no guarantee that this will actually work, and it would
appear that in your case it does not. I'd have to know exactly what the
definition of INT64_C is on your system, to be able to hazard a guess as
to why it is failing in this particular fashion.
If this is the case, is it therefore necessary to wrap values larger
that 2^32-1 (2^31-1 for base 10 literals) in INTn_C macros to ensure the
code compiles correctly on a C89 compiler?

There's nothing portable you can do to ensure that code using INTn_C
will compile correctly on a C89 compiler.
 
K

Keith Thompson

Christopher Key said:
Reading through n1124, it seems that any integer literals without
suffixes will have a type sufficently large to hold them (with the
exception of base 10 literals never having unsigned type, but this only
need a U suffix to overcome).
[...]

A minor point: n1256 is more up to date; it includes all three Technical
Corrigenda, while n1124 includes only the first two.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
 
C

Christopher Key

James said:
Keep in mind that the behavior of the conversion macros are connected to
the corresponding least-sized types, not the exact-sized types, so
INT64_C() corresponds to int_least64_t, not int64_t.


The key difference is that the result of an INTn_C macro is required to
be suitable for use in a #if preprocessing directive, a context where
casts are not merely meaningless, but syntax errors. In the following
directive:

#if ((int64_t) 1000000000) > INT_MAX

the identifier int64_t is not recognized as a type name; preprocessing
occurs during translation phase 4, type names don't become meaningful
until translation phase 7.

"After all replacements due to macro expansion and the defined unary
operator have been performed, all remaining identifiers (including those
lexically identical to keywords) are replaced with the pp-number 0"
(6.10.1p4).

Therefore, the above #if directive is equivalent to

#if ((0) 1000000000) > INT_MAX

which is a syntax error.

Since the INTn_C macros can't use casts, how do they change the type of
the operand? The only method I can see is by appending 'L' or 'LL' (The
UINTn_C macros presumably add 'U' as well). This only works for
least-sized type whose promoted type is either int, long, or long long,
If any of them has a promoted type which is an extended integer type,
the implementation must use either "magic" or implementation-specific
extensions to create a constant of the correct type, that is usable in
#if directives.

There is one other difference between the INTn_C macros and the
corresponding cast expressions. The macros are supposed to expand to
constants having the promoted type, which is not necessarily the same as
the actual type, of the corresponding size-named type. Thus, on an
implementation where INT_LEAST16_MAX < INT_MAX, INT16_C(12345) will have
the type int, whereas ((int_least16_t) 12345) will have the type
int_least16_t.

Thanks very much for the detailed response James, that's given me a far
better understanding of these macros, and moreover, I can see that they
are specifically required, e.g. problems may well arise from,

#define A 1000000000 /* 10^9 */
#define B 1000000000 /* 10^10 */
#if A * 10 == B
....
#else
....
#endif

which can only be fixed with,
#define A INT64_C(...)


One issue is still puzzling me slightly, although I suspect its more an
issue of form rather than anything.

If I want to use the INTn_C macros to define negative values, is it best
to use,

INTn_C(-x)

or

-INTn_C(x)


The reason for asking is that the first form is somewhat misleading, and
doesn't work for all values of x, even if -x is a valid value for the
corresponding int_leastn_t type.


I ran into problems trying to use code like,

INT64_C(-9223372036854775808) /* -2^63 */

After some head scratching, I realised that this gets expanded to
somethign like,

(-9223372036854775808L)

and in very crude terms, the 'L' is parsed before the '-'. As
9223372036854775808 > 2^63-1, we end up with an unsigned type to which
the unary minus is then applied, which is not as intended.

Hence, I say that INT64_C(-1) is misleading, as (in even cruder terms)
it implies that the unary minus is parsed before the value is
interpreted as any sort of integral type.

Clearly writing,

-INT64_C(9223372036854775808)

suffers from exactly the same problem, but it is at least intuitively
clear why it should be failing.


Kind Regards,

Chris



P.S.

I coded round the above by using something like,

(-INT64_C(9223372036854775807) - 1)

as used by some system headers.



P.P.S

I'm aware that such values aren't portable to platforms using 1's
complement / sign + magnitude representations. I am however explicitly
testing for such platforms with

#if INT_LEAST64_MIN == INT64_C(0x7fffffffffffffff)
....
#else
/* int_least64_t can express values at least as small as
-9223372036854775808 */
#endif
 
C

Christopher Key

Keith said:
Christopher Key said:
Reading through n1124, it seems that any integer literals without
suffixes will have a type sufficently large to hold them (with the
exception of base 10 literals never having unsigned type, but this only
need a U suffix to overcome).
[...]

A minor point: n1256 is more up to date; it includes all three Technical
Corrigenda, while n1124 includes only the first two.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

Thanks, bookmarks updated.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,147
Latest member
CarenSchni
Top