Getting lengths of short, int, etc

Tim Streater · Aug 28, 2010

I'm porting a couple of C apps to OS X and I notice that one of the apps
has the following in a .h file:

#define UBYTE unsigned char // 8-bit unsigned
#define BYTE signed char // 8-bit signed
#define UWORD unsigned short // 16-bit unsigned
#define WORD short // 16-bit signed
#define ULONG unsigned int // 32-bit unsigned
#define LONG int // 32-bit signed
#define ULONG64 unsigned long long // 64-bit unsigned
#define LONG64 long long // 64-bit signed

The other seems to use the naked type definitions such as short,
unsigned long, etc.

Now, as both these apps have been ported at least once before, and while
I've made the first of them apparently work, I can imagine there may be
problems due to word lengths and types. Is there an easy way determine
the numbers of bits the compiler allocates to each data type?

Felix Palmen · Aug 28, 2010

* Tim Streater said:
Now, as both these apps have been ported at least once before, and while
I've made the first of them apparently work, I can imagine there may be
problems due to word lengths and types. Is there an easy way determine
the numbers of bits the compiler allocates to each data type?

The GNU autotools approach to this is to compile little test programs
and let them output sizeof(<type>). So, if you really need to know sizes
of integers, I'd say let the build system find out for you.

Regards, Felix

Lew Pitcher · Aug 28, 2010

I'm porting a couple of C apps to OS X and I notice that one of the apps
has the following in a .h file:

#define UBYTE unsigned char // 8-bit unsigned
#define BYTE signed char // 8-bit signed
#define UWORD unsigned short // 16-bit unsigned
#define WORD short // 16-bit signed
#define ULONG unsigned int // 32-bit unsigned
#define LONG int // 32-bit signed
#define ULONG64 unsigned long long // 64-bit unsigned
#define LONG64 long long // 64-bit signed

The other seems to use the naked type definitions such as short,
unsigned long, etc.

Now, as both these apps have been ported at least once before, and while
I've made the first of them apparently work, I can imagine there may be
problems due to word lengths and types. Is there an easy way determine
the numbers of bits the compiler allocates to each data type?

For C99-compliant compilers, you don't have to; the <stdint.h> header will
contain bit-length-qualified typedefs for integer types, making
bit-length-dependant code portable through their use.

For instance, your compiler might offer
uint_8t, uint_16t, int_16t, uint_32t, int32_t, uint_64t and int_64t
corresponding to
an 8bit unsigned integer type,
a 16bit unsigned integer type,
a 16bit signed integer type,
a 32bit unsigned integer type,
a 32bit signed integer type,
a 64bit unsigned integer type, and
a 64bit signed integer type.
You would use these /instead/ of your handcoded UBYTE ... macros.
If the platform doesn't offer a 32bit unsigned type (for instance), stdint.h
will not include uint_32t, and any code which depends on this type will not
compile.

For pre-C99, there is no direct equivalent to stdint.h. However, you should
be able to determine the sizes of all the data types /for a specific
compiler/ through it's documentation. If the compiler's documentation fails
to give you the correct values, you can always fall back on
the "experimental" method, and code, compile and run a program that reports
the sizeof of each integral type.

Again, *this is compiler-specific* information, and can only be determined
case by case against each compiler. While there are specific /minimum
requirements/ for integral types, there are no limits on how the compiler
supports these minimums; the compiler /might/ support all 8bit integral
types through 32bit integers; sizeof (char) will report 1, (as per the
definition), but CHAR_BITS might evaluate to 32.

HTH

Ben Pfaff · Aug 28, 2010

Lew Pitcher said:
For C99-compliant compilers, you don't have to; the <stdint.h> header will
contain bit-length-qualified typedefs for integer types, making
bit-length-dependant code portable through their use.

For instance, your compiler might offer
uint_8t, uint_16t, int_16t, uint_32t, int32_t, uint_64t and int_64t

You've got the names switched around. They are (u)int<N>_t; that
is, the number goes before the underscore, not after.

Tim Rentsch · Aug 28, 2010

Tim Streater said:
I'm porting a couple of C apps to OS X and I notice that one of the apps
has the following in a .h file:

#define UBYTE unsigned char // 8-bit unsigned
#define BYTE signed char // 8-bit signed
#define UWORD unsigned short // 16-bit unsigned
#define WORD short // 16-bit signed
#define ULONG unsigned int // 32-bit unsigned
#define LONG int // 32-bit signed
#define ULONG64 unsigned long long // 64-bit unsigned
#define LONG64 long long // 64-bit signed

The other seems to use the naked type definitions such as short,
unsigned long, etc.

Now, as both these apps have been ported at least once before, and while
I've made the first of them apparently work, I can imagine there may be
problems due to word lengths and types. Is there an easy way determine
the numbers of bits the compiler allocates to each data type?

The program below may yield the information you're seeking.

/* Find widths and maximum values of regular integer types. */

#include <limits.h>

#define TYPE_WIDTH(T) ((unsigned) IMAX_BITS( TYPE_MAX(T) ) + ((T)-1 < 1))

#define TYPE_MAX(T) ((T) TM_WT_( T, UNSIGNED_MAX_MAX ))

#if __STDC_VERSION__ > 199900L
#include <stdint.h>
#include <inttypes.h>

#if defined UINTMAX_MAX
#define UNSIGNED_MAX_MAX UINTMAX_MAX
#define UMAX_FORMAT PRIuMAX
typedef uintmax_t unsigned_max;

#elif defined ULLONG_MAX
#define UNSIGNED_MAX_MAX ULLONG_MAX
#define UMAX_FORMAT "llu"
typedef unsigned long long unsigned_max;

#else
#define UNSIGNED_MAX_MAX ULONG_MAX
#define UMAX_FORMAT "lu"
typedef unsigned long unsigned_max;

#endif

#elif defined ULLONG_MAX
#define UNSIGNED_MAX_MAX ULLONG_MAX
#define UMAX_FORMAT "llu"
typedef unsigned long long unsigned_max;

#else
#define UNSIGNED_MAX_MAX ULONG_MAX
#define UMAX_FORMAT "lu"
typedef unsigned long unsigned_max;

#endif

#define IMAX_BITS(m) ((m) /((m)%0x3fffffffL+1) /0x3fffffffL %0x3fffffffL *30 \
+ (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))

#if IMAX_BITS(UNSIGNED_MAX_MAX) > 16384
#error number of bits in UNSIGNED_MAX_MAX unreasonably high

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 8192
#define TM_WT_(T,v) TM_D_(T,v)

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 4096
#define TM_WT_(T,v) TM_C_(T,v)

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 2048
#define TM_WT_(T,v) TM_B_(T,v)

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 1024
#define TM_WT_(T,v) TM_A_(T,v)

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 512
#define TM_WT_(T,v) TM_9_(T,v)

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 256
#define TM_WT_(T,v) TM_8_(T,v)

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 128
#define TM_WT_(T,v) TM_7_(T,v)

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 64
#define TM_WT_(T,v) TM_6_(T,v)

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 32
#define TM_WT_(T,v) TM_5_(T,v)

#elif IMAX_BITS(UNSIGNED_MAX_MAX) > 16
#define TM_WT_(T,v) TM_4_(T,v)

#else
#error number of bits in UNSIGNED_MAX_MAX impossibly low

#endif

#define TM_D_(T,v) ( TM_OK(T,v>>8191) ? TM_C_(T,v) : TM_C_(T,(v>>8192)) )
#define TM_C_(T,v) ( TM_OK(T,v>>4095) ? TM_B_(T,v) : TM_B_(T,(v>>4096)) )
#define TM_B_(T,v) ( TM_OK(T,v>>2047) ? TM_A_(T,v) : TM_A_(T,(v>>2048)) )
#define TM_A_(T,v) ( TM_OK(T,v>>1023) ? TM_9_(T,v) : TM_9_(T,(v>>1024)) )
#define TM_9_(T,v) ( TM_OK(T,v>> 511) ? TM_8_(T,v) : TM_8_(T,(v>> 512)) )
#define TM_8_(T,v) ( TM_OK(T,v>> 255) ? TM_7_(T,v) : TM_7_(T,(v>> 256)) )
#define TM_7_(T,v) ( TM_OK(T,v>> 127) ? TM_6_(T,v) : TM_6_(T,(v>> 128)) )
#define TM_6_(T,v) ( TM_OK(T,v>> 63) ? TM_5_(T,v) : TM_5_(T,(v>> 64)) )
#define TM_5_(T,v) ( TM_OK(T,v>> 31) ? TM_4_(T,v) : TM_4_(T,(v>> 32)) )
#define TM_4_(T,v) ( TM_OK(T,v>> 15) ? TM_3_(T,v) : TM_3_(T,(v>> 16)) )
#define TM_3_(T,v) ( TM_OK(T,v>> 7) ? TM_2_(T,v) : TM_2_(T,(v>> 8)) )
#define TM_2_(T,v) ( TM_OK(T,v>> 3) ? TM_1_(T,v) : TM_1_(T,(v>> 4)) )
#define TM_1_(T,v) ( TM_OK(T,v>> 1) ? TM_0_(T,v) : TM_0_(T,(v>> 2)) )
#define TM_0_(T,v) ( TM_OK(T,v ) ? v : v>> 1 )

#define TM_OK(T,v) ( (T)(v) > 0 && (T)(v) == (v) )

#include <stddef.h>
#include <stdio.h>

char test_array[ TYPE_MAX(char) ];

#define PRINT_WIDTH(T) \
printf( "Width of %20s is %5u\n", #T, TYPE_WIDTH(T) )

#define PRINT_MAX(T) \
printf( "Maximum value of %20s is %25" UMAX_FORMAT "\n", \
#T, (unsigned_max) TYPE_MAX(T) \
)

int
main(){
#if __STDC_VERSION__ > 199900L
PRINT_WIDTH(_Bool);
printf( "\n" );
#endif

PRINT_WIDTH(char);
PRINT_WIDTH(signed char);
PRINT_WIDTH(unsigned char);
printf( "\n" );

PRINT_WIDTH(short);
PRINT_WIDTH(signed short);
PRINT_WIDTH(unsigned short);
printf( "\n" );

PRINT_WIDTH(int);
PRINT_WIDTH(signed int);
PRINT_WIDTH(unsigned int);
printf( "\n" );

PRINT_WIDTH(long);
PRINT_WIDTH(signed long);
PRINT_WIDTH(unsigned long);
printf( "\n" );

PRINT_WIDTH(ptrdiff_t);
PRINT_WIDTH(size_t);
printf( "\n" );

#if __STDC_VERSION__ > 199900L || defined ULLONG_MAX
PRINT_WIDTH(long long);
PRINT_WIDTH(signed long long);
PRINT_WIDTH(unsigned long long);
printf( "\n" );
#endif

#if __STDC_VERSION__ > 199900L
PRINT_MAX(_Bool);
printf( "\n" );
#endif

PRINT_MAX(char);
PRINT_MAX(signed char);
PRINT_MAX(unsigned char);
printf( "\n" );

PRINT_MAX(short);
PRINT_MAX(signed short);
PRINT_MAX(unsigned short);
printf( "\n" );

PRINT_MAX(int);
PRINT_MAX(signed int);
PRINT_MAX(unsigned int);
printf( "\n" );

PRINT_MAX(long);
PRINT_MAX(signed long);
PRINT_MAX(unsigned long);
printf( "\n" );

PRINT_MAX(ptrdiff_t);
PRINT_MAX(size_t);
printf( "\n" );

#if __STDC_VERSION__ > 199900L || defined ULLONG_MAX
PRINT_MAX(long long);
PRINT_MAX(signed long long);
PRINT_MAX(unsigned long long);
printf( "\n" );
#endif

printf( "\n" );
printf( " sizeof test_array is %25" UMAX_FORMAT "\n",
(unsigned_max) sizeof test_array
);
printf( "\n" );

printf( " Value of UNSIGNED_MAX_MAX is %25" UMAX_FORMAT "\n",
(unsigned_max) UNSIGNED_MAX_MAX
);
printf( "\n" );

printf( " UMAX_FORMAT is %25s\n", UMAX_FORMAT );
printf( "\n" );

return 0;
}

Eric Sosman · Aug 28, 2010

I'm porting a couple of C apps to OS X and I notice that one of the apps
has the following in a .h file:

#define UBYTE unsigned char // 8-bit unsigned
#define BYTE signed char // 8-bit signed
#define UWORD unsigned short // 16-bit unsigned
#define WORD short // 16-bit signed
#define ULONG unsigned int // 32-bit unsigned
#define LONG int // 32-bit signed
#define ULONG64 unsigned long long // 64-bit unsigned
#define LONG64 long long // 64-bit signed

The other seems to use the naked type definitions such as short,
unsigned long, etc.

Now, as both these apps have been ported at least once before, and while
I've made the first of them apparently work, I can imagine there may be
problems due to word lengths and types. Is there an easy way determine
the numbers of bits the compiler allocates to each data type?

Others have mentioned the <stdint.h> header from C99. Since
the `long long' type didn't officially enter C until C99, it may
be that you're using a C99-conforming system and you're all set.
On the other hand, `long long' is also provided by some C90 compilers
as an extension to the language, so the mere presence of `long long'
doesn't absolutely prove that <stdint.h> is available ...

No matter which version of the Standard your implementation
follows, the <limits.h> header can answer the question as you've
asked it:

#include <limits.h>
int bits_in_a_T = sizeof(T) * CHAR_BIT;

This approach yields an answer, but unfortunately it's not an answer
you can test in the preprocessor with #if and so on: The preprocessor
operates before types come into existence, so sizeof(T) can't be
evaluated. For a preprocessor-time test you can ask a slightly
different question:

#include <limits.h>
#if UCHAR_MAX == 255
#define UBYTE unsigned char
#else
#error "No 8-bit unsigned type"
#endif
#if USHRT_MAX == 65535
#define UWORD unsigned short
#else
#error "No 16-bit unsigned type"
#endif
...

Note that this is not exactly the question you posed -- but on the
other hand, it's usually the question that *should* have been posed.

Keith Thompson · Aug 28, 2010

Lew Pitcher said:
For pre-C99, there is no direct equivalent to stdint.h. However, you should
be able to determine the sizes of all the data types /for a specific
compiler/ through it's documentation. If the compiler's documentation fails
to give you the correct values, you can always fall back on
the "experimental" method, and code, compile and run a program that reports
the sizeof of each integral type.

You might take a look at Doug Gwyn's "q8" at
<http://www.lysator.liu.se/c/q8/index.html>.

Tim Streater · Aug 29, 2010

"christian.bau said:
I like the

#define LONG int

which means that long* and LONG* are incompatible pointers, and quite
possible sizeof (long) > sizeof (LONG). And if anyone tries to pass
the address of a long to a function that expects a LONG*, and gives in
to the temptation to cast like (LONG *)&longValue, then hilarity
starts.

I also like the

#define BYTE signed char

which will confuse the hell out of anyone who compares (BYTE) 0x70 and
(BYTE) 0x80 and finds that the first one has the larger value.

"WORD" is of course the obvious sign of what is going on. It's code
written by bloody Windows programmers who tried to write portable code
and don't have a clue.

I'd change them to typedef's and typedef them to the types uint8_t,
int8_t, uint16_t etc. If someone used "unsigned WORD" somewhere in the
code, you'll get an error and fix it.

Thanks and to the others who responded; I've saved all your comments. I
haven't done any C for 20 years so this should be interesting.

Of the two programs, the one that seems to work is from circa 1989,
written for the VAX. I had to promote a short to an int, and change all
the function parameter declarations from this style:

int wiggy (x, y)
int x;
int y;
{
// code
}

to the required one. Then it compiled with some warnings and runs - but
I haven't tested it very much yet.

The other one, whose .h file I quoted, dates from May 1995 (unknown
host, possibly Amiga) but was ported to the PC in Feb 1999. One of the
things the PC guy did was to add an option to have a lot of strings that
the program puts out be in lower rather than upper case - by converting
the original strings. But the OS X compiler appears to put these in
read-only memory, so this is already fun.

Cheers,

Geoff · Aug 29, 2010

"WORD" is of course the obvious sign of what is going on. It's code
written by bloody Windows programmers who tried to write portable code
and don't have a clue.

Possibly. It's missing the infamous DWORD.

Thomas Richter · Aug 29, 2010

christian.bau said:
I like the

#define LONG int

which means that long* and LONG* are incompatible pointers, and quite
possible sizeof (long) > sizeof (LONG).

Which might be just the intention. Note that on GNU 64 bit systems, LONG
would be a 32 bit entity, while long would be 64 bit wide. If you use
the above types consistently (!) then there is no problem.

And if anyone tries to pass
the address of a long to a function that expects a LONG*, and gives in
to the temptation to cast like (LONG *)&longValue, then hilarity
starts.

And your point is?

I also like the

#define BYTE signed char

which will confuse the hell out of anyone who compares (BYTE) 0x70 and
(BYTE) 0x80 and finds that the first one has the larger value.

And your point is?

"WORD" is of course the obvious sign of what is going on. It's code
written by bloody Windows programmers who tried to write portable code
and don't have a clue.

Unlikely. Win has DWORD, not LONG. In fact, the above definitions are
rather in the tradition of AmigaOs coding, and just not understanding or
following them by yourself does not mean that the author has no clue.
The author probably just had a tradition different from yours. LONG was
there always understood to be 32 bit wide, BYTE a signed 8 bit type and
so on. Thus, it was perfectly understood what these types would be in
such an environment. Probably not by you, but that was not the question
to begin with.

I'd change them to typedef's and typedef them to the types uint8_t,
int8_t, uint16_t etc. If someone used "unsigned WORD" somewhere in the
code, you'll get an error and fix it.

Changing to typedefs is a good advice, but int8_t etc. are not very
usable nowadays. There are still compilers out there, even for very
popular platforms, that do not support C99. Unfortunately, for such
platforms, autoconf - which I would otherwise recommend much - is
neither of any value. Thus, check the compiler documentation, and insert
the proper types by hand.

So long,
Thomas

Ian Collins · Aug 29, 2010

Which might be just the intention. Note that on GNU 64 bit systems, LONG
would be a 32 bit entity, while long would be 64 bit wide. If you use
the above types consistently (!) then there is no problem.

And your point is?

The point is sizeof(long) != sizeof(LONG) which is confusing at best,
down right stupid at worst.

And your point is?

The point is 0x80 is greater than 0x70. People tend to assume a byte is
an unsigned 8 bit unit.

Changing to typedefs is a good advice, but int8_t etc. are not very
usable nowadays. There are still compilers out there, even for very
popular platforms, that do not support C99.

But most platforms do have <stdint.h> in their system headers. Checking
for and substituting one's own version is trivial.

Thomas Richter · Aug 29, 2010

Ian said:
The point is sizeof(long) != sizeof(LONG) which is confusing at best,
down right stupid at worst.

Why? A "long" is a compiler dependent quantity, a LONG is not. I
personally don't consider this overly confusing. I rather find it more
confusing to have a long 32 bit wide on some, 64 bit wide on other
platforms (and even completely other widths on more exotic systems). I
prefer to think - but this is as said a tradition - that LONGs are 32
bit wide.

The point is 0x80 is greater than 0x70. People tend to assume a byte is
an unsigned 8 bit unit.

Actually, no, I don't. Neither do other popular languages (java, for
example). It is really a matter of what you're used to.

But most platforms do have <stdint.h> in their system headers. Checking
for and substituting one's own version is trivial.

True enough, but not going through existing source and changing it
there, or establishing a different coding tradition if there is already
an in-house tradition.

But again, as I said, there is nothing to argue about such traditions, I
just don't agree with the overly harsh reaction above.

Greetings,

Thomas

Seebs · Aug 29, 2010

Why? A "long" is a compiler dependent quantity, a LONG is not. I
personally don't consider this overly confusing. I rather find it more
confusing to have a long 32 bit wide on some, 64 bit wide on other
platforms (and even completely other widths on more exotic systems). I
prefer to think - but this is as said a tradition - that LONGs are 32
bit wide.

Uh.

That would sure confuse the heck out of me. Sure, the all-caps would make
me check it, but I expect a "long" to be a size that is "long" to the
underlying hardware, and I don't expect it to have a constant size.
If I want int32_t, I know where to find it.

But again, as I said, there is nothing to argue about such traditions, I
just don't agree with the overly harsh reaction above.

I saw no overly harsh reaction. If anything, I saw a reaction that was
a little less firm than I would normally be.

-s

Tim Streater · Aug 29, 2010

Tim Rentsch said:
The program below may yield the information you're seeking.

/* Find widths and maximum values of regular integer types. */

[snip program]

Thanks - that worked a treat.

Ian Collins · Aug 29, 2010

Why? A "long" is a compiler dependent quantity, a LONG is not. I
personally don't consider this overly confusing. I rather find it more
confusing to have a long 32 bit wide on some, 64 bit wide on other
platforms (and even completely other widths on more exotic systems). I
prefer to think - but this is as said a tradition - that LONGs are 32
bit wide.

If you want a fixed with type, use one by name, not an historical
reference. Pity the poor newcomer to the code base.

True enough, but not going through existing source and changing it
there, or establishing a different coding tradition if there is already
an in-house tradition.

You asserted "int8_t etc. are not very usable nowadays", which is nonsense.

But again, as I said, there is nothing to argue about such traditions, I
just don't agree with the overly harsh reaction above.

Tradition has nothing to do with falsehoods.

Felix Palmen · Aug 29, 2010

* christian.bau said:
Would you like the version that harsh and impolite?

Yes, please! Thats just SO much more fun to read

Malcolm McLean · Aug 30, 2010

If your compiler doesn't support int32_t etc., then you are not using
a C compiler. So what are you posting on comp.lang.c for?

Even on its own terms this is wrong (the types are not required to
exist, because they can't easily be supported on odd architectures).

However plenty of people have to use old or non-standard C compilers.
As long as the language is reconisably C, it is on-topic, the
newsgroup predates standardisation.

Adding adressing of IPv6 to program	1	Feb 16, 2023
rescale signed to unsigned (short) int	11	Sep 10, 2010
Bit-fields in unsigned short	17	Mar 9, 2011
maximum value of int	8	Sep 17, 2011
Suitable type names needed for portable integers	30	Aug 15, 2013
FAQ: int or long int?	12	Apr 1, 2011
Natural size: int	78	Aug 8, 2006
casting unsigned int to void*	18	Jul 30, 2012

Getting lengths of short, int, etc

Tim Streater

Felix Palmen

Lew Pitcher

Ben Pfaff

Tim Rentsch

Eric Sosman

Keith Thompson

Tim Streater

Geoff

Thomas Richter

Ian Collins

Thomas Richter

Seebs

Tim Streater

Ian Collins

Felix Palmen

Malcolm McLean

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads