Getting lengths of short, int, etc

Discussion in 'C Programming' started by Tim Streater, Aug 28, 2010.

  1. Tim Streater

    Tim Streater Guest

    I'm porting a couple of C apps to OS X and I notice that one of the apps
    has the following in a .h file:

    #define UBYTE unsigned char // 8-bit unsigned
    #define BYTE signed char // 8-bit signed
    #define UWORD unsigned short // 16-bit unsigned
    #define WORD short // 16-bit signed
    #define ULONG unsigned int // 32-bit unsigned
    #define LONG int // 32-bit signed
    #define ULONG64 unsigned long long // 64-bit unsigned
    #define LONG64 long long // 64-bit signed

    The other seems to use the naked type definitions such as short,
    unsigned long, etc.

    Now, as both these apps have been ported at least once before, and while
    I've made the first of them apparently work, I can imagine there may be
    problems due to word lengths and types. Is there an easy way determine
    the numbers of bits the compiler allocates to each data type?
    Tim Streater, Aug 28, 2010
    1. Advertisements

  2. Tim Streater

    Felix Palmen Guest

    The GNU autotools approach to this is to compile little test programs
    and let them output sizeof(<type>). So, if you really need to know sizes
    of integers, I'd say let the build system find out for you.

    Regards, Felix
    Felix Palmen, Aug 28, 2010
    1. Advertisements

  3. Tim Streater

    Lew Pitcher Guest

    For C99-compliant compilers, you don't have to; the <stdint.h> header will
    contain bit-length-qualified typedefs for integer types, making
    bit-length-dependant code portable through their use.

    For instance, your compiler might offer
    uint_8t, uint_16t, int_16t, uint_32t, int32_t, uint_64t and int_64t
    corresponding to
    an 8bit unsigned integer type,
    a 16bit unsigned integer type,
    a 16bit signed integer type,
    a 32bit unsigned integer type,
    a 32bit signed integer type,
    a 64bit unsigned integer type, and
    a 64bit signed integer type.
    You would use these /instead/ of your handcoded UBYTE ... macros.
    If the platform doesn't offer a 32bit unsigned type (for instance), stdint.h
    will not include uint_32t, and any code which depends on this type will not

    For pre-C99, there is no direct equivalent to stdint.h. However, you should
    be able to determine the sizes of all the data types /for a specific
    compiler/ through it's documentation. If the compiler's documentation fails
    to give you the correct values, you can always fall back on
    the "experimental" method, and code, compile and run a program that reports
    the sizeof of each integral type.

    Again, *this is compiler-specific* information, and can only be determined
    case by case against each compiler. While there are specific /minimum
    requirements/ for integral types, there are no limits on how the compiler
    supports these minimums; the compiler /might/ support all 8bit integral
    types through 32bit integers; sizeof (char) will report 1, (as per the
    definition), but CHAR_BITS might evaluate to 32.

    Lew Pitcher, Aug 28, 2010
  4. Tim Streater

    Ben Pfaff Guest

    You've got the names switched around. They are (u)int<N>_t; that
    is, the number goes before the underscore, not after.
    Ben Pfaff, Aug 28, 2010
  5. Tim Streater

    Tim Rentsch Guest

    The program below may yield the information you're seeking.

    /* Find widths and maximum values of regular integer types. */

    #include <limits.h>

    #define TYPE_WIDTH(T) ((unsigned) IMAX_BITS( TYPE_MAX(T) ) + ((T)-1 < 1))

    #define TYPE_MAX(T) ((T) TM_WT_( T, UNSIGNED_MAX_MAX ))

    #if __STDC_VERSION__ > 199900L
    #include <stdint.h>
    #include <inttypes.h>

    #if defined UINTMAX_MAX
    typedef uintmax_t unsigned_max;

    #elif defined ULLONG_MAX
    #define UMAX_FORMAT "llu"
    typedef unsigned long long unsigned_max;

    #define UMAX_FORMAT "lu"
    typedef unsigned long unsigned_max;


    #elif defined ULLONG_MAX
    #define UMAX_FORMAT "llu"
    typedef unsigned long long unsigned_max;

    #define UMAX_FORMAT "lu"
    typedef unsigned long unsigned_max;


    #define IMAX_BITS(m) ((m) /((m)%0x3fffffffL+1) /0x3fffffffL %0x3fffffffL *30 \
    + (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))

    #error number of bits in UNSIGNED_MAX_MAX unreasonably high

    #elif IMAX_BITS(UNSIGNED_MAX_MAX) > 8192
    #define TM_WT_(T,v) TM_D_(T,v)

    #elif IMAX_BITS(UNSIGNED_MAX_MAX) > 4096
    #define TM_WT_(T,v) TM_C_(T,v)

    #elif IMAX_BITS(UNSIGNED_MAX_MAX) > 2048
    #define TM_WT_(T,v) TM_B_(T,v)

    #elif IMAX_BITS(UNSIGNED_MAX_MAX) > 1024
    #define TM_WT_(T,v) TM_A_(T,v)

    #define TM_WT_(T,v) TM_9_(T,v)

    #define TM_WT_(T,v) TM_8_(T,v)

    #define TM_WT_(T,v) TM_7_(T,v)

    #define TM_WT_(T,v) TM_6_(T,v)

    #define TM_WT_(T,v) TM_5_(T,v)

    #define TM_WT_(T,v) TM_4_(T,v)

    #error number of bits in UNSIGNED_MAX_MAX impossibly low


    #define TM_D_(T,v) ( TM_OK(T,v>>8191) ? TM_C_(T,v) : TM_C_(T,(v>>8192)) )
    #define TM_C_(T,v) ( TM_OK(T,v>>4095) ? TM_B_(T,v) : TM_B_(T,(v>>4096)) )
    #define TM_B_(T,v) ( TM_OK(T,v>>2047) ? TM_A_(T,v) : TM_A_(T,(v>>2048)) )
    #define TM_A_(T,v) ( TM_OK(T,v>>1023) ? TM_9_(T,v) : TM_9_(T,(v>>1024)) )
    #define TM_9_(T,v) ( TM_OK(T,v>> 511) ? TM_8_(T,v) : TM_8_(T,(v>> 512)) )
    #define TM_8_(T,v) ( TM_OK(T,v>> 255) ? TM_7_(T,v) : TM_7_(T,(v>> 256)) )
    #define TM_7_(T,v) ( TM_OK(T,v>> 127) ? TM_6_(T,v) : TM_6_(T,(v>> 128)) )
    #define TM_6_(T,v) ( TM_OK(T,v>> 63) ? TM_5_(T,v) : TM_5_(T,(v>> 64)) )
    #define TM_5_(T,v) ( TM_OK(T,v>> 31) ? TM_4_(T,v) : TM_4_(T,(v>> 32)) )
    #define TM_4_(T,v) ( TM_OK(T,v>> 15) ? TM_3_(T,v) : TM_3_(T,(v>> 16)) )
    #define TM_3_(T,v) ( TM_OK(T,v>> 7) ? TM_2_(T,v) : TM_2_(T,(v>> 8)) )
    #define TM_2_(T,v) ( TM_OK(T,v>> 3) ? TM_1_(T,v) : TM_1_(T,(v>> 4)) )
    #define TM_1_(T,v) ( TM_OK(T,v>> 1) ? TM_0_(T,v) : TM_0_(T,(v>> 2)) )
    #define TM_0_(T,v) ( TM_OK(T,v ) ? v : v>> 1 )

    #define TM_OK(T,v) ( (T)(v) > 0 && (T)(v) == (v) )

    #include <stddef.h>
    #include <stdio.h>

    char test_array[ TYPE_MAX(char) ];

    #define PRINT_WIDTH(T) \
    printf( "Width of %20s is %5u\n", #T, TYPE_WIDTH(T) )

    #define PRINT_MAX(T) \
    printf( "Maximum value of %20s is %25" UMAX_FORMAT "\n", \
    #T, (unsigned_max) TYPE_MAX(T) \

    #if __STDC_VERSION__ > 199900L
    printf( "\n" );

    PRINT_WIDTH(signed char);
    PRINT_WIDTH(unsigned char);
    printf( "\n" );

    PRINT_WIDTH(signed short);
    PRINT_WIDTH(unsigned short);
    printf( "\n" );

    PRINT_WIDTH(signed int);
    PRINT_WIDTH(unsigned int);
    printf( "\n" );

    PRINT_WIDTH(signed long);
    PRINT_WIDTH(unsigned long);
    printf( "\n" );

    printf( "\n" );

    #if __STDC_VERSION__ > 199900L || defined ULLONG_MAX
    PRINT_WIDTH(long long);
    PRINT_WIDTH(signed long long);
    PRINT_WIDTH(unsigned long long);
    printf( "\n" );

    #if __STDC_VERSION__ > 199900L
    printf( "\n" );

    PRINT_MAX(signed char);
    PRINT_MAX(unsigned char);
    printf( "\n" );

    PRINT_MAX(signed short);
    PRINT_MAX(unsigned short);
    printf( "\n" );

    PRINT_MAX(signed int);
    PRINT_MAX(unsigned int);
    printf( "\n" );

    PRINT_MAX(signed long);
    PRINT_MAX(unsigned long);
    printf( "\n" );

    printf( "\n" );

    #if __STDC_VERSION__ > 199900L || defined ULLONG_MAX
    PRINT_MAX(long long);
    PRINT_MAX(signed long long);
    PRINT_MAX(unsigned long long);
    printf( "\n" );

    printf( "\n" );
    printf( " sizeof test_array is %25" UMAX_FORMAT "\n",
    (unsigned_max) sizeof test_array
    printf( "\n" );

    printf( " Value of UNSIGNED_MAX_MAX is %25" UMAX_FORMAT "\n",
    (unsigned_max) UNSIGNED_MAX_MAX
    printf( "\n" );

    printf( " UMAX_FORMAT is %25s\n", UMAX_FORMAT );
    printf( "\n" );

    return 0;
    Tim Rentsch, Aug 28, 2010
  6. Tim Streater

    Eric Sosman Guest

    Others have mentioned the <stdint.h> header from C99. Since
    the `long long' type didn't officially enter C until C99, it may
    be that you're using a C99-conforming system and you're all set.
    On the other hand, `long long' is also provided by some C90 compilers
    as an extension to the language, so the mere presence of `long long'
    doesn't absolutely prove that <stdint.h> is available ...

    No matter which version of the Standard your implementation
    follows, the <limits.h> header can answer the question as you've
    asked it:

    #include <limits.h>
    int bits_in_a_T = sizeof(T) * CHAR_BIT;

    This approach yields an answer, but unfortunately it's not an answer
    you can test in the preprocessor with #if and so on: The preprocessor
    operates before types come into existence, so sizeof(T) can't be
    evaluated. For a preprocessor-time test you can ask a slightly
    different question:

    #include <limits.h>
    #if UCHAR_MAX == 255
    #define UBYTE unsigned char
    #error "No 8-bit unsigned type"
    #if USHRT_MAX == 65535
    #define UWORD unsigned short
    #error "No 16-bit unsigned type"

    Note that this is not exactly the question you posed -- but on the
    other hand, it's usually the question that *should* have been posed.
    Eric Sosman, Aug 28, 2010
  7. You might take a look at Doug Gwyn's "q8" at
    Keith Thompson, Aug 28, 2010
  8. Tim Streater

    Tim Streater Guest

    Thanks and to the others who responded; I've saved all your comments. I
    haven't done any C for 20 years so this should be interesting.

    Of the two programs, the one that seems to work is from circa 1989,
    written for the VAX. I had to promote a short to an int, and change all
    the function parameter declarations from this style:

    int wiggy (x, y)
    int x;
    int y;
    // code

    to the required one. Then it compiled with some warnings and runs - but
    I haven't tested it very much yet.

    The other one, whose .h file I quoted, dates from May 1995 (unknown
    host, possibly Amiga) but was ported to the PC in Feb 1999. One of the
    things the PC guy did was to add an option to have a lot of strings that
    the program puts out be in lower rather than upper case - by converting
    the original strings. But the OS X compiler appears to put these in
    read-only memory, so this is already fun.

    Tim Streater, Aug 29, 2010
  9. Tim Streater

    Geoff Guest

    Possibly. It's missing the infamous DWORD.
    Geoff, Aug 29, 2010
  10. Which might be just the intention. Note that on GNU 64 bit systems, LONG
    would be a 32 bit entity, while long would be 64 bit wide. If you use
    the above types consistently (!) then there is no problem.
    And your point is?
    And your point is?
    Unlikely. Win has DWORD, not LONG. In fact, the above definitions are
    rather in the tradition of AmigaOs coding, and just not understanding or
    following them by yourself does not mean that the author has no clue.
    The author probably just had a tradition different from yours. LONG was
    there always understood to be 32 bit wide, BYTE a signed 8 bit type and
    so on. Thus, it was perfectly understood what these types would be in
    such an environment. Probably not by you, but that was not the question
    to begin with.
    Changing to typedefs is a good advice, but int8_t etc. are not very
    usable nowadays. There are still compilers out there, even for very
    popular platforms, that do not support C99. Unfortunately, for such
    platforms, autoconf - which I would otherwise recommend much - is
    neither of any value. Thus, check the compiler documentation, and insert
    the proper types by hand.

    So long,
    Thomas Richter, Aug 29, 2010
  11. Tim Streater

    Ian Collins Guest

    The point is sizeof(long) != sizeof(LONG) which is confusing at best,
    down right stupid at worst.
    The point is 0x80 is greater than 0x70. People tend to assume a byte is
    an unsigned 8 bit unit.
    But most platforms do have <stdint.h> in their system headers. Checking
    for and substituting one's own version is trivial.
    Ian Collins, Aug 29, 2010
  12. Why? A "long" is a compiler dependent quantity, a LONG is not. I
    personally don't consider this overly confusing. I rather find it more
    confusing to have a long 32 bit wide on some, 64 bit wide on other
    platforms (and even completely other widths on more exotic systems). I
    prefer to think - but this is as said a tradition - that LONGs are 32
    bit wide.
    Actually, no, I don't. Neither do other popular languages (java, for
    example). It is really a matter of what you're used to.
    True enough, but not going through existing source and changing it
    there, or establishing a different coding tradition if there is already
    an in-house tradition.

    But again, as I said, there is nothing to argue about such traditions, I
    just don't agree with the overly harsh reaction above.


    Thomas Richter, Aug 29, 2010
  13. Tim Streater

    Seebs Guest


    That would sure confuse the heck out of me. Sure, the all-caps would make
    me check it, but I expect a "long" to be a size that is "long" to the
    underlying hardware, and I don't expect it to have a constant size.
    If I want int32_t, I know where to find it.
    I saw no overly harsh reaction. If anything, I saw a reaction that was
    a little less firm than I would normally be.

    Seebs, Aug 29, 2010
  14. Tim Streater

    Tim Streater Guest

    [snip program]

    Thanks - that worked a treat.
    Tim Streater, Aug 29, 2010
  15. Tim Streater

    Ian Collins Guest

    If you want a fixed with type, use one by name, not an historical
    reference. Pity the poor newcomer to the code base.
    You asserted "int8_t etc. are not very usable nowadays", which is nonsense.
    Tradition has nothing to do with falsehoods.
    Ian Collins, Aug 29, 2010
  16. Tim Streater

    Felix Palmen Guest

    Yes, please! Thats just SO much more fun to read :)
    Felix Palmen, Aug 29, 2010
  17. Even on its own terms this is wrong (the types are not required to
    exist, because they can't easily be supported on odd architectures).

    However plenty of people have to use old or non-standard C compilers.
    As long as the language is reconisably C, it is on-topic, the
    newsgroup predates standardisation.
    Malcolm McLean, Aug 30, 2010
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.