Suitable type names needed for portable integers

Discussion in 'C Programming' started by James Harris, Aug 15, 2013.

  1. James Harris

    James Harris Guest

    I am working on code that is to run on 16-bit, 32-bit and 64-bit machines
    and am looking for a way to declare specific types of integers. Could you
    suggest suitable names for the following?

    1. Integers which will be 16-bit on 16-bit machines and 32-bit on both
    32-bit machines and 64-bit machines.

    2. Integers which will be 16-bit on 16-bit machines, 32-bit on 32-bit
    machines and 64-bit on 64-bit machines.

    Both of these types need at least signed and unsigned variants so the need
    is for exactly four type names:

    a name for signed 16, 32, 32
    a name for unsigned 16, 32, 32

    a name for signed 16, 32, 64
    a name for unsigned 16, 32, 64

    I intend to define these names explicitly so that they are not subject to
    the defaults of any given compiler. So I don't want to use int, for example.
    I was going to use sint and uint as two of the names but one compiler
    predefines uint ... which is a pain. So I am looking for something else. All
    that's needed are four names. Any suggestions? Any precedent?

    James Harris, Aug 15, 2013
    1. Advertisements

  2. James Harris

    James Kuyper Guest

    I suspect that you've made a decision about how to achieve a goal, and
    are trying to figure out how to implement that decision, when what you
    really should be doing is reconsidering that decision. I suspect I could
    recommend a better way of achieving that goal, if I knew what it was. So
    far, you've described your decision, but not the goal it's intended to

    The <stdint.h> header that was added in C99 provides three pairs of
    families of type names: intN_t, int_leastN_t, and int_fastN_t.

    For example, int_fast16_t is the fastest signed integer type that has at
    least 16 bits. uint_least32_t is the smallest unsigned integer type
    that has at least 32 bits. int64_t is an signed integer type with
    exactly 64 bits.
    The "least" and "fast" types are required to be supported for N=8, 16,
    32, and 64. The exact-sized types are optional. The relevant names are
    reserved for all values of N - you're not likely to need int_least39_t,
    but if you find a implementation that uses that name, they're required
    to use it to describe the smallest type with at least 39 bits.
    You can determine which of the optional types is supported by using, for

    #ifdef INT32_MAX

    Would any of these types serve your purpose? Note that, in particular,
    int_fast16_t could be a 64 bit type on a 64 bit machine.
    James Kuyper, Aug 15, 2013
    1. Advertisements

  3. James Harris

    James Harris Guest

    It's no secret. I'll try to explain the main points briefly. The code is for
    an operating system that will run on different machines - x86 and otherwise.
    I am in the process of moving development from assembly to C and wanting to
    take advantage of the resulting ability to compile the code for different
    target machines. That's the biggest thing C gives me over assembly, i.e. the
    ability to write source code once and have it compile to different forms of
    machine code.

    For example, I should be able to write a device driver or a memory manager
    in C and have it run on an x86 whether that machine is running in 16-bit
    mode, 32-bit mode or 64-bit mode. Therefore I have specific requirements
    over how wide certain integers are, as follows.

    Some integer sizes are mandated by the data they represent - for example,
    data read from a 16-bit port might have to be 16-bit unsigned reardless of
    the CPU's mode (16-, 32- or 64-bit) - and I have specific types for those as
    not all compilers supported stdints. That stuff is done. But there are other
    integers that can or should be the appropriate size for the target mode. For
    example, some processor or OS tables hold 32-bit entries in 32-bit mode and
    64-bit entries in 64-bit mode but other tables hold 32-bit entries in either
    mode. Both may use 16-bit entries in 16-bit mode. So, as I say, I have some
    very specific requirements for integer sizes.

    Additionally (and sorry for the detail but you did ask) in 64-bit mode there
    are some cases where 64-bit integers are required and others where 64-bit
    integers could be used but 32-bit integers would be better. Hence I was
    asking about a suitable name for integers that would be 16, 32 and 32 wide
    and another name for integers that would be 16, 32 and 64 wide.

    Does that explain enough to make sense of the original query?

    Thanks for the suggestions. The goal is different but I could possibly use a
    similar principle. For the case in point the number of bits should not be
    specified but some suffix on the word "int" might be appropriate. Possibly
    something like these for unsigned integers in the three modes

    uint_small: which becomes 16, 32, or 32
    uint_large: which becomes 16, 32, or 64

    That conveys the idea but at the moment those suffixes do not seem ideal. As
    the integers would only differ in 64-bit mode _small and _large seem a
    little incongruous in the other modes. (Remember that the source would use
    those names regardles of the mode the code will run in.) I'll give it some
    more thought but this seems like a good road to go down.

    James Harris, Aug 16, 2013
  4. James Harris

    BartC Guest

    C makes treats upper and lower case variations of a name distinctly. So you
    could use SINT (or INT or Int) and UINT.

    I have also used 'word' to mean unsigned, leaving 'int' to mean signed (I
    think 'int' is always signed unless you add 'unsigned').

    With the type that's capped at 32-bits, what is the purpose of that? Perhaps
    they could be given a more relevant name than just an 'int'-based one.

    Also, is the 'uint' of that compiler a macro? I thought you could override
    macro names. Actually I think you can override built-in names too for that

    #define int int32_t

    But use with some care...
    BartC, Aug 16, 2013
  5. James Harris

    James Harris Guest

    Yes, sint is a bit ugly. At least it's less familar. It has three
    advantages, though. First, as long as use of int is banned the use of sint
    or uint forces me to think about what type of integer is needed at each
    point. Unsigned ints - especially small ones - are suprisingly common in the
    stuff I am working with. Second, the names are both the same size so
    declarations naturally line up. And third, it makes source easier to search
    for the specific types. Otherwise, uints also show up when searching for
    Please take a look at my reply to James Kuyper.
    Maybe. Christian made the same point. I already do that in one or two cases
    and will probably do it in some others but if I did it in all cases the
    number of such names could get unfeasibly large. That would make the code
    confusing, especially in places where such integers get combined with each
    other - which does happen. There are simply some cases where the generic "an
    integer" is what we want to say.
    Surprisingly it isn't in either of the 16-bit C compilers which I normally
    expect to give me trouble but crops up in gcc. The file in question is


    The relevant section is

    #ifdef __USE_MISC
    /* Old compatibility names for C types. */
    typedef unsigned long int ulong;
    typedef unsigned short int ushort;
    typedef unsigned int uint;

    James Harris, Aug 16, 2013
  6. James Harris

    James Harris Guest

    These are entries, not indices.

    And the point is the other way round from your suggestion. If I followed the
    principle of what you say and used different names such as thisTableEntry
    and thatTableEntry I would have to use different source code. That's not
    good. Instead, where practical I would rather have one type name such as
    TableEntry and make it suitable for the environment which is being built

    James Harris, Aug 16, 2013
  7. size_t and ssize_t may be acceptable here.
    Maximum object size almost always equals 2^width of address bus = processing
    register width. Not always of course. It depends if it has to work absolutely
    everywhere. But there's no totally portable solution, because C can't guarantee
    that "a 64 bit machine" actually means anything on some architecture someone
    could invent.
    Malcolm McLean, Aug 16, 2013
  8. James Harris

    James Harris Guest

    Yes, I could use int for the shorter (16 32 32) variant and ssize_t for the
    longer (16 32 64) at least with the compilers I am using at the moment.
    However, I have an almost instinctive distrust that such names can be relied
    on to do the expected thing on all compilers. I've had a number of occasions
    where something that worked well on one compiler failed to do what I wanted
    on another. And such differences may not be picked up at compile time. That
    kind of bug can be hard to find.

    So I currently define all the types I need explicitly and avoid names that
    compilers might use. If you don't mind or are interested I'll post the full
    setup below. Maybe someone has suggestions to improve this. At least it will
    explain the bigger picture. Asbestos pants on!

    For x86 there are four directories as follows


    Each of the x86_NN directories has a file called os_bits_log2.h. Aside from
    #ifdefs etc and comments those three files have exactly one line. Their
    contents are, respectively, nothing more than

    #define OS_BITS_LOG2 4
    #define OS_BITS_LOG2 5
    #define OS_BITS_LOG2 6

    These are saying that for a 16-bit target the log2 of the number of bits is
    4 and 2 ** 4 is 16 and similar for the other widths.

    The common directory (src/comn) includes a file called os_bits.h which is as

    #ifndef OS_BITS_H
    #define OS_BITS_H

    #include "os_bits_log2.h"

    #define OS_BITS (1 << OS_BITS_LOG2) /* 16, 32 or 64 */
    #define OS_BYTES (OS_BITS >> 3) /* 2, 4 or 8 */


    A build of a 16-bit system will set the include directories up in this
    order: first, src/x86_16, second src/comn. Similar for 32-bit and 64-bit
    builds. A makefile matches compiler, switches and include paths.

    Still with me? :)

    The above sets up OS_BITS. That is used toward the end of the main types
    header which is called os_types.h. Its current incarnation is below. As I
    say, I would welcome an expert view of this. Even if there are no comments
    this does serve to illustrate the approach I am currently taking. I may
    change it. This is fairly new stuff to me but so far it seems to work well.


    #ifndef OS_TYPES_H
    #define OS_TYPES_H

    #include "os_bits.h"
    #include <limits.h>

    /* We need to define

    si8 and ui8
    si16 and ui16
    si32 and ui32
    si64 and ui64
    si_small and ui_small (32-bit on 64-bit machines)
    si_large and ui_large (64-bit on 64-bit machines)

    * Define 8-bit integers

    #if SCHAR_MAX == 0x7f
    typedef signed char si8;
    typedef unsigned char ui8;
    #define si8_FMT_DEC "i"
    #define ui8_FMT_DEC "u"
    #error "No type candidate for si8 and ui8"

    * 16-bit integers

    #if INT_MAX == 0x7fff
    typedef signed int si16;
    typedef unsigned int ui16;
    #define si16_FMT_DEC "i"
    #define ui16_FMT_DEC "u"
    #elif SHRT_MAX == 0x7fff
    typedef signed short si16;
    typedef unsigned short ui16;
    #define si16_FMT_DEC "i"
    #define ui16_FMT_DEC "u"
    #error "No type candidate for si16 and ui16"

    * 32-bit integers

    /* Using small shifts to stop Open Watcom compiler warning */
    #if (INT_MAX >> 8 >> 8) == 0x7fff
    typedef signed int si32;
    typedef unsigned int ui32;
    #define si32_FMT_DEC "i"
    #define ui32_FMT_DEC "u"
    #elif (LONG_MAX >> 16) == 0x7fff
    typedef signed long si32;
    typedef unsigned long ui32;
    #define si32_FMT_DEC "li"
    #define ui32_FMT_DEC "lu"
    #error "No type candidate for si32 and ui32"

    * 64-bit integers

    /* Using small shifts to stop Open Watcom compiler warning */
    #if (INT_MAX >> 8 >> 8 >> 8 >> 8 >> 8 >> 8) == 0x7fff
    typedef signed int si64;
    typedef unsigned int ui64;
    #define si64_FMT_DEC "i"
    #define ui64_FMT_DEC "u"
    #elif (((LONG_MAX >> 16) >> 16) >> 16) == 0x7fff
    typedef signed long si64;
    typedef unsigned long ui64;
    #define si64_FMT_DEC "li"
    #define ui64_FMT_DEC "lu"
    #elif (((LLONG_MAX >> 16) >> 16) >> 16) == 0x7fff
    typedef signed long long si64;
    typedef unsigned long long ui64;
    #define si64_FMT_DEC "lli"
    #define ui64_FMT_DEC "llu"
    typedef struct {ui32 low; si32 high;} si64; /* Limited use */
    typedef struct {ui32 low; ui32 high;} ui64; /* Limited use */
    /* #warning "Using structures for si64 and ui64" */
    #define si64_FMT_DEC "%(unprintable %x)"
    #define ui64_FMT_DEC "%(unprintable %x)"

    * Define the potential number of digits when printed

    #define si8_DIG_DEC 3
    #define ui8_DIG_DEC 3

    #define si16_DIG_DEC 5
    #define ui16_DIG_DEC 5

    #define si32_DIG_DEC 10
    #define ui32_DIG_DEC 10

    #define si64_DIG_DEC 19
    #define ui64_DIG_DEC 20

    * Define small and large integer types. These are normally the same but
    * on at least x86_64 the small integers are half the normal width. Use
    * large ints by default. Use small ones only where it is known that
    * large ints are unnecessary.

    #if OS_BITS == 16
    typedef si16 si_small;
    typedef ui16 ui_small;
    #define si_small_MAX 0x7fff
    #define ui_small_MAX 0xffffU
    #define si_small_DIG_DEC si16_DIG_DEC
    #define ui_small_DIG_DEC ui16_DIG_DEC
    #define si_small_FMT_DEC si16_FMT_DEC
    #define ui_small_FMT_DEC ui16_FMT_DEC

    typedef si16 si_large;
    typedef ui16 ui_large;
    #define si_large_MAX 0x7fff
    #define ui_large_MAX 0xffffU
    #define si_large_DIG_DEC si16_DIG_DEC
    #define ui_large_DIG_DEC ui16_DIG_DEC
    #define si_large_FMT_DEC si16_FMT_DEC
    #define ui_large_FMT_DEC ui16_FMT_DEC

    #elif OS_BITS == 32
    typedef si32 si_small;
    typedef ui32 ui_small;
    #define si_small_MAX (0x7fffffff)
    #define ui_small_MAX (0xffffffffU)
    #define si_small_DIG_DEC si32_DIG_DEC
    #define ui_small_DIG_DEC ui32_DIG_DEC
    #define si_small_FMT_DEC si32_FMT_DEC
    #define ui_small_FMT_DEC ui32_FMT_DEC

    typedef si32 si_large;
    typedef ui32 ui_large;
    #define si_large_MAX (0x7fffffff)
    #define ui_large_MAX (0xffffffffU)
    #define si_large_DIG_DEC si32_DIG_DEC
    #define ui_large_DIG_DEC ui32_DIG_DEC
    #define si_large_FMT_DEC si32_FMT_DEC
    #define ui_large_FMT_DEC ui32_FMT_DEC

    #elif OS_BITS == 64
    typedef si32 si_small;
    typedef ui32 ui_small;
    #define si_small_MAX (0x7fffffff)
    #define ui_small_MAX (0xffffffffU)
    #define si_small_DIG_DEC si32_DIG_DEC
    #define ui_small_DIG_DEC ui32_DIG_DEC
    #define si_small_FMT_DEC si32_FMT_DEC
    #define ui_small_FMT_DEC ui32_FMT_DEC

    typedef si64 si_large;
    typedef ui64 ui_large;
    #define si_large_MAX (0x7fffffffffffffff)
    #define ui_large_MAX (0xffffffffffffffffU)
    #define si_large_DIG_DEC si64_DIG_DEC
    #define ui_large_DIG_DEC ui64_DIG_DEC
    #define si_large_FMT_DEC si64_FMT_DEC
    #define ui_large_FMT_DEC ui64_FMT_DEC

    #error "OS bit width not correctly specified"

    * Time stamp counter type

    typedef ui64 tsc64;
    typedef ui32 tsc32;

    James Harris, Aug 16, 2013
  9. James Harris

    James Harris Guest

    You know, starting to explain a big project in some detail inevitably leads
    to further questions about details not yet explained. In this case the
    inclusion of 16-bit mode is very deliberate largely because it is easily
    available and significantly different from the other modes. It is its
    difference that makes it an attractive target because it helps to improve
    the machine-independence of the design. The three x86 modes are conveniently
    available as they are present on any PC and in emulators and are thus easy
    to test. I am also looking at an Arm variant for the same reason - to ensure
    the design maintains a distance from a particular machine. It's not a
    perfect approach because even x86 and Arm share common characteristics. But
    I don't intend to support real oddities such as machines with 40-bit words.

    I will do this in some cases but not all. For example, in a list manager I
    have it would be madness to duplicate the code for different types of
    integer even though they are really all the same.

    Further, there are many cases where code wants a plain integer which does
    not represent a specific OS component. Consider something as universal as

    for (i = start; i < past; i++)

    In this, on 64-bit mode, sometimes i could be 32-bit. At other times it
    would need to be 64-bit. It all depends on how the index is to be used. So
    it does make sense to permit both types and to support them in a way that
    would be transparent on machines with smaller registers or address busses.

    James Harris, Aug 16, 2013
  10. Sure, but I'd say `INT` is a poor name for a type. If it's always the
    same as `int`, then it's useless; just use `int`. If it's *not* always
    the same as `int`, then the name is misleading.
    Yes, `int` synonymous with `signed int` (except that for a bit field,
    it's implementation-defined whether `int` means `signed int` or
    `unsigned int`).
    It's more likely to be a typedef, and you can't override those. You can
    avoid them by not `#include`ing the header that defines them, but it
    might be included indirectly or you might need other declarations in the
    same header.

    No conforming compiler may define the name `uint` by default, since it's
    in the user namespace -- but not all compilers are conforming by default.
    I believe macros that redefine keywords cause undefined behavior.
    They're certainly dangerous. For example, if `int32_t` is a typedef,
    then the above macro definition makes `unsigned int` a syntax error.

    Don't do that.
    Keith Thompson, Aug 16, 2013
  11. Microsoft's C compiler does not support C99. As of VS2010, it does
    support <stdint.h> as an extension, but I wouldn't count on it in
    general for non-C99 compilers.

    If you need to support systems without <stdint.h>, it isn't that
    complicated. At least for the types defined there (which doesn't
    include all the types you want), my advice is this: Don't reinvent your
    own wheel. Reinvent <stdint.h>.

    For example, you can write your own header "my_stdint.h", something

    #ifndef H_MY_STDINT
    #define H_MY STDINT
    #if __STDC_VERSION__ >= 199901L
    #include <stdint.h>
    typedef uint8_t ...
    typedef int8_t ...
    /* etc. */

    Then your code can use `#include "my_stdint.h"` rather than `#include

    Tweak the `#if` if you want to use <stdint.h> on non-C99 compilers that
    support it as an extension.

    Writing the "/* etc */" is left as an exercise, but one that's already
    been done numerous times, including this: <>.

    Keith Thompson, Aug 16, 2013
  12. James Harris

    James Kuyper Guest

    That is often the case, but it is possible to redefine a keyword with
    defined behavior.

    "The above tokens (case sensitive) are reserved (in translation phases 7
    and 8) for use as keywords, and shall not be used otherwise." 6.4.1p2.
    Note that the reservation does not apply until phase 7.

    "The program shall not have any macros with names lexically identical to
    keywords currently defined prior to the inclusion of the header or when
    any macro defined in the header is expanded." (7.1.2p4)

    If you #define a keyword, and then #undef it if necessary before the
    next #include of a standard header, or the next use of something that
    might be a macro defined in a standard header, the behavior is
    well-defined (assuming there's no other problem). I wouldn't recommend
    it though.

    Given that any standard library function might be a function-like macro,
    it's hard to imagine how BartC's suggested #define could be useful if
    used only in accord with those restrictions.
    James Kuyper, Aug 16, 2013
  13. You're right. C11 7.1.2p4.
    Agreed. It's difficult to avoid expanding a macro defined in a standard
    header; most standard library functions can be defined as macros.
    Keith Thompson, Aug 16, 2013
  14. James Harris

    Ian Pilcher Guest

    How about int_word_t for the type that always matches the word size
    of the processor?

    The 16/32/32 case is tougher, but maybe something like int_word32_t
    to indicate that it matches the word size up to 32 bits? This has the
    advantage of giving you a naming convention for your 16/32/64/64 type
    when you port your OS to a 128-bit architecture. :)
    Ian Pilcher, Aug 16, 2013
  15. James Harris

    James Harris Guest

    LOL - yes, it pays to plan ahead!

    How would uint_word and uint_short look to you? Or for closer matching to
    other names I already have defined, ui_word and ui_short?

    By the way, what's the standard deal with the _t? I see it sometimes used on
    type names and sometimes not.

    James Harris, Aug 16, 2013

  16. As far as the C standard is concerned, there's nothing special about a
    _t suffix. It means "type", but there's no particular consistency about
    its use.

    I think POSIX reserves identifiers ending with _t.
    Keith Thompson, Aug 17, 2013
  17. James Harris

    James Harris Guest

    Ah. Does that mean we should not use _t suffix on our type names?

    AIUI names such as stripe, town, memo and isomer are reserved from being
    externs. If so there's too much reserving of names going on! :-(
    James Harris, Aug 17, 2013
  18. James Harris

    James Kuyper Guest

    The type names ending with _t have file scope, and belong to the name
    space for ordinary identifiers. The corresponding POSIX reservation
    applies only at file scope, so you can use _t as a type name if it has
    any other scope. However, the POSIX reservation includes all identifiers
    in the ordinary name space - that includes variable names, function
    names, and enumeration constants. You can freely use names ending in _t
    as label names, and as tags or member names for structures, union, or
    James Kuyper, Aug 17, 2013

  19. Yes, I'm a little confused actually. <stdint.h> is the obvious choice,
    and I'm wondering why OP isn't just using it.
    Edward A. Falk, Aug 19, 2013
  20. James Harris

    James Kuyper Guest

    His target systems include ones where C99 is not supported. Of course,
    many of those systems support <stdint.h>, or some variant thereof, as a
    C90 extension. Even on the ones that don't, it would be feasible to
    provide your own, and there exist well-know versions in the public
    domain. However, these points have been made to him, and don't seem to
    have affected his thinking.

    The fundamental problem, I think, is that he doesn't trust the C
    compiler or <stdint.h> to have made what he considers to be the correct
    choice. Using C necessarily involves giving up a certain amount of
    control over the generated code, compared to assembly language. That's
    part of what makes it a higher-level language. It's very low-level for a
    high-level language, but it is still, definitely, a high-level language,
    at least by comparison with assembler.

    I've seen such attitudes before, both in people moving to C from
    assembler, and in C programmers who are temperamentally better suited to
    being assembly language programmers than C programmers. The first group
    needs to learn to let go and let C do its thing. The second group needs
    to switch to some other language better suited to their temperaments.
    James Kuyper, Aug 19, 2013
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.