Standard integer types vs <stdint.h> types

Discussion in 'C Programming' started by euler70, Jan 17, 2008.

  1. euler70

    euler70 Guest

    char and unsigned char have specific purposes: char is useful for
    representing characters of the basic execution character set and
    unsigned char is useful for representing the values of individual
    bytes. The remainder of the standard integer types are general
    purpose. Their only requirement is to satisfy a minimum range of
    values, and also int "has the natural size suggested by the
    architecture of the execution environment". What are the reasons for
    using these types instead of the int_fastN_t types of <stdint.h>?

    If I want just a signed integer with at least 16 width, then why
    choose int instead of int_fast16_t? int_fast16_t is the "fastest"
    signed integer type that has at least 16 width, while int is simply a
    signed integer type that has at least 16 width. It seems that choosing
    int_fast16_t is at least as good as choosing int. This argument can be
    made for N=8,16,32,64, and for the corresponding unsigned types.
    <stdint.h> also offers int_leastN_t types, which are useful when
    keeping a small storage size is the greatest concern.

    The only benefit of using the standard integer types I can see is that
    their ranges may expand as the C standard progresses, so code that
    uses them might stay useful over time. For example fseek of <stdio.h>
    uses a long for its offset parameter, so if the range of long grows
    then fseek will automatically offer a wider range of offsets.

    It's also interesting to note (or maybe not) that in the Windows
    world, the idea of long being general purpose has somewhat been
    destroyed and long has become a type that must have exactly 32 bits.
     
    euler70, Jan 17, 2008
    #1
    1. Advertisements


  2. Yes, we're rapidly going down the path of destroying the C basic integer
    types.

    Once you start inventing types like int_fast16_t people will use them, and
    the language becomes more and more difficult to read.

    My own view is that you should be able to stick to char for characters and
    int for integers, in almost every situation. However this is only tenable if
    you can use int as both an arbitrary array index and a fast type.
     
    Malcolm McLean, Jan 18, 2008
    #2
    1. Advertisements

  3. euler70

    Flash Gordon Guest

    Malcolm McLean wrote, On 18/01/08 09:09:
    Yes. Also pointer to char is the type normally taken by "string"
    functions including (but not limited to) those in the standard library.


    Well, stdint.h was only introduced in the 1999 standard, a standard that
    is not fully implemented by many compilers and not at all by at least
    one major player.

    Yes, this is true, and an excellent reason for using them. It does limit
    portability to implementations that do not support this part of C99,
    however it is always possible to write your own version of stdint.h for
    those systems.


    Yes. Also due to the possibly smaller size they can be *faster* for some
    purposes. For example if the smaller size means everything is kept in
    cache instead of having to be fetched.

    Yes, that is a potential advantage.

    There is an argument that fseek should have used another type...

    Of course, that is why there is fsetpos.

    Yes, and it is not entirely the fault of MS. It is the programmers who
    assumed that it would always be exactly 32 bits and/or assumed it would
    always be the same size as int. Not breaking such 3rd party code, as I
    understand it, was the reason for MS keeping long as 32 bits on Win64.
    Note that this is an opinion that seems to be almost unique to Malcolm.
    A lot of what Malcolm disagrees with was part of the original standard
    published in 1989 so he has a very strange idea of "rapidly".

    Note that some others think the fixed width types are a mistake,
    although some of us disagree, there being arguments on both side. More
    people (I think) would have liked things like int32_t being the fast
    types and having int_exact or int_fixed for the optional fixed sized types.
    In *your* opinion.
    Which is not its purpose and does not agree with the way modern
    processors work since often a 32 bit integer will be faster than a 64
    bit integer (even on 64 bit hardware) yet you need a 64 bit integer for
    an arbitrary index. Similar things were true for eariler HW in terms of
    16/32 bits. So Malcolm's view is that C has to meet two contradictory
    requirements at the same time.
     
    Flash Gordon, Jan 18, 2008
    #3
  4. euler70

    Bart C Guest

    My own view is the opposite. How can one program without knowing the bitsize
    of one's datatypes? But I've probably spent too many years low-level
    programming.

    I gave one example in c.l.c recently of a long int datatype that was either
    32 or 64-bits depending on compiler -- on the same processor. So if you port
    a supposedly portable program from one compiler to another, it could run
    much slower (or faster).

    Anyway for interfacing with other (binary) software, the exact sizes of
    datatypes becomes important. Or (in my case) trying to interface from
    another language to C.
     
    Bart C, Jan 18, 2008
    #4
  5. Flash Gordon said:
    Here at least, I can agree with you.
    I would rather not have seen these types at all, since they seem to me to
    enshrine poor practice. But I recognise that there are some spheres of
    programming in which they could prove useful.
    I think he has a point. At the very least, it becomes *uglier* to read. C
    as it stands, if well-written, is at least a relatively elegant language,
    not just technically and syntactically but also visually. All these
    stretched-out underscore-infested type names will be a real nuisance when
    scanning quickly through unfamiliar code.

    <snip>
     
    Richard Heathfield, Jan 18, 2008
    #5
  6. Bart C said:
    We know the minimum value range of our data types - why would we need to
    know more than that?
    Okay, that's one reason. Any more? Huh? Huh? :)
     
    Richard Heathfield, Jan 18, 2008
    #6
  7. If you've got to interface with assembly, yes, there is no real option other
    than to check through the calling code making sure that the bits are as you
    want them. I don't think there's an answer; sometimes you want integers the
    size of registers, sometimes fixed sizes, and the assembly code is as likely
    to know as the calling C code.

    However normally you don't. The integer represents something, which is
    normally an index into an array (even chars are really usually indices into
    glyph tables). So what you need is an integer that can index the biggest
    array possible, and is also fast. Which on some architectures is a bit of a
    contradiction, because the vast majority of your arrays will never grow to
    more than a hundred items or so, whilst there is a flat memory space that is
    many gigabytes in size.
     
    Malcolm McLean, Jan 18, 2008
    #7
  8. euler70

    Flash Gordon Guest

    Bart C wrote, On 18/01/08 11:03:

    A lot of the time, very easily.
    It could run much slower of faster due to the quality of implementation
    of the standard library as well.
    Often you *still* don't need to know. I can interface C and C++ without
    needing to know, all I need to know is that the compilers that I am
    using for the two languages use the same sizes, not what they are. If it
    is simply a binary library designed for C on my platform then again all
    I need to know is the types (which should be specified in a header file)
    and not their sizes.

    There *are* times when you need to know, and there are times when you
    need to know a type has at least a specific range, but a lot of the time
    you do not care if it is larger.
     
    Flash Gordon, Jan 18, 2008
    #8
  9. euler70

    pete Guest

    Two reasons:
    1 int is everywhere.
    int_fast16_t isn't avaailable on all C implementations.
    2 int exactly fits the description of what you said you wanted
    "want just a signed integer with at least 16 width"
     
    pete, Jan 18, 2008
    #9
  10. euler70

    Flash Gordon Guest

    Richard Heathfield wrote, On 18/01/08 11:29:
    That, in my opinion, is an argument over the chosen names rather than
    the addition of the types. Personally I don't find underscores in names
    a problem for scanning, especially once I have learnt the patterns.
     
    Flash Gordon, Jan 18, 2008
    #10
  11. euler70

    Bart C Guest

    For ... performance?

    If I know I need, say, 32-bits, but not 64-bits which would be an overkill,
    what do I use? int could be only 16. long int is at least 32 but could be
    64.

    I would have to choose long int but at the risk of being inefficient (I
    might need a few hundred million of them).

    If I distribute an appl as source code I don't have control of the final
    size unless I specify the exact compiler and version.

    So it's necessary to use alternatives, like the stuff in stdint.h.
     
    Bart C, Jan 18, 2008
    #11
  12. Bart C said:
    <shrug> For some, maybe. I generally find that simply selecting good
    algorithms is sufficient to give me "good-enough" performance. Yeah,
    absolutely, there are some situations where you need to hack every last
    spare clock out, but they are not as common as people like to imagine. I'd
    rather write clear code than fast code (which doesn't mean I don't like my
    code to be fast). And in any case, when you start selecting types based on
    their performance, it won't be long before you discover that what's faster
    on one machine could well turn out to be slower on another.

    If you'd said that *you* find it necessary, okay, I'd have to accept that,
    obviously. I don't think I've ever found it necessary, though.
     
    Richard Heathfield, Jan 18, 2008
    #12
  13. euler70

    euler70 Guest

    [snip]

    The problem with signed char and [unsigned] (short|int|long|long long)
    is that they are too general purpose. They are at an awkward
    not-very-useful spot between the super-high-level "give me an integer
    object in which I can store any integer value" and the super-low-level
    "give me this many bits that I can play with". As a result, what seems
    to have happened in practice is that different camps have created
    their own de-facto purposes for some of these types. For example in
    the world of Windows, long is essentially a type that has exactly 32
    bits. Elsewhere, long may be the de-facto 64-bit type.

    For portable code, this can become detrimental to efficiency. If I
    want a fast type with at least 32-bit range, C90 says I should choose
    long. This might end up being a good choice for one compiler, but on
    another compiler where long has another de-facto purpose for long that
    causes long to be significantly less efficient than another available
    at-least-32-bit type, then half of the intent behind my choice of long
    has been ruined.

    If you argue against the preceding paragraph by saying "you should not
    be so concerned about efficiency", then I think your reasoning is a
    very short step away from concluding that we can discard our worries
    about type ranges and efficiency and simply use only intmax_t and
    uintmax_t everywhere. Surely this is not the level of abstraction that
    C is intended for.

    This is the reasoning that has led me to conclude that the
    int_(fast|least)N types are more useful than signed char and
    [unsigned] (short|int|long|long long). They allow me to state my
    entire intent instead of stating only half of it and hoping the other
    half works out. Having types that allow me to say "I want the
    (fastest|smallest) type that gives me at least N bits" is more useful
    than having types that only allow me to say "I want a type that gives
    me at least N bits".
     
    euler70, Jan 18, 2008
    #13
  14. euler70

    James Kuyper Guest



    Keep in mind that the C type system grew over decades. The committee
    considers backwards compatibility to be very important (IMO, correctly),
    but it has also attempted to alleviate some of the problems associated
    with the original language design. As a result of those conflicting
    goals, C has a lot of internal inconsistencies.

    If it had been designed from scratch with something similar to the
    current result in mind, we would probably have only the size-named types
    from stdint.h, they wouldn't require a special header, and they'd
    probably have simpler, shorter names. Aside from the fact that their
    names are easier to type, char, short, int, and don't have any inherent
    advantage over the size-named types.

    If and when C99 gets fully adopted by most mainstream compilers and
    programmers, the main remaining reason for using char, short, int, or
    long, will be that your code must be compatible with an interface
    defined in terms of those types. That applies to almost the entire C
    standard library, as well as large parts of most of the other libraries
    I've used.
     
    James Kuyper, Jan 18, 2008
    #14

  15. I absolutely agree.
    I think we are creating a mess with all these integer types and conventions
    on how they should be used.

    However generally I want an integer which can index or count an array, and
    is fast, and is signed, because intermediate calculations might go below
    zero.
    This is usually semi-achievable. There will be a fast type the same width as
    the address bus, the one bit we can ignore as we are unlikely to want a char
    array taking up half memory (we can always resort to "unsigned" for that
    special situation), but it might not be the fastest type, and most arrays
    will in fact be a lot smaller than the largest possible array.
     
    Malcolm McLean, Jan 18, 2008
    #15
  16. Flash Gordon said:
    Yes, it is. My argument against the new types *was* that they are
    unnecessary, but I accept that what I really mean is that *I* don't see a
    need for them in the kind of code I tend to write. If they will bring real
    benefits to other C programmers, well, they're a wart I can live with,
    since at least I won't have to come across it all that often, and then
    only in other people's code, not my own.

    But they could have found better names, surely? Abigail, for instance. Or
    Rhododendron.

    Yeah, all right, maybe not those precise names... :)
    Is ugliness a problem? I guess ugliness is in the eye of the beholder.
     
    Richard Heathfield, Jan 18, 2008
    #16
  17. said:

    It depends. :) We cannot and should not *ignore* efficiency.
    Nevertheless, there is more to life than speed. Correctness, clarity,
    generality and portability are all important too. But, as I said before,
    there *are* occasions when you need to push the hardware as fast as it
    will go. I do accept that.

    <snip>
     
    Richard Heathfield, Jan 18, 2008
    #17
  18. euler70

    Flash Gordon Guest

    Malcolm McLean wrote, On 18/01/08 12:19:
    Please be aware that Malcolm seems to be the only person who thinks this.
    As a result Malcolm seems to be the only person who thinks this.

    The flaws in Malcolm's arguments have been pointed out many times so you
    should be able to find them using Google.
     
    Flash Gordon, Jan 18, 2008
    #18
  19. euler70

    Flash Gordon Guest

    Richard Heathfield wrote, On 18/01/08 13:37:
    <snip discussion of types in stdint.h>

    I think we reached this point before.
    Of course not those names, they should be Brenda, Heather... ;-)
    It is indeed. I'm so used to underscores in names that I don't see them
    as such I just see N words grouped together.
     
    Flash Gordon, Jan 18, 2008
    #19
  20. euler70

    Flash Gordon Guest

    Malcolm McLean wrote, On 18/01/08 13:25:


    Do you realise you are agreeing to having several integer types? I
    though you wanted there to be only a single one size fits hardly anyone
    integer type!
    Urm, he was just saying to use the new types which you say make things work!
    Unsigned arithmetic handles that nicely, but to meet your desires I
    suggest you use ptrdiff_t and size_t as appropriate. These types have
    been there since at least the 1989 standard was implemented.
    See above, the types you want have been supported for a long time. If
    you don't like the spelling you can typedef them to something else.
     
    Flash Gordon, Jan 18, 2008
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.