Intrinsic Minimums

Discussion in 'C++' started by JKop, Jul 22, 2004.

  1. JKop

    JKop Guest

    I've been searching the Standard for info about the minimum
    "bits" of the intrinsic types, but haven't been able to
    find it. Could anyone please point me to it?

    -JKop
     
    JKop, Jul 22, 2004
    #1
    1. Advertising

  2. JKop

    Andre Kostur Guest

    JKop <> wrote in news:j%ULc.5298$:

    > I've been searching the Standard for info about the minimum
    > "bits" of the intrinsic types, but haven't been able to
    > find it. Could anyone please point me to it?


    It's not defined. Best you've got is that sizeof(char) == 1, and sizeof
    (short) <= sizeof(int) <= sizeof(long).

    However, that's in bytes, not bits. It is implementation-defined as to how
    many bits are in a byte. sizeof(int) is the "natural size suggested by the
    architecture of the execution environment'. (Section 3.9)

    And I think CHAR_BIT specifies the number of bits in a char... but that
    appears to be defined in the C Standard (in <limits.h>)
     
    Andre Kostur, Jul 22, 2004
    #2
    1. Advertising

  3. On Thu, 22 Jul 2004 22:10:35 GMT, Andre Kostur <> wrote:

    > JKop <> wrote in news:j%ULc.5298$:
    >
    >> I've been searching the Standard for info about the minimum
    >> "bits" of the intrinsic types, but haven't been able to
    >> find it. Could anyone please point me to it?

    >
    > It's not defined. Best you've got is that sizeof(char) == 1, and sizeof
    > (short) <= sizeof(int) <= sizeof(long).
    >
    > However, that's in bytes, not bits. It is implementation-defined as to
    > how
    > many bits are in a byte. sizeof(int) is the "natural size suggested by
    > the
    > architecture of the execution environment'. (Section 3.9)
    >
    > And I think CHAR_BIT specifies the number of bits in a char... but that
    > appears to be defined in the C Standard (in <limits.h>)


    C requires short >= 16 bits, int >= 16 bits, long >= 32 bits. These
    minimums are implied by the constraints given on INT_MIN, INT_MAX etc. in
    <limits.h>. Presumably C++ inherits this from C.

    john
     
    John Harrison, Jul 22, 2004
    #3
  4. JKop

    JKop Guest

    John Harrison posted:


    > C requires short >= 16 bits, int >= 16 bits, long >= 32

    bits. These
    > minimums are implied by the constraints given on INT_MIN,

    INT_MAX etc.
    > in
    ><limits.h>. Presumably C++ inherits this from C.
    >
    > john



    I'm writing a prog that'll use Unicode. To represent a
    Unicode character, I need a data type that can be set to
    65,536 distinct possible values; which in the today's world
    of computing equates to 16 bits. wchar_t is the natural
    choice, but is there any guarantee in the standard that'll
    it'll be 16 bits? If not, then is unsigned short the way to
    go?

    This might sound a bit odd, but... if an unsigned short
    must be atleast 16 bits, then does that *necessarily* mean
    that it:

    A) Must be able to hold 65,536 distinct values.
    B) And be able to store integers in the range 0 -> 65,535 ?

    Furthermore, does a signed short int have to be able to
    hold a value between:

    A) -32,767 -> 32,768

    B) -32,768 -> 32,767

    I've also heard that some systems are stupid enough (opps!
    I mean poorly enough designed) to have two values for zero,
    resulting in:

    -32,767 -> 32,767


    For instance, I don't care if some-one tells me it's 3
    bits, just so long as it can hold 65,536 distinct values!


    -JKop
     
    JKop, Jul 22, 2004
    #4
  5. JKop

    JKop Guest

    I've just realized something:

    char >= 8 bits
    short int >= 16 bits
    int >= 16 bits
    long int >= 32 bits

    And:

    short int >= int >= long int


    On WinXP, it's like so:

    char : 8 bits
    short : 16 bits
    int : 32 bits
    long : 32 bits


    Anyway,

    Since there's a minimum, why haven't they just be given definite values!
    like:

    char : 8 bits
    short : 16 bits
    int : 32 bits
    long : 64 bits

    or maybe even names like:

    int8
    int16
    int32
    int64

    And so then if you want a greater amount of distinct possible values,
    there'll be standard library classes. For instance, if you want a 128-Bit
    integer, then you're looking for a data type that can store 3e+38 approx.
    distinct values. Well... if a 64-Bit integer can store 1e+19 approx values,
    then put two together and viola, you've got a 128-Bit number:

    class int128
    {
    private:

    int64 a;
    int64 b;
    //and so on
    };

    Or while I'm thinking about that, why not be able to specify whatever size
    you want, as in:

    int8 number_of_sisters;

    int16 population_my_town;

    int32 population_of_asia;

    int64 population_of_earth;

    or maybe even:

    int40 population_of_earth;


    Some people may find that this is a bit ludicrious, but you can do it
    already yourself with classes: if you want a 16,384 bit number, then all you
    need to do is:

    class int16384
    {
    private:
    unsigned char data[2048];

    //and so on

    };

    Or maybe even be able to specify how many distinct possible combinations you
    need. So for unicode:

    unsigned int<65536> username[15];


    This all seems so simple in my head - why can't it just be as so!


    -JKop
     
    JKop, Jul 23, 2004
    #5
  6. JKop

    Andre Kostur Guest

    JKop <> wrote in news:IWXLc.5317$:

    >
    > I've just realized something:
    >
    > char >= 8 bits
    > short int >= 16 bits
    > int >= 16 bits
    > long int >= 32 bits
    >
    > And:
    >
    > short int >= int >= long int
    >
    >
    > On WinXP, it's like so:
    >
    > char : 8 bits
    > short : 16 bits
    > int : 32 bits
    > long : 32 bits


    That's one platform. There are also platforms with 9 bit chars, and 36
    bit ints..... (at least if I recall correctly, it was 36 bits...)
     
    Andre Kostur, Jul 23, 2004
    #6
  7. JKop wrote:
    > I'm writing a prog that'll use Unicode. To represent a
    > Unicode character, I need a data type that can be set to
    > 65,536 distinct possible values; which in the today's world
    > of computing equates to 16 bits. wchar_t is the natural
    > choice, but is there any guarantee in the standard that'll
    > it'll be 16 bits? If not, then is unsigned short the way to
    > go?


    16 bits will always store 65,536 distinct values, regardless of what
    day's world the programmer is living in, and regardless of how the
    platform interprets those 65,536 values (eg, positive and negative 0).

    as far as my reading goes there are no explicit guarantees for the size
    of wchar_t. however, wchar_t will be (in essence) an alias for one of
    the other integer types. what i am not sure of is whether or not
    "integer types" includes any of the char derivatives. if not, then the
    underlying type for wchar_t must be either short, int, or long, which
    would therefore imply a minimum of 16 bits.

    could someone confirm or deny my interpretation here?

    now, on another less c++-y note, you have made the classic java
    "misunderestimation" of unicode. unicode characters may require up to 32
    bits, not 16
    (http://www.unicode.org/standard/principles.html#Encoding_Forms). given
    that gem, your *best* bet would appear to be not wchar_t, not short, but
    *long*.

    of course, you should have no problem extending the iostreams, strings,
    etc. for the new character type ^_^. enjoy.

    indi
     
    Mark A. Gibbs, Jul 23, 2004
    #7
  8. JKop

    Old Wolf Guest

    JKop <> wrote:
    >
    > I'm writing a prog that'll use Unicode. To represent a
    > Unicode character, I need a data type that can be set to
    > 65,536 distinct possible values;


    There were more than 90,000 possible Unicode characters last
    time I looked (there are probably more now).

    If you use a 16-bit type to store this, you have to either:
    - Ignore characters whose code is > 65535, or
    - Use a multi-byte encoding such as UTF-16, and then all of
    your I/O functions will have to be UTF-16 aware.

    > which in the today's world of computing equates to 16 bits.


    A bit of mathematical thought will convince you that you need
    at least 16 Binary digITs to represent 2^16 values.

    > wchar_t is the natural choice, but is there any guarantee
    > in the standard that'll it'll be 16 bits?


    No, in fact it's unlikely to be 16 bit. It's only guaranteed to be
    able to support "the largest character in all supported locales",
    and locales are implementation-dependent, so it could be 8-bit on
    a system with no Unicode support.

    On MS windows, some compilers (eg. gcc) have 32-bit wchar_t and
    some (eg. Borland, Microsoft) have 16-bit. On all other systems
    that I've encountered, it is 32-bit.

    This is quite annoying (for people whose language falls in the
    over-65535 region especially). One can only hope that MS will
    eventually come to their senses, or perhaps that someone will
    standardise a system of locales.

    If you want to write something that's portable to all Unicode
    platforms, you will have to use UTF-16 for your strings,
    unfortunately. This means you can't use all the standard library
    algorithms on them. Define a type "utf16_t" which is an unsigned short.

    The only other alternative is to use wchar_t and decode UTF-16
    to plain wchar_t (ignoring any characters outside the range of
    your wchar_t) whenever you receive a wchar_t string encoded as
    UTF-16. (and don't write code that's meant to be used by Chinese).

    > This might sound a bit odd, but... if an unsigned short
    > must be atleast 16 bits, then does that *necessarily* mean
    > that it:
    >
    > A) Must be able to hold 65,536 distinct values.
    > B) And be able to store integers in the range 0 -> 65,535 ?


    Yes

    > Furthermore, does a signed short int have to be able to
    > hold a value between:
    >
    > A) -32,767 -> 32,768
    >
    > B) -32,768 -> 32,767


    No

    > I've also heard that some systems are stupid enough (opps!
    > I mean poorly enough designed) to have two values for zero,
    > resulting in:
    >
    > -32,767 -> 32,767


    Yes (these are all archaic though, from a practical point of
    view you can assume 2's complement, ie. -32768 to 32767).
    FWIW the 3 supported systems are (for x > 0):
    2's complement: -x == ~x + 1
    1's complement: -x == ~x
    sign-magnitude: -x = x & (the sign bit)
     
    Old Wolf, Jul 23, 2004
    #8
  9. JKop

    Sharad Kala Guest


    > C requires short >= 16 bits, int >= 16 bits, long >= 32 bits. These
    > minimums are implied by the constraints given on INT_MIN, INT_MAX etc. in
    > <limits.h>. Presumably C++ inherits this from C.


    Yes, that's correct.
    At the end of section 18.2.2, there is a specific reference to ISO C
    subclause
    5.2.4.2.1. So this section is included by reference. This section gives
    definition of CHAR_BIT, UCHAR_MAX etc.

    -Sharad
     
    Sharad Kala, Jul 23, 2004
    #9
  10. >
    > This might sound a bit odd, but... if an unsigned short
    > must be atleast 16 bits, then does that *necessarily* mean
    > that it:
    >
    > A) Must be able to hold 65,536 distinct values.
    > B) And be able to store integers in the range 0 -> 65,535 ?
    >


    Yes, USHRT_MIN must be at least 65535, and all unsigned types must obey
    the laws of modulo 2-to-the-power-N arithmetic where N is the number of
    bits. I think that implies that the minimum value is 0, and that all
    values between 0 and 2 to-the-power-N - 1 must be represented.

    > Furthermore, does a signed short int have to be able to
    > hold a value between:
    >
    > A) -32,767 -> 32,768
    >
    > B) -32,768 -> 32,767
    >
    > I've also heard that some systems are stupid enough (opps!
    > I mean poorly enough designed) to have two values for zero,
    > resulting in:
    >
    > -32,767 -> 32,767
    >


    That's correct. I seriously doubt you would meet such a system in practise
    (except in a museum).

    >
    > For instance, I don't care if some-one tells me it's 3
    > bits, just so long as it can hold 65,536 distinct values!
    >
    >
    > -JKop


    john
     
    John Harrison, Jul 23, 2004
    #10
  11. JKop

    JKop Guest

    Mark A. Gibbs posted:

    > of course, you should have no problem extending the

    iostreams, strings,
    > etc. for the new character type ^_^. enjoy.


    You're absolutley correct

    basic_string<unsigned long> stringie;


    -JKop
     
    JKop, Jul 23, 2004
    #11
  12. JKop

    JKop Guest

    Old Wolf posted:

    > from a practical point of
    > view you can assume 2's complement, ie. -32768 to 32767).
    > FWIW the 3 supported systems are (for x > 0):
    > 2's complement: -x == ~x + 1
    > 1's complement: -x == ~x
    > sign-magnitude: -x = x & (the sign bit)
    >


    sS wouldn't that be -32,767 -> 32,768?

    I assume that 1's compliment is the one that has both positive and negative
    0.


    As for the sign-magnitude thingie, that's interesting!

    unsigned short blah = 65535;

    signed short slah = blah;

    slah == -32767 ? ?


    -JKop
     
    JKop, Jul 23, 2004
    #12
  13. "JKop" <> wrote in message
    news:ma6Mc.5340$...
    > Mark A. Gibbs posted:
    >
    > > of course, you should have no problem extending the

    > iostreams, strings,
    > > etc. for the new character type ^_^. enjoy.

    >
    > You're absolutley correct
    >
    > basic_string<unsigned long> stringie;
    >


    I think you'll also need a char_traits class.

    basic_string<unsigned long, ul_char_traits> stringie;

    john
     
    John Harrison, Jul 23, 2004
    #13
  14. JKop

    Rolf Magnus Guest

    JKop wrote:

    > John Harrison posted:
    >
    >
    >> C requires short >= 16 bits, int >= 16 bits, long >= 32

    > bits. These
    >> minimums are implied by the constraints given on INT_MIN,

    > INT_MAX etc.
    >> in
    >><limits.h>. Presumably C++ inherits this from C.
    >>
    >> john

    >
    >
    > I'm writing a prog that'll use Unicode. To represent a
    > Unicode character, I need a data type that can be set to
    > 65,536 distinct possible values;


    No, you need more for full unicode support.

    > which in the today's world of computing equates to 16 bits. wchar_t is
    > the natural choice, but is there any guarantee in the standard that'll
    > it'll be 16 bits?


    It doesn't need to be exactly 16 bit. It can be more. In g++, it's 32
    bits.

    > If not, then is unsigned short the way to go?
    >
    > This might sound a bit odd, but... if an unsigned short
    > must be atleast 16 bits, then does that *necessarily* mean
    > that it:
    >
    > A) Must be able to hold 65,536 distinct values.
    > B) And be able to store integers in the range 0 -> 65,535 ?


    It's actually rather the other way round. It must explicitly be able to
    hold at least the range from 0 to 65535, which implies a minimum of 16
    bits.

    > Furthermore, does a signed short int have to be able to
    > hold a value between:
    >
    > A) -32,767 -> 32,768
    >
    > B) -32,768 -> 32,767


    Neither.

    > I've also heard that some systems are stupid enough (opps!
    > I mean poorly enough designed) to have two values for zero,
    > resulting in:
    >
    > -32,767 -> 32,767


    That's the minimum range that a signed short int must support.
     
    Rolf Magnus, Jul 23, 2004
    #14
  15. JKop

    Rolf Magnus Guest

    JKop wrote:

    >
    > I've just realized something:
    >
    > char >= 8 bits
    > short int >= 16 bits
    > int >= 16 bits
    > long int >= 32 bits


    Yes.

    > And:
    >
    > short int >= int >= long int


    Uhm, no. But I guess it's just a typo :)

    > On WinXP, it's like so:
    >
    > char : 8 bits
    > short : 16 bits
    > int : 32 bits
    > long : 32 bits
    >
    >
    > Anyway,
    >
    > Since there's a minimum, why haven't they just be given definite
    > values! like:
    >
    > char : 8 bits
    > short : 16 bits
    > int : 32 bits
    > long : 64 bits


    Because there are other platforms for which other sizes may fit better.
    There are even systems that only support data types with a multple of
    24bit as size. C++ can still be implemented on those, because the size
    requirements in the standard don't have fixed values. Also, int is
    supposed (though not required) to be the machine's native type that is
    the fastest one. On 64 bit platforms, it often isn't though.

    > or maybe even names like:
    >
    > int8
    > int16
    > int32
    > int64


    C99 has something like this in the header <stdint.h>. It further defines
    smallest and fastest integers with a specific minimum size, like:

    int_fast16_t
    int_least32_t

    This is a good thing, because an exact size is only needed rarely. Most
    often, you don't care for the exact size as long as it's the fastest
    resp. smallest type that provides at least a certain range.

    > And so then if you want a greater amount of distinct possible values,
    > there'll be standard library classes. For instance, if you want a
    > 128-Bit integer, then you're looking for a data type that can store
    > 3e+38 approx. distinct values. Well... if a 64-Bit integer can store
    > 1e+19 approx values, then put two together and viola, you've got a
    > 128-Bit number:
    >
    > class int128
    > {
    > private:
    >
    > int64 a;
    > int64 b;
    > //and so on
    > };
    >
    > Or while I'm thinking about that, why not be able to specify whatever
    > size you want, as in:
    >
    > int8 number_of_sisters;
    >
    > int16 population_my_town;
    >
    > int32 population_of_asia;
    >
    > int64 population_of_earth;
    >
    > or maybe even:
    >
    > int40 population_of_earth;
    >
    >
    > Some people may find that this is a bit ludicrious, but you can do it
    > already yourself with classes: if you want a 16,384 bit number, then
    > all you need to do is:
    >
    > class int16384
    > {
    > private:
    > unsigned char data[2048];
    >
    > //and so on
    >
    > };
    >
    > Or maybe even be able to specify how many distinct possible
    > combinations you need. So for unicode:
    >
    > unsigned int<65536> username[15];
    >
    >
    > This all seems so simple in my head - why can't it just be as so!


    It isn't as simple as you might think. If it were, you could just start
    writing a proof-of-concept implementation. :)
     
    Rolf Magnus, Jul 23, 2004
    #15
  16. (Old Wolf) wrote in message news:<>...
    > JKop <> wrote:
    > On MS windows, some compilers (eg. gcc) have 32-bit wchar_t and
    > some (eg. Borland, Microsoft) have 16-bit. On all other systems
    > that I've encountered, it is 32-bit.
    >
    > This is quite annoying (for people whose language falls in the
    > over-65535 region especially). One can only hope that MS will
    > eventually come to their senses, or perhaps that someone will
    > standardise a system of locales.


    It is hardly annoying for these people, because they died long before
    computers were invented. Non-BMP region contains mostly symbols for
    dead languages.
     
    Nemanja Trifunovic, Jul 23, 2004
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?R2xlbm4=?=

    trouble utilizing intrinsic objects in custom class

    =?Utf-8?B?R2xlbm4=?=, May 19, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    492
    =?Utf-8?B?R2xlbm4=?=
    May 19, 2004
  2. Alek Davis
    Replies:
    15
    Views:
    7,235
    Scott M.
    Nov 12, 2009
  3. Alan Ho

    Intrinsic objects of ASP.Net

    Alan Ho, May 9, 2005, in forum: ASP .Net
    Replies:
    5
    Views:
    2,444
    Juan T. Llibre
    May 9, 2005
  4. jason
    Replies:
    2
    Views:
    1,814
    jason
    Jan 10, 2006
  5. Jeremy Cowles

    STD types vs C++ intrinsic types

    Jeremy Cowles, Aug 18, 2003, in forum: C++
    Replies:
    5
    Views:
    1,912
    Bob Jacobs
    Aug 19, 2003
Loading...

Share This Page