Machines where size of size_t is not equal to size of unsigned int/long

Discussion in 'C Programming' started by James Harris, Sep 30, 2013.

  1. James Harris

    James Harris Guest

    AIUI for many CPUs and CPU modes a size_t could be typedef'd to unsigned int
    or unsigned long. I wondered where that would not be the case. Anyone know
    which CPUs or modes would have a size_t which was not the same size as
    unsigned int or unsigned long?

    James
    James Harris, Sep 30, 2013
    #1
    1. Advertising

  2. James Harris

    tim prince Guest

    Re: Machines where size of size_t is not equal to size of unsignedint/long

    On 09/30/2013 07:00 AM, James Harris wrote:
    > AIUI for many CPUs and CPU modes a size_t could be typedef'd to unsigned int
    > or unsigned long. I wondered where that would not be the case. Anyone know
    > which CPUs or modes would have a size_t which was not the same size as
    > unsigned int or unsigned long?
    >
    > James
    >
    >

    Most of our work nowadays is on the AMD64/Intel64 linux, or
    corresponding Windows X64, where size_t is a 64-bit data type, but int
    is a 32-bit type. On Windows, long int also is a 32-bit type.
    I don't know how those software vendors who vowed long ago not to
    support platforms where size_t differs from unsigned int can survive.

    --
    Tim Prince
    tim prince, Sep 30, 2013
    #2
    1. Advertising

  3. James Harris

    Jorgen Grahn Guest

    Re: Machines where size of size_t is not equal to size of unsignedint/long

    On Mon, 2013-09-30, wrote:
    > On Monday, September 30, 2013 12:00:22 PM UTC+1, James Harris wrote:
    >> AIUI for many CPUs and CPU modes a size_t could be typedef'd to unsigned int
    >> or unsigned long. I wondered where that would not be the case. Anyone know
    >> which CPUs or modes would have a size_t which was not the same size as
    >> unsigned int or unsigned long?
    >>
    >> James

    >
    > It's not a question of "machines" it is a matter of "implementation".


    To be fair, "machine" is often an euphemism for "machine, plus the
    tradeoffs made by the ABI and/or compiler vendor".

    (But yes, it's useful to point out that there's a distinction.)

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Sep 30, 2013
    #3
  4. Re: Machines where size of size_t is not equal to size of unsignedint/long

    On 30-Sep-13 15:56, wrote:
    > On Monday, September 30, 2013 12:00:22 PM UTC+1, James Harris wrote:
    >> AIUI for many CPUs and CPU modes a size_t could be typedef'd to
    >> unsigned int or unsigned long. I wondered where that would not be
    >> the case. Anyone know which CPUs or modes would have a size_t which
    >> was not the same size as unsigned int or unsigned long?

    >
    > It's not a question of "machines" it is a matter of "implementation".
    >
    > If you created a compiler for lets say a modern Intel processor
    > running in 64-bit mode, you'd reasonably use a 64 bit unsigned
    > integer type for size_t.. You'd also likely make unsigned long long a
    > 64 bit unsigned integer.


    That's the minimum, and x86-64 has no hardware support for a wider
    integer type, so that's the only logical choice.

    > On the other hand, there is no particular reason why int and long
    > shouldn't both be 32 bit,


    There is disagreement on that even within the x86-64 world: Microsoft
    chose IL32LLP64, presumably to make porting from Win32 easier, but the
    POSIX world standardized on I32LP64.

    While obviously not x86-64, it's notable that most implementations for
    Alpha were ILP64, rather than I32LP64. IIRC, Windows NT was ILP32!

    S

    --
    Stephen Sprunk "God does not play dice." --Albert Einstein
    CCIE #3723 "God is an inveterate gambler, and He throws the
    K5SSS dice at every possible opportunity." --Stephen Hawking
    Stephen Sprunk, Oct 1, 2013
    #4
  5. James Harris

    James Harris Guest

    "Jorgen Grahn" <> wrote in message
    news:...
    > On Mon, 2013-09-30, wrote:
    >> On Monday, September 30, 2013 12:00:22 PM UTC+1, James Harris wrote:
    >>> AIUI for many CPUs and CPU modes a size_t could be typedef'd to unsigned
    >>> int
    >>> or unsigned long. I wondered where that would not be the case. Anyone
    >>> know
    >>> which CPUs or modes would have a size_t which was not the same size as
    >>> unsigned int or unsigned long?
    >>>
    >>> James

    >>
    >> It's not a question of "machines" it is a matter of "implementation".

    >
    > To be fair, "machine" is often an euphemism for "machine, plus the
    > tradeoffs made by the ABI and/or compiler vendor".
    >
    > (But yes, it's useful to point out that there's a distinction.)


    Agreed.

    People have pointed out the differences that could be found on x86-64. I
    appreciate the info and it was one I hadn't thought of but I was principally
    wondering about CPUs which are still in use today where the size of their
    addresses cannot be made to match the size of any of their integer types
    despite the implementation.

    The only one I can think of is old real-mode x86 using far pointers where an
    address is 20 bits but the integers can be only 16-bit or 32-bit.

    I suppose the same mismatch might occur where a machine has separate address
    and data registers and they have different sizes but would guess they are
    not common.

    Some machines used words which were not a power of 2 but I don't know how
    they manipulated addresses. Presumably their addresses were often smaller
    than their word size and few or none of those are still in use.

    James
    James Harris, Oct 1, 2013
    #5
  6. James Harris

    Noob Guest

    Re: Machines where size of size_t is not equal to size of unsignedint/long

    Stephen Sprunk wrote:

    > Christian Bau wrote:
    >
    >> If you created a compiler for lets say a modern Intel processor
    >> running in 64-bit mode, you'd reasonably use a 64 bit unsigned
    >> integer type for size_t.. You'd also likely make unsigned long long a
    >> 64 bit unsigned integer.

    >
    > That's the minimum, and x86-64 has no hardware support for a wider
    > integer type, so that's the only logical choice.


    Errr...

    x86-64 does have limited support for 128-bit GP integers, in the form
    of add-with-carry, widening multiply, and shift right/left double.
    (The same way x86 has limited support for 64-bit GP integers.)

    Therefore, it would not be unreasonable for an implementation to pick

    CHAR_BIT = 8, sizeof(int) = 4, sizeof(long) = 8, sizeof(long long) = 16

    and define uint32_t, uint64_t, uint128_t accordingly.

    Regards.
    Noob, Oct 1, 2013
    #6
  7. James Harris

    James Kuyper Guest

    Re: Machines where size of size_t is not equal to size of unsignedint/long

    On 10/01/2013 07:15 AM, James Harris wrote:
    ....
    > People have pointed out the differences that could be found on x86-64. I
    > appreciate the info and it was one I hadn't thought of but I was principally
    > wondering about CPUs which are still in use today where the size of their
    > addresses cannot be made to match the size of any of their integer types
    > despite the implementation.


    If that's what your actual question was about, you asked it very poorly.
    It seems to me that uintmax_t would be more relevant to your question
    than either unsigned int or unsigned long. On the machines you describe,
    size_t would probably be the same as uintmax_t, which might or might not
    be bigger than unsigned long, so asking about "not equal" also seems
    irrelevant. intptr_t is more relevant to the question you describe.
    intprt_t is optional, and on the machines you describe, could not be
    supported. So it would be more relevant to ask about "machines where
    intptr_t cannot be supported".

    > The only one I can think of is old real-mode x86 using far pointers where an
    > address is 20 bits but the integers can be only 16-bit or 32-bit.


    I don't understand how that's an example of what you say you're looking
    for. It might have required only 20 bits to uniquely specify a byte of
    addressable memory, but they were usually accessed as a 16-bit segment
    and a 16-bit offset, and could be stored in 32 bits, the same as
    unsigned long. With 8-bit bytes, they couldn't have been stored in 20
    bits. They could have been stored in 24-bit pointers, but I don't think
    that would have worked very well, and I'm not aware of any
    implementation that did so (though that could just be ignorance on my part).

    A system such as you describe would have to have addresses too big to
    fit in uintmax_t. Support for a 64 bit integer types is mandatory, even
    if only by software emulation. Therefore, addresses would have to be
    larger than that, and the implementor would have to have some good
    reason for not implementing an integer type of the same size.
    --
    James Kuyper
    James Kuyper, Oct 1, 2013
    #7
  8. Noob <root@127.0.0.1> writes:
    > Stephen Sprunk wrote:
    >> Christian Bau wrote:
    >>> If you created a compiler for lets say a modern Intel processor
    >>> running in 64-bit mode, you'd reasonably use a 64 bit unsigned
    >>> integer type for size_t.. You'd also likely make unsigned long long a
    >>> 64 bit unsigned integer.

    >>
    >> That's the minimum, and x86-64 has no hardware support for a wider
    >> integer type, so that's the only logical choice.

    >
    > Errr...
    >
    > x86-64 does have limited support for 128-bit GP integers, in the form
    > of add-with-carry, widening multiply, and shift right/left double.
    > (The same way x86 has limited support for 64-bit GP integers.)
    >
    > Therefore, it would not be unreasonable for an implementation to pick
    >
    > CHAR_BIT = 8, sizeof(int) = 4, sizeof(long) = 8, sizeof(long long) = 16
    >
    > and define uint32_t, uint64_t, uint128_t accordingly.


    Support for division is still mandatory. Of course it could be done in
    software.

    Another reasonable choice would be 64-bit [unsigned] long long and
    128-bit intmax_t/int128_t, which would require the use of extended
    integer types.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Oct 1, 2013
    #8
  9. Richard Damon <> wrote:
    > On 10/1/13 7:15 AM, James Harris wrote:


    (snip)
    >> The only one I can think of is old real-mode x86 using far
    >> pointers where an address is 20 bits but the integers can
    >> be only 16-bit or 32-bit.


    (snip)

    > Minor nit. In real-mode x86, while addresses only had 20 bits of
    > results, far pointers were 32 bits in length (4 bytes).


    In real mode 8086 and 8088 address were 20 bits.

    > Addresses had many redundant ways of being represented,
    > as the effective address was (segment << 4) + offset (with
    > both segment and offset being 16 bits).


    On later processors than the 8086 and 8088, the result of the
    addition was 21 bits. Because some programs depend on the result
    being 20 bits, extra hardware was added to zero A20 in real mode,
    but that could be turned off. Turning it off allowed real mode
    programs an extra 64K (almost) of memory.

    For protected mode 80286, you had a 16 bit segment selector
    and 16 bit offset. The selector selected an entry into
    a segment descriptor table giving a 24 bit origin and 16 bit
    length for each addressable segment.

    -- glen
    glen herrmannsfeldt, Oct 1, 2013
    #9
  10. James Harris

    Thomas Jahns Guest

    Re: Machines where size of size_t is not equal to size of unsignedint/long

    On 09/30/13 22:56, wrote:
    > On the other hand, there is no particular reason why int and long
    > shouldn't both be 32 bit, in which case size_t would be unsigned
    > long long, and have a different size from both unsigned int and
    > unsigned long.


    Actually there is: legacy code beforce even C89 is prone to assume a
    long can hold a pointer value. Definitely bad practice but happened to
    work almost universally back then.

    Regards, Thomas
    Thomas Jahns, Oct 8, 2013
    #10
  11. James Harris

    James Kuyper Guest

    Re: Machines where size of size_t is not equal to size of unsignedint/long

    On 10/08/2013 07:30 AM, Thomas Jahns wrote:
    > On 09/30/13 22:56, wrote:
    >> On the other hand, there is no particular reason why int and long
    >> shouldn't both be 32 bit, in which case size_t would be unsigned
    >> long long, and have a different size from both unsigned int and
    >> unsigned long.

    >
    > Actually there is: legacy code beforce even C89 is prone to assume a
    > long can hold a pointer value. Definitely bad practice but happened to
    > work almost universally back then.


    True, but that's not a particularly compelling reason. A policy of
    accommodating legacy code that has built-in assumptions about things
    left unspecified by the standard would prevent you from ever creating an
    implementation significantly different from the ones where those
    assumptions were valid. You should not expect to be able to port legacy
    code containing such assumptions to new systems; either it must be
    forever restricted to the steadily decreasing number of systems matching
    all of its assumptions, or you must sooner or later bite the bullet and
    remove at least some of those assumptions. You shouldn't used them as an
    argument to justify restricting new implementations.
    --
    James Kuyper
    James Kuyper, Oct 8, 2013
    #11
  12. James Harris

    James Kuyper Guest

    Re: Machines where size of size_t is not equal to size of unsignedint/long

    On 10/08/2013 11:05 PM, Robert Wessel wrote:
    > On Tue, 08 Oct 2013 07:46:23 -0400, James Kuyper
    > <> wrote:
    >
    >> On 10/08/2013 07:30 AM, Thomas Jahns wrote:
    >>> On 09/30/13 22:56, wrote:
    >>>> On the other hand, there is no particular reason why int and long
    >>>> shouldn't both be 32 bit, in which case size_t would be unsigned
    >>>> long long, and have a different size from both unsigned int and
    >>>> unsigned long.
    >>>
    >>> Actually there is: legacy code beforce even C89 is prone to assume a
    >>> long can hold a pointer value. Definitely bad practice but happened to
    >>> work almost universally back then.

    >>
    >> True, but that's not a particularly compelling reason. A policy of
    >> accommodating legacy code that has built-in assumptions about things
    >> left unspecified by the standard would prevent you from ever creating an
    >> implementation significantly different from the ones where those
    >> assumptions were valid. You should not expect to be able to port legacy
    >> code containing such assumptions to new systems; either it must be
    >> forever restricted to the steadily decreasing number of systems matching
    >> all of its assumptions, or you must sooner or later bite the bullet and
    >> remove at least some of those assumptions. You shouldn't used them as an
    >> argument to justify restricting new implementations.

    >
    >
    > Although it would be a foolish standards body that did something that
    > broke a non-standard but widely used assumption for no good reason.
    > Breaking a lot of existing code is generally a bad idea. A lot of the
    > cruft in the C standard is just that sort of accommodation.


    The comment I was responding to was not about a decision to be made by a
    standards body, but by an implementation. The assumption Thomas Jahns
    mentioned is, in C99 terms, that UINTPTR_MAX <= ULONG_MAX. He mentioned
    it in the context of legacy code that pre-dates C89, and therefore C99,
    so uintptr_t didn't even exist yet. However, the concept behind
    uintptr_t dates back to before C89. The standard allows that assumption
    to be true, and it allows it to be false (either because a type larger
    than unsigned long is needed , or because no supported integer type is
    big enough to meet the requirements for uintptr_t).

    It's individual implementors who decide whether or not that should be
    true for their implementation. That decision should be made on the basis
    of what's good for their intended customers, and sometimes it's better
    for to break legacy code than to make the accommodations needed to avoid
    breaking it. As long as someone needs the legacy code to be compilable,
    someone will maintain a compiler that has a mode that will allow it to
    be compiled, but that doesn't mean that all compilers need to be able to
    do so, nor even that it be the default mode for that compiler.
    --
    James Kuyper
    James Kuyper, Oct 9, 2013
    #12
  13. On Wednesday, October 9, 2013 12:27:13 PM UTC+1, James Kuyper wrote:
    > On 10/08/2013 11:05 PM, Robert Wessel wrote:
    >
    > >> You should not expect to be able to port legacy code containing such
    > >> assumptions to new systems; either it must be forever restricted to the
    > >> steadily decreasing number of systems matching all of its assumptions, or
    > >> you must sooner or later bite the bullet and remove at least some of those
    > >> assumptions. You shouldn't used them as an argument to justify restricting
    > >> new implementations.

    >
    >
    > > Although it would be a foolish standards body that did something that
    > > broke a non-standard but widely used assumption for no good reason.
    > > Breaking a lot of existing code is generally a bad idea. A lot of the
    > > cruft in the C standard is just that sort of accommodation.

    >
    > It's individual implementors who decide whether or not that should be
    > true for their implementation. That decision should be made on the basis
    > of what's good for their intended customers, and sometimes it's better
    > for to break legacy code than to make the accommodations needed to avoid
    > breaking it. As long as someone needs the legacy code to be compilable,
    > someone will maintain a compiler that has a mode that will allow it to
    > be compiled, but that doesn't mean that all compilers need to be able to
    > do so, nor even that it be the default mode for that compiler.
    >

    A real example of this happening is the MS Windows interface.

    Windows are defined by opaque handles, which can be PrivateWindow *s underneath, but originally they were longs, I suspect an index into a
    window table. To have any sort of encapsulation, you need to be able to hang
    a pointer off a window. But Microsoft didn't provude a "Set user pointer"
    function. Instead they provided a "set/set Window long", with a USER_DATA
    field nicely defined.

    So if a void *fitted into a long, you could hang a pointer off a window. It was
    wrong, but the only alternative was to specify some sort of memory handle
    scheme. Then you wouldn't have encapsulation, because your window widget would
    depend on an external malloc/handle wrapper. You could get round this by
    having separate malloc wrappers for each class, but then it gets even more
    messy, and all to avoid a cast from a long to a void *.

    So lots of widgets were built with this scheme. Now you want the code to mix
    with new code. There's limited use in having a widget that can't be taken and
    dropped into a new program. So just having one mode which defines long as
    the same size as void * doesn't help. Of course Microsoft put in a layer of
    typedefs, so the function actually takes a LONG. Then they provided a
    SetWindowLongPtr() function, which, it turns out, also needs a long. But these
    strategies haven't actually worked. They rarely do. Changing typedef has
    too many effects to be a smooth process.

    There's no easy answer. The changes needs to most code are pretty trivial,
    you've just got to replace the call to get/set the user long with a call to
    the latest memory hook. But it still means editing and maintaining two versions
    of files.
    Malcolm McLean, Oct 9, 2013
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. George Marsaglia

    Assigning unsigned long to unsigned long long

    George Marsaglia, Jul 8, 2003, in forum: C Programming
    Replies:
    1
    Views:
    645
    Eric Sosman
    Jul 8, 2003
  2. Daniel Rudy

    unsigned long long int to long double

    Daniel Rudy, Sep 19, 2005, in forum: C Programming
    Replies:
    5
    Views:
    1,161
    Peter Shaggy Haywood
    Sep 20, 2005
  3. pereges

    Promoting unsigned long int to long int

    pereges, Jun 30, 2008, in forum: C Programming
    Replies:
    112
    Views:
    2,017
    David Thompson
    Jul 28, 2008
  4. Alex Vinokur
    Replies:
    9
    Views:
    762
    James Kanze
    Oct 13, 2008
  5. pozz
    Replies:
    12
    Views:
    699
    Tim Rentsch
    Mar 20, 2011
Loading...

Share This Page