Plauger, size_t and ptrdiff_t

R

robertwessel2

In another thread, a poster mentioned the Posix ssize_t definition
(signed version of size_t). My initial reaction was to wonder what the
point of the Posix definition was when ptrdiff_t was already defined as
such.

I got the idea that ptrdiff_t had to be the same size as size_t from
Plauger's "The Standard C Library," where he states "... It is always
the signed type that has the same number of bits as the4 unsigned type
chosen for size_t..." This language would not rule out one being int
and the other long so long as sizeof(int)==sizeof(long) for the
implementation.

Now I can't see anywhere in the standard that would require that, at
least not directly, and it seems that a size_t of unsigned int and a
prtdiff_t of long (where int and long are different sizes) would be
possible. C99 defines SIZE_MAX as being at least 65535, and
PTRDIFF_MIN/MAX as being at least -/+65535.

So do size_t and ptrdiff_t have to be the same size (or base type) or
not?
 
M

Michael Mair

In another thread, a poster mentioned the Posix ssize_t definition
(signed version of size_t). My initial reaction was to wonder what the
point of the Posix definition was when ptrdiff_t was already defined as
such.

I got the idea that ptrdiff_t had to be the same size as size_t from
Plauger's "The Standard C Library," where he states "... It is always
the signed type that has the same number of bits as the4 unsigned type
chosen for size_t..." This language would not rule out one being int
and the other long so long as sizeof(int)==sizeof(long) for the
implementation.

Now I can't see anywhere in the standard that would require that, at
least not directly, and it seems that a size_t of unsigned int and a
prtdiff_t of long (where int and long are different sizes) would be
possible. C99 defines SIZE_MAX as being at least 65535, and
PTRDIFF_MIN/MAX as being at least -/+65535.

So do size_t and ptrdiff_t have to be the same size (or base type) or
not?

They don't. For most implementations, the stated connection
will hold.
size_t, ssize_t and ptrdiff_t are typedefed for their respective
_roles_ as well as for abstraction.

Cheers
Michael
 
K

Keith Thompson

In another thread, a poster mentioned the Posix ssize_t definition
(signed version of size_t). My initial reaction was to wonder what the
point of the Posix definition was when ptrdiff_t was already defined as
such.

I got the idea that ptrdiff_t had to be the same size as size_t from
Plauger's "The Standard C Library," where he states "... It is always
the signed type that has the same number of bits as the4 unsigned type
chosen for size_t..." This language would not rule out one being int
and the other long so long as sizeof(int)==sizeof(long) for the
implementation.

Now I can't see anywhere in the standard that would require that, at
least not directly, and it seems that a size_t of unsigned int and a
prtdiff_t of long (where int and long are different sizes) would be
possible. C99 defines SIZE_MAX as being at least 65535, and
PTRDIFF_MIN/MAX as being at least -/+65535.

So do size_t and ptrdiff_t have to be the same size (or base type) or
not?

There's no requirement in the standard for size_t and ptrdiff_t to be
the same size, but I don't know of any implementation where they
differ.

ptrdiff_t is "the signed integer type of the result of subtracting two
pointers"; size_t is "the unsigned integer type of the result of the
sizeof operator".

Suppose a system only supports objects up to 65535 bytes. The sizeof
operator can only yield values from 0 to 65535, so 16 bits are
sufficient, but pointer subtraction for pointers to elements of an
array of 65535 bytes could yield values from -65535 to +65535, so
ptrdiff_t would have to be at least 17 bits.
 
P

P.J. Plauger

There's no requirement in the standard for size_t and ptrdiff_t to be
the same size, but I don't know of any implementation where they
differ.

ptrdiff_t is "the signed integer type of the result of subtracting two
pointers"; size_t is "the unsigned integer type of the result of the
sizeof operator".

Suppose a system only supports objects up to 65535 bytes. The sizeof
operator can only yield values from 0 to 65535, so 16 bits are
sufficient, but pointer subtraction for pointers to elements of an
array of 65535 bytes could yield values from -65535 to +65535, so
ptrdiff_t would have to be at least 17 bits.

Yes, but. X3J11 was painfully aware of this problem, which is why
we explicitly decided to allow some pointer differences to be
unrepresentable as a ptrdiff_t. I said what I said because that
was the reality at the time and it was certainly the intent of the
committee. Whether it got captured well in words...

BTW, even when you do get a ptrdiff_t overflow, on a (very common)
twos-complement machine with quiet wraparound on overflow there are
remarkably few cases where it matters.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
J

Jordan Abel

There's no requirement in the standard for size_t and ptrdiff_t to be
the same size, but I don't know of any implementation where they
differ.

How about an implementation where size_t is unsigned int and ptrdiff_t
is long? If all base types have only the minimum ranges, you can fit
size_t in an unsigned int but can't fit ptrdiff_t in an int.
 
K

Keith Thompson

P.J. Plauger said:
Yes, but. X3J11 was painfully aware of this problem, which is why
we explicitly decided to allow some pointer differences to be
unrepresentable as a ptrdiff_t. I said what I said because that
was the reality at the time and it was certainly the intent of the
committee. Whether it got captured well in words...

BTW, even when you do get a ptrdiff_t overflow, on a (very common)
twos-complement machine with quiet wraparound on overflow there are
remarkably few cases where it matters.

Ok, I see that the standard explicitly allows the result of ptr1-ptr2
not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
But I don't see anything that requires, or even encourages, size_t and
ptrdiff_t to be the same size.

If the maximum object size is, say, 65535 bytes, the standard
*permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
(as far as the standard is concerned) not to make size_t 16 bits and
ptrdiff_t 32 bits?
 
K

Keith Thompson

Jordan Abel said:
How about an implementation where size_t is unsigned int and ptrdiff_t
is long? If all base types have only the minimum ranges, you can fit
size_t in an unsigned int but can't fit ptrdiff_t in an int.

Then that would be an implementation I don't know of. (It's legal as
far as I know.)
 
P

P.J. Plauger

Ok, I see that the standard explicitly allows the result of ptr1-ptr2
not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
But I don't see anything that requires, or even encourages, size_t and
ptrdiff_t to be the same size.

If the maximum object size is, say, 65535 bytes, the standard
*permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
(as far as the standard is concerned) not to make size_t 16 bits and
ptrdiff_t 32 bits?

There's just Nixon's rule -- you could do it, but it would be wrong.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
K

Keith Thompson

P.J. Plauger said:
There's just Nixon's rule -- you could do it, but it would be wrong.

And why exactly would it be wrong? I see nothing in the standard that
even vaguely implies that size_t and ptrdiff_t should be the same
size, and there are realistic circumstances (see above) in which it
would make perfectly good sense, IMHO, for ptrdiff_t to be larger than
size_t. The standard's wording caters to implementations that choose
to make them the same size, but that's very different from encouraging
them to be the same size.

Your opinion does carry significant weight, but I'd be very interested
in knowing the reasoning behind it. In the meantime, I can't think of
any reason to write code that could break on systems where ptrdiff_t
is bigger than size_t.
 
P

P.J. Plauger

And why exactly would it be wrong?

Because there's next to nothing to be gained by it, and some people
would be surprised that ptrdiff_t is unnecessarily large. (I won't
even mention a third of a century of past practice.)
I see nothing in the standard that
even vaguely implies that size_t and ptrdiff_t should be the same
size,

There's nothing in the C Standard that prohibits 93 bit chars, either.
and there are realistic circumstances (see above) in which it
would make perfectly good sense, IMHO, for ptrdiff_t to be larger than
size_t.

Maybe by one bit, but as I said before even there it matters way less
than you might think.
The standard's wording caters to implementations that choose
to make them the same size, but that's very different from encouraging
them to be the same size.

Right. The C Standard is not intended to rule out silly implementations.
Your opinion does carry significant weight, but I'd be very interested
in knowing the reasoning behind it. In the meantime, I can't think of
any reason to write code that could break on systems where ptrdiff_t
is bigger than size_t.

Nor do I encourage that either.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
T

tmp123

P.J. Plauger said:
Because there's next to nothing to be gained by it, and some people
would be surprised that ptrdiff_t is unnecessarily large. (I won't
even mention a third of a century of past practice.)

....

Sorry, I do not understand something:

Result of calloc is an array that, according to calloc parameters, can
be up to size_t**2 bytes. ptrdiff_t must be long enough for pointer
differences in its elements. Thus, ptrdiff_t must be greater than
size_t (or both must have the maximum system value)

In other words: it is easy think about a system that doesn't allows big
structure definitions (by example, a maximum of one "memory page" of
4096 bytes --> size_t=4096 ), but this system can accept big dynamic
arrays (ptrdiff_t --> +/-0x7FFFFFFF).

I suposse there are some more related rules I do not know.

Kind regards.
 
P

P.J. Plauger

P.J. Plauger said:
Keith Thompson said:
[...]
Ok, I see that the standard explicitly allows the result of ptr1-ptr2
not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
But I don't see anything that requires, or even encourages, size_t
and
ptrdiff_t to be the same size.

If the maximum object size is, say, 65535 bytes, the standard
*permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
(as far as the standard is concerned) not to make size_t 16 bits and
ptrdiff_t 32 bits?

There's just Nixon's rule -- you could do it, but it would be wrong.

And why exactly would it be wrong?

Because there's next to nothing to be gained by it, and some people
would be surprised that ptrdiff_t is unnecessarily large. (I won't
even mention a third of a century of past practice.)

...

Sorry, I do not understand something:

Result of calloc is an array that, according to calloc parameters, can
be up to size_t**2 bytes.

The calloc parameters might permit you to request that, but the system
can't deliver it. size_t by definition is an unsigned integer type
big enough to represent the count of bytes in the largest object you
can declare or allocate.
ptrdiff_t must be long enough for pointer
differences in its elements.

Actually, no. The difference between two pointers is permitted to
overflow when represented as an object of type ptrdiff_t.
Thus, ptrdiff_t must be greater than
size_t (or both must have the maximum system value)

Neother is true.
In other words: it is easy think about a system that doesn't allows big
structure definitions (by example, a maximum of one "memory page" of
4096 bytes --> size_t=4096 ), but this system can accept big dynamic
arrays (ptrdiff_t --> +/-0x7FFFFFFF).

I suposse there are some more related rules I do not know.

Yes. See above.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
K

Keith Thompson

P.J. Plauger said:
Actually, no. The difference between two pointers is permitted to
overflow when represented as an object of type ptrdiff_t.

Agreed, but you seem to be arguing that it's *better* for an
implementation to allow pointer subtraction to overflow than to make
ptrdiff_t bigger than size_t. I respectfully disagree.

Note that this would be necessary only when the maximum object size is
greater than SIZE_MAX/2 *and* size_t is the largest available integer
type. Since C99 requires support for 64-bit integers, that's unlikely
to happen for a long time. But a system with a 32-bit size_t that
allows objects larger than 2 gigabytes (but not larger than 4
gigabytes) might reasonably have a 64-bit ptrdiff_t. (33 bits would
suffice, but I'm assuming hardware support for 32 and 64 bits.) (It
might also reasonably expand its size_t to 64 bits.)
 
J

Jordan Abel

P.J. Plauger said:
Keith Thompson said:
[...]
Ok, I see that the standard explicitly allows the result of ptr1-ptr2
not to fit in a ptrdiff_t (it's undefined behavior if it doesn't).
But I don't see anything that requires, or even encourages, size_t and
ptrdiff_t to be the same size.

If the maximum object size is, say, 65535 bytes, the standard
*permits* size_t and ptrdiff_t to be 16 bits, but is there any reason
(as far as the standard is concerned) not to make size_t 16 bits and
ptrdiff_t 32 bits?

There's just Nixon's rule -- you could do it, but it would be wrong.

And why exactly would it be wrong?

Because there's next to nothing to be gained by it, and some people
would be surprised that ptrdiff_t is unnecessarily large. (I won't
even mention a third of a century of past practice.)

...

Sorry, I do not understand something:

Result of calloc is an array that, according to calloc parameters, can
be up to size_t**2 bytes. ptrdiff_t must be long enough for pointer
differences in its elements. Thus, ptrdiff_t must be greater than
size_t (or both must have the maximum system value)

The system can [and many do. mine does.] fail all calls to calloc that
result in a size that won't fit in size_t.
 
J

Jordan Abel

Agreed, but you seem to be arguing that it's *better* for an
implementation to allow pointer subtraction to overflow than to make
ptrdiff_t bigger than size_t. I respectfully disagree.

But a system could implement a "magic overflow" - i.e. the size of
size_t, char *, and ptrdiff_t are all the same and overflow acts like
typical twos-complement systems.

Then the [char *] pointer difference 0x01-0xFFF3 is 14 [not -65522], and
adding 14 to the pointer 0xFFF3 results in 0x01.
 
M

Malcolm

P.J. Plauger said:
Yes, but. X3J11 was painfully aware of this problem, which is why
we explicitly decided to allow some pointer differences to be
unrepresentable as a ptrdiff_t. I said what I said because that
was the reality at the time and it was certainly the intent of the
committee. Whether it got captured well in words...

BTW, even when you do get a ptrdiff_t overflow, on a (very common)
twos-complement machine with quiet wraparound on overflow there are
remarkably few cases where it matters.
My own vew is that it is time the size_t, ptrdiff_t ugliness was put to
rest.

It is designed to solve a problem that almost never occurs in practise,
which is that the size of an object in memory exceeds the size of an
integer.
There is maybe a case for making malloc() take a special type as an
argument, but there is a much weaker case for then allowing the type to run
through the code, so that every count of objects (and most integers count
something), and hence every array index, has to be a size_t.

As this problem shows, the strategy doesn't even have the advantage of
making every program that uses the new types theoretically corrrect for all
values, unless you get into the nonsense of making ptrdiff_t a bit wider
than size-t.
 
K

Keith Thompson

Jordan Abel said:
P.J. Plauger said:
news:[email protected]... [...]
ptrdiff_t must be long enough for pointer
differences in its elements.

Actually, no. The difference between two pointers is permitted to
overflow when represented as an object of type ptrdiff_t.

Agreed, but you seem to be arguing that it's *better* for an
implementation to allow pointer subtraction to overflow than to make
ptrdiff_t bigger than size_t. I respectfully disagree.

But a system could implement a "magic overflow" - i.e. the size of
size_t, char *, and ptrdiff_t are all the same and overflow acts like
typical twos-complement systems.

Then the [char *] pointer difference 0x01-0xFFF3 is 14 [not -65522], and
adding 14 to the pointer 0xFFF3 results in 0x01.

Sure, a system could do that, and many do. This is clearly allowed by
the standard. I'm suggesting that, given that ptrdiff_t can overflow
for large objects, it's better to make ptrdiff_t big enough so it
can't overflow (even if it's not the same size as size_t) than to
force it to be the same size as size_t.

I know that most systems have ptrdiff_t and size_t the same size. I'm
just not convinced that there's any significant disadvantage in making
them different sizes *if* there's some benefit (avoiding overflow) in
doing so.
 
K

Keith Thompson

Malcolm said:
My own vew is that it is time the size_t, ptrdiff_t ugliness was put to
rest.

It is designed to solve a problem that almost never occurs in practise,
which is that the size of an object in memory exceeds the size of an
integer.

I assume you mean that the size of an object (in bytes) exceeds the
maximum value of an integer. And I assume that by "integer" you
really mean "int".

So are you suggesting that we should use unsigned int to represent
object sizes?

On typical 64-bit systems, including several that I work on, type int
is 32 bits (and long is 64 bits). If int were made 64 bits, then
there would be a gap in the type system; char is 8 bits, int would be
64 bits, and short would be either 16 or 32 bits. (A C99 extended
integer type could solve this, but such types aren't commonly
implemented.) One such system in particular has 8 gigabytes of
physical memory. I don't know whether objects bigger than 4 gigabytes
are allowed, but there's no fundamental reason they shouldn't be --
and both size_t and ptrdiff_t are 64 bits.
There is maybe a case for making malloc() take a special type as an
argument, but there is a much weaker case for then allowing the type to run
through the code, so that every count of objects (and most integers count
something), and hence every array index, has to be a size_t.

As this problem shows, the strategy doesn't even have the advantage of
making every program that uses the new types theoretically corrrect for all
values, unless you get into the nonsense of making ptrdiff_t a bit wider
than size-t.

Even if you assume that making ptrdiff_t wider than size_t is
nonsense, it's not necessary in this case. Limiting object sizes to
values representable in 32 bits would be absurd; limiting them to
values representable in 64 bits will be more than enough for many
years.

What exactly are you proposing?
 
M

Malcolm

Keith Thompson said:
Even if you assume that making ptrdiff_t wider than size_t is
nonsense, it's not necessary in this case. Limiting object sizes to
values representable in 32 bits would be absurd; limiting them to
values representable in 64 bits will be more than enough for many
years.

What exactly are you proposing?
Change the standard so that malloc() "takes as an argument an integral type
of type int or higher".

Therefore if you have a machine with a huge memory and small ints, the
compiler writer is free to say that malloc() should take an unsigned long
long, or whatever.

However anyone can write code with int s as array indices, and know that it
is portable - the huge array user is the one making the non-portable
assumption.

Usually you never want to allocate more memory than can be held in an
intger, and of the excpetions usually you wnat to make only one, very
program and platform-specific, allocation.
 
K

Keith Thompson

Malcolm said:
Change the standard so that malloc() "takes as an argument an integral type
of type int or higher".

Therefore if you have a machine with a huge memory and small ints, the
compiler writer is free to say that malloc() should take an unsigned long
long, or whatever.

However anyone can write code with int s as array indices, and know that it
is portable - the huge array user is the one making the non-portable
assumption.

You propose making malloc()'s argument type implementation-defined.
It already is; the only real difference is that the standard gives a
name to that argument type, namely size_t.

If you want to call malloc() with an int argument, you're already free
to do so; it will be converted to size_t as long as the prototype is
visible. If you want index arrays with ints, that's also perfectly
legal.

I really don't see any advantage in what you propose.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,563
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top