ptrdiff_t maximum

K

Keith Thompson

Shao Miller said:
It seems to me that 'SIZE_MAX' is pretty useful as the count of
possible pointer values (not object representations) given a
contiguous range of object storage, and 'ptrdiff_t' simply allows for
a sign. But hey! Suppose the hardware supports far fewer pointer
values (for address registers, perhaps) than it does values for the
next-largest arithmetic register... If 'size_t' is a typedef to a
standard integer type, then we really need 'SIZE_MAX' rather than
relying on finding the maximum value of the standard integer type. An
arbitrary choice for an alternative to a missing 'SIZE_MAX' in a C89
implementation might yield unpleasant results when we increment a
pointer one too many times. Fortunately, the Standard evolves.

Note that SIZE_MAX is an upper bound on the value that can be yielded by
sizeof, not necessarily the least upper bound. There is no defined
constant (that I know of) that actually tells you the maximum size of an
object, which could be substantially smaller than SIZE_MAX. SIZE_MAX is
merely the maximum value of the unsigned type chosen to represent sizes.
(Consider the old m68000, which had 24-bit addresses but 16-bit and
32-bit integers.)
I don't fully understand why we have the optional 'intptr_t' and
uintptr_t' types, _except_ that the sign of 'ptrdiff_t' allows a
possibly greater bit-width than 'size_t', but 'uintptr_t' seems to
have a greater bit-width than 'intptr_t'. Oh well.

On many systems, uintptr_t is likely to be the same type as size_t, but
that assumes a monolithic address space. You might, for example, have
64-bit addresses and 16 exabytes of address space, but only up to
4-gigabyte objects (32 bits), so uintptr_t could be 64 bits while size_t
is only 32 bits.

It's also a matter of documentation; if you using uintptr_t, it's
clear that you're doing it because you want an unsigned integer
type that can represent addresses. (You'll also get a compile-time
error if there is no such integer type.)

There's no way to determine whether uintptr_t or intptr_t is preferred
for a given system, but I'm not sure what you'd do with that information
anyway.
I'd still be interested in anyone's response to:

In principle, objects themselves don't (necessarily) have types.
An object is merely a "region of data storage in the execution
environment, the contents of which can represent values". (Note:
"*can* represent", not "represents".) A type is imposed by the
lvalue (expression) used to refer to the object. For declared
objects, this is typically the type used in the object declaration;
for allocated objects, some type has to be imposed.

For any context where you care whether an object is an array or not,
you'll be using some type to access it. That type will clearly be
an array type or a non-array type.

I'm not at all sure that's sufficiently clear. If it isn't,
could you rephrase the question and/or provide an example?
 
G

Geoff

Indeed so. Thank you for the correction.

Output on x64 (i7), Windows 7, Visual Studio 2010 Pro.

sizeof test_array is 127

Maximum value of char is 127
Maximum value of signed char is 127
Maximum value of unsigned char is 255

Maximum value of signed short is 32767
Maximum value of unsigned short is 65535

Maximum value of signed int is 2147483647
Maximum value of unsigned int is 4294967295

Maximum value of signed long is 2147483647
Maximum value of unsigned long is 4294967295

Maximum value of ptrdiff_t is 9223372036854775807
Maximum value of size_t is 18446744073709551615

Value of UNSIGNED_MAX_MAX is 18446744073709551615

UMAX_FORMAT is llu
 
S

Shao Miller

Keith said:
Note that SIZE_MAX is an upper bound on the value that can be yielded by
sizeof, not necessarily the least upper bound. There is no defined
constant (that I know of) that actually tells you the maximum size of an
object, which could be substantially smaller than SIZE_MAX. SIZE_MAX is
merely the maximum value of the unsigned type chosen to represent sizes.
(Consider the old m68000, which had 24-bit addresses but 16-bit and
32-bit integers.)

D'oh. Uno suggested votes. I'm genuinely curious as to who might
perceive benefit from a theoretical 'OBJECT_MAX' integer constant
expression which yields the maximum size (in bytes/chars) of an object
which the implementation guarantees well-defined behaviour for. It
might help Mr. J. Kelly's 'trim', and has my vote.

Such could allow for assertions on integers to be added to pointers and
could even help to answer "will my implementation honour this 'calloc'
or reject it as absurd before looking for available space?" to which
experts could reply, "You're not using 'OBJECT_MAX', so it's difficult
to know."

It might be easy to simply make this 'SIZE_MAX'.
On many systems, uintptr_t is likely to be the same type as size_t, but
that assumes a monolithic address space. You might, for example, have
64-bit addresses and 16 exabytes of address space, but only up to
4-gigabyte objects (32 bits), so uintptr_t could be 64 bits while size_t
is only 32 bits.
Aha.


It's also a matter of documentation; if you using uintptr_t, it's
clear that you're doing it because you want an unsigned integer
type that can represent addresses. (You'll also get a compile-time
error if there is no such integer type.)

Fortunately, we can check for 'INTPTR_MIN' and family to get a clue as
to 'intptr_t' availability.
There's no way to determine whether uintptr_t or intptr_t is preferred
for a given system, but I'm not sure what you'd do with that information
anyway.


In principle, objects themselves don't (necessarily) have types.

Thank goodness. That is definitely a characteristic about C which can
be useful for programmer compatibility deliberations. :)
An object is merely a "region of data storage in the execution
environment, the contents of which can represent values". (Note:
"*can* represent", not "represents".) A type is imposed by the
lvalue (expression) used to refer to the object. For declared
objects, this is typically the type used in the object declaration;
for allocated objects, some type has to be imposed.
Agreed.


For any context where you care whether an object is an array or not,
you'll be using some type to access it. That type will clearly be
an array type or a non-array type.

I'm not at all sure that's sufficiently clear. If it isn't,
could you rephrase the question and/or provide an example?
"Pure" pointers (to non-array object types) are a popular choice for
working with allocated array objects (whatever that means; we have no
type) from memory management functions, for some reason. Probably
because of their brevity in contrast to pointers to array types.

Do these "pure" pointers to allocated storage have anything to say about
the number of elements in the array object pointed-to? Similarly,

union forty_two_doubles {
double d;
char space[sizeof (double) * 42];
} foo;
double *pure = (double *)&foo;

Here we have nice alignment and clearly enough space for 42 'double's,
but different programmers reading the same Standard (or draft) have
different expectations for what kinds of pointer arithmetic is
well-defined upon 'pure'.

A: It doesn't pointer to a 'double', but to a 'union'!
B: It might point to 42 of them!
C: You must use '&foo.d'!
D [in a recognizable voice]: Nobody ought to write such code; it's worse
than stupid! Forget all about it. *Jedi*hand-wave*
E: The bounds are implementation-defined.
F: The bounds are undefined.
G: I only see a single 'double'. There's a single-element array of
doubles, at most.
H: It's aligned for 'double'. What does 'sizeof foo / sizeof (double)'
yield?
I: You have to cast to 'double(*)[sizeof foo / sizeof (double)]' first!

That is, someone might have an expectation that pointer arithmetic is
well-defined without any sign of array types anywhere, or with an
unrelated array-type. We do have the beautiful footnote 91 in
'n1256.pdf' near 6.5.6... But it might be nice if this was crystal
clear from a C Standard. Something like:

"An addressable object or an addressable contiguous range of objects
shall be considered to be an array object for any element type, only if
the object or range is suitably aligned for such an element type; the
count of elements in the array is the maximum number of contiguous whole
elements that would consume less than or equal to the size of the object
or range.(footnote)

(footnote) It follows from this and the alignment and size of 'char'
that any non-bit-field object not having 'register' storage-class can be
treated as 'array of char'."

Unfortunately, such might warrant additional definition for what array
object a pointer points into in the case of multi-dimensional arrays.
For example, a pointer pointing one-past: If it also happens to point to
an element of the same pointed-to type, can we then add a non-zero
integer to this pointer without overflow? Ought the answer to be
implementation-defined, so programmers can find out?
 
U

Uno

I could be mistaken, but my impression is that the people responsible
for C Standards take votes and debate "ought" relative to a variety of
costs and potential benefits. Perhaps that's all Mr. K. Thompson meant
with that "should".

I suspect about the same. I think when Keith says "the standard should
say," he means, "x, y, z or me advocates that the standard says." I
imagine Keith to be one of those guys who is very reasonable at meetings
like these.
 
K

Keith Thompson

Shao Miller said:
D'oh. Uno suggested votes. I'm genuinely curious as to who might
perceive benefit from a theoretical 'OBJECT_MAX' integer constant
expression which yields the maximum size (in bytes/chars) of an object
which the implementation guarantees well-defined behaviour for. It
might help Mr. J. Kelly's 'trim', and has my vote.

I vaguely recall seeing a proposed amendment or extension to C that
had such a constant.

But it wouldn't really help trim() at all. Suppose OBJECT_MAX is,
say, 2**24-1. That still doesn't let you avoid undefined behavior
if you call trim() with a pointer to a 1024-byte object containing no
zero bytes. As soon as trim() goes beyond the end of the object,
however small it is, the behavior is undefined.
Such could allow for assertions on integers to be added to pointers
and could even help to answer "will my implementation honour this
calloc' or reject it as absurd before looking for available space?" to
which experts could reply, "You're not using 'OBJECT_MAX', so it's
difficult to know."

Or you could just call calloc() and see what it returns.
It might be easy to simply make this 'SIZE_MAX'.

SIZE_MAX is the maximum value of the *type* size_t.

[...]
"Pure" pointers (to non-array object types) are a popular choice for
working with allocated array objects (whatever that means; we have no
type) from memory management functions, for some reason. Probably
because of their brevity in contrast to pointers to array types.

So a "pure" pointer is merely a pointer to an object type other than
an array type. I'm not sure we need a specific term for that.
Do these "pure" pointers to allocated storage have anything to say
about the number of elements in the array object pointed-to?

No. More precisely, the language provides no way to get the size
of an array from a pointer to its first element. Go beyond the
end of the array, and the language doesn't define what happens.
(The implementation *could* store bounds in the array, and could
even provide non-standard functions to let you retrieve them.)

[...]
 
L

lawrence.jones

Keith Thompson said:
Note that SIZE_MAX is an upper bound on the value that can be yielded by
sizeof, not necessarily the least upper bound. There is no defined
constant (that I know of) that actually tells you the maximum size of an
object, which could be substantially smaller than SIZE_MAX.

In the Bounds-checking interfaces Technical Report (TR 24731-1), which
has been incorporated into the C1X draft as Annex K, there's an
RSIZE_MAX macro that is intended to serve this purpose.
 
P

Peter Nilsson

Shao Miller said:
I'm genuinely curious as to who might
perceive benefit from a theoretical 'OBJECT_MAX' integer
constant expression which yields the maximum size
(in bytes/chars) of an object which the implementation
guarantees well-defined behaviour for.

It can't guarantee well-defined behaviour. Implementations
are only required to support one specific program (of their
choosing) that meets a set of minimum requirements. It
simply isn't possible to force every implementation to
conform to any otherwise legal program that contains an
instance of a specific limit.
 It might help Mr. J. Kelly's 'trim', and has my vote.

There is no need for OBJECT_MAX or SIZE_MAX in a trim()
function.
Such could allow for assertions on integers to be added
to pointers and could even help to answer "will my
implementation honour this 'calloc' or reject it as absurd
before looking for available space?" to which experts could
reply, "You're not using 'OBJECT_MAX', so it's difficult
to know."

It might be easy to simply make this 'SIZE_MAX'.

Better still, PTRDIFF_MAX. [If you recall, it was pointer
differences that started the sub-threads.] But both are unfair
on implementations that support (say) 16 and 32-bit integers,
but only have an 8MB address space. They shouldn't limit size_t
to 16-bit, just because they can't guarantee support for a
program allocating a 2GB object.
 
S

Shao Miller

Peter said:
It can't guarantee well-defined behaviour. Implementations
are only required to support one specific program (of their
choosing) that meets a set of minimum requirements. It
simply isn't possible to force every implementation to
conform to any otherwise legal program that contains an
instance of a specific limit.

I apologize if I have not communicated clearly. I did not intend to
suggest that an implementation _prevent_ undefined behaviour based on
some arbitrary size. I meant to suggest that _an_ object _with_ that
arbitrary size be guaranteed with well-defined operations.

That is, I did not mean that pointer addition should not overflow until
reaching some fixed limit for all objects. I would not appreciate that
at all.
There is no need for OBJECT_MAX or SIZE_MAX in a trim()
function.

I disagree. Multiple posters have explained that the moment 'trim'
walks beyond what genuinely constitutes the object, that undefined
behaviour is asserted by the C Standard and an infinite loop is a valid
possible outcome.

Granting that, Mr. J. Kelly then seems to narrow his 'trim' target
implementations as being those which specifically do not result in an
infinite loop simply due to the aforementioned UB. Thus, he requires
some arbitrary limit to compare against; preferably one which would
further reduce the chance(s) of an infinite loop. A maximum array size
that an implementation would "normally" support seems like just the thing.
Such could allow for assertions on integers to be added
to pointers and could even help to answer "will my
implementation honour this 'calloc' or reject it as absurd
before looking for available space?" to which experts could
reply, "You're not using 'OBJECT_MAX', so it's difficult
to know."

It might be easy to simply make this 'SIZE_MAX'.

Better still, PTRDIFF_MAX. [If you recall, it was pointer
differences that started the sub-threads.]

Either is useful. Both are missing in C89, which I believe that Mr.
Kelly has expressed a desire to target in at least some regards.
But both are unfair
on implementations that support (say) 16 and 32-bit integers,
but only have an 8MB address space. They shouldn't limit size_t
to 16-bit, just because they can't guarantee support for a
program allocating a 2GB object.

Regardless of what type 'size_t' is, the macro 'SIZE_MAX' is the "limit
of size_t" and the "maximum limit" for the "integer type". Does that
require that a value greater than 'SIZE_MAX' cannot be stored into a
'size_t'? That seems to be what Mr. K. Thompson has suggested. Do you
suggest the same, that the macro cannot be an artificial limiter
projected onto some integer type? "Limit" and "value" are different
words, but equivalence in this context could be perceived as a fair
interpretation.
 
S

Shao Miller

In the Bounds-checking interfaces Technical Report (TR 24731-1), which
has been incorporated into the C1X draft as Annex K, there's an
RSIZE_MAX macro that is intended to serve this purpose.

Shao Round to Dr. Jones: C1X cheats very big![1]

[1] A motion picture reference. Yet another C1X bonus! Thanks for
pointing it out!
 
L

lawrence.jones

Shao Miller said:
In the Bounds-checking interfaces Technical Report (TR 24731-1), which
has been incorporated into the C1X draft as Annex K, there's an
RSIZE_MAX macro that is intended to serve this purpose.

Shao Round to Dr. Jones: C1X cheats very big![1]

[1] A motion picture reference. Yet another C1X bonus! Thanks for
pointing it out!

I should note, however, that the bounds checking interfaces are optional
(a "conditional feature" in standardese), so implementations don't have
to support them.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,147
Latest member
CarenSchni
Top