size_t or int for malloc-type functions?

Keith Thompson · Jan 8, 2007

CBFalconer said:
Richard said:

CBFalconer said:

It's a conclusion that has to be drawn, because it does say
somewhere that size_t can describe the size of any object.

Click to expand...

I looked for that, but all I found was that size_t is the type
returned by sizeof, and that cannot be applied to objects
allocated by [mc]alloc().

Click to expand...

Regardless, I believe we can imply that conclusion. C can only
operate on objects. The size of an object is garnered by sizeof,
and is of type size_t. malloc doesn't return generic objects, it
returns pointers.

I disagree. Yes, malloc and calloc return pointers, but a (non-null)
pointer returned by *alloc pointers to (the first byte of) an object,
i.e. a "region of data storage in the execution environment, the
contents of which can represent values".

Given:

int *ptr = malloc(10 * sizeof *ptr);
assert(ptr != NULL);

*ptr is an object (of type int), and the full allocated space can be
treated as an object of type array of int; the latter cannot have
sizeof applied to it.

I think you're arguing on the basis of what (in your opinion) makes
sense, rather than what the standard actually says.

Richard Tobin · Jan 8, 2007

I looked for that, but all I found was that size_t is the type
returned by sizeof, and that cannot be applied to objects
allocated by [mc]alloc().

[/QUOTE]

Regardless, I believe we can imply that conclusion. C can only
operate on objects. The size of an object is garnered by sizeof,
and is of type size_t. malloc doesn't return generic objects, it
returns pointers.

I didn't say it returned them. Are you suggesting that the pointer
returned by malloc (or calloc) doesn't point to an object?

-- Richard

kuyper · Jan 8, 2007

CBFalconer said:
Richard said:

CBFalconer said:

It's a conclusion that has to be drawn, because it does say
somewhere that size_t can describe the size of any object.

Click to expand...

I looked for that, but all I found was that size_t is the type
returned by sizeof, and that cannot be applied to objects
allocated by [mc]alloc().

Click to expand...

Regardless, I believe we can imply that conclusion.

It it certainly possible to imply it; it's been implied frequently by
many different people. The question is whether it's possible to infer
it. I've not seen valid arguments inferring that conclusion from actual
citations from the C standard.

... The size of an object is garnered by sizeof,

Citation please, for the applicability of sizeof to dynamically
allocated objects?

kuyper · Jan 8, 2007

Steve said:
With perfect 20/20 hindsight I agree, but...

I'd be much more inclined to agree with this if the Standard
explicitly said

If the product nmemb * size cannot be represented as a
size_t, the calloc function returns NULL.

I wish the Standard did say that. This is really a grey area, a
longstanding source of bugs and misunderstanding. (See e.g. the
related c.l.c. FAQ 7.16.)

It doesn't matter whether nmemb*size can be represented in a size_t;
the relevant issue is whether or not calloc() can allocate enough
memory for nmemb objects of the specified size. If it can't it must
return NULL; returning a pointer to enough memory store nmemb*size
bytes doesn't meet calloc()'s required behavior, if nmemb*size is
smaller (by reason of unsigned wrap-around) than the size actually
requested.

Without the explicit statement (and as the arguments presented by
several knowledgeable posters in this thread prove), I think an
implementor could be excused for assuming that the Standard's
intent was

If the product nmemb * size cannot be represented as a
size_t, the behavior is undefined.

...although since the Standard doesn't say *that* explicitly,
either, it's sort of doubly or meta undefined!

Undefined behavior due to that absence of a definition only applies
when there is in fact an absence of a definition. When the standard
describes, for instance, the multiplication operator, it doesn't say
anything about how the results might vary from one day of the week to
the next. That doesn't mean that the behavior of 2*2 is undefined on
Wednesdays. The standard does provide a definition of the behavior of
the multiplication operator, which applies equally well to any day of
the week.

The standard says very clearly what calloc() must do if it cannot
allocate the requested amount of space. It doesn't provide exemptions
for the case where a naive approach to calculating the amount of space
needed would give a wrong result. The implication is not that the
behavior of calloc() is undefined in that case; the implication is that
calloc() is not allowed to use that naive method for calculating the
amount of space needed.

kuyper · Jan 8, 2007

CBFalconer said:
Keith Thompson wrote: ....

Consider the implications. We are always yelling at people for not
remembering the size they allocated. If a size_t can't hold that
size, where are they to remember it? ...

Since the original memory allocation request was made using two size_t
value, the size of memory requested can be kept track of using two
size_t values.

... After all, size_t is
guaranteed to hold the size of any object.

Citation, please? Several people have made that claim, or closely
related claims, and I've repeatedly requesting supporting citations;
none has yet been provided.

Keith Thompson · Jan 8, 2007

With perfect 20/20 hindsight I agree, but...

I'd be much more inclined to agree with this if the Standard
explicitly said

If the product nmemb * size cannot be represented as a
size_t, the calloc function returns NULL.

I wish the Standard did say that. This is really a grey area, a
longstanding source of bugs and misunderstanding. (See e.g. the
related c.l.c. FAQ 7.16.)

Without the explicit statement (and as the arguments presented by
several knowledgeable posters in this thread prove), I think an
implementor could be excused for assuming that the Standard's
intent was

If the product nmemb * size cannot be represented as a
size_t, the behavior is undefined.

...although since the Standard doesn't say *that* explicitly,
either, it's sort of doubly or meta undefined!

I agree that this kind of confusion probably explains why some
implementations have gotten this wrong. I don't agree that the
standard is at all ambiguous if you read it carefully. An implementer
can be excused for making this error, but it is an error (and some
implementers have managed to avoid it).

Spiros Bousbouras · Jan 8, 2007

Peter said:
They invented Fields. But note that the mathematical notion of
multiplicative
inverses in Fields does not correspond to integer division with
rounding.

Division by zero is usually excluded. So division is rarely closed
in any case.

If by "division" we mean an operation which is roughly
speaking the inverse of multiplication then it may not
exist at all. The fact that "division by zero" is not defined
does not mean that it's not closed , it means that it's not
an operation in the algebraic sense. Not closed means
that a/b may not be an element of the set we're talking about.

That's too strong a statement. It is trivial to generate Rings whose
elements
are normal integers, but whose addition and multiplication functions
are
entirely different to the ordinary arithmetic of integers. The elements
are
really just labels, it's the function mappings that determine what is a
ring.
C's unsigned arithmetic uses particular mappings that just one example.

Yes, but it's unnecessary to consider rings in general in order to
understand
unsigned integers.

Hear hear for that. All the discussion about
rings does is to obfuscate the issue. It is also
out of topic.

The important part is the notion that in the abstract, the operators +
and *
are just function mappings, not 'computations'. In other words + and *
are
in essence just pure lookup tables.

One of the 2 words in "function mappings" is redundant.
"Function" and "mapping" are usually synonymous. In
mathematical parlance "operator" generally also means
function. In a ring R, + and * are functions from the cartesian
product RxR into R. Using the usual functional notation we
could write +((a,b)) and *((a,b)) for any elements of the ring a,b
but for historical reasons we write instead a+b and a*b
respectively.

For unsigned integers, the mapping/table is 'onto', that is, for every
pair (a,b) there exists a corresponding element.

That's not what "onto" means. Just by virtue of the fact
that + and * are functions we know that for every pair
(a,b) there's going to be an image under + and an image
under * ie a+b and a*b are both going to be defined. Addition
being onto would mean that for every unsigned integer c there
are unsigned integers a and b such that c=a+b. This is trivially true.
Multiplication being onto would mean that for every unsigned
integer c there are unsigned integers a and b such that c=a*b. This
is also trivially true.

Signed integer
operators are not onto since there are pairs that do not have
mappings.

No , addition and multiplication are not binary operations
on the set of signed integers of some platform. Since they
are not even functions defined on the set of ordered pairs
of signed integers it is meaningless to ask if they are onto
functions.

Because of that, C's signed integer arithmetic is not
a ring. However, the typical two's complement implementation
completes the mapping and does form a ring.

I don't see what guarantees a two's complement reprpesentation
offers in the case of overflow.

Keith Thompson · Jan 8, 2007

Spiros Bousbouras said:
Peter Nilsson wrote: [...]

Because of that, C's signed integer arithmetic is not
a ring. However, the typical two's complement implementation
completes the mapping and does form a ring.

Click to expand...

I don't see what guarantees a two's complement reprpesentation
offers in the case of overflow.

A two's complement *representation* doesn't. A "typical' two's
complement *implementation*, in which, for example, INT_MAX+1 yields
INT_MIN, does. (The standard makes this undefined behavior; the
typical wraparound behavior is one of many possible manifestations of
that undefined behavior.)

jacob navia · Jan 8, 2007

Spiros Bousbouras a écrit :

Hear hear for that. All the discussion about
rings does is to obfuscate the issue. It is also
out of topic.

Exactly. If you see my message, it meant that the
discussion proposed to avoid

c = malloc(a*b)

because of the overflow problems, and proposed to
change that to

c = calloc(a,b);

Then people started arguing that "overflow doesn't exist"
etc etc.

I think a consensus would be to accept calloc as a "safer"
alternative to just multiplying without overflow check.

Spiros Bousbouras · Jan 8, 2007

jacob said:
Spiros Bousbouras a écrit :

Exactly. If you see my message, it meant that the
discussion proposed to avoid

c = malloc(a*b)

because of the overflow problems, and proposed to
change that to

c = calloc(a,b);

Then people started arguing that "overflow doesn't exist"
etc etc.

I think a consensus would be to accept calloc as a "safer"
alternative to just multiplying without overflow check.

I have only read parts of the thread here and there so it
may have been explained already but why is calloc(a,b)
safer ? Generally you don't know that the calloc implementation
checks for wraparound. Even if you have seen the source
code of the calloc you're using and know that it checks for
wraparound you can't know that a newer version of the
library will still do the same. It seems to me that if a and b
are large enough to make wraparound probable then the
only safe option is to check for that before calling malloc()
or calloc().

Keith Thompson · Jan 8, 2007

Spiros Bousbouras said:
jacob navia wrote: [...]

Exactly. If you see my message, it meant that the
discussion proposed to avoid

c = malloc(a*b)

because of the overflow problems, and proposed to
change that to

c = calloc(a,b);

Then people started arguing that "overflow doesn't exist"
etc etc.

I think a consensus would be to accept calloc as a "safer"
alternative to just multiplying without overflow check.

Click to expand...

I have only read parts of the thread here and there so it
may have been explained already but why is calloc(a,b)
safer ? Generally you don't know that the calloc implementation
checks for wraparound. Even if you have seen the source
code of the calloc you're using and know that it checks for
wraparound you can't know that a newer version of the
library will still do the same. It seems to me that if a and b
are large enough to make wraparound probable then the
only safe option is to check for that before calling malloc()
or calloc().

Because a calloc() implementation that blindly multiplies its two
arguments without checking whether the result wraps around is
non-conforming. If the user writes malloc(a*b), and the
multiplication wraps around, malloc() *must* attempt to allocated the
number of bytes specified by the (possibly wrapped) result of a*b; if
the user writes calloc(a, b), calloc() must attempt to allocate enough
memory for enough space for "a" objects of size "b" bytes each, and
must return NULL if it fails to do so. (But some calloc()
implementations do fail to check for wraparound, and are therefore
non-conforming, so in practice using calloc() isn't that much safer.)

But I question jacob's use of the word "consensus". I've been
following this discussion, and I haven't seen anyone else agree that
using calloc() is the proper solution; it imposes the additional
overhead of initializing the allocated memory to all-bits-zero, which
could be sigificant for the large allocations we're talking about.

Using calloc() rather than malloc() in this case merely detects the
error of attempting to allocate more than SIZE_MAX bytes. Detecting
errors is good, but avoiding them in the first place is better, and
using calloc() doesn't avoid the error, it merely diagnoses it better.

Richard · Jan 9, 2007

Spiros Bousbouras said:
I have only read parts of the thread here and there so it
may have been explained already but why is calloc(a,b)
safer ?

Because calloc takes care of the math and necessary wraparound
checks. if your multiplication wraps before the call to malloc then
malloc will try to allocate that "corrupt" number of bytes.

Keith Thompson · Jan 9, 2007

Keith Thompson said:
Because a calloc() implementation that blindly multiplies its two
arguments without checking whether the result wraps around is
non-conforming. If the user writes malloc(a*b), and the
multiplication wraps around, malloc() *must* attempt to allocated the
number of bytes specified by the (possibly wrapped) result of a*b; if
the user writes calloc(a, b), calloc() must attempt to allocate enough
memory for enough space for "a" objects of size "b" bytes each, and
must return NULL if it fails to do so. (But some calloc()
implementations do fail to check for wraparound, and are therefore
non-conforming, so in practice using calloc() isn't that much safer.)

On re-reading the above, I see that it was potentially unclear.

calloc(a, b) must attempt to allocate a*b bytes, where a*b denotes the
result of multiplying a by b *without* reducing the result modulo
SIZE_MAX+1. For example, given:
size_t a = SIZE_MAX;
size_t b = SIZE_MAX;
a * b == 1 when the multiplication is done in type size_t, but if
calloc(SIZE_MAX, SIZE_MAX) allocates only 1 byte, it's non-conforming.
On the other hand, malloc(SIZE_MAX * SIZE_MAX) *must* attempt to
allocate 1 byte, because the wraparound occurs before malloc() is
called.

CBFalconer · Jan 9, 2007

Richard said:
CBFalconer said:

I looked for that, but all I found was that size_t is the type
returned by sizeof, and that cannot be applied to objects
allocated by [mc]alloc().

Click to expand...

Click to expand...

Regardless, I believe we can imply that conclusion. C can only
operate on objects. The size of an object is garnered by sizeof,
and is of type size_t. malloc doesn't return generic objects, it
returns pointers.

Click to expand...

I didn't say it returned them. Are you suggesting that the pointer
returned by malloc (or calloc) doesn't point to an object?

No, I am saying it doesn't create objects. It creates space into
which you may stuff an object, and returns a pointer to that
space. It isn't a valid object until it is stuffed (possibly by
calloc).

Please don't strip attributions for material you quote.

CBFalconer · Jan 9, 2007

Richard said:
Richard said:

It's a conclusion that has to be drawn, because it does say
somewhere that size_t can describe the size of any object.

I looked for that, but all I found was that size_t is the type
returned by sizeof, and that cannot be applied to objects
allocated by [mc]alloc().

Click to expand...

Regardless, I believe we can imply that conclusion.

Click to expand...

It it certainly possible to imply it; it's been implied frequently by
many different people. The question is whether it's possible to infer
it. I've not seen valid arguments inferring that conclusion from actual
citations from the C standard.

... The size of an object is garnered by sizeof,

Click to expand...

Citation please, for the applicability of sizeof to dynamically
allocated objects?

struct foo ( whatever };
struct foo *p;
p = malloc(sizeof *p);
size = sizeof(*p);

Note that you can't apply sizeof to void.

Douglas A. Gwyn · Jan 9, 2007

Jun Woong said:
if (n == 0 || ((size_t) -1) / n >= s)

I'd like to request that people please not assume that -1
is represented by all 1 value bits.
((size_t)0 - 1) is one nice way to get all 1 value bits of
the desired width. You could also use ~(size_t)0.

Keith Thompson · Jan 9, 2007

CBFalconer said:
Richard said:

CBFalconer said:

I looked for that, but all I found was that size_t is the type
returned by sizeof, and that cannot be applied to objects
allocated by [mc]alloc().

Click to expand...

Regardless, I believe we can imply that conclusion. C can only
operate on objects. The size of an object is garnered by sizeof,
and is of type size_t. malloc doesn't return generic objects, it
returns pointers.

Click to expand...

I didn't say it returned them. Are you suggesting that the pointer
returned by malloc (or calloc) doesn't point to an object?

Click to expand...

No, I am saying it doesn't create objects. It creates space into
which you may stuff an object, and returns a pointer to that
space. It isn't a valid object until it is stuffed (possibly by
calloc).

But it does create objects. An object is a "region of data storage in
the execution environment, the contents of which can represent values"
(C99 3.14); the space allocated by malloc() or calloc() qualifies.

If you're saying it's not an object before a value is assigned to it,
but becomes on after a value has been assigned to it, consider this:

{
int obj;
}

Is "obj" an object? I say it is, even if it's uninitialized.

Keith Thompson · Jan 9, 2007

CBFalconer said:
(e-mail address removed) wrote: [...]

Citation please, for the applicability of sizeof to dynamically
allocated objects?

Click to expand...

struct foo ( whatever };
struct foo *p;
p = malloc(sizeof *p);
size = sizeof(*p);

Note that you can't apply sizeof to void.

Good point.

The general case of an array object allocated by calloc() is more
complicated. It's straightforward if the number of elements is a
constant:

struct foo *p = calloc(10, sizeof *p);
size = sizeof (struct foo[10]);

but more generally:

struct foo *p = calloc(n, sizeof *p);
size = sizeof(struct foo[n]);

I believe the latter is legal; "struct foo[n]" is a VLA type. And if
n * sizeof(struct foo) exceeds SIZE_MAX, I suppose apply sizeof
invokes undefined behavior.

But I don't believe UB is invoked *until* you attempt to apply sizeof.

Keith Thompson · Jan 9, 2007

I'd like to request that people please not assume that -1
is represented by all 1 value bits.
((size_t)0 - 1) is one nice way to get all 1 value bits of
the desired width. You could also use ~(size_t)0.

Can (size_t)-1 and (size_t)0 - 1 differ? If so, how?

Richard Heathfield · Jan 9, 2007

Douglas A. Gwyn said:

I'd like to request that people please not assume that -1
is represented by all 1 value bits.

Are people assuming that? I thought they were deducing from the reduction
rule for unsigned types that (size_t)-1 is the largest possible value of a
size_t, and *therefore*, with unsigned integer types being pure binary
representations, it is a size_t with every value bit set (i.e. "all 1 value
bits" is a consequence, not a cause).

((size_t)0 - 1) is one nice way to get all 1 value bits of
the desired width. You could also use ~(size_t)0.

Could you please explain how (size_t)-1 might *not* have all value bits set,
in a conforming implementation?

size_t, ssize_t and ptrdiff_t	56	Oct 12, 2013
malloc	40	May 1, 2011
Rock, Paper, Scissor game. Im getting TypeError, unsupported operand type(s) for -=: 'NoneType' and 'int'	2	Aug 29, 2023
size_t in inttypes.h	4	May 26, 2011
size_t, when to use it? (learning)	45	Apr 10, 2014
Machines where size of size_t is not equal to size of unsigned int/long	12	Sep 30, 2013
return -1 using size_t???	44	Feb 11, 2012
size_t or ssize_t	11	Feb 16, 2006

size_t or int for malloc-type functions?

Keith Thompson

Richard Tobin

kuyper

kuyper

kuyper

Keith Thompson

Spiros Bousbouras

Keith Thompson

jacob navia

Spiros Bousbouras

Keith Thompson

Richard

Keith Thompson

CBFalconer

CBFalconer

Douglas A. Gwyn

Keith Thompson

Keith Thompson

Keith Thompson

Richard Heathfield

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads