Black magic, or insanity?

K

Keith Thompson

Robbie Brown said:
Helmut Tessarek said:
On 21.01.14 7:33 , Robbie Brown wrote:
check out my mail signature. it will also answer your question.

[...]
/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/

Good advice, but not actually relevant in this case.

The OP *expected* a segmentation fault on dereferencing a null pointer.
The problem was that the pointer object in question was uninitialized,
and therefore might or might not contain a null pointer value.

Yes, I'm starting to get the impression that, unlike other languages I
have used, C (or rather the C compiler perhaps) doesn't stop you from
doing all manner of exceptionally stupid things.

Yes. Another interesting, um, feature of C is that the syntax is what I
think of as "dense". What that means is that a single-character typo in
an otherwise correct C program can easily produce something that's
perfectly correct as far as the compiler is concerned, but has
completely different behavior.
For example, for no other reason that experimentation I tried to get my
head around pointers to pointers and came up with the following.
Trying hard not to make assumptions, just observations.

[Linux 3.2.0-23-generic x86_64 GNU/Linux]

int **arpi = (int**) malloc(sizeof(int*) * 5);

A good idiom for malloc that mostly avoids type mismatches is:

int **arpi = malloc(5 * sizeof *arpi);

Casting the result of malloc is unnecessary and can mask errors in some
cases. Applying sizeof to *arpi (more generally, to what the LHS points
to) ensures that you have the correct size and type without having to
repeat the type name.
*(arpi + 4) = malloc(sizeof(int));

Probably better written as:

arpi[4] = malloc(sizeof *(arpi[4]));
*(*(arpi + 4)) = 14;

*(arpi[4]) = 14;
If I run this through gdb I can see what I expected to see (there's that
word again, what other word can I use?).

arpi is a pointer to the first of 5 64 bit addresses.
the first 4 addresses contain 0x0000000000000000 I hope I understand
that these are uninitialized addresses ... or maybe they have been
initialized to 0 by some voodoo priest :) anyway

malloc returns a pointer to uninitialized memory. The contents might
happen to be all bits zero, but that's not guaranteed, and you shouldn't
rely on it. And the null pointer is very commonly represented as
all-bits-zero, but that's not guaranteed either.
the fifth address contains the 64 bit address 0x0000000000602010
this seems reasonable as I malloc'd enough space for a pointer to int.
if I inspect the contents of 0x602010 I see 0x0e which is (I hope) what
I was expecting
Yes.

Then it got all strange again

I changed the first line to
int **arpi = (int**) malloc(sizeof(int) * 5);

now I malloc int instead of int*
Compile, run, inspect, same old results
I think this works because an int is probably 64 bits same as an address
(gross assumption)

Yes, that's the kind of type mismatch that can be avoided by the idiom I
suggested above.
Then it gets weirder
int **arpi = (int**) malloc(0);
Now realistically what should I 'expect' to happen

I sort of expected it not to compile ... wrong, it compiled

I'm not sure why you'd expect it not to compile. malloc is a library
function, not a built-in language feature. It takes an integer argument
(specifically an argument of the unsigned integer type size_t), and
you've called it with an integer value. Even if 0 were not a valid
argument value, it's of the right type (or rather, is implicitly
convertible to the right type), so there's nothing for the compiler to
complain about. The run time behavior may be another matter; as James
Kuyper already explained, the behavior of malloc(0) is
implementation-defined.
I sort of expected it to blow up ... wrong, ran and exited normally
I even found 0x0e lurking about almost where I hoped it would be.

gdb exposed the memory and it was obviously not right but it still ran.

It's likely that malloc(0) allocated some small amount of memory from
the heap (it could have returned a null pointer, but then your program
probably would have crashed). The actual amount of memory allocated for
malloc(N) is likely to be a bit bigger than N, but you can only safely
access the first N bytes (and only if malloc(N) actually succeeded).
But if you try to access memory beyond those first N bytes, you're
*probably* still accessing memory within your program's memory space.
The behavior is undefined, but that doesn't mean it's going to crash;
if you're *unlucky*, it will appear to "work".
 
K

Kaz Kylheku

Now to me, that just seems perverse. By what strange incantation of
inverse logic was the decision made to use a request for 0 bytes of
memory as meaning 'give me anything but 0 bytes'.

The actual logic is "return either null, or erturn a unique pointer".
The issue is not the number of bytes, but rather the important expectation that
malloc doesn't return the same pointer two or more times (when nothing is freed
in betwen), unless perhaps it is the null.

Since pointers are basically addresses, the requirement for returning unique
pointers requires a non-zero amount of allocation.

Note that the blocks returned by malloc are often larger than what is
requested, though there isn't any portable way to find out how much larger.
This is done for the sake of alignment of the meta-data structures that
lie between the allocated blocks.

If there is a free-space block after the block you've just allocated, a common
strategy is to put a header structure into that free space, which places it
into a list of other such free space blocks. On many architectures, such a
structure has to be properly aligned since it contains word-sized quantities
like pointers.

Also, some malloc implementations simply have "buckets" of fixed-sizes of
blocks. For instance there might be a bucket for, say, 32 byte objects, one for 48 byte ones, then 64, 92, 128, ...

If you allocate a 49 byte object, you may actually get 64 bytes; you just don't
know.

It is not reasonable to get a 16 byte object when you asked for zero.
I would have thought NULL was the perfect value to return in this case.

Yes, and so did some traditional C library implementors. So when it came time
to standardize the language, it was found that some libraries produced null,
whereas others returned something new.

This was simply captured in the standard: that programs being ported
among implementations could expect either behavior.
 
J

Joe Pfeiffer

Robbie Brown said:
Helmut Tessarek said:
On 21.01.14 7:33 , Robbie Brown wrote:
check out my mail signature. it will also answer your question.

[...]
/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/

Good advice, but not actually relevant in this case.

The OP *expected* a segmentation fault on dereferencing a null pointer.
The problem was that the pointer object in question was uninitialized,
and therefore might or might not contain a null pointer value.

Yes, I'm starting to get the impression that, unlike other languages I
have used, C (or rather the C compiler perhaps) doesn't stop you from
doing all manner of exceptionally stupid things.

Years and years ago I came across ways to shoot yourself in the foot in
various programming languages (in assembly code, you started by building
a gun. In Pascal, you changed your mind and shot yourself in the head
when you realized you couldn't actually accomplish anything useful in
the language. And so forth.). For C, it simply stated "you shoot
yourself in the foot".

For me, that's always been simultaneously C's strongest and weakest
point: it will let you do what you say you want to do without arguing
with you about it.
 
K

Ken Brody

On 22/01/14 15:10, James Kuyper wrote: [...]
I should have mentioned that malloc(0) returns any non-null pointer

I assume there is a missing "if"? ("... if malloc(0) returns ...")
Now to me, that just seems perverse. By what strange incantation of inverse
logic was the decision made to use a request for 0 bytes of memory as
meaning 'give me anything but 0 bytes'.

I would have thought NULL was the perfect value to return in this case.
I suppose there is a good reason for it but I can't for the life of me think
what it could be. It's almost as if it were *designed* to confuse and
befuddle the unwary neophyte ........ no, surely not?

Consider the fact that, for non-zero lengths, a return of NULL means
failure. If malloc(0) returns NULL, did it really fail? (Valid arguments
can be made for both sides.)

I'm sure that, at the time the Standard was written, there were
implementations on both sides of the argument, and there was no compelling
reason to require one over the other. If there was any change to existing
implementations, it would have been to add the requirement that non-NULL
returns from malloc(0) must be different than any previous non-free()ed
return from malloc(), just as would be the case of non-zero malloc()s.

In short, you can think of "malloc(len)" where len==0 to be no different
than any other malloc(len) call -- if it succeeds, it returns a buffer of
the requested length.
 
J

James Kuyper

On 22/01/14 15:10, James Kuyper wrote: [...]
I should have mentioned that malloc(0) returns any non-null pointer

I assume there is a missing "if"? ("... if malloc(0) returns ...")
Correct.

....
In short, you can think of "malloc(len)" where len==0 to be no different
than any other malloc(len) call -- if it succeeds, it returns a buffer of
the requested length.

"... of at least the requested length.". malloc(n) is always permitted
to allocate more than n bytes. In the case of malloc(0), a non-null
return value is not only allowed to point at a larger allocation, it is
required to do so.
 
K

Keith Thompson

James Kuyper said:
On 22/01/14 15:10, James Kuyper wrote: [...]
I should have mentioned that malloc(0) returns any non-null pointer

I assume there is a missing "if"? ("... if malloc(0) returns ...")
Correct.

...
In short, you can think of "malloc(len)" where len==0 to be no different
than any other malloc(len) call -- if it succeeds, it returns a buffer of
the requested length.

"... of at least the requested length.". malloc(n) is always permitted
to allocate more than n bytes. In the case of malloc(0), a non-null
return value is not only allowed to point at a larger allocation, it is
required to do so.

But even if malloc(0) returns a non-null value, it's not necessarily
*quite* the same as a value returned by malloc() with some non-zero
argument:

If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned, or the
behavior is as if the size were some nonzero value, except that the
returned pointer shall not be used to access an object.

So this:

char *p1 = malloc(1);
if (p1 != NULL) *p1 = 'x';

is well behaved, but this:

char p0 = malloc(0);
if (p0 != NULL) *p0 = 'x';

has undefined behavior.

A reasonable implementation would probably either return NULL for
malloc(0), or treat malloc(0) as equivalent to malloc(1), but other
behaviors are permitted.
 
J

James Kuyper

But even if malloc(0) returns a non-null value, it's not necessarily
*quite* the same as a value returned by malloc() with some non-zero
argument:

If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned, or the
behavior is as if the size were some nonzero value, except that the
returned pointer shall not be used to access an object.

So this:

char *p1 = malloc(1);
if (p1 != NULL) *p1 = 'x';

is well behaved, but this:

char p0 = malloc(0);
if (p0 != NULL) *p0 = 'x';

has undefined behavior.

A reasonable implementation would probably either return NULL for
malloc(0), or treat malloc(0) as equivalent to malloc(1), but other
behaviors are permitted.

Yes, it's permitted to behave like malloc(n) where n is an arbitrary
positive number which could even, in principle, differ between one call
to malloc(0) and another. But every permitted variation for malloc(0)
that involves returning a non-null pointer is correctly described by the
phrase "allocates more than 0 bytes". The as-if rule provides a limited
amount of protection - the memory need not actually be allocated, since
the pointer cannot be safely used to access that memory. However, the
address returned must not point to memory allocated for any other
purpose that is visible from the user code, which is almost the same thing.
 
E

Eric Sosman

[...]
But even if malloc(0) returns a non-null value, it's not necessarily
*quite* the same as a value returned by malloc() with some non-zero
argument:

If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned, or the
behavior is as if the size were some nonzero value, except that the
returned pointer shall not be used to access an object.

So this:

char *p1 = malloc(1);
if (p1 != NULL) *p1 = 'x';

is well behaved, but this:

char p0 = malloc(0);
if (p0 != NULL) *p0 = 'x';

has undefined behavior.

True, but that's just a special case of

size_t n = ...;
char *pn = malloc(n);
if (pn != NULL) pn[n] = 'x';

.... having undefined behavior.
 
P

Paul N

Yes, I'm starting to get the impression that, unlike other languages I
have used, C (or rather the C compiler perhaps) doesn't stop you from
doing all manner of exceptionally stupid things.

C is derived from BCPL, of which a book co-written by the author of the language (Martin Richards) says "The philosophy of BCPL is not one of the tyrant who thinks he knows best and lays down the law on what is and what is not allowed; rather, BCPL acts more as a servant offering his services to thebest of his ability without complaint, even when confronted with apparent nonsense. The programmer is always assumed to know what he is doing and is not hemmed in by petty restrictions."
 
K

Kaz Kylheku

C is derived from BCPL, of which a book co-written by the author of the
language (Martin Richards) says "The philosophy of BCPL is not one of the
tyrant who thinks he knows best and lays down the law on what is and what is
not allowed; rather, BCPL acts more as a servant offering his services to the

BCPL is completely "typeless"; everything is a word. If you use a word as
apointer, then it's a pointer. If you use it as a number, it's a number.

C has a comparatively "rich" type system, and its declarations and type
checking are the tyranny the above alludes to.
 
J

James Kuyper

On 1/22/2014 11:12 AM, Robbie Brown wrote:
On 22/01/14 15:10, James Kuyper wrote:
[...]
I should have mentioned that malloc(0) returns any non-null pointer

I assume there is a missing "if"? ("... if malloc(0) returns ...")
Correct.

...
In short, you can think of "malloc(len)" where len==0 to be no different
than any other malloc(len) call -- if it succeeds, it returns a buffer of
the requested length.

"... of at least the requested length.". malloc(n) is always permitted
to allocate more than n bytes. In the case of malloc(0), a non-null
return value is not only allowed to point at a larger allocation, it is
required to do so.

No, the allocation space may have 0 bytes of user data. Most mallocs
return a block which includes a few bytes before the block as memory
management. This often is exactly the size (or a multiple of) the
alignment requirement for allocations. Thus malloc(0) CAN return a
pointer to 0 bytes of usable memory while still making every allocation
unique.

The standard requires that "If the size of the space requested is zero,
the behavior is implementation-defined: either a null pointer is
returned, or the behavior is as if the size were some nonzero value,
....". This means that, since returning a non-null pointer to a block of
memory 0 bytes long is not permissible behavior for malloc(n) when n is
non-zero, it is therefore not permissible behavior for malloc(0).
However, since you can't access the memory allocated, the as-if rule
probably covers that.

Some mallocs() use other methods of memory management, such are rounding
all allocations up to the next power of 2, and reserving distinct blocks
of memory for each power of two. They can then figure out the size of
each allocation by determining which block it was allocated from, and
therefore don't need to store the allocation's size in a header. Such an
implementation cannot allocate 0 bytes when malloc(0) is called, because
then it could return the same pointer for multiple calls to malloc(0).
That would not be covered by the as-if rule: different allocations of
non-zero amounts of memory cannot have the same starting address,
therefore different calls to malloc(0) are also not allowed to return
equivalent pointer values.
 
K

Keith Thompson

James Kuyper said:
On 01/22/2014 01:36 PM, Ken Brody wrote:
On 1/22/2014 11:12 AM, Robbie Brown wrote:
On 22/01/14 15:10, James Kuyper wrote:
[...]
I should have mentioned that malloc(0) returns any non-null pointer

I assume there is a missing "if"? ("... if malloc(0) returns ...")

Correct.

...
In short, you can think of "malloc(len)" where len==0 to be no different
than any other malloc(len) call -- if it succeeds, it returns a buffer of
the requested length.

"... of at least the requested length.". malloc(n) is always permitted
to allocate more than n bytes. In the case of malloc(0), a non-null
return value is not only allowed to point at a larger allocation, it is
required to do so.

No, the allocation space may have 0 bytes of user data. Most mallocs
return a block which includes a few bytes before the block as memory
management. This often is exactly the size (or a multiple of) the
alignment requirement for allocations. Thus malloc(0) CAN return a
pointer to 0 bytes of usable memory while still making every allocation
unique.

The standard requires that "If the size of the space requested is zero,
the behavior is implementation-defined: either a null pointer is
returned, or the behavior is as if the size were some nonzero value,
...". This means that, since returning a non-null pointer to a block of
memory 0 bytes long is not permissible behavior for malloc(n) when n is
non-zero, it is therefore not permissible behavior for malloc(0).
However, since you can't access the memory allocated, the as-if rule
probably covers that.

I think the "..." in your quotation hides something critical.

The full sentence is:

If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned, or
the behavior is as if the size were some nonzero value, except
that the returned pointer shall not be used to access an object.

So the behavior of malloc(0), if it returns a non-null pointer, is *not*
necessarily the same as malloc(n) for some positive n. It can be, but
it can behave differently.

For example, the implementation could maintain a pool of addresses that
point outside the actual memory space, and dole them out only for
malloc(0) calls. As long as they're non-null, unique, and comparable
for equality to other addresses, the implementation is still conforming
(which would not be the case if the "except that" clause weren't there).

Certainly malloc(0) *can* behave exactly like malloc(1), but it doesn't
have to.
 
J

James Kuyper

I think the "..." in your quotation hides something critical.

The full sentence is:

If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned, or
the behavior is as if the size were some nonzero value, except
that the returned pointer shall not be used to access an object.

So the behavior of malloc(0), if it returns a non-null pointer, is *not*
necessarily the same as malloc(n) for some positive n. It can be, but
it can behave differently.

For example, the implementation could maintain a pool of addresses that
point outside the actual memory space, and dole them out only for
malloc(0) calls.

That's certainly acceptable, so long as they are doled out with a
spacing of at least 1 byte; which is essentially an allocation of 1
byte, even if the byte itself is never used. However, they can't be
doled out with a spacing of 0 bytes, which is the possibility I was
concerned about.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top