Black magic, or insanity?

R

Robbie Brown

I've been reviewing what I've learned about pointers.

I thought I'd do a few tests just to consolidate what I thought I'd
learned and frankly .. I'm dumfounded.

int main(int argc, char *argv[]){

//declare a pointer to int
int *ip;

//print ... what exactly, prints 'nil'
printf("ip is %p\n", ip);
//dereference the pointer, seg fault
printf("*ip is %d\n", *ip);

}

the output is what I expected

ip is (nil)
Segmentation fault (core dumped)

I then add the following statement after the last printf

int *ip2;

compile and exec and get the same output

ip is (nil)
Segmentation fault (core dumped)

Now then, the next bit is a total head****

If I modify the last statement so that it reads

int *ip2 = NULL;

so the code is now

int main(int argc, char *argv[]){

//declare a pointer to int
int *ip;

//print ... what exactly, prints 'nil'
printf("ip is %p\n", ip);
//dereference the pointer, seg fault
printf("*ip is %d\n", *ip);

int *ip2 = NULL;

}

then compile and exec I get the following

ip is 0x7fff0dfeb230
*ip is 1

WTF!!! ... how does initalizing ip2 to NULL cause the
previous code to now display ... something.

Is this for real?
I mean seriously, this is just ... what

I have no idea

Dazed and confused.
 
Z

Zoltan Kocsi

I've been reviewing what I've learned about pointers.

I thought I'd do a few tests just to consolidate what I thought I'd
learned and frankly .. I'm dumfounded.

int main(int argc, char *argv[]){

//declare a pointer to int
int *ip;

//print ... what exactly, prints 'nil'
printf("ip is %p\n", ip);
//dereference the pointer, seg fault
printf("*ip is %d\n", *ip);

}

the output is what I expected

ip is (nil)
Segmentation fault (core dumped)

Your expectation is completely wrong. The fact that ip is nil is due to
luck. You do not initialise it. Automatic variables (i.e. ones defined
inside a function without the 'static' keyword) are *not* initialised
by the compiler. Whatever junk is on the stack, that's the initial
value. If your compiler does any optimisation, then it's not even
the stack. Most likely ip was allocated in a register, which the start
code (which executes before your main() enters) happened to set to 0.
[ snip ]
WTF!!! ... how does initalizing ip2 to NULL cause the
previous code to now display ... something.

Chances are, ip was now allocated in a different register, due to the
need of allocating space for ip2. The new register contained a valid
address.

Since you have not initialised the pointers and they were not in the
BSS, you could expect nothing, absolutely nothing about their values.

Any decent compiler should have given you a warning about the
uninitialised nature of ip. Also note that even zeroing the BSS is a
hosted environment thing, many embedded systems do not initialise the
memory before starting main() at all.

Zoltan
 
B

Ben Bacarisse

Robbie Brown said:
I've been reviewing what I've learned about pointers.

I thought I'd do a few tests just to consolidate what I thought I'd
learned and frankly .. I'm dumfounded.

You just need to re-adjust your expectations. All of your examples have
what C calls undefined behaviour. The language standard does not say
what should happen, so compilers can do pretty much what they like.
Having any expectation at all is going to lead to puzzlement.

If, on the other hand, you want to know what is actually going on, then
just look at the generated code, but keep in mind that this will tell
you about one version of one compiler with one set of command-line flags
on one system at some particular time. You probably won't learn much of
use.

<snip>
 
R

Robbie Brown

Any decent compiler should have given you a warning about the
uninitialised nature of ip.

Hmm, I'm using gcc version 4.6.3 ... is this a 'decent compiler'

gcc -std=gnu99 -Wall pointers.c -g -o pointers
gives no warnings about uninitialised anything.

I hear what you are saying though and have taken it on board.

Thanks for your time
 
R

Robbie Brown

You just need to re-adjust your expectations. All of your examples have
what C calls undefined behaviour. The language standard does not say
what should happen, so compilers can do pretty much what they like.
Having any expectation at all is going to lead to puzzlement.

I'm discovering this, fascinating stuff.

Thanks
 
E

Eric Sosman

Hmm, I'm using gcc version 4.6.3 ... is this a 'decent compiler'

gcc -std=gnu99 -Wall pointers.c -g -o pointers
gives no warnings about uninitialised anything.

Strange. Even a much older (4.4.1) gcc gives me

foo.c: In function 'main':
foo.c:7: warning: implicit declaration of function 'printf'
foo.c:7: warning: incompatible implicit declaration of built-in
function 'printf'
foo.c:7: warning: 'ip' is used uninitialized in this function

A truly ancient (3.4.4) version emits only the `printf' warning,
but if invoked with optimization at -O1 or higher it also squawks
"warning: 'ip' might be used uninitialized in this function" (note
"might be" rather than "is"; this could be a different warning).

Wild guess: The detection of uninitialized uses depends on data
developed while optimizing, and the default optimization level when
no -Ox is specified varies from one gcc version to another. Try
adding -O1 or -O2 (or even -O3) to your command line, to see if
the compiler will offer more commentary.
 
K

Kaz Kylheku

I've been reviewing what I've learned about pointers.

I thought I'd do a few tests just to consolidate what I thought I'd
learned and frankly .. I'm dumfounded.

int main(int argc, char *argv[]){

//declare a pointer to int
int *ip;

Since this is a non-static local variable that is uninitialized, it contains
data which is traditionally called "garbage" in programmer lingo.

In C standard formal terms, its value is "indeterminate": which means that
it is an unspecified value which may be a trap representation.

By dumb tuck, this indeterminate value could look like a valid pointer,
and dereference successfully.

The indeterminate garbage inside ip could be different upon different
executions of the program, and could be influenced by changes to seemingly
irrelevant parts of the program.
//print ... what exactly, prints 'nil'
printf("ip is %p\n", ip);

This is undefined behavior already: you're accesing the value
indeterminately-valued object ip.
//dereference the pointer, seg fault
printf("*ip is %d\n", *ip);

We have no basis for expecting a "seg fault" here. The behavior here is
also undefined for the same reason. Undefined means not defined by the ISO
standard document which describes the C language. (If there were a requirement
to rpoduce a segmentation fault, that would be a definition of behavior; it
would not be "undefined".)

In the case of some undefined behaviors, we do have a basis for expecting
some particular behavior on a particular platform. That happens when the
language implementors give us a definition, or else we can otherwise deduce
the behavior from the structure of the platform, or from knowing something
about the compiler behavior, etc.
 
K

Keith Thompson

Robbie Brown said:
I've been reviewing what I've learned about pointers.

I thought I'd do a few tests just to consolidate what I thought I'd
learned and frankly .. I'm dumfounded.

int main(int argc, char *argv[]){

//declare a pointer to int
int *ip;

//print ... what exactly, prints 'nil'
printf("ip is %p\n", ip);
//dereference the pointer, seg fault
printf("*ip is %d\n", *ip);

}
[...]

This is not directly relevant to your question, but the "%p" printf
format expects an argument of type void*. You're giving it an argument
of type int*, which strictly speaking causes undefined behavior.

It's very very likely to work correctly on any system where void* and
int* have the same representation (which is the vast majority of
existing systems), but for maximum portability you should cast the
pointer value to void:

printf("ip is %p\n", (void*)ip);

This is one of the few cases where casting, particularly pointer
casting, is a good habit.
 
H

Helmut Tessarek

Is this for real?
I mean seriously, this is just ... what

check out my mail signature. it will also answer your question.

--
Helmut K. C. Tessarek

/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/
 
K

Keith Thompson

Helmut Tessarek said:
On 21.01.14 7:33 , Robbie Brown wrote:
check out my mail signature. it will also answer your question.

[...]
/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/

Good advice, but not actually relevant in this case.

The OP *expected* a segmentation fault on dereferencing a null pointer.
The problem was that the pointer object in question was uninitialized,
and therefore might or might not contain a null pointer value.
 
H

Helmut Tessarek

Good advice, but not actually relevant in this case.

A lot of people already gave extensive explanations and I think the main point
is that anything can and will happen.

So I think 'chaos and madness' is quite relevant, if you mess with null
pointers (or pointers that are potentially null pointers). ;-)
The OP *expected* a segmentation fault on dereferencing a null pointer.
The problem was that the pointer object in question was uninitialized,
and therefore might or might not contain a null pointer value.

Yep, for me a(n) (uninitialized) pointer that is a potential null pointer
still falls in the category not to mess with.

Cheerio!

--
Helmut K. C. Tessarek

/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/
 
R

Robbie Brown

Helmut Tessarek said:
On 21.01.14 7:33 , Robbie Brown wrote:
check out my mail signature. it will also answer your question.

[...]
/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/

Good advice, but not actually relevant in this case.

The OP *expected* a segmentation fault on dereferencing a null pointer.
The problem was that the pointer object in question was uninitialized,
and therefore might or might not contain a null pointer value.

Yes, I'm starting to get the impression that, unlike other languages I
have used, C (or rather the C compiler perhaps) doesn't stop you from
doing all manner of exceptionally stupid things.

For example, for no other reason that experimentation I tried to get my
head around pointers to pointers and came up with the following.
Trying hard not to make assumptions, just observations.

[Linux 3.2.0-23-generic x86_64 GNU/Linux]

int **arpi = (int**) malloc(sizeof(int*) * 5);
*(arpi + 4) = malloc(sizeof(int));
*(*(arpi + 4)) = 14;

If I run this through gdb I can see what I expected to see (there's that
word again, what other word can I use?).

arpi is a pointer to the first of 5 64 bit addresses.
the first 4 addresses contain 0x0000000000000000 I hope I understand
that these are uninitialized addresses ... or maybe they have been
initialized to 0 by some voodoo priest :) anyway
the fifth address contains the 64 bit address 0x0000000000602010
this seems reasonable as I malloc'd enough space for a pointer to int.
if I inspect the contents of 0x602010 I see 0x0e which is (I hope) what
I was expecting

Then it got all strange again

I changed the first line to
int **arpi = (int**) malloc(sizeof(int) * 5);

now I malloc int instead of int*
Compile, run, inspect, same old results
I think this works because an int is probably 64 bits same as an address
(gross assumption)

Then it gets weirder
int **arpi = (int**) malloc(0);
Now realistically what should I 'expect' to happen

I sort of expected it not to compile ... wrong, it compiled
I sort of expected it to blow up ... wrong, ran and exited normally
I even found 0x0e lurking about almost where I hoped it would be.

gdb exposed the memory and it was obviously not right but it still ran.

This *is* fun isn't it?

Ah well, onwards and upwards.
 
J

James Kuyper

On 01/22/2014 07:06 AM, Robbie Brown wrote:
....
int **arpi = (int**) malloc(0);
Now realistically what should I 'expect' to happen

I sort of expected it not to compile ... wrong, it compiled
I sort of expected it to blow up ... wrong, ran and exited normally
I even found 0x0e lurking about almost where I hoped it would be.

gdb exposed the memory and it was obviously not right but it still ran.

What the C standard requires is that malloc(0) may return either
a) a null pointer
b) a pointer suitably aligned for any type, but which points at memory
that cannot be safely written to.
 
M

Malcolm McLean

Yes, I'm starting to get the impression that, unlike other languages I
have used, C (or rather the C compiler perhaps) doesn't stop you from
doing all manner of exceptionally stupid things.
All you really need to understand is that C allows you to write to "raw"
addresses. Often the bits in the pointer are the actual bits which go on the
address bus to fetch data to and from RAM. Other times there's a very low-level
layer of indirection which prevents programs from corrupting each other and,
possible, damage to hardware.
Now if you write to a random address, it's very hard to say what will happen.
You might hit another variable, you might destroy your call stack, you might
send a byte to a memory-mapped port or put up a pixel on a memory-mapped
screen. The system might detect that what you are doing is illegal and issue
a segfault (this is the best, most desirable result from the point of view
of someone trying to write a useful program). You might even hit the pointer
itself.

That's all there really is to it. Some systems also put in protections against
reading from random addresses.
 
J

James Kuyper

On 01/22/2014 07:06 AM, Robbie Brown wrote:
...

What the C standard requires is that malloc(0) may return either
a) a null pointer
b) a pointer suitably aligned for any type, but which points at memory
that cannot be safely written to.

I should have mentioned that malloc(0) returns any non-null pointer
value, that value must be the result of malloc() having behaved exactly
the same as if it had been asked to allocate some non-zero amount of
memory. This implies that each non-null value returned by malloc(0) will
be unique, in the sense that will not compare equal to any other valid
pointer to an object.
 
R

Robbie Brown

I should have mentioned that malloc(0) returns any non-null pointer
value, that value must be the result of malloc() having behaved exactly
the same as if it had been asked to allocate some non-zero amount of
memory. This implies that each non-null value returned by malloc(0) will
be unique, in the sense that will not compare equal to any other valid
pointer to an object.

Now to me, that just seems perverse. By what strange incantation of
inverse logic was the decision made to use a request for 0 bytes of
memory as meaning 'give me anything but 0 bytes'.

I would have thought NULL was the perfect value to return in this case.
I suppose there is a good reason for it but I can't for the life of me
think what it could be. It's almost as if it were *designed* to confuse
and befuddle the unwary neophyte ........ no, surely not?
 
M

Malcolm McLean

On 22/01/14 15:10, James Kuyper wrote:

I would have thought NULL was the perfect value to return in this case.
I suppose there is a good reason for it but I can't for the life of me
think what it could be. It's almost as if it were *designed* to confuse
and befuddle the unwary neophyte ........ no, surely not?
Do you ask for a bag of no beans or no bag of beans?
Some took the former view, some the latter. It's a difficult problem how to
handle the empty case, you tend to want programs that treat it as part of
normal control flow, because that's likely to be more robust and correct.
But often treating specially is more efficient and easier to think through.
 
L

Lowell Gilbert

Robbie Brown said:
Now to me, that just seems perverse. By what strange incantation of
inverse logic was the decision made to use a request for 0 bytes of
memory as meaning 'give me anything but 0 bytes'.

I would have thought NULL was the perfect value to return in this case.
I suppose there is a good reason for it but I can't for the life of me
think what it could be. It's almost as if it were *designed* to
confuse and befuddle the unwary neophyte ........ no, surely not?

Both usages were already extant by the time standardization came around,
so we're stuck with them. The logic by which the not-returning-null
approach came about was the idea that a valid return value should not be
the same as an error return. I don't see that as completely silly.
 
J

James Kuyper

Missing word: ^ if
Now to me, that just seems perverse. By what strange incantation of
inverse logic was the decision made to use a request for 0 bytes of
memory as meaning 'give me anything but 0 bytes'.

For some purposes, it's convenient to create objects of varying sizes,
without having to do special case handling for objects with a size of 0.
It's sometimes important that each such object be distinguishable.
Objects allocated by using malloc(0), if it returns a non-null value,
are distinguishable by their addresses. The cost of making that possible
is that those addresses cannot be used for any other purpose, which is
pretty much the same effect as if those addresses had been used to store
something. Portable code cannot rely upon this behavior, but unportable
code exists that relies upon the fact that malloc(0) has this behavior
on a particular implementation of C.
I would have thought NULL was the perfect value to return in this case.
I suppose there is a good reason for it but I can't for the life of me
think what it could be. It's almost as if it were *designed* to confuse
and befuddle the unwary neophyte ........ no, surely not?

No, the standard was designed to accommodate the wide variety of
existing implementations of C. This often results in confusion and
befuddlement, but that wasn't the purpose. There are arguments for
either way of implementing malloc(0), but I don't think anyone would
have chosen to allow both if they'd been free to ignore existing
implementations.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top