Printing a NULL pointer

J

junky_fellow

Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

int main(void)
{

char *ptr;
ptr = 0;

printf("\nptr=%p\n",ptr);
}

What would be the output, 0 or 0x12345678 ?
I think user must be kept transparent from the internal representation
of NULL pointer. Even if the implementation is
using 0x12345678 for NULL pointer, value printed should be all
bits zero.
 
G

Gordon Burditt

Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

Probably. It could just as well be printed as "(ullnay)".
What would be the output, 0 or 0x12345678 ?

It is quite possible on some implementations that the output of %p
always contains a colon.
I think user must be kept transparent from the internal representation
of NULL pointer.

If that's your opinion, fine. I don't think this is justified
by the standard or shared by implementors.
Even if the implementation is
using 0x12345678 for NULL pointer, value printed should be all
bits zero.

Gordon L. Burditt
 
J

Jean-Claude Arbaut

Le 15/06/2005 07:50, dans
(e-mail address removed),
« [email protected] » said:
Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

int main(void)
{

char *ptr;
ptr = 0;

printf("\nptr=%p\n",ptr);
}

What would be the output, 0 or 0x12345678 ?
I think user must be kept transparent from the internal representation
of NULL pointer. Even if the implementation is
using 0x12345678 for NULL pointer, value printed should be all
bits zero.

It seems dangerous if 0 is a valid address, and it probably will be
if 0x12345678 is NULL. In that cas, I would prefer 0x12345678,
or even "(null)" or anything clearly announcing a NULL pointer.
 
L

Lawrence Kirby

Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

Either or neither. The implementation could coose to output it as, for
example said:
int main(void)
{

char *ptr;
ptr = 0;

printf("\nptr=%p\n",ptr);
}
}
What would be the output, 0 or 0x12345678 ?

The standard does not specify the form of output of %p. There is no
requirement that it takes the form of a hex number, although it can and
some implementations do that.
I think user must be kept
transparent from the internal representation of NULL pointer.

That is certainly not a requirement. All that is required is that scanf()
%p can recreate the pointer from the output of printf() %p
Even if
the implementation is using 0x12345678 for NULL pointer, value printed
should be all bits zero.

If you want full transparency then direct correspondance to any particular
bit pattern should be avoided.

Lawrence
 
J

junky_fellow

Lawrence said:
Either or neither. The implementation could coose to output it as, for


The standard does not specify the form of output of %p. There is no
requirement that it takes the form of a hex number, although it can and
some implementations do that.


That is certainly not a requirement. All that is required is that scanf()
%p can recreate the pointer from the output of printf() %p


If you want full transparency then direct correspondance to any particular
bit pattern should be avoided.

Lawrence

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.
 
P

pete

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
void *pointer = NULL;
size_t byte;

for (byte = 0; byte != sizeof pointer; ++byte) {
printf(
"byte %lu is 0x%u\n",
(long unsigned)byte,
(unsigned)((unsigned char *)&pointer)[byte]
);
}
puts(
"There may be more than one "
"representation for a null pointer."
);
return 0;
}

/* END new.c */
 
J

Jean-Claude Arbaut

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

You can cast to an unsigned integer (if pointers and integers are
32 bits, that should work). If the compiler is very clever and
converts the NULL pointer to a 0 value, then you can try this:
Write a function "unsigned fun(unsigned x) { return x; }", and
compile, then in another file, declare this function as
"unsigned fun(char *p)" and pass it a NULL pointer. Hopefully the
result will be the internet representation of a NULL. I'm not
sure, since I've never tried this on a machine on which NULL != 0.
I made many assumptions here on integer and pointer sizes, so
it may not work at all. And on a machine where pointers and integers
are passed in a different way, it won't work either. Be careful,
that's just an idea.
 
P

pete

Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

It could be printed out as 0xdeadbeef.
 
C

CBFalconer

Consider an implementation that doesn't use all bits 0 to represent
a NULL pointer. Let the NULL pointer is represented by 0x12345678.
On such an implementation, if the value of NULL pointer is printed
will it be all 0's or 0x12345678

int main(void)
{
char *ptr;
ptr = 0;
printf("\nptr=%p\n",ptr);
}

What would be the output, 0 or 0x12345678 ?

It's implementation dependent. From N869:

p The argument shall be a pointer to void. The value
of the pointer is converted to a sequence of
printing characters, in an implementation-defined
manner.
 
R

Richard Tobin

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

In all real-world implementations the NULL pointer is all-bits zero.
(Someone will post a counter-example if I'm wrong.) So if you are
really debugging with a memory dump, rather than asking a theoretical
question, there is no problem.

-- Richard
 
L

Lawrence Kirby

You can cast to an unsigned integer (if pointers and integers are
32 bits, that should work).

This may work for debugging purposes if the compiler happens to do the
right thing. But there is no circumstances under which the standard
guarantees that converting from a pointer to an integer will produce a
useful result.
If the compiler is very clever and
converts the NULL pointer to a 0 value, then you can try this:
Write a function "unsigned fun(unsigned x) { return x; }", and
compile, then in another file, declare this function as
"unsigned fun(char *p)" and pass it a NULL pointer.

This is an extremely nasty and broken kludge. You get undefined behaviour
if the type used to call a function is not compatible with the type of
its definition.
Hopefully the
result will be the internet representation of a NULL. I'm not
sure, since I've never tried this on a machine on which NULL != 0.
I made many assumptions here on integer and pointer sizes, so
it may not work at all. And on a machine where pointers and integers
are passed in a different way, it won't work either. Be careful,
that's just an idea.

E.g. 68K systems where integers and pointers are typically passed in
different registers.

There's no ned to resort to non-portable code, this can be done portably.
In C the representation of an addressable object can be inspected by
treating it as an array of unsigned char. For example


type *ptr = NULL;
const unsigned char *p = (const unsigned char *)&ptr;
size_t i;

for (i = 0; i < sizeof ptr; i++)
printf(" %.2x", (unsigned)p);


Lawrence
 
J

Jean-Claude Arbaut

Le 15/06/2005 16:52, dans (e-mail address removed),
« Lawrence Kirby » said:
This may work for debugging purposes if the compiler happens to do the
right thing. But there is no circumstances under which the standard
guarantees that converting from a pointer to an integer will produce a
useful result.

I thought it was obvious here that my suggestions were non Standard.
Is there also a Standard way to explain obvious things ? :)
This is an extremely nasty and broken kludge. You get undefined behaviour
if the type used to call a function is not compatible with the type of
its definition.

I knew you wouldn't like ;-) It's not very dangerous here, I just
return an integer and it's a valuable trick in some situations.
Oh, and when you link asm code to C, what do you think you do ?
And if you tell me it's not Standard, then I'll answer you can't
have a libc without asm... Perhaps asm is too nasty :-D
E.g. 68K systems where integers and pointers are typically passed in
different registers.

Thanks for the example. I had another processor in mind, but I've never
Seen a C compiler for it. I think it's called Saturn, but I'm not sure.
There's no ned to resort to non-portable code, this can be done portably.
In C the representation of an addressable object can be inspected by
treating it as an array of unsigned char. For example


type *ptr = NULL;
const unsigned char *p = (const unsigned char *)&ptr;
size_t i;

for (i = 0; i < sizeof ptr; i++)
printf(" %.2x", (unsigned)p);


Much better, yes.
 
A

Anonymous 7843

That is certainly not a requirement. All that is required is that scanf()
%p can recreate the pointer from the output of printf() %p

C is broken then. Okay, not broken, but there is a hidden
assumption that needs to be called out.

If we cannot make any assumptions about the format of the pointer,
then we cannot reliably print it out and read it back in. Preceding
and succeeding data might "accidentally" consist of colons,
hexadecimal characters, the word "ullnay" etc and we have no
sanctioned set of delimiters to protect the %p data from surrounding
garbage.

The only alternatives I can see are that
- all %p data has a deterministic length (not necessarily fixed length) but
e.g. if is starts with 0x then there must be (e.g.) 8 more chars,
else it must be 6 chars of "ullnay" else it isn't a %p
- Or, the %p data can only be written to a file or char array alone
IOW, the EOF or \0 are the only delimiters.

Perhaps it is already codified and I'm not aware of it, but
modifiers like %-20.10p would disturb the length of the field
and make it unreadable.

At this point I'm suspecting that a lot of implementations
would screw up "%p%p" in a *scanf call, confusing the leading
zero of 0x as part of the preceding pointer data and ignoring
the result as if it were unsigned hex overflow.
 
K

Keith Thompson

Is there any way by which user can determine what is the internal
representation for a NULL pointer ? I am asking this because,
sometimes during debugging the memory dump is analysed. In that
case it would be difficult to find it is a NULL pointer or not.

printf("%p\n", (void*)NULL);

is very likely to print a legible form of the representation of a null
pointer (which is very likely to be all-bits-zero). If "very likely"
isn't good enough, several followups have shown how to break the
representation down into a sequence of bytes.
 
L

Lawrence Kirby

Le 15/06/2005 16:52, dans (e-mail address removed),
« Lawrence Kirby » <[email protected]> a écrit :

....
I thought it was obvious here that my suggestions were non Standard.
Is there also a Standard way to explain obvious things ? :)

It is obvious if you know it is non-standard, probably not if you don't.
The best thing is simply not to do it, you rarely if ever need it unless
you are writing your own memory allocator.
I knew you wouldn't like ;-) It's not very dangerous here,

Like everything else it is not dangerous on implementations where it
works, but could be disasterous on implementations where it doesn't. The
approach should always be to looks for approaches that avoid doing things
like this. After all what if it breaks on the next version of the compiler
you use? There's nothing wrong with the compiler, it is the code that is
faulty.
I just
return an integer and it's a valuable trick in some situations.

I find that hard to believe. It suggests that you haven't put enough
thought into finding a better solution.
Oh, and when you link asm code to C, what do you think you do ?
And if you tell me it's not Standard, then I'll answer you can't
have a libc without asm... Perhaps asm is too nasty :-D

If you link asm to C code you are sacrificing portability, which is fine
in some circumstances. However you are (or should be) still basing the
code on specifications that define its behaviour. By doing things like
calling functions incorrectly you have NO specification of behaviour, you
are trusting to blind luck and hope that things will continue to work as
you have observed in the past. This is no way to program. Today's compiler
optimisers are too complex to predict with any certainty. You may use a
trick that worked many times, then one day with the same compiler you use
it in a situation which the compiler decides it can optimise and suddenly
the trick no longer works. There was an example of this in another
thread to do with overlaying structures. One of the reasons C leaves some
areas of behaviour undefined is to allow more aggressive optimisations.

Lawrence
 
C

Clark S. Cox III

C is broken then. Okay, not broken, but there is a hidden
assumption that needs to be called out.

If we cannot make any assumptions about the format of the pointer,
then we cannot reliably print it out and read it back in. Preceding
and succeeding data might "accidentally" consist of colons, hexadecimal
characters, the word "ullnay" etc and we have no
sanctioned set of delimiters to protect the %p data from surrounding
garbage.

The only alternatives I can see are that
- all %p data has a deterministic length (not necessarily fixed length)
but e.g. if is starts with 0x then there must be (e.g.) 8 more chars,
else it must be 6 chars of "ullnay" else it isn't a %p
- Or, the %p data can only be written to a file or char array alone
IOW, the EOF or \0 are the only delimiters.

Perhaps it is already codified and I'm not aware of it, but
modifiers like %-20.10p would disturb the length of the field
and make it unreadable.

At this point I'm suspecting that a lot of implementations
would screw up "%p%p" in a *scanf call, confusing the leading
zero of 0x as part of the preceding pointer data and ignoring
the result as if it were unsigned hex overflow.

But that's no different than any of the other printf/scanf specifiers.
That is, print two integers with "%d%d", and scan them back in.
 
J

Jean-Claude Arbaut

It is obvious if you know it is non-standard, probably not if you don't.
The best thing is simply not to do it, you rarely if ever need it unless
you are writing your own memory allocator.

I agree, but it doesn't hurt knowing it's possible (though not portable),
and knowing how things work.
Like everything else it is not dangerous on implementations where it
works, but could be disasterous on implementations where it doesn't. The
approach should always be to looks for approaches that avoid doing things
like this.

Yes, sir ! No irony, I completely agree it's not the style to use
too often, but if I remember well, the original post asked for
a manner to see what NULL is. Just a toy program after all.

Sometimes, it may be necessary: you mention a memory allocator,
but there may be other uses.
After all what if it breaks on the next version of the compiler
you use? There's nothing wrong with the compiler, it is the code that is
faulty.

Some parts of a program are by nature very dependent on OS/compiler/proc.
It's a good practice to try to avoid them, it's not good practice do
deny their existence. This newsgroup deny all that is not perfectly
described by the Standard. It's only this attitude I reject, not
the Standard itself.
I find that hard to believe. It suggests that you haven't put enough
thought into finding a better solution.

You are right, I must confess :)
If you link asm to C code you are sacrificing portability, which is fine
in some circumstances.

I agree (but do I need to say that ? :)).
However you are (or should be) still basing the
code on specifications that define its behaviour. By doing things like
calling functions incorrectly you have NO specification of behaviour,

Well, the standard doesn't specify anything, but your knowledge of
the system you're programming on (including compiler), gives you
all needed informations on the behaviour. When in doubt, it's always (?)
possible to have a look at the assembly output.
you
are trusting to blind luck and hope that things will continue to work as
you have observed in the past.

Nope. I never trust a compiler :) Even with beautiful C code, I'm not
confident in its optimizations, especially concerning floating point.
And this opinion is not going to change in the near future: "paranoia" is
old, but still an interesting test (just as an example).
This is no way to program. Today's compiler
optimisers are too complex to predict with any certainty.

Here I disagree. There are many predictable parts in gcc output, with enough
habit, you know when writing some code if it will be well optimized or not,
and which kind of optimization occurs. If you think your compiler is
perfect, well I hope it is ! Gcc, to stay with something I am acquainted
with, won't use prefetch or Altivec instructions (I heard gcc 4 will, but
I'm still not convinced). Hence it's not so difficult to beat its optimizer.
On the other hand, it's much more difficult to beat xlc (won't use Altivec,
but apart from that, it is very, very clever). I think a good practice is
always having a look at assembly output, for function that need good
optimizations, or that use non standard tricks. Obviously, that demands
some understanding of OS/proc.

Oh, and I said "acquainted with gcc", I wouldn't even try to make anyone
believe I know perfectly how it works.
You may use a
trick that worked many times, then one day with the same compiler you use
it in a situation which the compiler decides it can optimise and suddenly
the trick no longer works.

Not seen for the moment, but I am vigilant :)
There was an example of this in another
thread to do with overlaying structures.

I'll look for it !
One of the reasons C leaves some
areas of behaviour undefined is to allow more aggressive optimisations.

And why does many if not all compilers allow so many extensions ? Some are
completely understandable, but many (including gcc's extensions) are too
tempting, and when used, code is irremissibly struck with one specific
compiler. It may seem strange for me to say that :) In fact I agree with
the principle of a standard (I've seen this necessity with these many
implementations of f77 hanging around), but I don't agree to deny specific
use of it, when it's needed. Only when it's needed. And I'm glad you
pointed out a good way to do what was asked in this thread. I hope I've
clarified some points.

By the way, is it completely portable ? Use of an array of chars to read
what is very often an integer may seem strange. It's fun: it's portable
only because the standard don't know a pointer is an int or
something else :) Actually, we only want to know if NULL is 0, so no
problem.
 
K

Keith Thompson

Clark S. Cox III said:
On 2005-06-15 14:07:30 -0400, (e-mail address removed) (Anonymous 7843) said: [...]
Perhaps it is already codified and I'm not aware of it, but
modifiers like %-20.10p would disturb the length of the field
and make it unreadable.
At this point I'm suspecting that a lot of implementations
would screw up "%p%p" in a *scanf call, confusing the leading
zero of 0x as part of the preceding pointer data and ignoring
the result as if it were unsigned hex overflow.

But that's no different than any of the other printf/scanf
specifiers. That is, print two integers with "%d%d", and scan them
back in.

The difference, though, is that we know the format of the output
produced by "%d", and can allow for in when using *scanf. We can't
necessarily know how to avoid similar problems with "%p".

On the other hand, I've never heard of this being a problem in
practice. If the output isn't intended to be re-scanned (which is
probably the case most of the time), you merely have to count on the
implementer to produce something legible. If you need to be able to
re-scan it, it probably suffices to surround it with white space. (If
it turned out to be a problem, the standard could always be amended to
forbid blanks in the result of a "%p" format.)
 
A

Anonymous 7843

But that's no different than any of the other printf/scanf specifiers.
That is, print two integers with "%d%d", and scan them back in.

Sorry, poor example.

But you know how %d will printf and how it will scanf back
in. You know that you can use spaces, letters, or
punctuation to separate a %d-generated number from other
data. You do *not* know (from the standard) what %p will or
won't print so you don't know what delimiters, if any, are
safe to use. So, you can't use delimiters.

And how the heck do you print a function pointer? It's not
guaranteed to fit in a void*, right?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,019
Latest member
RoxannaSta

Latest Threads

Top