pointer to integer to pointer conversions

L

lithiumcat

Hi,

I bothered you a while back about storing integer values in void*. Now
in a completely unrelated context, I'm trying to store pointer values
in an integer type.

So the basic question is, is it possible to convert a pointer into an
integer, and then later (but on the same execution environment, ie the
program has not exited, thus it's the same architecture, same
compiler, same binary representations and so on) retrieve from the
integer the "same" pointer (that is, a pointer that points to the same
object and that would compare equal to the original pointer if it was
kept somewhere)?

On the document I found, named "ISO/IEC 9899:TC3", I have found on
paragraph 6.3.2.3 that pointers can be converted into integers and
vice versa, provided the integer type is large enough, the result
being implementation-defined. Does it mean the standard does not
guarantee that converting to an integer and back to a pointer yeilds
the same pointer? Or is it written somewhere else?

I gather that this document is about C99, is the answer different in
C89?

And supposing that conversion might work, should I make sure the
original pointer and the retrieved pointer have exactly the same type,
or can it work with one them being of a given type and the other one
being a void* later converted into a pointer of the correct type?

In any case it works, is there a portable way to know what integer
types are large enough to hold a pointer value?

In case it matters, my situation is that I want, for debug purposes,
to output the "value" of a pointer to the user (me), and then read
that value back from the user. The most natural way to make a user
handle pointers was to print and read it as an integer. Of course I
don't really need portability in that case, but I have the feeling
that it might be useful knowledge later on. Or maybe I'm only
overemphasizing portability the same way too many people overemphasize
performance.
 
V

vippstar

Hi,

I bothered you a while back about storing integer values in void*. Now
in a completely unrelated context, I'm trying to store pointer values
in an integer type.
That's possible. Using `intptr_t' or `uintptr_t'.
You have to include <stdint.h> to use it.
There is also uintptr_t, which you will use to store the void pointer
does not matter.
The type only matters if you use it in arithmetic.
ie, uintptr_t foo = malloc(123); foo = ~foo; free((void*)~free);
With intptr_t undefined behavior might be invoked in this example (if,
for example, malloc() returns NULL)
So the basic question is, is it possible to convert a pointer into an
integer, and then later (but on the same execution environment, ie the
program has not exited, thus it's the same architecture, same
compiler, same binary representations and so on) retrieve from the
integer the "same" pointer (that is, a pointer that points to the same
object and that would compare equal to the original pointer if it was
kept somewhere)?
Yep, quote from ISO 9899:1999, 7.18.1.4:
The following type designates a (un)signed integer type with the property
that any valid pointer to void can be converted to this type, then converted
back to pointer to void, and the result will compare equal to the original pointer.
(regarding intptr_t and uintptr_t)
On the document I found, named "ISO/IEC 9899:TC3", I have found on
paragraph 6.3.2.3 that pointers can be converted into integers and
vice versa, provided the integer type is large enough, the result
being implementation-defined. Does it mean the standard does not
guarantee that converting to an integer and back to a pointer yeilds
the same pointer? Or is it written somewhere else? From 6.3.2.3:
Any pointer type may be converted to an integer type. Except as previously speciï¬ed, the
result is implementation-deï¬ned. If the result cannot be represented in the integer type,
the behavior is undeï¬ned. The result need not be in the range of values of any integer
type.
That means it's not safe to use any integer type other than intptr_t!
Even uintmax_t can invoke undefined behavior, when there is no
intptr_t or uintptr_t provided (they are optional types), and when a
pointer is larger than the largest integer type in the implementation.
I gather that this document is about C99, is the answer different in
C89?
There's no answer in C89. In older code, unsigned long was used, but
that's not safe either.
It was obviously used it places where the details were known.
(compiler, platform, etc)
And supposing that conversion might work, should I make sure the
original pointer and the retrieved pointer have exactly the same type,
or can it work with one them being of a given type and the other one
being a void* later converted into a pointer of the correct type?
You have to cast the pointer to `void *' before you assign it to a
uintptr_t or intptr_t.
In any case it works, is there a portable way to know what integer
types are large enough to hold a pointer value? No.
In case it matters, my situation is that I want, for debug purposes,
to output the "value" of a pointer to the user (me), and then read
that value back from the user. The most natural way to make a user
handle pointers was to print and read it as an integer. Of course I
don't really need portability in that case, but I have the feeling
that it might be useful knowledge later on. Or maybe I'm only
overemphasizing portability the same way too many people overemphasize
performance.
You can also use the `p' conversion specifier in printf and scanf like
functions, which would probably be the best solution for your problem,
because it will also work in C89, and you don't have to worry about
the availability of uintptr_t/intptr_t.
Also, yes: It is possible to write a pointer to a file stream with %p,
then read it back.
 
L

lithiumcat

Thanks you all for your answers, they were really helpful. It seems
that 6.3.2.3 wasn't all.

Although it's quite off-topic, could you tell me what that
"ISO/IEC 1999:TC3" is worth? Can I take it as a reference? If so, is
there a similar reference for C89?

Even in C89 you could printf a void* value with "%p". You
could also scanf it with "%p" (matching a void**), but the result
was not usefully defined. C99 (or maybe an intermediate revision)
tightened the language to require that the round-trip must succeed
if everything was valid to begin with, the program hasn't exited,
the pointed-to location is still valid, and so on.

Thanks a lot for the idea, I haven't even considered it. I've
actually never used any scanf-like function. I don't like using
things I don't understand, and these functions looks a little bit
like magic to me. I will dig into that direction.

Now that I think about it, I made the mistake of only considering
giving the pointer to the user as an int, while actually I want to
give it as a printable string.
I can understand why you'd want to display pointer values as
a debugging aid, but it seems peculiar to want to read them back
again. Chancy, too: An innocent typo arouses the nasal demons.

I don't know if it's a usual or good way to do it, but I like to
test parts of my programs "by hand", without any input santization
and as little processing as possible (I want to test the part of
my program, not the draft-input I put on it for testing/debugging
purpose). In these cases, an "innocent typo" only means that I
can't deduce anything from the test, which isn't that bad (though
I should probably consider myself lucky to encounter only program
segfaults as manifestations of undefined behaviour, and not e.g.
large-scale nuclear explosions).
 
K

Keith Thompson

Thanks you all for your answers, they were really helpful. It seems
that 6.3.2.3 wasn't all.

Although it's quite off-topic, could you tell me what that
"ISO/IEC 1999:TC3" is worth? Can I take it as a reference? If so, is
there a similar reference for C89?

I presume you're referring to n1256.pdf, available at
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf>.

That's not a completely official document, but it's good enough for
most practical purposes. It includes the offical C99 standard plus
the changes made by the three Technical Corrigenda. Any changes
introduced by the TCs are marked with change bars. It's what I use
most of the time.

If you want something absolutely official, you can get a copy of the
C99 standard itself (without the TCs) by paying money to your national
standards body. I got mine from ANSI for something like $18; I think
it's gone up slightly since then. You should be able to get TC1, TC2,
and TC3 from the same source at no charge. Flipping back and forth
between the C99 standard and the changes in the TCs is tedious --
which is why n1256.pdf is so handy.

Good copies of the C89/C90 standard are a bit harder to come by. I
have a poor-quality PDF copy that I bought from ANSI for $18, but I
don't think it's available anymore. Some pre-C89 drafts are freely
available; I expect that someone will post URLs any minute now.

At least one person here prefers (I think it's) n869.txt. This is a
pre-C99 draft, *not* a C89/C90 draft. It has the advantage that it's
in plain text rather than PDF. It has the disadvantage that it's in
plain text rather than PDF. In particular, some semantically
significant formatting, particularly the use if italics, is lost --
and there were some changes between n869 and the final C99 standard.
I don't recommend it unless you have serious problems dealing with PDF
documents.
 
F

Flash Gordon

Keith Thompson wrote, On 06/05/08 20:46:
I presume you're referring to n1256.pdf, available at
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf>.

don't think it's available anymore. Some pre-C89 drafts are freely
available; I expect that someone will post URLs any minute now.

Links to I think all the relevant documents, including that one, can be
found at http://clc-wiki.net/wiki/c_standard
At least one person here prefers (I think it's) n869.txt. This is a
pre-C99 draft, *not* a C89/C90 draft. It has the advantage that it's

I don't recommend it unless you have serious problems dealing with PDF
documents.

I second that. If you want a C99 draft the post C99 drafts are a lot
more useful.
 
R

Richard Bos

Eric Sosman said:
C99 improves the situation, but only a little. If the integer
types intptr_t and uintptr_t exist, then any valid void* can be
converted to one of them and back again and survive the journey.
(There are no guarantees for invalid pointers, nor for converting
an arbitrary integer value to void* and back.) Note, though, that
these integer types are optional: If they exist they will work as
you desire, but on some "exotic" architecture they might be absent.

On the upside, if your architecture is exotic enough that it has a C99
implementation but no (u)intptr_t, it's probably not reliably possible
to do this in the first place. So if including <stdint.h> doesn't result
in a definition of UINTPTR_MAX, bailing out with an #error would
probably have been your best option anyway.

BTW, I still don't understand why we have both intptr_t and uintptr_t.
We really only need either of those.

Richard
 
V

vippstar

On the upside, if your architecture is exotic enough that it has a C99
implementation but no (u)intptr_t, it's probably not reliably possible
to do this in the first place. So if including <stdint.h> doesn't result
in a definition of UINTPTR_MAX, bailing out with an #error would
probably have been your best option anyway.

BTW, I still don't understand why we have both intptr_t and uintptr_t.
We really only need either of those.
I explained that in my other post.
vippstar said:
That's possible. Using `intptr_t' or `uintptr_t'.
You have to include <stdint.h> to use it.
There is also uintptr_t, which you will use to store the void pointer
does not matter.
The type only matters if you use it in arithmetic.
ie, uintptr_t foo = malloc(123); foo = ~foo; free((void*)~free);
^ ^^^^
With intptr_t undefined behavior might be invoked in this example (if,
for example, malloc() returns NULL)

Ignore the small error I made (typing free instead of foo), and that I
did not cast malloc(123) to (uintptr_t), which might be necessary.
The point of this snip is to show that ~ in signed integer with value
0 might be a trap representation, which you can avoid by using an
unsigned integer. I cannot think of an example that signed is
preferred to unsigned, but there has to be at least one.
 
S

Stephen Sprunk

Richard Bos said:
On the upside, if your architecture is exotic enough that it has a C99
implementation but no (u)intptr_t, it's probably not reliably possible
to do this in the first place. So if including <stdint.h> doesn't result
in a definition of UINTPTR_MAX, bailing out with an #error would
probably have been your best option anyway.

BTW, I still don't understand why we have both intptr_t and uintptr_t.
We really only need either of those.

.... on any given system. I know of one system that defines pointers to be
signed, so storing them in a signed integer type makes sense, particularly
if one wants to (non-portably) manipulate them. On most other systems I'm
familiar with, pointers are unsigned, so you'd want to store them in an
unsigned integer type.

More practically speaking, I bet both exist simply because it's symmetric
and doesn't hurt. It also stops people from typing "unsigned intptr_t" if
that's what they want...

S
 
P

Peter Nilsson

Richard said:
BTW, I still don't understand why we have both intptr_t and
uintptr_t. We really only need either of those.

Because of 6.2.5p6, "For each of the signed integer types,
there is a corresponding (but different) unsigned integer..."
 
P

Peter Nilsson

^ ^^^^

Ignore the small error I made (typing free instead of foo), and
that I did not cast malloc(123) to (uintptr_t), which might be
necessary.

It is.
The point of this snip is to show that ~ in signed integer with
value 0 might be a trap representation, which you can avoid
by using an unsigned integer. I cannot think of an example
that signed is preferred to unsigned, but there has to be at
least one.

There are hashing techniques that involve negative values.
[Note that Java has no unsigned types.]
 
K

Keith Thompson

Stephen Sprunk said:
BTW, I still don't understand why we have both intptr_t and uintptr_t.
We really only need either of those.
[...]

More practically speaking, I bet both exist simply because it's
symmetric and doesn't hurt. It also stops people from typing
"unsigned intptr_t" if that's what they want...

There's also the fact that you can't legally apply "unsigned" to a
typedef.
 
R

Richard Bos

Peter Nilsson said:
Because of 6.2.5p6, "For each of the signed integer types,
there is a corresponding (but different) unsigned integer..."

That's the most reasonable argument I've yet seen.

Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top