type-punning?

J

j.j.fishbat

Hi all

I have a program which, with inessential details removed,
looks like this:

--
#include <stdlib.h>
#include <stdio.h>

static int punme(void** dat,size_t newsize)
{
void *newdat = realloc(*dat,newsize);

if (! newdat) return 1;

*dat = newdat;

return 0;
}

int main (void)
{
char *dat = malloc(30);

int ret = punme((void**)&dat,40);

printf("punme returns %i\n",ret);

return 0;
}
--

My compiler (gcc -Wall -O3) tells me that "typepun.c:19:
warning: dereferencing type-punned pointer will break
strict-aliasing rules". The code works as expected, ie,
prints that punme returns 0.

Who is the idiot? Me or the compiler?

Many thanks!

Jim
 
V

vippstar

Hi all

I have a program which, with inessential details removed,
looks like this:

--
#include <stdlib.h>
#include <stdio.h>

static int punme(void** dat,size_t newsize)
{
void *newdat = realloc(*dat,newsize);

if (! newdat) return 1;

*dat = newdat;

return 0;

}

int main (void)
{
char *dat = malloc(30);

int ret = punme((void**)&dat,40);

printf("punme returns %i\n",ret);

return 0;}

--

My compiler (gcc -Wall -O3) tells me that "typepun.c:19:
warning: dereferencing type-punned pointer will break
strict-aliasing rules". The code works as expected, ie,
prints that punme returns 0.

Who is the idiot? Me or the compiler?
You (cast) and assign a char ** to a void **. void ** is not like void
*.
Fixed your code:
static int punme(void* dat,size_t newsize) {
void **tmp = dat;
void *newdat = realloc(*tmp, newsize);

You also don't free the allocated memory.
 
H

Harald van Dijk

Heh, this looks familiar. :)
You (cast) and assign a char ** to a void **. void ** is not like void
*.
Fixed your code:

This doesn't fix the problem. This merely reorganises the code in a form
that the compiler may happen to not warn about.

The problem is that dat is defined as char *. You can't pretend it's
defined as a void *. The language doesn't let you. A fixed version looks
like

#include <stdlib.h>
#include <stdio.h>

static void *punme(void *dat, size_t newsize) {
return realloc(dat, newsize);
}

int main (void)
{
char *dat = malloc(30);
char *newdat = punme(dat, 40); /* or call realloc directly */
if (newdat != NULL) dat = newdat;
printf("punme would have returned %d\n", (newdat == NULL ? 1 : 0));
return 0;
}
 
A

Andrey Tarasevich

Hi all

I have a program which, with inessential details removed,
looks like this:

--
#include <stdlib.h>
#include <stdio.h>

static int punme(void** dat,size_t newsize)
{
void *newdat = realloc(*dat,newsize);

if (! newdat) return 1;

*dat = newdat;

return 0;
}

int main (void)
{
char *dat = malloc(30);

int ret = punme((void**)&dat,40);

printf("punme returns %i\n",ret);

return 0;
}
--

My compiler (gcc -Wall -O3) tells me that "typepun.c:19:
warning: dereferencing type-punned pointer will break
strict-aliasing rules". The code works as expected, ie,
prints that punme returns 0.

Who is the idiot? Me or the compiler?
...

You perform memory reinterpretation in your code. You have an lvalue
'dat' of type 'char*', which is reinterpreted as an lvalue of type
'void*' by using a conversion followed by a dereference '*(void**)
&dat'. In general, accessing the lvalue obtained by such
reinterpretation causes undefined behavior in C, unless the types are
"similar enough". In strict-aliasing mode (implied by -O3) GCC assumes
that such reinterpretations are not performed in the code. This is why
it issues the warning.
 
J

j.j.fishbat

Hi
This doesn't fix the problem. This merely reorganises the code in a form
that the compiler may happen to not warn about.

The problem is that dat is defined as char *. You can't pretend it's
defined as a void *. The language doesn't let you.

are you saying that

---
int foo(void* bar)
{
char *st = (char*)bar;
:
}

:
:

char *bar;
:

foo((void*)bar);
---

is not possible in C?

Are we using the same language?
A fixed version looks
like

#include <stdlib.h>
#include <stdio.h>

static void *punme(void *dat, size_t newsize) {
return realloc(dat, newsize);

}

int main (void)
{
char *dat = malloc(30);
char *newdat = punme(dat, 40); /* or call realloc directly */
if (newdat != NULL) dat = newdat;
printf("punme would have returned %d\n", (newdat == NULL ? 1 : 0));
return 0;

}

Part of the "inessential detail" omitted is that
I use the return value of punme() for other purposes,
I specifically want to modify dat (char*) which I pass
to punme() by reference.

Is there really no way at all to modify a generic
pointer by passing a reference to it to a function?
Are C programmers condemned to return structs
(one member of which is the void* modified) whenever
we want to do generic programming??

Thanks!
 
A

Andrey Tarasevich

are you saying that

---
int foo(void* bar)
{
char *st = (char*)bar;
:
}

:
:

char *bar;
:

foo((void*)bar);

This is not even remotely the same thing as your original code. This
version uses _conversion_. Your original version performed memory
reinterpretation.

Compare the following two examples:

Let's assume that sizeof(float) is the same as sizeof(int)

float f = 1.0;
int i;

Example 1. Conversion

i = (int) f;
/* Here we can expect i to be equal to 1 */
assert(i == 1);

Example 2. Memory reinterpretation

i = *(int*) &f;
/* Can we say anything about the value of i here? NO! */

The difference between these tow examples is exactly the difference
between your two code samples.
Is there really no way at all to modify a generic
pointer by passing a reference to it to a function?

No.
 
H

Harald van Dijk

Is there really no way at all to modify a generic pointer by passing a
reference to it to a function?

No, there's really no way to do that, any more than it's possible to
modify a generic integer (char, short, int, long) by passing a reference
to it to a function. It's obvious that that won't work on most systems
with integers, because different integer types are represented
differently. When you consider that C allows different pointer types to be
represented differently too (even though most implementations don't do
so), I hope you can see why what you're asking for is not possible.
 
K

Kenneth Brody

The problem is that dat is defined as char *. You can't pretend it's
defined as a void *. The language doesn't let you.

are you saying that

---
int foo(void* bar)
{ [...]
:
char *bar;
:
foo((void*)bar);

That's not the same thing. In your original post, you had:

static int punme(void **dat,size_t newsize)
...
char *dat = malloc(30);
int ret = punme((void **)&dat,40);

Here, you are telling punme that is gets passed a pointer to "void *"
but you are passing a pointer to "char *". Note "pointer to" in both
of those pieces.

[...]
Is there really no way at all to modify a generic
pointer by passing a reference to it to a function?
Are C programmers condemned to return structs
(one member of which is the void* modified) whenever
we want to do generic programming??

But you are not modifying a "generic pointer". Or, at least, you
are not _passing_ a "generic pointer".

Consider the fact that a "void *" and a "char *" need not be stored
the same way. Yes, a function can take a "void *" and you can pass
it a "char *", but that is because the value will be converted to a
"void *" before being passed. However, what you are trying to do is
pass a pointer to "char *", while a pointer to "void *" is expected.
No conversion of your original "char *" will take place. This is
the same as if you had:

void foo(double *pt)
...
float f;
foo((double *)&f);

In your case, the program "works" because, in all likelihood, the
representation of "void *" and "char *" are the same.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
L

lawrence.jones

Andrey Tarasevich said:
You perform memory reinterpretation in your code. You have an lvalue
'dat' of type 'char*', which is reinterpreted as an lvalue of type
'void*' by using a conversion followed by a dereference '*(void**)
&dat'. In general, accessing the lvalue obtained by such
reinterpretation causes undefined behavior in C, unless the types are
"similar enough".

In particular, (char *) and (void *) are required to have the same
representation and alignment requirements, so reinterpretation is almost
(but not quite) guaranteed to work in that case. The same is not true
of other pointer types, however, and reinterpretation involving other
pointer types *does* fail on some implementations.

-- Larry Jones

Apparently I was misinformed. -- Calvin
 
H

Harald van Dijk

Consider the fact that a "void *" and a "char *" need not be stored the
same way.

Actually, void * and char * are special in that they must be stored the
same way. It may be better to pretend they too might be stored
differently, as you still aren't allowed to access one as the other anyway
(except in special circumstances), and this guarantee does not extend to
other pointer types, but I'm mentioning it for completeness.
 
J

j.j.fishbat

Hi all
Consider the fact that a "void *" and a "char *" need not be stored
the same way.

OK, I am the idiot then. I had thought, for the past 10 years
C programming, that a pointer to char was the same as a
pointer to int was the same as a pointer to foobar_t, the only
difference being what it pointed to. And that this fact enables
one to cast to void* to enable generic programming.

I see I'll have to rewrite some code.

Thank you again to all who replied!

Jim
 
K

Kenneth Brody

Hi all


OK, I am the idiot then. I had thought, for the past 10 years
C programming, that a pointer to char was the same as a
pointer to int was the same as a pointer to foobar_t, the only
difference being what it pointed to. And that this fact enables
one to cast to void* to enable generic programming.

I see I'll have to rewrite some code.

Thank you again to all who replied!

Well, you have probably been programming on a platform on which all
of the data pointer types are represented the same way. (And, I
must confess, so have I.) However, C does not guarantee such a
thing, and I believe others have posted here examples of systems
in which they are, in fact, represented differently.

Yes, you can convert from any pointer to "void *" and back again.
But, you can't, for example, convert from "int *" to "void *" and
then to "long *", and dereference that pointer without invoking
UB. (In fact, the converting to "long *" step itself may be UB.)

Note, too, that Mr. van D?k (sorry about the mangling of the name,
but my newsreader is obviously not UTF-8 compliant) says that the
standard does say that "void *" and "char *" are special cases
which must be represented the same way. If that is true, then
your original example will "work" as-is, but only because you
are converting "char **" to "void **". It's not guaranteed to
work with, for example, "int **" to "void **", in a "portable"
manner. Again, it will probably "work" on your systems, because
they are likely to be storing "void *" and "int *" in the same
representation.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Kaz Kylheku

Hi



are you saying that

---
int foo(void* bar)
{
  char *st = (char*)bar;
     :

}

 :
 :

char *bar;
 :

foo((void*)bar);

No, because here you are doing a well-defined conversion from char *
to void * and back. There is no type-punning going on.

(In fact, you have not even dereferenced a pointer at all!)

The problem is pretending that a char * object /is/ a void *, by
aiming a void ** pointer at it and dereferencing it. This is what your
original program does, and what is diagnosed by GCC (it break strict
aliasing rules).

What this means is that when the compiler is optimizing code, it won't
be making any consideration about data flow paths through the code
which are based on wrongful aliasing.

The C language allows this, by marking such wrongful aliasing as
undefined behavior. Undefined behavior means that the program has
problem, and there are no requirements about how your language
implementation should respond. (Or any such requirements do not come
from the language standard, but from somewhere else, like the
impelmentor's commitment to some additional design documents about
their compiler).

Here is an example.

Suppose you have a function in which you are working with, say,
variables of type double (it's some kind of number crunching). And
suppose that the same code also manipulates data through int * type
pointers.

The compiler can assume that the int * pointers are not aimed at any
of the double type objects, and optimize accordingly.

If an assignment occurs thorugh the int *, the compiler can assume
that none of the values of type double are touched.

Similar reasoning applies to any two incompatible types:

char *a = "abc";
void **pa = (void **) &a; // pa is a type-punned pointer
*pa = "def";

puts(a);

What should this print, abc or def? Trick question: the behavior is
undefined, so anything can happen. But it's a realistic possibility
that in fact abc is printed, because an optimizer may simply discard
the possibility that the *pa = "def" assignment has any effect on the
value of a. The reason is that *pa and a have different,
incompatible types. Assignment to a void * lvalue is not supposed to
have an effect on the value of any char * object anywhere in the
program.

Recent versions of the GNU compiler can diagnose some cases of type-
punned pointers. Code like that is recognized as representing a kind
of common programming practice. It is supported by GCC as an extension
to undefined behavior. You must use the compiler option -fno-strict-
aliasing. The diagnostic message will then go away. Of course, you are
no longer programming in ICO C, but in a dialect which supports
aliasing.

And note that this is not perfectly well-defined either. GNU C does
not say, for instance, what will happen to the value of x if you do
this, even if -fno-strict-aliasing is enabled:

double x = 3.14;
int *p = (int *) &x;
*p = 42;

The -fno-strict-aliasing extension works with objects that have a
compatible representation. I.e. things that are de-facto compatible,
or highly compatible, at the implementation level. Things like
structurally equivalent structs, pointers (on most machines), signed
and unsigned versions of integral types, etc.
Is there really no way at all to modify a generic
pointer by passing a reference to it to a function?

That is false. You can convert the pointer to a void * type, stored in
a void * object, then have the function modify the void * object.
Then, convert the modified void * back to the original type.

int wrap_realloc(void **pp, size_t nsz)
{
void *np = realloc(*pp, nsz);
if (np != 0) {
*pp = np;
return 1;
}
return 0; // leave *pp alone
}

Use it:

char *ptr = malloc(10);
/* check for errors etc */

void *modify = ptr;

if (wrap_realloc(&modify, 20)) {
/* success */
ptr = modify;
}

The problem with the void ** aliasing hack is that it doesn't do a
proper conversion like this. It just assumes that the original pointer
ptr can be treated as if it were a void *. The aliasing rules forbid
this even if a void * has exactly the same representation as a char *
in order to allow for aggressive optimization based on type.
 
H

Harald van Dijk

Note, too, that Mr. van D?k (sorry about the mangling of the name, but
my newsreader is obviously not UTF-8 compliant)

<OT> It could be that your system font simply doesn't contain the
says that the standard
does say that "void *" and "char *" are special cases which must be
represented the same way. If that is true,

6.2.5p27:
"A pointer to void shall have the same representation and alignment
requirements as a pointer to a character type."
then your original example
will "work" as-is, but only because you are converting "char **" to
"void **".

No, and I had tried to make that explicit. I wrote "you still aren't
allowed to access one as the other anyway (except in special
circumstances)". Those special circumstances would be using memcpy or an
union to copy between distinct pointer types. Using *(char **) &p when p
is defined as void *p, or vice versa, is not one of those special
circumstances, and when you do that, the behaviour is still undefined,
just like the behaviour is undefined if you use *(int **) &p on systems
where int * has the same representation and alignment requirements as
void *.
 
B

Ben Bacarisse

Is there really no way at all to modify a generic
pointer by passing a reference to it to a function?

Two people have said "no" here and that is because they have, quote
reasonably, taken this phrase in the context of your previous example.
However, I think you can do what you ask, provided you stick to the
narrowest interpretation of the quote above. (If not, I will get shot
down by the experts and we will both be wiser.)

What you can't do is leave the safety guarantees provided by the void
* type. So, you /can/ do this (example only, I don't recommend it!):

int my_realloc(void **ptr, size_t sz)
{
void *newp = realloc(*ptr, sz);
if (newp == NULL) {
free(*ptr);
return 0;
}
*ptr = newp;
return 1;
}


later...

char *sp = malloc(20);
void *tmp;
...
tmp = sp;
if (!my_realloc(&tmp, 30))
printf("Failed!\n");
else sp = tmp;
...

and you can do the same even for the types that do not explicitly use
the same representation: int *, or struct pointers, etc. The
assignments to a void * object make it work in a portable way.

This may not be close enough to what you want to be of any use, but I
hope it clarifies some things.
 
D

Default User

Harald said:
<OT> It could be that your system font simply doesn't contain the
character, but the newsreader still is UTF-8 compliant. :) </OT>

It's a char set problem. Here's the header from his message:

Content-Type: text/plain; charset=us-ascii




Brian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,074
Latest member
StanleyFra

Latest Threads

Top