Help with void's

N

none

Hello, I was reading some of the FAQ last night and got to this one,
http://c-faq.com/ptrs/genericpp.html . Now i'd just like to make sure
i'm understanding it correctly, so please correct me where i'm wrong.
The code below is just a quick example, and makes assumptions to keep
it simple (i.e malloc never fails).

#include <stdio.h>
#include <stdlib.h>

/* This is wrong since *ptr was not converted */
void foo( void **p )
{
int **ptr = (int**)p;
*ptr = malloc( sizeof(int)*6 );
return;
}

/* This is ok since *p will be implicitly casted to return
pointer-to-int while remaining pointer-to-void */
int* bar( void **p )
{
*p = malloc( sizeof(int)*6 );
return *p;
}

/* This is fine since v is the pointer-to-void from malloc */
void foo2( void *v )
{
int *i = v;
printf("%d\n", *i);
}

int main(void)
{
int *i=NULL;
void *v = i;

foo(&v);
free(v);

i = bar(&v);
*i = 8;
foo2(v);

free(i);

return 0;
}
 
M

Micah Cowan

Hello, I was reading some of the FAQ last night and got to this one,
http://c-faq.com/ptrs/genericpp.html . Now i'd just like to make sure
i'm understanding it correctly, so please correct me where i'm wrong.
The code below is just a quick example, and makes assumptions to keep
it simple (i.e malloc never fails).

#include <stdio.h>
#include <stdlib.h>

/* This is wrong since *ptr was not converted */
void foo( void **p )
{
int **ptr = (int**)p;
*ptr = malloc( sizeof(int)*6 );
return;
}

It's wrong in the worst possible way, in that the compiler will
silently let you do what you asked it to.

Your comment regarding *ptr not being converted is correct: no
conversion (to int*) happened. If void* and int* happen to have the
same representation, then it will probably work (but it's not
portable). Otherwise: bang!
/* This is ok since *p will be implicitly casted to return
pointer-to-int while remaining pointer-to-void */
int* bar( void **p )
{
*p = malloc( sizeof(int)*6 );
return *p;
}

You mean implicitly converted. "Implicit cast" is an oxymoron: a cast
is an inherently explicit conversion operation.

But yes: a pointer of type void* can always be assigned to any other
pointer type.
/* This is fine since v is the pointer-to-void from malloc */
void foo2( void *v )
{
int *i = v;
printf("%d\n", *i);
}

This appears to be fine.
 
M

Me

Hello, I was reading some of the FAQ last night and got to this one,
http://c-faq.com/ptrs/genericpp.html . Now i'd just like to make sure
i'm understanding it correctly, so please correct me where i'm wrong.
The code below is just a quick example, and makes assumptions to keep
it simple (i.e malloc never fails).

#include <stdio.h>
#include <stdlib.h>

/* This is wrong since *ptr was not converted */
void foo( void **p )
{
int **ptr = (int**)p;
*ptr = malloc( sizeof(int)*6 );
return;
}

/* This is ok since *p will be implicitly casted to return
pointer-to-int while remaining pointer-to-void */
int* bar( void **p )
{
*p = malloc( sizeof(int)*6 );
return *p;
}

/* This is fine since v is the pointer-to-void from malloc */
void foo2( void *v )
{
int *i = v;
printf("%d\n", *i);
}

int main(void)
{
int *i=NULL;
void *v = i;

foo(&v);
free(v);

i = bar(&v);
*i = 8;
foo2(v);

free(i);

return 0;
}

Congratulations, you got them all right. You placed higher than the
majority of Microsoft programmers:
http://blogs.msdn.com/oldnewthing/archive/2004/03/26/96777.aspx#100845

I don't like the explaination in the FAQ though. It has this lengthy
explaination about sizes and representations but the reason it's
disallowed is because of aliasing. Here is a simple example (assume int
and long have the same size/representation on this implementation):

int intv;

long longv = 10;
memcpy(&intv, &longv, sizeof(int));

would correctly assign 10 to intv on this machine (since the character
types are special cased to alias all objects (and memcpy basically does
an array of character copy)) but:

*(long*)&intv = 10;

would be undefined because of aliasing. Aliasing exists because of
optimization. It allows the compiler to study the types of the
variables and make certain assumptions based on what the standard
guarantees.

In any event, since we lied to the compiler here, the compiler sees
this as writing to a long variable. What's likely to happen in real
life on an optimizing compiler that takes advantage of "type based
alias analysis" (http://www.nullstone.com/htmls/category/aliastyp.htm
is the first hit on google I found if you want to read more) is that
the compiler is free to move anything that uses the intv variable
around as if this assignment to it never happened. you can visualize
this as writing the value 10 to some random location in memory at some
point in time and leaving intv uninitialized. I say "at some point in
time" because if the compiler does rearranging:

*(long*)&intv = 10;
intv = 5;

The compiler can pull the = 5 line above the = 10 line and if this
random location in memory just happens to be intv, intv becomes 10
instead of 5. But all of this is just one of the many effects of
undefined behavior.
 
R

Richard G. Riley

"Me"posted the following on 2006-03-11:
Congratulations, you got them all right. You placed higher than the
majority of Microsoft programmers:
http://blogs.msdn.com/oldnewthing/archive/2004/03/26/96777.aspx#100845

I don't like the explaination in the FAQ though. It has this lengthy
explaination about sizes and representations but the reason it's
disallowed is because of aliasing. Here is a simple example (assume int
and long have the same size/representation on this implementation):

int intv;

long longv = 10;
memcpy(&intv, &longv, sizeof(int));

would correctly assign 10 to intv on this machine (since the character
types are special cased to alias all objects (and memcpy basically does
an array of character copy)) but:

*(long*)&intv = 10;

would be undefined because of aliasing. Aliasing exists because of
optimization. It allows the compiler to study the types of the
variables and make certain assumptions based on what the standard
guarantees.

In any event, since we lied to the compiler here, the compiler sees
this as writing to a long variable. What's likely to happen in real
life on an optimizing compiler that takes advantage of "type based
alias analysis"
(http://www.nullstone.com/htmls/category/aliastyp.htm

Very, interesting : a bug based on this I would not like to have to
dig out.
is the first hit on google I found if you want to read more) is that
the compiler is free to move anything that uses the intv variable
around as if this assignment to it never happened. you can visualize
this as writing the value 10 to some random location in memory at some
point in time and leaving intv uninitialized. I say "at some point in
time" because if the compiler does rearranging:

*(long*)&intv = 10;
intv = 5;

The compiler can pull the = 5 line above the = 10 line and if this
random location in memory just happens to be intv, intv becomes 10
instead of 5. But all of this is just one of the many effects of
undefined behavior.

Not knowing too much about compiler technology, does it really not
still tag both lines as utilising a memory address involving intv and
therefore guarentee execution order?

Is there a difference with, e.g.,

foo(*(long*)&intv,10);
intv =5;

or may the compiler still make the same assumptions?
 
R

Richard Heathfield

Me said:
Congratulations, you got them all right. You placed higher than the
majority of Microsoft programmers:

That's hardly difficult...

....but that discussion pertains to C++ code, not C code. Just as the expert
on crocodile eyebrows tends to refer matters pertaining to alligator
eyebrows to his esteemed colleague even if he's pretty sure he understands
the issue, so we leave C++ matters to comp.lang.c++.
 
M

Mark F. Haigh

Me said:
(e-mail address removed) wrote:
I don't like the explaination in the FAQ though. It has this lengthy
explaination about sizes and representations but [...]

[OT]

I don't like the comp.lang.c FAQ either. I've been absent from
comp.lang.c for a couple months due to a demanding contract (involves
running different OSs on a single 16-core chip simultaneously without
virtualization), but before vanishing I registered the complangc.net
domain and toyed with the idea of hosting a collaborative content
publishing site there. The idea is to have a dictatorship of the
regulars be able to add and modify FAQs, and also publish papers on C
that are informative to the community.

The idea would be to supplement comp.lang.c with a centralized,
high-quality source of organized topical information. In no way would
it usurp or displace the function of the comp.lang.c newsgroup. If I
had to sum it up, the idea is, "If it's hosted at complangc.net, it's
100% correct."

Whether or not it will actually come to fruition is yet to be
determined.

Here is a simple example (assume int
and long have the same size/representation on this implementation):

int intv;

long longv = 10;
memcpy(&intv, &longv, sizeof(int));

would correctly assign 10 to intv on this machine (since the character
types are special cased to alias all objects (and memcpy basically does
an array of character copy)) but:

*(long*)&intv = 10;

would be undefined because of aliasing. Aliasing exists because of
optimization. It allows the compiler to study the types of the
variables and make certain assumptions based on what the standard
guarantees.

In any event, since we lied to the compiler here, the compiler sees
this as writing to a long variable. What's likely to happen in real
life on an optimizing compiler that takes advantage of "type based
alias analysis" (http://www.nullstone.com/htmls/category/aliastyp.htm
is the first hit on google I found if you want to read more) is that
the compiler is free to move anything that uses the intv variable
around as if this assignment to it never happened. you can visualize
this as writing the value 10 to some random location in memory at some
point in time and leaving intv uninitialized. I say "at some point in
time" because if the compiler does rearranging:

*(long*)&intv = 10;
intv = 5;

The compiler can pull the = 5 line above the = 10 line and if this
random location in memory just happens to be intv, intv becomes 10
instead of 5. But all of this is just one of the many effects of
undefined behavior.

You're correct. Now you can say you saw it with your own eyes:

[Mildly OT]

Here's case in point on a 3.2 or later gcc. Note that the __asm__
statment is not standard C and exists in this example only to force the
compiler to reload 'a' from memory before the printf so we get the
actual value.

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

int main(void)
{
uint32_t a = 0xC0FFEE;
uint16_t *b = (uint16_t *) &a;
char *p = "not taken.";

b[1] = 0xDEAD;
if(a == 0xC0FFEE)
p = "taken!";

__asm__ volatile ("" ::: "memory");
printf("a = %#" PRIx32 "\nbranch was %s\n", a, p);
return 0;
}

[mark@icepick ~]$ gcc -Wall -ansi -pedantic -O2 foo.c -o foo
-save-temps
foo.c: In function 'main':
foo.c:10: warning: dereferencing type-punned pointer will break
strict-aliasing rules
[mark@icepick ~]$ ./foo
a = 0xdeadffee
branch was taken!

Very unsurprising, but some people continue to be surprised by it.


Mark F. Haigh
(e-mail address removed)
 
M

Mike

You mean implicitly converted. "Implicit cast" is an oxymoron: a cast
is an inherently explicit conversion operation.

Hmm, seems I still need work on my terminology :). Thanks for the
reply.
 
M

Me

Richard said:
"Me"posted the following on 2006-03-11:

Is there a difference with, e.g.,

foo(*(long*)&intv,10);
intv =5;

or may the compiler still make the same assumptions?

I have no clue what that example means but I think what you're asking
is for the compiler to rely on static analysis. This has a few
problems, the major issue being code that crosses 2 translation units.
In this case, the static analysis is delayed until link time. For
static linking, this can be done but with dynamic linking this is very
difficult or impossible depending on the platform.

There are a few cases, lets call S static analysis and A type-based
alias analysis

- This is a very crappy compiler. Most compilers in debug mode do
this.
S - Ignore the type system. The types are just used for the type system
and to set size/alignment/representation. You're basically running an
optimizing BCPL compiler with friendlier syntax on non-word
aligned/sized objects at this point. Your average C programmer assumes
this is correct.
A- Don't do any object tracking at all. I don't know of any compilers
that do this. It basically assumes the programmers know what they're
doing and if they lie, their program is likely to crash (if you're
lucky). This is what the C standard assumes is correct.
SA- This is what most compilers do with full optimizations. Depending
on the compiler, this is what happens:

if it detects you lying:
- it can warn you. This is good for tools like lint that detect broken
code.
- it can side with the aliasing and assume the programmer knows what
they're doing (case A). This leads to the fastest code at the expense
of crashing.
- it can side with the object tracking that assumes the programmer is
an idiot (case S). The code is slower than siding with the aliasing
case but it works.

if it detects no lie:
- rely on both type and object information to generate the fastest code
possible.

if it can't decide if you lied or not:
- make a conservative choice to allow for bad programmers (case S)
- make an aggressive choice to allow for faster code (case A)

What you're suggesting is for the compiler to do SA but be
conservative. It's a bad idea to rely on this because you're just
wallpapering over the problem and as your code piles up over time, this
bug becomes way more expensive/difficult to detect and fix. The sad
reality is that the majority of code you find makes this kind of
mistake because they're ignorant of what the standard says and
compilers in the past weren't aggressive with optimization based on the
type system.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,040
Latest member
papereejit

Latest Threads

Top