Help with void's

Discussion in 'C Programming' started by none@mail.com, Mar 11, 2006.

  1. Guest

    Hello, I was reading some of the FAQ last night and got to this one,
    http://c-faq.com/ptrs/genericpp.html . Now i'd just like to make sure
    i'm understanding it correctly, so please correct me where i'm wrong.
    The code below is just a quick example, and makes assumptions to keep
    it simple (i.e malloc never fails).

    #include <stdio.h>
    #include <stdlib.h>

    /* This is wrong since *ptr was not converted */
    void foo( void **p )
    {
    int **ptr = (int**)p;
    *ptr = malloc( sizeof(int)*6 );
    return;
    }

    /* This is ok since *p will be implicitly casted to return
    pointer-to-int while remaining pointer-to-void */
    int* bar( void **p )
    {
    *p = malloc( sizeof(int)*6 );
    return *p;
    }

    /* This is fine since v is the pointer-to-void from malloc */
    void foo2( void *v )
    {
    int *i = v;
    printf("%d\n", *i);
    }

    int main(void)
    {
    int *i=NULL;
    void *v = i;

    foo(&v);
    free(v);

    i = bar(&v);
    *i = 8;
    foo2(v);

    free(i);

    return 0;
    }
     
    , Mar 11, 2006
    #1
    1. Advertising

  2. Micah Cowan Guest

    writes:

    > Hello, I was reading some of the FAQ last night and got to this one,
    > http://c-faq.com/ptrs/genericpp.html . Now i'd just like to make sure
    > i'm understanding it correctly, so please correct me where i'm wrong.
    > The code below is just a quick example, and makes assumptions to keep
    > it simple (i.e malloc never fails).
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    >
    > /* This is wrong since *ptr was not converted */
    > void foo( void **p )
    > {
    > int **ptr = (int**)p;
    > *ptr = malloc( sizeof(int)*6 );
    > return;
    > }


    It's wrong in the worst possible way, in that the compiler will
    silently let you do what you asked it to.

    Your comment regarding *ptr not being converted is correct: no
    conversion (to int*) happened. If void* and int* happen to have the
    same representation, then it will probably work (but it's not
    portable). Otherwise: bang!

    > /* This is ok since *p will be implicitly casted to return
    > pointer-to-int while remaining pointer-to-void */
    > int* bar( void **p )
    > {
    > *p = malloc( sizeof(int)*6 );
    > return *p;
    > }


    You mean implicitly converted. "Implicit cast" is an oxymoron: a cast
    is an inherently explicit conversion operation.

    But yes: a pointer of type void* can always be assigned to any other
    pointer type.

    > /* This is fine since v is the pointer-to-void from malloc */
    > void foo2( void *v )
    > {
    > int *i = v;
    > printf("%d\n", *i);
    > }


    This appears to be fine.

    >
    > int main(void)
    > {
    > int *i=NULL;
    > void *v = i;
    >
    > foo(&v);
    > free(v);
    >
    > i = bar(&v);
    > *i = 8;
    > foo2(v);
    >
    > free(i);
    >
    > return 0;
    > }
     
    Micah Cowan, Mar 11, 2006
    #2
    1. Advertising

  3. Me Guest

    wrote:
    > Hello, I was reading some of the FAQ last night and got to this one,
    > http://c-faq.com/ptrs/genericpp.html . Now i'd just like to make sure
    > i'm understanding it correctly, so please correct me where i'm wrong.
    > The code below is just a quick example, and makes assumptions to keep
    > it simple (i.e malloc never fails).
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    >
    > /* This is wrong since *ptr was not converted */
    > void foo( void **p )
    > {
    > int **ptr = (int**)p;
    > *ptr = malloc( sizeof(int)*6 );
    > return;
    > }
    >
    > /* This is ok since *p will be implicitly casted to return
    > pointer-to-int while remaining pointer-to-void */
    > int* bar( void **p )
    > {
    > *p = malloc( sizeof(int)*6 );
    > return *p;
    > }
    >
    > /* This is fine since v is the pointer-to-void from malloc */
    > void foo2( void *v )
    > {
    > int *i = v;
    > printf("%d\n", *i);
    > }
    >
    > int main(void)
    > {
    > int *i=NULL;
    > void *v = i;
    >
    > foo(&v);
    > free(v);
    >
    > i = bar(&v);
    > *i = 8;
    > foo2(v);
    >
    > free(i);
    >
    > return 0;
    > }


    Congratulations, you got them all right. You placed higher than the
    majority of Microsoft programmers:
    http://blogs.msdn.com/oldnewthing/archive/2004/03/26/96777.aspx#100845

    I don't like the explaination in the FAQ though. It has this lengthy
    explaination about sizes and representations but the reason it's
    disallowed is because of aliasing. Here is a simple example (assume int
    and long have the same size/representation on this implementation):

    int intv;

    long longv = 10;
    memcpy(&intv, &longv, sizeof(int));

    would correctly assign 10 to intv on this machine (since the character
    types are special cased to alias all objects (and memcpy basically does
    an array of character copy)) but:

    *(long*)&intv = 10;

    would be undefined because of aliasing. Aliasing exists because of
    optimization. It allows the compiler to study the types of the
    variables and make certain assumptions based on what the standard
    guarantees.

    In any event, since we lied to the compiler here, the compiler sees
    this as writing to a long variable. What's likely to happen in real
    life on an optimizing compiler that takes advantage of "type based
    alias analysis" (http://www.nullstone.com/htmls/category/aliastyp.htm
    is the first hit on google I found if you want to read more) is that
    the compiler is free to move anything that uses the intv variable
    around as if this assignment to it never happened. you can visualize
    this as writing the value 10 to some random location in memory at some
    point in time and leaving intv uninitialized. I say "at some point in
    time" because if the compiler does rearranging:

    *(long*)&intv = 10;
    intv = 5;

    The compiler can pull the = 5 line above the = 10 line and if this
    random location in memory just happens to be intv, intv becomes 10
    instead of 5. But all of this is just one of the many effects of
    undefined behavior.
     
    Me, Mar 11, 2006
    #3
  4. "Me"posted the following on 2006-03-11:

    > wrote:
    >> Hello, I was reading some of the FAQ last night and got to this one,
    >> http://c-faq.com/ptrs/genericpp.html . Now i'd just like to make sure
    >> i'm understanding it correctly, so please correct me where i'm wrong.
    >> The code below is just a quick example, and makes assumptions to keep
    >> it simple (i.e malloc never fails).
    >>
    >> #include <stdio.h>
    >> #include <stdlib.h>
    >>
    >> /* This is wrong since *ptr was not converted */
    >> void foo( void **p )
    >> {
    >> int **ptr = (int**)p;
    >> *ptr = malloc( sizeof(int)*6 );
    >> return;
    >> }
    >>
    >> /* This is ok since *p will be implicitly casted to return
    >> pointer-to-int while remaining pointer-to-void */
    >> int* bar( void **p )
    >> {
    >> *p = malloc( sizeof(int)*6 );
    >> return *p;
    >> }
    >>
    >> /* This is fine since v is the pointer-to-void from malloc */
    >> void foo2( void *v )
    >> {
    >> int *i = v;
    >> printf("%d\n", *i);
    >> }
    >>
    >> int main(void)
    >> {
    >> int *i=NULL;
    >> void *v = i;
    >>
    >> foo(&v);
    >> free(v);
    >>
    >> i = bar(&v);
    >> *i = 8;
    >> foo2(v);
    >>
    >> free(i);
    >>
    >> return 0;
    >> }

    >
    > Congratulations, you got them all right. You placed higher than the
    > majority of Microsoft programmers:
    > http://blogs.msdn.com/oldnewthing/archive/2004/03/26/96777.aspx#100845
    >
    > I don't like the explaination in the FAQ though. It has this lengthy
    > explaination about sizes and representations but the reason it's
    > disallowed is because of aliasing. Here is a simple example (assume int
    > and long have the same size/representation on this implementation):
    >
    > int intv;
    >
    > long longv = 10;
    > memcpy(&intv, &longv, sizeof(int));
    >
    > would correctly assign 10 to intv on this machine (since the character
    > types are special cased to alias all objects (and memcpy basically does
    > an array of character copy)) but:
    >
    > *(long*)&intv = 10;
    >
    > would be undefined because of aliasing. Aliasing exists because of
    > optimization. It allows the compiler to study the types of the
    > variables and make certain assumptions based on what the standard
    > guarantees.
    >
    > In any event, since we lied to the compiler here, the compiler sees
    > this as writing to a long variable. What's likely to happen in real
    > life on an optimizing compiler that takes advantage of "type based
    > alias analysis"
    >(http://www.nullstone.com/htmls/category/aliastyp.htm


    Very, interesting : a bug based on this I would not like to have to
    dig out.

    > is the first hit on google I found if you want to read more) is that
    > the compiler is free to move anything that uses the intv variable
    > around as if this assignment to it never happened. you can visualize
    > this as writing the value 10 to some random location in memory at some
    > point in time and leaving intv uninitialized. I say "at some point in
    > time" because if the compiler does rearranging:
    >
    > *(long*)&intv = 10;
    > intv = 5;
    >
    > The compiler can pull the = 5 line above the = 10 line and if this
    > random location in memory just happens to be intv, intv becomes 10
    > instead of 5. But all of this is just one of the many effects of
    > undefined behavior.
    >


    Not knowing too much about compiler technology, does it really not
    still tag both lines as utilising a memory address involving intv and
    therefore guarentee execution order?

    Is there a difference with, e.g.,

    foo(*(long*)&intv,10);
    intv =5;

    or may the compiler still make the same assumptions?
    --
    "A desk is a dangerous place from which to view the world" - LeCarre.
     
    Richard G. Riley, Mar 11, 2006
    #4
  5. Me said:

    > Congratulations, you got them all right. You placed higher than the
    > majority of Microsoft programmers:


    That's hardly difficult...

    > http://blogs.msdn.com/oldnewthing/archive/2004/03/26/96777.aspx#100845


    ....but that discussion pertains to C++ code, not C code. Just as the expert
    on crocodile eyebrows tends to refer matters pertaining to alligator
    eyebrows to his esteemed colleague even if he's pretty sure he understands
    the issue, so we leave C++ matters to comp.lang.c++.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
     
    Richard Heathfield, Mar 11, 2006
    #5
  6. Me wrote:
    > wrote:
    >

    <snip>
    >
    > I don't like the explaination in the FAQ though. It has this lengthy
    > explaination about sizes and representations but [...]


    [OT]

    I don't like the comp.lang.c FAQ either. I've been absent from
    comp.lang.c for a couple months due to a demanding contract (involves
    running different OSs on a single 16-core chip simultaneously without
    virtualization), but before vanishing I registered the complangc.net
    domain and toyed with the idea of hosting a collaborative content
    publishing site there. The idea is to have a dictatorship of the
    regulars be able to add and modify FAQs, and also publish papers on C
    that are informative to the community.

    The idea would be to supplement comp.lang.c with a centralized,
    high-quality source of organized topical information. In no way would
    it usurp or displace the function of the comp.lang.c newsgroup. If I
    had to sum it up, the idea is, "If it's hosted at complangc.net, it's
    100% correct."

    Whether or not it will actually come to fruition is yet to be
    determined.


    > Here is a simple example (assume int
    > and long have the same size/representation on this implementation):
    >
    > int intv;
    >
    > long longv = 10;
    > memcpy(&intv, &longv, sizeof(int));
    >
    > would correctly assign 10 to intv on this machine (since the character
    > types are special cased to alias all objects (and memcpy basically does
    > an array of character copy)) but:
    >
    > *(long*)&intv = 10;
    >
    > would be undefined because of aliasing. Aliasing exists because of
    > optimization. It allows the compiler to study the types of the
    > variables and make certain assumptions based on what the standard
    > guarantees.
    >
    > In any event, since we lied to the compiler here, the compiler sees
    > this as writing to a long variable. What's likely to happen in real
    > life on an optimizing compiler that takes advantage of "type based
    > alias analysis" (http://www.nullstone.com/htmls/category/aliastyp.htm
    > is the first hit on google I found if you want to read more) is that
    > the compiler is free to move anything that uses the intv variable
    > around as if this assignment to it never happened. you can visualize
    > this as writing the value 10 to some random location in memory at some
    > point in time and leaving intv uninitialized. I say "at some point in
    > time" because if the compiler does rearranging:
    >
    > *(long*)&intv = 10;
    > intv = 5;
    >
    > The compiler can pull the = 5 line above the = 10 line and if this
    > random location in memory just happens to be intv, intv becomes 10
    > instead of 5. But all of this is just one of the many effects of
    > undefined behavior.


    You're correct. Now you can say you saw it with your own eyes:

    [Mildly OT]

    Here's case in point on a 3.2 or later gcc. Note that the __asm__
    statment is not standard C and exists in this example only to force the
    compiler to reload 'a' from memory before the printf so we get the
    actual value.

    #include <stdio.h>
    #include <stdint.h>
    #include <inttypes.h>

    int main(void)
    {
    uint32_t a = 0xC0FFEE;
    uint16_t *b = (uint16_t *) &a;
    char *p = "not taken.";

    b[1] = 0xDEAD;
    if(a == 0xC0FFEE)
    p = "taken!";

    __asm__ volatile ("" ::: "memory");
    printf("a = %#" PRIx32 "\nbranch was %s\n", a, p);
    return 0;
    }

    [mark@icepick ~]$ gcc -Wall -ansi -pedantic -O2 foo.c -o foo
    -save-temps
    foo.c: In function 'main':
    foo.c:10: warning: dereferencing type-punned pointer will break
    strict-aliasing rules
    [mark@icepick ~]$ ./foo
    a = 0xdeadffee
    branch was taken!

    Very unsurprising, but some people continue to be surprised by it.


    Mark F. Haigh
     
    Mark F. Haigh, Mar 11, 2006
    #6
  7. Mike Guest

    On 10 Mar 2006 23:42:14 -0800, "Me" <>
    wrote:

    >
    >Congratulations, you got them all right. You placed higher than the
    >majority of Microsoft programmers:
    >http://blogs.msdn.com/oldnewthing/archive/2004/03/26/96777.aspx#100845
    >


    I've used thier products before, no big suprise there ;). Thanks for
    the reply though, i'm going to look up more on aliasing later on.
     
    Mike, Mar 11, 2006
    #7
  8. Mike Guest

    On Sat, 11 Mar 2006 06:55:31 GMT, Micah Cowan <>
    wrote:

    >
    >You mean implicitly converted. "Implicit cast" is an oxymoron: a cast
    >is an inherently explicit conversion operation.
    >


    Hmm, seems I still need work on my terminology :). Thanks for the
    reply.
     
    Mike, Mar 11, 2006
    #8
  9. Me Guest

    Richard G. Riley wrote:
    > "Me"posted the following on 2006-03-11:
    > > disallowed is because of aliasing. Here is a simple example (assume int
    > > and long have the same size/representation on this implementation):

    >
    > Is there a difference with, e.g.,
    >
    > foo(*(long*)&intv,10);
    > intv =5;
    >
    > or may the compiler still make the same assumptions?


    I have no clue what that example means but I think what you're asking
    is for the compiler to rely on static analysis. This has a few
    problems, the major issue being code that crosses 2 translation units.
    In this case, the static analysis is delayed until link time. For
    static linking, this can be done but with dynamic linking this is very
    difficult or impossible depending on the platform.

    There are a few cases, lets call S static analysis and A type-based
    alias analysis

    - This is a very crappy compiler. Most compilers in debug mode do
    this.
    S - Ignore the type system. The types are just used for the type system
    and to set size/alignment/representation. You're basically running an
    optimizing BCPL compiler with friendlier syntax on non-word
    aligned/sized objects at this point. Your average C programmer assumes
    this is correct.
    A- Don't do any object tracking at all. I don't know of any compilers
    that do this. It basically assumes the programmers know what they're
    doing and if they lie, their program is likely to crash (if you're
    lucky). This is what the C standard assumes is correct.
    SA- This is what most compilers do with full optimizations. Depending
    on the compiler, this is what happens:

    if it detects you lying:
    - it can warn you. This is good for tools like lint that detect broken
    code.
    - it can side with the aliasing and assume the programmer knows what
    they're doing (case A). This leads to the fastest code at the expense
    of crashing.
    - it can side with the object tracking that assumes the programmer is
    an idiot (case S). The code is slower than siding with the aliasing
    case but it works.

    if it detects no lie:
    - rely on both type and object information to generate the fastest code
    possible.

    if it can't decide if you lied or not:
    - make a conservative choice to allow for bad programmers (case S)
    - make an aggressive choice to allow for faster code (case A)

    What you're suggesting is for the compiler to do SA but be
    conservative. It's a bad idea to rely on this because you're just
    wallpapering over the problem and as your code piles up over time, this
    bug becomes way more expensive/difficult to detect and fix. The sad
    reality is that the majority of code you find makes this kind of
    mistake because they're ignorant of what the standard says and
    compilers in the past weren't aggressive with optimization based on the
    type system.
     
    Me, Mar 12, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sergio

    returning a void (*)(void)

    Sergio, Jan 5, 2005, in forum: C++
    Replies:
    6
    Views:
    445
    Jonathan Turkanis
    Jan 5, 2005
  2. Ollej Reemt
    Replies:
    7
    Views:
    568
    Jack Klein
    Apr 22, 2005
  3. Stig Brautaset

    `void **' revisited: void *pop(void **root)

    Stig Brautaset, Oct 25, 2003, in forum: C Programming
    Replies:
    15
    Views:
    813
    The Real OS/2 Guy
    Oct 28, 2003
  4. Replies:
    5
    Views:
    851
    S.Tobias
    Jul 22, 2005
  5. Replies:
    1
    Views:
    419
    Victor Bazarov
    May 23, 2007
Loading...

Share This Page