Function casting - UB?

Discussion in 'C Programming' started by nroberts, Jun 27, 2012.

  1. nroberts

    nroberts Guest

    In C++ this would be totally undefined. How about in C? It works on
    my machine with my compiler...

    If it's UB, is it one of those rules that pretty much has to work
    anyway? What I mean is things like the fact that it's UB to assign to
    one part of a union and read from another but it generally works and
    is a pretty important construct. Is this like that?

    #include <stdio.h>

    typedef void (*fun)(int, int);

    void f(int i) {
    int q = 33;
    printf("%d %d\n", i, q);
    }

    int main(void) {
    fun funptr = (fun)f;
    funptr(3,2);

    return 0;
    }

    output => 3 33
     
    nroberts, Jun 27, 2012
    #1
    1. Advertising

  2. nroberts

    Eric Sosman Guest

    On 6/27/2012 11:50 AM, nroberts wrote:
    > In C++ this would be totally undefined. How about in C? It works on
    > my machine with my compiler...
    >
    > If it's UB, is it one of those rules that pretty much has to work
    > anyway? What I mean is things like the fact that it's UB to assign to
    > one part of a union and read from another but it generally works and
    > is a pretty important construct. Is this like that?
    >
    > #include <stdio.h>
    >
    > typedef void (*fun)(int, int);
    >
    > void f(int i) {
    > int q = 33;
    > printf("%d %d\n", i, q);
    > }
    >
    > int main(void) {
    > fun funptr = (fun)f;
    > funptr(3,2);
    >
    > return 0;
    > }
    >
    > output => 3 33


    Undefined behavior, because the type of the pointer expression
    `funptr' used in the call does not match the type of the called
    function `f'. (The fact that the `(fun)' cast was necessary
    should alert you to the mismatch.)

    It's likely to "work" on a good many systems. For example,
    on a system where the first four or so "sufficiently small"
    arguments are passed in a few designated registers, `main' will
    fill two registers and `f' will use only one of them, and there's
    a decent chance that the one `f' uses will be the one `main' put
    the 3 into. But it would be a bad idea to generalize from the
    apparent success of this simple case to the notion "It works!"
    Throw in a floating-point argument, or a union argument, or even
    more than MagicNumber plain int arguments, and things may well go
    blooey. Make the function struct- or union-valued, and things are
    more likely than not to go blooey.

    And then, there's wide variety in function linkage mechanisms.
    Sometimes the caller takes care of both setting up and disposing
    of the argument list, but sometimes the caller sets it up and the
    callee disposes -- and if the callee does cleanup for a one-argument
    list while the caller expected it to clean up two, it's blooey again.

    Does it "pretty much" have to work? That sort of depends on
    your definition of "pretty much," but my take would be "No."

    Besides: Can you come up with a good reason to want to do such
    a perverted thing in the first place? In what way does this merit
    being called "a pretty important construct?"

    --
    Eric Sosman
    d
     
    Eric Sosman, Jun 27, 2012
    #2
    1. Advertising

  3. nroberts

    Tim Rentsch Guest

    nroberts <> writes:

    > In C++ this would be totally undefined. How about in C? It works on
    > my machine with my compiler...


    The example you show is clearly and explicitly undefined behavior.

    > If it's UB, is it one of those rules that pretty much has to work
    > anyway? What I mean is things like the fact that it's UB to assign to
    > one part of a union and read from another but it generally works and
    > is a pretty important construct. Is this like that?


    No, it's completely different, because it is both allowed to fail
    and is likely to fail on some platforms. The other example you give,
    assigning to one member of a union and reading from another, is
    actually defined behavior, _not_ undefined behavior.

    > #include <stdio.h>
    >
    > typedef void (*fun)(int, int);
    >
    > void f(int i) {
    > int q = 33;
    > printf("%d %d\n", i, q);
    > }
    >
    > int main(void) {
    > fun funptr = (fun)f;
    > funptr(3,2);
    >
    > return 0;
    > }
    >
    > output => 3 33


    The call to f() through 'funptr' is undefined behavior because
    the type of the function pointer used to call is not compatible
    with the type of the function actually being called. And that
    isn't just a theoretical problem.
     
    Tim Rentsch, Jun 27, 2012
    #3
  4. On 2012-06-27, Eric Sosman <> wrote:
    > On 6/27/2012 11:50 AM, nroberts wrote:
    >> In C++ this would be totally undefined. How about in C? It works on
    >> my machine with my compiler...

    ....
    >> typedef void (*fun)(int, int);
    >>
    >> void f(int i) {
    >> int q = 33;
    >> printf("%d %d\n", i, q);
    >> }
    >>
    >> int main(void) {
    >> fun funptr = (fun)f;
    >> funptr(3,2);
    >>
    >> return 0;
    >> }

    ....
    > the 3 into. But it would be a bad idea to generalize from the
    > apparent success of this simple case to the notion "It works!"


    I fully agree that empirical experimentation is not terribly useful
    (and/or perhaps interesting), but FWIW, a counter-example:

    $ cat tmp.c
    #include <windows.h>

    int WINAPI f(int a) { return a; }

    int g(void) {
    int WINAPI (*p)(int, int) = (int WINAPI (*)(int, int))f;
    return p(0, 1);
    }

    int main(void) {
    return g();
    }

    $ i586-mingw32msvc-gcc -o tmp.exe tmp.c -O2 -fomit-frame-pointer
    $ wine ./tmp.exe
    wine: Unhandled page fault on read access to 0x00000001 at address 0x1
    (thread 0022), starting debugger...

    (For the "not really comp.lang.c material" details, see below.)

    > And then, there's wide variety in function linkage mechanisms.
    > Sometimes the caller takes care of both setting up and disposing
    > of the argument list, but sometimes the caller sets it up and the
    > callee disposes -- and if the callee does cleanup for a one-argument
    > list while the caller expected it to clean up two, it's blooey again.


    The above case (involving the "WINAPI"/"stdcall" calling convention) is
    an example of the latter, and with the frame pointer omitted, 'g' ends
    up using the second provided parameter as its return address.

    --
    Heikki Kallasjoki
     
    Heikki Kallasjoki, Jun 27, 2012
    #4
  5. nroberts

    Les Cargill Guest

    nroberts wrote:
    > In C++ this would be totally undefined. How about in C? It works on
    > my machine with my compiler...
    >
    > If it's UB, is it one of those rules that pretty much has to work
    > anyway? What I mean is things like the fact that it's UB to assign to
    > one part of a union and read from another but it generally works and
    > is a pretty important construct. Is this like that?
    >


    No. The example given is just wrong, period.

    > #include <stdio.h>
    >
    > typedef void (*fun)(int, int);
    >
    > void f(int i) {
    > int q = 33;
    > printf("%d %d\n", i, q);
    > }
    >
    > int main(void) {
    > fun funptr = (fun)f;
    > funptr(3,2);
    >
    > return 0;
    > }
    >
    > output => 3 33
    >



    --
    Les Cargill
     
    Les Cargill, Jun 27, 2012
    #5
  6. nroberts

    Les Cargill Guest

    nroberts wrote:
    > In C++ this would be totally undefined. How about in C? It works on
    > my machine with my compiler...
    >
    > If it's UB, is it one of those rules that pretty much has to work
    > anyway? What I mean is things like the fact that it's UB to assign to
    > one part of a union and read from another


    I am not sure that that is undefined behavior. That's sort of what
    unions are *for*.

    > but it generally works and
    > is a pretty important construct. Is this like that?
    >
    > #include <stdio.h>
    >
    > typedef void (*fun)(int, int);
    >
    > void f(int i) {
    > int q = 33;
    > printf("%d %d\n", i, q);
    > }
    >
    > int main(void) {
    > fun funptr = (fun)f;
    > funptr(3,2);
    >
    > return 0;
    > }
    >
    > output => 3 33
    >



    --
    Les Cargill
     
    Les Cargill, Jun 27, 2012
    #6
  7. nroberts

    James Kuyper Guest

    On 06/27/2012 01:59 PM, Les Cargill wrote:
    > nroberts wrote:

    ....
    >> anyway? What I mean is things like the fact that it's UB to assign to
    >> one part of a union and read from another

    >
    > I am not sure that that is undefined behavior. That's sort of what
    > unions are *for*.


    A footnote in the current version of the standard says that the result
    of reading from a different member of a union than the one last written
    is that the bit pattern stored in that memory is reinterpreted according
    to the type of the member being read; it would therefore have defined
    behavior, so long as that bit pattern is a valid one for that type.
    Some have claimed that this conclusion can be derived from the normative
    text of the standard, but I find the argument supporting that claim
    weak. There's certainly no normative text that says so directly.
    However, that is how unions were always intended to work, whether or not
    the normative text of the standard has ever actually said so.
     
    James Kuyper, Jun 27, 2012
    #7
  8. nroberts

    Tim Rentsch Guest

    "christian.bau" <> writes:

    > On Jun 27, 7:45 pm, James Kuyper <> wrote:
    >
    >> A footnote in the current version of the standard says that the result
    >> of reading from a different member of a union than the one last written
    >> is that the bit pattern stored in that memory is reinterpreted according
    >> to the type of the member being read; it would therefore have defined
    >> behavior, so long as that bit pattern is a valid one for that type.
    >> Some have claimed that this conclusion can be derived from the normative
    >> text of the standard, but I find the argument supporting that claim
    >> weak. There's certainly no normative text that says so directly.
    >> However, that is how unions were always intended to work, whether or not
    >> the normative text of the standard has ever actually said so.

    >
    > I had to check that, and you are right (footnote 95 in the N1570
    > draft). I think there is a problem. Say long and float have the same
    > size, I have a union containing a long and a float, I write to the
    > long and read the float, then I am supposed to get a float with
    > exactly those bits that I stored. That's perfectly fine.
    >
    > But what if the compiler doesn't know that both are elements of the
    > same union? If I just have a long*, and a float*, which _might_ point
    > to members of the same union, but the compiler doesn't know. Does the
    > rule apply then as well? That would completely destroy what is said in
    > other places.


    This case is different, because it is addressed by different
    portions of the effective type rules. In particular, using
    the '.' or '->' form of access, the lvalue being accessed
    has a declared type, and so those accesses never violate effective
    type rules. When access is done using pointers, the rule for
    determining effective type is different, so the two accesses
    may very well run afoul of the effective type requirements.

    > I'd prefer if this was said in the standard explicitely, but with the
    > restriction that the value must be written, then read, using the . or -
    >> operators.


    Unfortunately the Standard often expresses itself rather obliquely,
    and this case certainly falls into that category. However, it should
    be easy to see that the two different cases you bring up are covered
    under different areas of the effective type rules. See 6.5 p6.
    Note especially the first sentence, which applies in the case of
    member access (ie, through '.' or '->', but which does not apply
    in the case of pointer access.
     
    Tim Rentsch, Jun 28, 2012
    #8
  9. On 27.06.2012 19:37, Tim Rentsch wrote:
    > The other example you give,
    > assigning to one member of a union and reading from another, is
    > actually defined behavior, _not_ undefined behavior.


    I also was under the impression that writing to member x and reading
    member y of a union is UB. Wikipedia says "This is not, however, a safe
    use of unions in general.", which is pretty vague (i.e. it's not clear
    which cases are safe and which are not).

    Could you elaborate on why you think this is well-defined?

    Best regards,
    Johannes

    --
    >> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

    > Zumindest nicht öffentlich!

    Ah, der neueste und bis heute genialste Streich unsere großen
    Kosmologen: Die Geheim-Vorhersage.
    - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$>
     
    Johannes Bauer, Jun 28, 2012
    #9
  10. On 28.06.2012 09:23, Johannes Bauer wrote:
    > On 27.06.2012 19:37, Tim Rentsch wrote:
    >> The other example you give,
    >> assigning to one member of a union and reading from another, is
    >> actually defined behavior, _not_ undefined behavior.

    >
    > I also was under the impression that writing to member x and reading
    > member y of a union is UB. Wikipedia says "This is not, however, a safe
    > use of unions in general.", which is pretty vague (i.e. it's not clear
    > which cases are safe and which are not).
    >
    > Could you elaborate on why you think this is well-defined?


    Ah, I just read James' response further down. Interesting. Really
    thought this was undefined. Is this a recent change?

    Best regards,
    Johannes

    --
    >> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

    > Zumindest nicht öffentlich!

    Ah, der neueste und bis heute genialste Streich unsere großen
    Kosmologen: Die Geheim-Vorhersage.
    - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$>
     
    Johannes Bauer, Jun 28, 2012
    #10
  11. nroberts

    Jens Gustedt Guest

    Am 28.06.2012 09:27, schrieb Johannes Bauer:
    > On 28.06.2012 09:23, Johannes Bauer wrote:
    >> On 27.06.2012 19:37, Tim Rentsch wrote:
    >>> The other example you give,
    >>> assigning to one member of a union and reading from another, is
    >>> actually defined behavior, _not_ undefined behavior.

    >>
    >> I also was under the impression that writing to member x and reading
    >> member y of a union is UB. Wikipedia says "This is not, however, a safe
    >> use of unions in general.", which is pretty vague (i.e. it's not clear
    >> which cases are safe and which are not).
    >>
    >> Could you elaborate on why you think this is well-defined?

    >
    > Ah, I just read James' response further down. Interesting. Really
    > thought this was undefined. Is this a recent change?


    I think this was not considered as a change in contents but given as
    more precision on the intent. n1256.pdf has modification marks in this
    region so I suppose that these came with TC3. They state

    > When a value is stored in a member of an object of union type, the
    > bytes of the object representation that do not correspond to that
    > member but do correspond to other members take unspecified values.


    which in terms of the standard means that it is only UB if these
    unspecific values are "forbidden" values for that type, in particular
    trap representations.

    This means that for most of modern architectures manipulating integer
    values (except of _Bool) through unions is completely ok. Floating
    point values, _Bool, and pointer types must be treated with more care.

    Jens
     
    Jens Gustedt, Jun 28, 2012
    #11
  12. nroberts

    Tim Rentsch Guest

    Johannes Bauer <> writes:

    > On 27.06.2012 19:37, Tim Rentsch wrote:
    >> The other example you give,
    >> assigning to one member of a union and reading from another, is
    >> actually defined behavior, _not_ undefined behavior.

    >
    > I also was under the impression that writing to member x and reading
    > member y of a union is UB. Wikipedia says "This is not, however, a safe
    > use of unions in general.", which is pretty vague (i.e. it's not clear
    > which cases are safe and which are not).
    >
    > Could you elaborate on why you think this is well-defined?


    Besides the particular footnote (which you already mentioned in
    your own followup), there is just the normative text pertaining to
    types and storage access. If you read through the two main
    sections on types (6.2.5 and 6.2.6), and also the description of
    what happens on lvalue-to-value conversion, I think it's pretty
    easy to see that the definition is there (although I freely admit
    it isn't expressed as directly as one might like). Basically, the
    same passages that explain how ordinary (ie, non-union-member)
    access works also explain how access to union members work; the
    only thing that's missing is knowing that the respective memories
    overlap, which is stated in 6.2.5. There is another detail having
    to do with effective type rules, but that doesn't contribute to
    defining the semantics; it just needs to be checked to make sure
    the effective type rules don't _un_define the semantics (and they
    don't, but if you're interested look at 6.5 p6&7).
     
    Tim Rentsch, Jun 29, 2012
    #12
  13. nroberts

    Tim Rentsch Guest

    Jens Gustedt <> writes:

    > Am 28.06.2012 09:27, schrieb Johannes Bauer:
    >> On 28.06.2012 09:23, Johannes Bauer wrote:
    >>> On 27.06.2012 19:37, Tim Rentsch wrote:
    >>>> The other example you give,
    >>>> assigning to one member of a union and reading from another, is
    >>>> actually defined behavior, _not_ undefined behavior.
    >>>
    >>> I also was under the impression that writing to member x and reading
    >>> member y of a union is UB. Wikipedia says "This is not, however, a safe
    >>> use of unions in general.", which is pretty vague (i.e. it's not clear
    >>> which cases are safe and which are not).
    >>>
    >>> Could you elaborate on why you think this is well-defined?

    >>
    >> Ah, I just read James' response further down. Interesting. Really
    >> thought this was undefined. Is this a recent change?

    >
    > I think this was not considered as a change in contents but given as
    > more precision on the intent. n1256.pdf has modification marks in this
    > region so I suppose that these came with TC3. [snip]


    Yes, if you read the Defect Report that prompted the change I
    think you'll find that the intention was that the behavior
    required was supposed to be the same all along (ie, since C90 and
    presumably also before that), but changes in wording in other
    places raised a concern that this (unchanged) requirement was not
    evident enough without the footnote.
     
    Tim Rentsch, Jun 29, 2012
    #13
  14. On Jun 27, 5:44 pm, Tim Rentsch <> wrote:
    > "christian.bau" <> writes:
    > > On Jun 27, 7:45 pm, James Kuyper <> wrote:

    >
    > >> A footnote in the current version of the standard says that the result
    > >> of reading from a different member of a union than the one last written
    > >> is that the bit pattern stored in that memory is reinterpreted according
    > >> to the type of the member being read; it would therefore have defined
    > >> behavior, so long as that bit pattern is a valid one for that type.
    > >> Some have claimed that this conclusion can be derived from the normative
    > >> text of the standard, but I find the argument supporting that claim
    > >> weak. There's certainly no normative text that says so directly.
    > >> However, that is how unions were always intended to work, whether or not
    > >> the normative text of the standard has ever actually said so.

    >
    > > I had to check that, and you are right (footnote 95 in the N1570
    > > draft). I think there is a problem. Say long and float have the same
    > > size, I have a union containing a long and a float, I write to the
    > > long and read the float, then I am supposed to get a float with
    > > exactly those bits that I stored. That's perfectly fine.

    >
    > > But what if the compiler doesn't know that both are elements of the
    > > same union? If I just have a long*, and a float*, which _might_ point
    > > to members of the same union, but the compiler doesn't know. Does the
    > > rule apply then as well? That would completely destroy what is said in
    > > other places.

    >
    > This case is different, because it is addressed by different
    > portions of the effective type rules. In particular, using
    > the '.' or '->' form of access, the lvalue being accessed
    > has a declared type, and so those accesses never violate effective
    > type rules. When access is done using pointers, the rule for
    > determining effective type is different, so the two accesses
    > may very well run afoul of the effective type requirements.
    >
    > > I'd prefer if this was said in the standard explicitely, but with the
    > > restriction that the value must be written, then read, using the . or -
    > >> operators.

    >
    > Unfortunately the Standard often expresses itself rather obliquely,
    > and this case certainly falls into that category. However, it should
    > be easy to see that the two different cases you bring up are covered
    > under different areas of the effective type rules. See 6.5 p6.
    > Note especially the first sentence, which applies in the case of
    > member access (ie, through '.' or '->', but which does not apply
    > in the case of pointer access.


    Sorry. Some silly questions if I may, please? Consider the following
    programs:

    int main(void)
    {
    union { int x; float y; } u;
    u.y = 2;
    u.x = 1;
    return u.x;
    }
    /* ---- */
    int main(void)
    {
    union { int x; float y; } u;
    float * y = &u.y;
    *y = 2;
    int * x = &u.x;
    *x = 1;
    return u.x;
    }
    /* ---- */
    int main(void)
    {
    union { int x; float y; } u;
    float * y = &u.y;
    int * x = &u.x;
    *y = 2;
    *x = 1;
    return u.x;
    }
    /* ---- */
    void foo(int * x, float * y)
    {
    *y = 2;
    *x = 1;
    }
    int main(void)
    {
    union { int x; float y; } u;
    float * y = &u.y;
    int * x = &u.x;
    foo(x, y);
    return u.x;
    }
    /* ---- */
    The last program above, except with foo in a different translation
    unit.

    Where exactly do you think we cross from defined behavior to undefined
    behavior? I would argue that the first example is clearly not UB, and
    the last example with foo() in a different translation unit is
    probably UB. Specifically, the intent of the effective type rules is
    to allow the compiler to do additional aliasing analysis and reorder
    reads and writes that are sufficiently differently typed. With foo()
    in a different translation unit, we want the compiler to be able to
    reorder the writes to x and y in foo() from type aliasing analysis,
    but if we do that then we'll change the semantics of the last program
    and have it return garbage.

    I don't have a strong opinion on this one. It seems that the intent of
    the type access rules and the existence of unions is an inherent
    contradiction - with several plausible ways out, of course.
     
    Joshua Maurice, Jul 8, 2012
    #14
  15. On Sat, 7 Jul 2012 16:46:42 -0700 (PDT), Joshua Maurice
    <> wrote:

    >On Jun 27, 5:44 pm, Tim Rentsch <> wrote:
    >> "christian.bau" <> writes:
    >> > On Jun 27, 7:45 pm, James Kuyper <> wrote:

    >>
    >> >> A footnote in the current version of the standard says that the result
    >> >> of reading from a different member of a union than the one last written
    >> >> is that the bit pattern stored in that memory is reinterpreted according
    >> >> to the type of the member being read; it would therefore have defined
    >> >> behavior, so long as that bit pattern is a valid one for that type.


    <snip>

    >Sorry. Some silly questions if I may, please? Consider the following
    >programs:
    >
    > int main(void)
    > {
    > union { int x; float y; } u;
    > u.y = 2;
    > u.x = 1;
    > return u.x;
    > }


    <snip three similar examples>

    >Where exactly do you think we cross from defined behavior to undefined
    >behavior? I would argue that the first example is clearly not UB, and


    None of your examples perform the sequence of operations under
    discussion. In every case, you store a value in one member of the
    union, store a value in a different member of the union, and then
    access the member which was last stored. Accessing the last stored
    member never yields undefined behavior.

    --
    Remove del for email
     
    Barry Schwarz, Jul 8, 2012
    #15
  16. nroberts

    Tim Rentsch Guest

    Joshua Maurice <> writes:

    > On Jun 27, 5:44 pm, Tim Rentsch <> wrote:
    >> "christian.bau" <> writes:
    >> > On Jun 27, 7:45 pm, James Kuyper <> wrote:

    >>
    >> >> A footnote in the current version of the standard says that the result
    >> >> of reading from a different member of a union than the one last written
    >> >> is that the bit pattern stored in that memory is reinterpreted according
    >> >> to the type of the member being read; it would therefore have defined
    >> >> behavior, so long as that bit pattern is a valid one for that type.
    >> >> Some have claimed that this conclusion can be derived from the normative
    >> >> text of the standard, but I find the argument supporting that claim
    >> >> weak. There's certainly no normative text that says so directly.
    >> >> However, that is how unions were always intended to work, whether or not
    >> >> the normative text of the standard has ever actually said so.

    >>
    >> > I had to check that, and you are right (footnote 95 in the N1570
    >> > draft). I think there is a problem. Say long and float have the same
    >> > size, I have a union containing a long and a float, I write to the
    >> > long and read the float, then I am supposed to get a float with
    >> > exactly those bits that I stored. That's perfectly fine.

    >>
    >> > But what if the compiler doesn't know that both are elements of the
    >> > same union? If I just have a long*, and a float*, which _might_ point
    >> > to members of the same union, but the compiler doesn't know. Does the
    >> > rule apply then as well? That would completely destroy what is said in
    >> > other places.

    >>
    >> This case is different, because it is addressed by different
    >> portions of the effective type rules. In particular, using
    >> the '.' or '->' form of access, the lvalue being accessed
    >> has a declared type, and so those accesses never violate effective
    >> type rules. When access is done using pointers, the rule for
    >> determining effective type is different, so the two accesses
    >> may very well run afoul of the effective type requirements.
    >>
    >> > I'd prefer if this was said in the standard explicitely, but with the
    >> > restriction that the value must be written, then read, using the . or -
    >> >> operators.

    >>
    >> Unfortunately the Standard often expresses itself rather obliquely,
    >> and this case certainly falls into that category. However, it should
    >> be easy to see that the two different cases you bring up are covered
    >> under different areas of the effective type rules. See 6.5 p6.
    >> Note especially the first sentence, which applies in the case of
    >> member access (ie, through '.' or '->', but which does not apply
    >> in the case of pointer access.

    >
    > Sorry. Some silly questions if I may, please? Consider the following
    > programs:
    >
    > int main(void)
    > {
    > union { int x; float y; } u;
    > u.y = 2;
    > u.x = 1;
    > return u.x;
    > }
    > /* ---- */
    > int main(void)
    > {
    > union { int x; float y; } u;
    > float * y = &u.y;
    > *y = 2;
    > int * x = &u.x;
    > *x = 1;
    > return u.x;
    > }
    > /* ---- */
    > int main(void)
    > {
    > union { int x; float y; } u;
    > float * y = &u.y;
    > int * x = &u.x;
    > *y = 2;
    > *x = 1;
    > return u.x;
    > }
    > /* ---- */
    > void foo(int * x, float * y)
    > {
    > *y = 2;
    > *x = 1;
    > }
    > int main(void)
    > {
    > union { int x; float y; } u;
    > float * y = &u.y;
    > int * x = &u.x;
    > foo(x, y);
    > return u.x;
    > }
    > /* ---- */
    > The last program above, except with foo in a different translation
    > unit.
    >
    > Where exactly do you think we cross from defined behavior to undefined
    > behavior? I would argue that the first example is clearly not UB, and
    > the last example with foo() in a different translation unit is
    > probably UB. Specifically, the intent of the effective type rules is
    > to allow the compiler to do additional aliasing analysis and reorder
    > reads and writes that are sufficiently differently typed. With foo()
    > in a different translation unit, we want the compiler to be able to
    > reorder the writes to x and y in foo() from type aliasing analysis,
    > but if we do that then we'll change the semantics of the last program
    > and have it return garbage.
    >
    > I don't have a strong opinion on this one. It seems that the intent of
    > the type access rules and the existence of unions is an inherent
    > contradiction - with several plausible ways out, of course.


    I'm sorry, I didn't see any silly questions. Is it okay if I
    just answer what you asked? (See, there's an example of a silly
    question. :)

    If we take the effective type rules at face value, I don't think
    any of these are undefined behavior. In each case the stores that
    are done are consistent with the declared type of the member whose
    object is being stored into. Going through the different sequences
    (and I admit I haven't checked them as carefully as I might have)
    and referring to the effective type rules in each case, I don't see
    any violations. That includes the last case where the foo()
    function is defined in a different TU, although AFAIK that doesn't
    change whether effective type rules are violated.

    Of course, this is upsetting, because intuitively we expect that
    when it looks like reordering might muck things up then either the
    reordering isn't allowed (presumably due to effective type rules
    considerations) or the program has crossed over into undefined
    behavior (probably because effective type rules have been
    violated). None of the obvious alternatives seems appealing, eg,
    "no reordering can be done in cases like this" (ick), or "stores
    through the x and y pointers can be reordered, and the later access
    of u.x just gets one or the other -- ie, unspecified behavior, but
    not undefined behavior" (at odds with other parts of the Standard),
    or "even though these case follow the letter of the law, effective
    type wise, they violate its spirit, and therefore are undefined
    behavior" (lacks evidence to be convincing). Of course, any
    sensible developer would instinctively shy away from writing such
    code, but that doesn't resolve the question.

    I have two principal takeaways to offer.

    First, how the effective type rules are phrased is somewhat broken,
    or at least incomplete. If these examples are defined behavior,
    that has serious negative consequences for code reordering. If
    they are supposed to have undefined behavior, the effective type
    rules don't express that adequately. Neither of those consequences
    is acceptable, I would say, and in either case the Standard needs
    to clarify what is meant.

    Second, as a practical matter, this kind of pattern (taking
    addresses of several members of the same union object, storing
    through the resultant pointers, then using . or -> to get the value
    of one of those members, is likely to be unspecified hehavior as
    far as which store occurred last. That behavior is what I think
    most seasoned developers would expect, how most actual compilers
    will generate code, and (I opine) what the Standard would prescribe
    if a suitable way of expressing that presented itself. My feeling
    is that cases like this one _should_ be unspecified behavior, and not
    undefined behavior, but I also know that finding suitable language
    to delimit the boundaries -- clearly, correctly, and exactly --
    is not at all an easy task.
     
    Tim Rentsch, Jul 8, 2012
    #16
  17. nroberts

    Tim Rentsch Guest

    Barry Schwarz <> writes:

    > On Sat, 7 Jul 2012 16:46:42 -0700 (PDT), Joshua Maurice
    > <> wrote:
    >
    >>On Jun 27, 5:44 pm, Tim Rentsch <> wrote:
    >>> "christian.bau" <> writes:
    >>> > On Jun 27, 7:45 pm, James Kuyper <> wrote:
    >>>
    >>> >> A footnote in the current version of the standard says that the result
    >>> >> of reading from a different member of a union than the one last written
    >>> >> is that the bit pattern stored in that memory is reinterpreted according
    >>> >> to the type of the member being read; it would therefore have defined
    >>> >> behavior, so long as that bit pattern is a valid one for that type.

    >
    > <snip>
    >
    >>Sorry. Some silly questions if I may, please? Consider the following
    >>programs:
    >>
    >> int main(void)
    >> {
    >> union { int x; float y; } u;
    >> u.y = 2;
    >> u.x = 1;
    >> return u.x;
    >> }

    >
    > <snip three similar examples>
    >
    >>Where exactly do you think we cross from defined behavior to undefined
    >>behavior? I would argue that the first example is clearly not UB, and

    >
    > None of your examples perform the sequence of operations under
    > discussion. In every case, you store a value in one member of the
    > union, store a value in a different member of the union, and then
    > access the member which was last stored. Accessing the last stored
    > member never yields undefined behavior.


    Only the first example (ie, the only one not snipped) stores into
    members. The other examples store into objects that happen to
    coincide with memory areas corresponding to members of u, but
    that's not the same as storing into members. If nothing else,
    which parts of the effective type rules govern the accesses
    are different in the two cases.
     
    Tim Rentsch, Jul 8, 2012
    #17
  18. On Sun, 08 Jul 2012 11:13:53 -0700, Tim Rentsch
    <> wrote:

    >Barry Schwarz <> writes:

    snip

    >> None of your examples perform the sequence of operations under
    >> discussion. In every case, you store a value in one member of the
    >> union, store a value in a different member of the union, and then
    >> access the member which was last stored. Accessing the last stored
    >> member never yields undefined behavior.

    >
    >Only the first example (ie, the only one not snipped) stores into
    >members. The other examples store into objects that happen to
    >coincide with memory areas corresponding to members of u, but
    >that's not the same as storing into members. If nothing else,
    >which parts of the effective type rules govern the accesses
    >are different in the two cases.


    Do I understand correctly that storing into a member and storing into
    the memory occupied by that member are somehow different?

    --
    Remove del for email
     
    Barry Schwarz, Jul 8, 2012
    #18
  19. On Jul 8, 3:24 pm, Barry Schwarz <> wrote:
    > On Sun, 08 Jul 2012 11:13:53 -0700, Tim Rentsch
    > >Only the first example (ie, the only one not snipped) stores into
    > >members.  The other examples store into objects that happen to
    > >coincide with memory areas corresponding to members of u, but
    > >that's not the same as storing into members.  If nothing else,
    > >which parts of the effective type rules govern the accesses
    > >are different in the two cases.

    >
    > Do I understand correctly that storing into a member and storing into
    > the memory occupied by that member are somehow different?


    I would hope not! (But maybe.) I agree that the current rules are
    unclear.

    I think/hope that:
    struct foo { int x; };
    int main(void)
    {
    struct foo f;
    f.x = 1;
    }
    is definitionally equivalent to:
    struct foo { int x; };
    int main(void)
    {
    struct foo f;
    int * y;
    y = &f.x;
    *y = 1;
    }
    Any decision that makes "f.x = 1;" somehow different than "y = &f.x;
    *y = 1;" is my least preferred alternative.

    I'd much rather have rules that require the compiler to limit its type
    aliasing optimizations when unions are in scope. Basically, a rule in
    the standard somewhere which says something like the following. Please
    note that I just whipped this up, and I have no clue if it's actually
    "correct". It could very probably/definitely be fixed, improved, etc.
    I'm just trying to get the ball rolling. There's closely related
    alternative formulations that would also be appealing to me.

    Quickie Definition: The "lifetime" of a pointer value is the
    contiguous interval of time of the program execution, starting when
    the pointer value is "created", and ending when the last "copy" or
    "derivation" of the pointer value ceases to exist in an object.
    Example:
    #include <stdlib.h>
    int main(void)
    {
    {
    int a[2];
    int * x;
    int * y;
    {
    x = a; /* this statement "creates" a pointer value */
    }
    /*the pointer object "x" exists, and it contains the pointer
    value, so it's still "alive" */
    y = x + 1;
    x = 0;
    /* the pointer value is still "alive" because a "derivation" of
    it exists in the pointer object "y" */
    }
    /* the pointer value is now "dead", and the pointer value lifetime
    has ended */
    }

    New Rule: For two accesses to two sufficiently differently typed
    members of a union, if:
    - the accesses are a write and a read, or two writes, to the union
    member objects or sub-objects thereof, and
    - the pointer value lifetimes of the pointer values used to do the
    accesses overlap, and
    - both accesses are done in scopes where the union definition is not
    visible, then
    - the program has undefined behavior.

    This approach formulated disallows all aliasing optimization with the
    types in a union when the union definition is in scope. Perhaps there
    are "nicer" ways to do this without such a substantial penalty.
     
    Joshua Maurice, Jul 9, 2012
    #19
  20. nroberts

    Tim Rentsch Guest

    Barry Schwarz <> writes:

    > On Sun, 08 Jul 2012 11:13:53 -0700, Tim Rentsch
    > <> wrote:
    >
    >>Barry Schwarz <> writes:

    > snip
    >
    >>> None of your examples perform the sequence of operations under
    >>> discussion. In every case, you store a value in one member of the
    >>> union, store a value in a different member of the union, and then
    >>> access the member which was last stored. Accessing the last stored
    >>> member never yields undefined behavior.

    >>
    >>Only the first example (ie, the only one not snipped) stores into
    >>members. The other examples store into objects that happen to
    >>coincide with memory areas corresponding to members of u, but
    >>that's not the same as storing into members. If nothing else,
    >>which parts of the effective type rules govern the accesses
    >>are different in the two cases.

    >
    > Do I understand correctly that storing into a member and storing into
    > the memory occupied by that member are somehow different?


    They are, if for no other reason than because effective type
    rules are different for the two cases. Let's look at the
    pointer case first:

    int
    f( int *pi, float *pf ){
    *pi = 1;
    *pf = 2;
    return *pi;
    }

    If pi and pf point to the same place -- for example, to two
    members of the same union object -- this function violates
    effective type rules, and therefore transgresses into
    undefined behavior. So a call like

    union { int i; float f; } u;
    ...
    f( &u.i, &u.f );

    would provoke undefined behavior. Now consider a similar
    function that accesses the union object 'u' directly, eg,

    int
    g(){
    u.i = 1;
    u.f = 2;
    return u.i;
    }

    The function g does not violate effective type rules. Its
    behavior is defined, subject to the implementation-defined
    representations of the two types involved. That is, it
    should obey all the regular access rules, and there are no
    'shall' stipulations that it violates (at least, I'm not
    aware of any, and I've looked fairly long and hard at
    questions like this), and that is enough to define the
    behavior (again, subject to how the types are represented).

    It makes sense that these two cases would be different. If
    they weren't, then everywhere there were pointers to two
    different types, those pointers might potentially point to
    members of the same union object, which would greatly inhibit
    potential code movement. Also, the "special guarantee" of
    6.5.2.3 p6 would not be needed, because the possibility of
    the two struct types belonging to the same union would (under
    the assumption that pointers to objects of members and direct
    member access is the same) be enough to guarantee correct
    behavior. If that were so, there would be no reason to have
    the guarantee of 6.5.2.3 p6.

    In footnote 95 (footnote 83 in N1256), the Standard says in
    plain English what happens when one member is read when
    another has been stored into. But notice the way it says
    that:

    If the member used to read the contents of a union object
    is not the same as the member last used to store a value
    in the object, ...

    Note: 'the member /used/ to read', and 'the member last /used/ to
    store' (my emphasis). The explanation in the footnote applies only
    to member access that is done directly, ie, using '.' or '->', and
    not just dereferencing a pointer that happens to point to the
    member in question. And that distinction is consistent with the
    differences in how effective type rules treat the two situations.

    Does this help explain my earlier statement?
     
    Tim Rentsch, Jul 10, 2012
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. kevin
    Replies:
    11
    Views:
    5,840
    Andrew McDonagh
    Jan 8, 2005
  2. Bren
    Replies:
    4
    Views:
    772
    Ron Natalie
    Sep 18, 2003
  3. ken
    Replies:
    3
    Views:
    2,572
    Rolf Magnus
    Nov 8, 2003
  4. Wally Barnes
    Replies:
    3
    Views:
    545
    Wally Barnes
    Nov 20, 2008
  5. Sosuke

    Up casting and down casting

    Sosuke, Dec 20, 2009, in forum: C++
    Replies:
    2
    Views:
    595
    James Kanze
    Dec 20, 2009
Loading...

Share This Page