all-bits-zero pointer-to-object representation

Discussion in 'C Programming' started by Ersek, Laszlo, Apr 26, 2010.

  1. Hi,

    with reference to [0] and [1], please consider the following:

    1 #include <string.h>
    2 #include <stdlib.h>
    3
    4 struct x {
    5 double *y;
    6 };
    7
    8 int
    9 main(void)
    10 {
    11 struct x *x = malloc(sizeof *x);
    12
    13 /* suppose the allocation succeeds */
    14 (void)memset(x, 0, sizeof *x);
    15 (void)(0 == x->y);
    16 return 0;
    17 }

    In my understanding, the evaluation of x->y on line 15 is undefined
    behavior in C99.

    Consider the following (fictious) extension:

    "The all-bits-zero object representation is valid for any
    pointer-to-object type. Any pointer-to-object object with the
    all-bits-zero representation is a null pointer of the corresponding
    pointer-to-object type."

    Would this extension make the above program well defined?

    In particular, would the following program still break aliasing rules, as
    per C99 6.5 p 6-7?

    1 #include <stdlib.h>
    2
    3 int
    4 main(void)
    5 {
    6 double **d = malloc(sizeof *d);
    7 size_t pos;
    8
    9 /* suppose the allocation succeeded */
    10
    11 for (pos = 0u; pos < sizeof *d; ++pos) {
    12 ((char unsigned *)d)[pos] = 0u;
    13 }
    14
    15 (void)*d;
    16 return 0;
    17 }

    (I hope my question corresponds precisely to the austin-group-l topic.)

    Thank you very much,
    lacos


    [0] https://www.opengroup.org/sophocles...tpl&source=L&listname=austin-group-l&id=13687
    [1] https://www.opengroup.org/sophocles...tpl&source=L&listname=austin-group-l&id=13690
    Ersek, Laszlo, Apr 26, 2010
    #1
    1. Advertising

  2. Ersek, Laszlo

    James Kuyper Guest

    Ersek, Laszlo wrote:
    > Hi,
    >
    > with reference to [0] and [1], please consider the following:
    >
    > 1 #include <string.h>
    > 2 #include <stdlib.h>
    > 3
    > 4 struct x {
    > 5 double *y;
    > 6 };
    > 7
    > 8 int
    > 9 main(void)
    > 10 {
    > 11 struct x *x = malloc(sizeof *x);
    > 12
    > 13 /* suppose the allocation succeeds */
    > 14 (void)memset(x, 0, sizeof *x);
    > 15 (void)(0 == x->y);
    > 16 return 0;
    > 17 }
    >
    > In my understanding, the evaluation of x->y on line 15 is undefined
    > behavior in C99.
    >
    > Consider the following (fictious) extension:
    >
    > "The all-bits-zero object representation is valid for any
    > pointer-to-object type. Any pointer-to-object object with the
    > all-bits-zero representation is a null pointer of the corresponding
    > pointer-to-object type."


    That depends upon what you mean by valid. The standard distinguishes
    several cases. It talks about pointer objects containing representations
    of which can be dereferenced, incremented, decremented, compared for
    order, compared for equality, or simply copied as a pointer value. For
    each of those operations, the set of pointer representations valid for
    that operation is different.

    A null pointer value must compare equal to any other null pointer value,
    and it must compare unequal to any pointer to an object. A pointer is
    not valid for the purpose of dereferencing it, unless it points at an
    object. Therefore, in principle, an implementation cannot choose
    all-bits-0 to be both a null pointer and a pointer which is valid for
    the purpose of dereferencing it. However, offhand I can't come up with
    any code with defined behavior which demonstrates the non-conformance of
    such an implementation, so it might be permitted, under the "as-if" rule.

    > Would this extension make the above program well defined?


    Yes. The behavior would be well-defined, by the implementor. The
    behavior would, of course, still be undefined as far as the C standard
    is concerned, because "undefined behavior" is a specialized piece of
    jargon in the C standard. It doesn't carry the apparently obvious
    meaning of "behavior that has no definition". Instead, it means
    "behavior which is not defined by this standard" (I've paraphrased the
    actual wording, for the sake improved clarity in this context).

    > In particular, would the following program still break aliasing rules,
    > as per C99 6.5 p 6-7?
    >
    > 1 #include <stdlib.h>
    > 2
    > 3 int
    > 4 main(void)
    > 5 {
    > 6 double **d = malloc(sizeof *d);
    > 7 size_t pos;
    > 8
    > 9 /* suppose the allocation succeeded */


    It's almost as easy to handle the possibility that the allocation
    failed, as it is to write a comment explaining that you've decided to
    ignore that possibility.

    > 11 for (pos = 0u; pos < sizeof *d; ++pos) {
    > 12 ((char unsigned *)d)[pos] = 0u;
    > 13 }
    > 14
    > 15 (void)*d;


    > 16 return 0;
    > 17 }
    >
    > (I hope my question corresponds precisely to the austin-group-l topic.)


    I'm not sure that it does. Aliasing is something that is inherently
    impossible for null pointers, for they do not point at an object.
    James Kuyper, Apr 26, 2010
    #2
    1. Advertising

  3. On Mon, 26 Apr 2010, James Kuyper wrote:

    > Ersek, Laszlo wrote:
    >> Hi,
    >>
    >> with reference to [0] and [1], please consider the following:
    >>
    >> 1 #include <string.h>
    >> 2 #include <stdlib.h>
    >> 3
    >> 4 struct x {
    >> 5 double *y;
    >> 6 };
    >> 7
    >> 8 int
    >> 9 main(void)
    >> 10 {
    >> 11 struct x *x = malloc(sizeof *x);
    >> 12
    >> 13 /* suppose the allocation succeeds */
    >> 14 (void)memset(x, 0, sizeof *x);
    >> 15 (void)(0 == x->y);
    >> 16 return 0;
    >> 17 }
    >>
    >> In my understanding, the evaluation of x->y on line 15 is undefined
    >> behavior in C99.
    >>
    >> Consider the following (fictious) extension:
    >>
    >> "The all-bits-zero object representation is valid for any
    >> pointer-to-object type. Any pointer-to-object object with the
    >> all-bits-zero representation is a null pointer of the corresponding
    >> pointer-to-object type."

    >
    > That depends upon what you mean by valid.


    My apologies. I tried (and failed) to formulate "all-bits-zero implies a
    null pointer value" in standardese. Thus I meant all those uses that are
    otherwise valid for any given lvalue evaluating to a null pointer value.


    > The standard distinguishes several cases. It talks about pointer objects
    > containing representations of which can be dereferenced, incremented,
    > decremented, compared for order, compared for equality, or simply copied
    > as a pointer value. For each of those operations, the set of pointer
    > representations valid for that operation is different.
    >
    > A null pointer value must compare equal to any other null pointer value,
    > and it must compare unequal to any pointer to an object. A pointer is
    > not valid for the purpose of dereferencing it, unless it points at an
    > object. Therefore, in principle, an implementation cannot choose
    > all-bits-0 to be both a null pointer and a pointer which is valid for
    > the purpose of dereferencing it. However, offhand I can't come up with
    > any code with defined behavior which demonstrates the non-conformance of
    > such an implementation, so it might be permitted, under the "as-if"
    > rule.
    >
    >> Would this extension make the above program well defined?

    >
    > Yes. The behavior would be well-defined, by the implementor. The
    > behavior would, of course, still be undefined as far as the C standard
    > is concerned, because "undefined behavior" is a specialized piece of
    > jargon in the C standard. It doesn't carry the apparently obvious
    > meaning of "behavior that has no definition". Instead, it means
    > "behavior which is not defined by this standard" (I've paraphrased the
    > actual wording, for the sake improved clarity in this context).


    Thank you.

    As I understand it, the question is: after adding this extension to
    POSIX(R), would further extensions be necessary, so that the code above
    becomes defined? Most probably, this could be answered completely only by
    considering all other extensions introduced by POSIX. Assuming, however,
    that POSIX only defined otherwise undefined (or unspecified) behavior, and
    that it didn't redefine (or weaken/reclassify) already defined behavior, I
    think the suggestion ought to be eligible for consideration in isolation
    as well.


    >> In particular, would the following program still break aliasing rules, as
    >> per C99 6.5 p 6-7?
    >>
    >> 1 #include <stdlib.h>
    >> 2
    >> 3 int
    >> 4 main(void)
    >> 5 {
    >> 6 double **d = malloc(sizeof *d);
    >> 7 size_t pos;
    >> 8
    >> 9 /* suppose the allocation succeeded */

    >
    > It's almost as easy to handle the possibility that the allocation failed, as
    > it is to write a comment explaining that you've decided to ignore that
    > possibility.


    I agree absolutely. I had to force myself to omit the error checking and
    write a comment instead. I favor examples with complete error checking. I
    only wanted to sidestep dead-ends like "if malloc() fails, there is no
    undefined behavior, because the first substatement of your *if* statement
    is not executed then".


    >
    >> 11 for (pos = 0u; pos < sizeof *d; ++pos) {
    >> 12 ((char unsigned *)d)[pos] = 0u;
    >> 13 }
    >> 14
    >> 15 (void)*d;

    >
    >> 16 return 0;
    >> 17 }
    >>
    >> (I hope my question corresponds precisely to the austin-group-l topic.)

    >
    > I'm not sure that it does. Aliasing is something that is inherently
    > impossible for null pointers, for they do not point at an object.


    I believe it isn't about an object hypothetically aliased by some other
    (valid) pointer and a pointer with all-bits-zero representation. It is
    about the pointer object with all-bits-zero representation itself, aliased
    by differently typed pointers (pointer rvalues); like *d vs. *(char
    unsigned *)d in the above.

    I think that storing a valid (double*)0 null pointer value representation
    in the space allocated by malloc() through either ((char unsigned
    *)d)[...] or memset() doesn't force any effective type (eg. a character
    type) on the allocated object. That is, the conclusion of the second
    sentence of C99 6.5 p6 does not hold, because its premise is false.

    --o--

    I've downloaded ISO/IEC 9899:1999/Cor.2:2004(E) from
    <http://www.open-std.org/jtc1/sc22/wg14/www/docs/9899-1999_cor_2-2004.pdf>.
    Entry 9 seems to imply that

    {
    int *ip;

    ip = malloc(sizeof *ip);
    if (0 != ip) {
    (void)memset(ip, 0, sizeof *ip);
    (void)*ip;
    }
    }

    can invoke no undefined behavior, even though TC2 doesn't appear to extend
    C99 6.5 with any requirement on memset(). Reformulating the original
    question: extending C99 (with all TC's applied) with a requirement on
    pointers-to-objects, similar to TC2 entry 9, would the code at the top
    *instantly* become defined?


    Thank you very much for your answer.
    lacos
    Ersek, Laszlo, Apr 26, 2010
    #3
  4. Ersek, Laszlo

    James Kuyper Guest

    Ersek, Laszlo wrote:
    > On Mon, 26 Apr 2010, James Kuyper wrote:
    >
    >> Ersek, Laszlo wrote:

    ....
    >>> 11 for (pos = 0u; pos < sizeof *d; ++pos) {
    >>> 12 ((char unsigned *)d)[pos] = 0u;
    >>> 13 }
    >>> 14
    >>> 15 (void)*d;

    >>
    >>> 16 return 0;
    >>> 17 }
    >>>
    >>> (I hope my question corresponds precisely to the austin-group-l topic.)

    >>
    >> I'm not sure that it does. Aliasing is something that is inherently
    >> impossible for null pointers, for they do not point at an object.

    >
    > I believe it isn't about an object hypothetically aliased by some other
    > (valid) pointer and a pointer with all-bits-zero representation. It is
    > about the pointer object with all-bits-zero representation itself,
    > aliased by differently typed pointers (pointer rvalues); like *d vs.
    > *(char unsigned *)d in the above.


    The anti-aliasing rules in 6.5p7 explicitly allow for the representation
    of any object to be accessed through an lvalue of character type, in
    addition to several other possibilities.

    > I think that storing a valid (double*)0 null pointer value
    > representation in the space allocated by malloc() through either ((char
    > unsigned *)d)[...] or memset() doesn't force any effective type (eg. a
    > character type) on the allocated object. That is, the conclusion of the
    > second sentence of C99 6.5 p6 does not hold, because its premise is false.


    Sorry, I didn't realize that it was 'd' itself, rather than *d, that you
    were thinking about in terms of aliasing.

    C99 describes how the effective type is determined, and I agree that
    neither the call to memset() nor writing to that memory as an array of
    char gives it an effective type, since that writing did not take the
    form of copying from an existing object. However, accessing it through
    *d gives it an effective type of 'double*'; and 6.5p6 makes no
    distinction between whether the access was for the purpose of writing
    the memory, or reading it. If that memory has all bits zero at the time
    of the access, then an POSIX-specific promise that such a representation
    represents a null pointer seems to me to be sufficient to render the
    behavior of that code defined - by POSIX, not by the C standard.

    This is really an issue that you should raise in a forum devoted to the
    POSIX standard, though I'm not sure what the appropriate one would be.
    comp.std.unix has been quiet for so long that someone is starting the
    formal process for removing it.
    James Kuyper, Apr 27, 2010
    #4
  5. On Mon, 26 Apr 2010, James Kuyper wrote:

    > Ersek, Laszlo wrote:
    >> On Mon, 26 Apr 2010, James Kuyper wrote:
    >>
    >>> Ersek, Laszlo wrote:

    > ...
    >>>> 11 for (pos = 0u; pos < sizeof *d; ++pos) {
    >>>> 12 ((char unsigned *)d)[pos] = 0u;
    >>>> 13 }
    >>>> 14
    >>>> 15 (void)*d;
    >>>
    >>>> 16 return 0;
    >>>> 17 }
    >>>>


    [snip]

    >> I think that storing a valid (double*)0 null pointer value
    >> representation in the space allocated by malloc() through either ((char
    >> unsigned *)d)[...] or memset() doesn't force any effective type (eg. a
    >> character type) on the allocated object. That is, the conclusion of the
    >> second sentence of C99 6.5 p6 does not hold, because its premise is
    >> false.


    [snip]

    > C99 describes how the effective type is determined, and I agree that
    > neither the call to memset() nor writing to that memory as an array of
    > char gives it an effective type, since that writing did not take the
    > form of copying from an existing object. However, accessing it through
    > *d gives it an effective type of 'double*'; and 6.5p6 makes no
    > distinction between whether the access was for the purpose of writing
    > the memory, or reading it. If that memory has all bits zero at the time
    > of the access, then an POSIX-specific promise that such a representation
    > represents a null pointer seems to me to be sufficient to render the
    > behavior of that code defined - by POSIX, not by the C standard.


    Thank you very much for your invaluable input. Much obliged.


    > This is really an issue that you should raise in a forum devoted to the
    > POSIX standard, though I'm not sure what the appropriate one would be.


    As I understand it, austin-group-l is *that* forum. (See [0]/Q0, and [1].)
    The issue was raised there and I thought comp.lang.c and comp.std.c
    subscribers could contribute authoritatively. Thankfully, you proved the
    hunch right.

    If you don't mind, I'll forward your message to the corresponding
    austin-group-l thread.

    Cheers,
    lacos

    [0] http://www.opengroup.org/austin/faq.html
    [1] http://www.opengroup.org/austin/lists.html
    Ersek, Laszlo, Apr 27, 2010
    #5
  6. On Tue, 27 Apr 2010, Ersek, Laszlo wrote:

    > The issue was raised there and I thought comp.lang.c and comp.std.c
    > subscribers could contribute authoritatively.


    Small fix, with apologies: the comp.std.c idea came from Vincent Lefevre.

    Thanks,
    lacos
    Ersek, Laszlo, Apr 27, 2010
    #6
  7. In comp.std.c, article <hr5dd0$8o3$-september.org>,
    James Kuyper <> wrote:

    > C99 describes how the effective type is determined, and I agree that
    > neither the call to memset() nor writing to that memory as an array of
    > char gives it an effective type, since that writing did not take the
    > form of copying from an existing object.


    There's a problem with this sentence ("since ... not" while
    the C standard uses positive causality - see below). Here's
    what I said in the austin-group mailing-list about memset()
    used on a dynamically allocated region (but again, the
    standard is not clear enough, IMHO):

    6.5#6 says:

    The effective type of an object for an access to its stored value is
    the declared type of the object, if any.75)

    Here there is no declared type (I recall the context: the memory
    was allocated dynamically). So, this doesn't apply.

    If a value is stored into an object having no declared type through
    an lvalue having a type that is not a character type, then the type
    of the lvalue becomes the effective type of the object for that
    access and for subsequent accesses that do not modify the stored
    value.

    Here the type of the lvalue is a character type, so that this doesn't
    apply. Another interpretation is that memset is its own way to store
    data (just like memcpy and memmove below); still, the above sentence
    doesn't apply here.

    If a value is copied into an object having no declared type using
    memcpy or memmove, or is copied as an array of character type, then
    the effective type of the modified object for that access and for
    subsequent accesses that do not modify the value is the effective
    type of the object from which the value is copied, if it has one.

    Here this is memset, not memcpy or memmove. I don't know what "is
    copied as an array of character type" intends to mean. Anyway,
    memset doesn't copy an object. So, this doesn't apply.

    For all other accesses to an object having no declared type, the
    effective type of the object is simply the type of the lvalue used
    for the access.

    This is an "else" case. This is how I deduce that the effective type
    is a character type.

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
    Vincent Lefevre, Apr 27, 2010
    #7
  8. In comp.std.c, article <>,
    Ersek, Laszlo <> wrote:

    > On Tue, 27 Apr 2010, Ersek, Laszlo wrote:


    > > The issue was raised there and I thought comp.lang.c and comp.std.c
    > > subscribers could contribute authoritatively.


    > Small fix, with apologies: the comp.std.c idea came from Vincent Lefevre.


    Actually, my remark in the austin-group list was just about the
    effective type due to a memset() on a dynamically allocated region
    (not about the representation of a null pointer). This question is
    covered by the C standard, not by POSIX. That's why I suggested
    comp.std.c.

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
    Vincent Lefevre, Apr 27, 2010
    #8
  9. On Tue, 27 Apr 2010, Vincent Lefevre wrote:

    > Here's what I said in the austin-group mailing-list about memset() used
    > on a dynamically allocated region (but again, the standard is not clear
    > enough, IMHO):
    >
    > 6.5#6 says:



    (a)

    > The effective type of an object for an access to its stored value is
    > the declared type of the object, if any.75)
    >
    > Here there is no declared type (I recall the context: the memory was
    > allocated dynamically). So, this doesn't apply.



    (b)

    > If a value is stored into an object having no declared type through
    > an lvalue having a type that is not a character type, then the type
    > of the lvalue becomes the effective type of the object for that
    > access and for subsequent accesses that do not modify the stored
    > value.
    >
    > Here the type of the lvalue is a character type, so that this doesn't
    > apply. Another interpretation is that memset is its own way to store
    > data (just like memcpy and memmove below); still, the above sentence
    > doesn't apply here.



    (c)

    > If a value is copied into an object having no declared type using
    > memcpy or memmove, or is copied as an array of character type, then
    > the effective type of the modified object for that access and for
    > subsequent accesses that do not modify the value is the effective
    > type of the object from which the value is copied, if it has one.
    >
    > Here this is memset, not memcpy or memmove. I don't know what "is copied
    > as an array of character type" intends to mean. Anyway, memset doesn't
    > copy an object. So, this doesn't apply.



    (d)

    > For all other accesses to an object having no declared type, the
    > effective type of the object is simply the type of the lvalue used
    > for the access.
    >
    > This is an "else" case. This is how I deduce that the effective type is
    > a character type.


    Ah, okay, now I think I see what you mean. Sorry for being dense.

    We seem to agree that none of (a) and (b) apply. You say that (c) doesn't
    apply either, and thus (d) -- the "else branch" -- must apply. I didn't
    understand this previously: I said (or rather, I think I said) "since none
    of a-b-c applies, the access establishes no effective type at all". This
    is probably a misinterpretation. (Euphemism for "I was wrong".)

    But what if we assume for a moment that the all-bits-zero representation
    carries a valid null pointer value for all pointer-to-object types? In
    that case, wouldn't zeroing out the individual bytes of a
    pointer-to-object object through a (char unsigned *) make (c) applicable?

    ----v----
    If a value is copied into an object having no declared type [...] as an
    array of character type, then the effective type of the modified object
    for that access and for subsequent accesses that do not modify the value
    is the effective type of the object from which the value is copied, if it
    has one.
    ----^----

    static double *dp; /* suppose all-bits-zero */
    static char unsigned zeroes[sizeof dp];

    static void
    z1(void **dpp)
    {
    (void)memcpy(dpp, &dp, sizeof dp);
    }

    static void
    z2(void **dpp)
    {
    size_t pos;

    assert(0 == memcmp(&dp, zeroes, sizeof zeroes));
    for (pos = 0u; pos < sizeof zeroes; ++pos) {
    ((char unsigned *)dpp)[pos] = zeroes[pos];
    }
    }

    static void
    z3(void **dpp)
    {
    (void)memset(dpp, 0, sizeof(double *));
    }


    (c) applies to z1(). z2() copies the exact same bit pattern (object
    representation) from a character array to "*dpp". z3() establishes the
    exact same bit pattern (object representation) in "*dpp" without a source
    object.

    Insomuch as TC2 entry 9 has rendered z2() and z3() equivalent to z1() wrt.
    integers, without touching 6.5 at all, I think it would only be consequent
    if a similar all-bits-zero requirement on object pointers (added as an
    extension) made both z2() and z3() equivalent to z1(), wrt. object
    pointers, necessitating no change to 6.5 either.

    In my opinion, the reason why the standard doesn't explicitly include
    memset() in (c), and the char-wise storing of a pattern, is only because
    it couldn't do that without restricting the object representations
    themselves. Adding a constraint on object representation sufficed to allow
    z3() for integers. So should it for pointers-to-objects.

    Cheers,
    lacos
    Ersek, Laszlo, Apr 27, 2010
    #9
  10. In comp.std.c, article <>,
    Ersek, Laszlo <> wrote:

    > But what if we assume for a moment that the all-bits-zero representation
    > carries a valid null pointer value for all pointer-to-object types? In
    > that case, wouldn't zeroing out the individual bytes of a
    > pointer-to-object object through a (char unsigned *) make (c) applicable?


    It depends on how this is done. You could apply (c) under "If a value
    [...] is copied as an array of character type". But note the words
    "value" and "copied". This means that there is a source in memory
    (with an effective type). While memcpy and memmove use such a source,
    memset doesn't. I'm not such that even a "for" loop falls under this
    condition because I don't see how an implementation could recognize
    every form of such loops (in the most complicated cases).

    > ----v----
    > If a value is copied into an object having no declared type [...] as an
    > array of character type, then the effective type of the modified object
    > for that access and for subsequent accesses that do not modify the value
    > is the effective type of the object from which the value is copied, if it
    > has one.
    > ----^----


    > static double *dp; /* suppose all-bits-zero */
    > static char unsigned zeroes[sizeof dp];


    > static void
    > z1(void **dpp)
    > {
    > (void)memcpy(dpp, &dp, sizeof dp);
    > }


    > static void
    > z2(void **dpp)
    > {
    > size_t pos;


    > assert(0 == memcmp(&dp, zeroes, sizeof zeroes));
    > for (pos = 0u; pos < sizeof zeroes; ++pos) {
    > ((char unsigned *)dpp)[pos] = zeroes[pos];
    > }
    > }


    > static void
    > z3(void **dpp)
    > {
    > (void)memset(dpp, 0, sizeof(double *));
    > }



    > (c) applies to z1().


    and the effective type of the object in &dp (that is, double *)
    is used.

    > z2() copies the exact same bit pattern (object representation) from
    > a character array to "*dpp".


    But the object in zeroes has no effective type (except the individual
    unsigned char), thus no value. You first need to force an effective
    type (and a value), e.g. with

    *((double **) &zeroes) = NULL;

    Then I'm not sure that the for loop counts as a copy of such an
    object.

    > z3() establishes the exact same bit pattern (object representation)
    > in "*dpp" without a source object.


    Since there is no source, there is no effective type and no value.

    > Insomuch as TC2 entry 9 has rendered z2() and z3() equivalent to z1() wrt.
    > integers, without touching 6.5 at all,


    ??? Could you explain? I don't see such a thing on

    http://www.open-std.org/jtc1/sc22/wg14/www/docs/tc2.htm

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
    Vincent Lefevre, Apr 28, 2010
    #10
  11. Ersek, Laszlo

    James Kuyper Guest

    Vincent Lefevre wrote:
    > In comp.std.c, article <hr5dd0$8o3$-september.org>,
    > James Kuyper <> wrote:
    >
    >> C99 describes how the effective type is determined, and I agree that
    >> neither the call to memset() nor writing to that memory as an array of
    >> char gives it an effective type, since that writing did not take the
    >> form of copying from an existing object.

    >
    > There's a problem with this sentence ("since ... not" while
    > the C standard uses positive causality - see below). Here's
    > what I said in the austin-group mailing-list about memset()
    > used on a dynamically allocated region (but again, the
    > standard is not clear enough, IMHO):
    >
    > 6.5#6 says:
    >
    > The effective type of an object for an access to its stored value is
    > the declared type of the object, if any.75)
    >
    > Here there is no declared type (I recall the context: the memory
    > was allocated dynamically). So, this doesn't apply.
    >
    > If a value is stored into an object having no declared type through
    > an lvalue having a type that is not a character type, then the type
    > of the lvalue becomes the effective type of the object for that
    > access and for subsequent accesses that do not modify the stored
    > value.
    >
    > Here the type of the lvalue is a character type, so that this doesn't
    > apply. Another interpretation is that memset is its own way to store
    > data (just like memcpy and memmove below); still, the above sentence
    > doesn't apply here.
    >
    > If a value is copied into an object having no declared type using
    > memcpy or memmove, or is copied as an array of character type, then
    > the effective type of the modified object for that access and for
    > subsequent accesses that do not modify the value is the effective
    > type of the object from which the value is copied, if it has one.
    >
    > Here this is memset, not memcpy or memmove. I don't know what "is
    > copied as an array of character type" intends to mean.


    It means that the fact that memcpy() causes the memory to acquire that
    effective type is not a magical feature of memcpy() or memmove(), but is
    merely a consequence of the defined behavior of those functions. It
    means that any other code that has the same defined behavior as either
    of those two functions must therefore also have the same effect, of
    establishing the effective type of that piece of memory. In particular,

    double *d = NULL;
    unsigned char *pin;
    double **dp = malloc(sizeof din);
    unsigned char *pout = (char*)dp;

    if(pout)
    {
    for(pin = &d; pin < (char*)(&d + 1); pin++)
    *pout++ = *pin;
    }

    must cause the memory pointed at by dp to acquire the effect type of
    'double *', and to contain a valid representation of a null pointer to
    double. The relevant consequence of this fact is that, for an
    implementation which provides the guarantee (not provided by the
    standard itself) that *pin has a value of 0 at each point where that
    value is copied to *pout in the above loop, then that loop must be
    replaceable by

    for(size_t i=0; i < sizeof(**dout); i++)
    pout = 0;

    without any change to the resulting behavior. That's because, when such
    a implementation-defined guarantee applies, the behavior defined by the
    standard is identical for those two loops. That second loop has the same
    standard-defined behavior as a call to memset(), which must therefore
    also have that same effect.

    > ... Anyway,
    > memset doesn't copy an object. So, this doesn't apply.
    >
    > For all other accesses to an object having no declared type, the
    > effective type of the object is simply the type of the lvalue used
    > for the access.
    >
    > This is an "else" case. This is how I deduce that the effective type
    > is a character type.


    The first access to *d, as a whole, in the code provided, is through a
    pointer to a pointer to double, not through a pointer to unsigned char.
    That access sets the effective type for the object as a whole to be
    'double *', even though the previous access to the individual bytes of
    the object set the effective type for those bytes to unsigned char. All
    objects can also be accessed as arrays of unsigned char, so there is no
    conflict between those two effective types.
    James Kuyper, Apr 28, 2010
    #11
  12. In comp.std.c, article <hr946q$5bh$-september.org>,
    James Kuyper <> wrote:

    > > Here this is memset, not memcpy or memmove. I don't know what "is
    > > copied as an array of character type" intends to mean.


    > It means that the fact that memcpy() causes the memory to acquire that
    > effective type is not a magical feature of memcpy() or memmove(), but is
    > merely a consequence of the defined behavior of those functions. It
    > means that any other code that has the same defined behavior as either
    > of those two functions must therefore also have the same effect, of
    > establishing the effective type of that piece of memory.


    Perhaps, but I don't see how you can deduce all of this:

    > In particular,


    > double *d = NULL;
    > unsigned char *pin;
    > double **dp = malloc(sizeof din);

    ^^^

    I think you mean d or double *.

    > unsigned char *pout = (char*)dp;

    ^^^^

    I think you mean unsigned char.

    > if(pout)
    > {
    > for(pin = &d; pin < (char*)(&d + 1); pin++)

    ^^^^
    Ditto.

    > *pout++ = *pin;
    > }


    > must cause the memory pointed at by dp to acquire the effect type of
    > 'double *',


    Yes, because an object of type double * is copied.

    However, how far does this go? What if the char's are set in
    some arbitrary order, possibly with other statements between
    the stores?

    > and to contain a valid representation of a null pointer to double.
    > The relevant consequence of this fact is that, for an implementation
    > which provides the guarantee (not provided by the standard itself)
    > that *pin has a value of 0 at each point where that value is copied
    > to *pout in the above loop, then that loop must be replaceable by


    > for(size_t i=0; i < sizeof(**dout); i++)
    > pout = 0;


    > without any change to the resulting behavior.


    The behavior doesn't change at *this* point, but while you could say
    that the effective type in the former case was double *, there's no
    source of double * here, so that the effective type of the allocated
    memory cannot be double *, and...

    > That's because, when such a implementation-defined guarantee
    > applies, the behavior defined by the standard is identical for those
    > two loops. That second loop has the same standard-defined behavior
    > as a call to memset(), which must therefore also have that same
    > effect.


    ditto for memset.

    > > ... Anyway,
    > > memset doesn't copy an object. So, this doesn't apply.
    > >
    > > For all other accesses to an object having no declared type, the
    > > effective type of the object is simply the type of the lvalue used
    > > for the access.
    > >
    > > This is an "else" case. This is how I deduce that the effective type
    > > is a character type.


    > The first access to *d, as a whole, in the code provided, is through a
    > pointer to a pointer to double, not through a pointer to unsigned char.


    We are talking about dynamically allocated memory. *d is not in this
    case.

    > That access sets the effective type for the object as a whole to be
    > 'double *', even though the previous access to the individual bytes of
    > the object set the effective type for those bytes to unsigned char. All
    > objects can also be accessed as arrays of unsigned char, so there is no
    > conflict between those two effective types.


    This is off-topic: your code does not use memset.

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
    Vincent Lefevre, Apr 28, 2010
    #12
  13. Ersek, Laszlo

    James Kuyper Guest

    Vincent Lefevre wrote:
    > In comp.std.c, article <hr946q$5bh$-september.org>,
    > James Kuyper <> wrote:
    >
    >>> Here this is memset, not memcpy or memmove. I don't know what "is
    >>> copied as an array of character type" intends to mean.

    >
    >> It means that the fact that memcpy() causes the memory to acquire that
    >> effective type is not a magical feature of memcpy() or memmove(), but is
    >> merely a consequence of the defined behavior of those functions. It
    >> means that any other code that has the same defined behavior as either
    >> of those two functions must therefore also have the same effect, of
    >> establishing the effective type of that piece of memory.

    >
    > Perhaps, but I don't see how you can deduce all of this:


    I don't know how too explain that deduction to you; to me, it all seems
    clearly implied by the phrase "copied as an array of character type".

    >> In particular,

    >
    >> double *d = NULL;
    >> unsigned char *pin;
    >> double **dp = malloc(sizeof din);

    > ^^^
    >
    > I think you mean d or double *.
    >
    >> unsigned char *pout = (char*)dp;

    > ^^^^
    >
    > I think you mean unsigned char.
    >
    >> if(pout)
    >> {
    >> for(pin = &d; pin < (char*)(&d + 1); pin++)

    > ^^^^
    > Ditto.


    Yes - that single post contained far too many typos. Perhaps I should go
    back to bed for a while before going to work.

    >> *pout++ = *pin;
    >> }

    >
    >> must cause the memory pointed at by dp to acquire the effect type of
    >> 'double *',

    >
    > Yes, because an object of type double * is copied.
    >
    > However, how far does this go? What if the char's are set in
    > some arbitrary order, possibly with other statements between
    > the stores?


    The order in which the chars are copied does not matter, because the
    standard does not specify that order for memcpy (and, in fact, the
    typical implementation of memmove() will sometimes copy the bytes in
    reverse order). The only way that other statements, between the stores,
    can be relevant, is if they change either the object being copied, or
    the object being copied to. All other statements are irrelevant to the
    equivalence with memcpy().

    >> and to contain a valid representation of a null pointer to double.
    >> The relevant consequence of this fact is that, for an implementation
    >> which provides the guarantee (not provided by the standard itself)
    >> that *pin has a value of 0 at each point where that value is copied
    >> to *pout in the above loop, then that loop must be replaceable by

    >
    >> for(size_t i=0; i < sizeof(**dout); i++)
    >> pout = 0;

    >
    >> without any change to the resulting behavior.

    >
    > The behavior doesn't change at *this* point,


    But the difference between the two versions is complete at this point.
    If the behavior has not yet differed, it has no license to change from
    this point onward.

    ....
    >>> ... Anyway,
    >>> memset doesn't copy an object. So, this doesn't apply.
    >>>
    >>> For all other accesses to an object having no declared type, the
    >>> effective type of the object is simply the type of the lvalue used
    >>> for the access.
    >>>
    >>> This is an "else" case. This is how I deduce that the effective type
    >>> is a character type.

    >
    >> The first access to *d, as a whole, in the code provided, is through a
    >> pointer to a pointer to double, not through a pointer to unsigned char.

    >
    > We are talking about dynamically allocated memory. *d is not in this
    > case.


    Sorry - another typo. That should have been *dp.

    >> That access sets the effective type for the object as a whole to be
    >> 'double *', even though the previous access to the individual bytes of
    >> the object set the effective type for those bytes to unsigned char. All
    >> objects can also be accessed as arrays of unsigned char, so there is no
    >> conflict between those two effective types.

    >
    > This is off-topic: your code does not use memset.


    memset() accesses objects as arrays of unsigned char. My example
    contains code equivalent to memset(), and cites that equivalence as
    justification for the usability of memset() to achieve the desired
    effect - how is that off-topic? You might believe the connection to be
    incorrect, but I don't see how it's off-topic.
    James Kuyper, Apr 28, 2010
    #13
  14. In comp.std.c, article <hr9a6k$a37$-september.org>,
    James Kuyper <> wrote:

    > Vincent Lefevre wrote:
    > > In comp.std.c, article <hr946q$5bh$-september.org>,
    > > James Kuyper <> wrote:
    > >
    > >>> Here this is memset, not memcpy or memmove. I don't know what "is
    > >>> copied as an array of character type" intends to mean.

    > >
    > >> It means that the fact that memcpy() causes the memory to acquire that
    > >> effective type is not a magical feature of memcpy() or memmove(), but is
    > >> merely a consequence of the defined behavior of those functions. It
    > >> means that any other code that has the same defined behavior as either
    > >> of those two functions must therefore also have the same effect, of
    > >> establishing the effective type of that piece of memory.

    > >
    > > Perhaps, but I don't see how you can deduce all of this:


    > I don't know how too explain that deduction to you; to me, it all seems
    > clearly implied by the phrase "copied as an array of character type".


    Clearly not. There's no "double *" in "copied as an array of character
    type". I don't see why a memset would set the effective type to
    "double *". And why "double *" and not something else?

    [...]
    > >> and to contain a valid representation of a null pointer to double.
    > >> The relevant consequence of this fact is that, for an implementation
    > >> which provides the guarantee (not provided by the standard itself)
    > >> that *pin has a value of 0 at each point where that value is copied
    > >> to *pout in the above loop, then that loop must be replaceable by

    > >
    > >> for(size_t i=0; i < sizeof(**dout); i++)
    > >> pout = 0;

    > >
    > >> without any change to the resulting behavior.

    > >
    > > The behavior doesn't change at *this* point,


    > But the difference between the two versions is complete at this point.
    > If the behavior has not yet differed, it has no license to change from
    > this point onward.


    Here's a counter-example. Assume unsigned long and void * have
    the same size 8 (no padding bits), and that the null pointer
    is represented by a sequence of null bytes, and consider the
    following two cases:

    void foo (void)
    {
    void *p = malloc(8);
    *(unsigned long *)p = 0;
    printf ("%ld\n", *(unsigned long *)p);
    }

    void foo (void)
    {
    void *p = malloc(8);
    *(void **)p = 0;
    printf ("%ld\n", *(unsigned long *)p);
    }

    Until the "... = 0;", the behavior has not changed. However, though
    the following line is the same in both cases, the first case has
    well-specified behavior, while the second case has undefined behavior
    (since it breaks the aliasing rules).

    [...]
    > > This is off-topic: your code does not use memset.


    > memset() accesses objects as arrays of unsigned char. My example
    > contains code equivalent to memset(), and cites that equivalence as
    > justification for the usability of memset() to achieve the desired
    > effect - how is that off-topic? You might believe the connection to
    > be incorrect, but I don't see how it's off-topic.


    Your code doesn't use memset. Please show a code that uses memset
    (without typos). And reasoning based on "the behavior has not yet
    differed" is flawed, as shown above.

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
    Vincent Lefevre, Apr 28, 2010
    #14
  15. Ersek, Laszlo

    James Kuyper Guest

    Vincent Lefevre wrote:
    > In comp.std.c, article <hr9a6k$a37$-september.org>,
    > James Kuyper <> wrote:
    >
    >> Vincent Lefevre wrote:
    >>> In comp.std.c, article <hr946q$5bh$-september.org>,
    >>> James Kuyper <> wrote:
    >>>
    >>>>> Here this is memset, not memcpy or memmove. I don't know what "is
    >>>>> copied as an array of character type" intends to mean.
    >>>> It means that the fact that memcpy() causes the memory to acquire that
    >>>> effective type is not a magical feature of memcpy() or memmove(), but is
    >>>> merely a consequence of the defined behavior of those functions. It
    >>>> means that any other code that has the same defined behavior as either
    >>>> of those two functions must therefore also have the same effect, of
    >>>> establishing the effective type of that piece of memory.
    >>> Perhaps, but I don't see how you can deduce all of this:

    >
    >> I don't know how too explain that deduction to you; to me, it all seems
    >> clearly implied by the phrase "copied as an array of character type".

    >
    > Clearly not. There's no "double *" in "copied as an array of character
    > type".


    "is copied as an array of character type" is a fragment of a much
    longer, complicated statement. That complete statement talks about the
    case when "a value is copied into an object", and following sentences
    refer to "the object from which the value is copied" - all without
    constraining the type of either object, or of the value copied, in any
    ways. An object of type 'double *'. clearly qualifies as "an object",
    despite the fact that neither 'double' nor '*' appear anywhere in that
    entire paragraph.

    > ... I don't see why a memset would set the effective type to
    > "double *". And why "double *" and not something else?


    The relevant issue is the the fact that it has a known representation.
    If you access memory with no declared type through lvalues of character
    type in order to create something known to be a valid representation of
    an object of a given type, and the first access to that memory is
    through an lvalue of that type, that access gives that memory it's
    effective type, and will not, in itself, run afoul of any requirement
    specified by the C standard.

    There are only a few types for which the standard guarantees that any
    particular representation is valid, but any one of those type could have
    been used in this example.

    Representations of most types are unspecified by the standard; there's
    no requirement that they be documented by the implementation. However,
    most implementations do document the representations of many types, and
    code that is not intended to be portable can use this approach for any
    of those types; there's nothing specific to double* about this.

    >>>> and to contain a valid representation of a null pointer to double.
    >>>> The relevant consequence of this fact is that, for an implementation
    >>>> which provides the guarantee (not provided by the standard itself)
    >>>> that *pin has a value of 0 at each point where that value is copied
    >>>> to *pout in the above loop, then that loop must be replaceable by
    >>>> for(size_t i=0; i < sizeof(**dout); i++)
    >>>> pout = 0;
    >>>> without any change to the resulting behavior.
    >>> The behavior doesn't change at *this* point,

    >
    >> But the difference between the two versions is complete at this point.
    >> If the behavior has not yet differed, it has no license to change from
    >> this point onward.

    >
    > Here's a counter-example. Assume unsigned long and void * have
    > the same size 8 (no padding bits), and that the null pointer
    > is represented by a sequence of null bytes, and consider the
    > following two cases:
    >
    > void foo (void)
    > {
    > void *p = malloc(8);
    > *(unsigned long *)p = 0;
    > printf ("%ld\n", *(unsigned long *)p);
    > }
    >
    > void foo (void)
    > {
    > void *p = malloc(8);
    > *(void **)p = 0;
    > printf ("%ld\n", *(unsigned long *)p);
    > }


    Neither unsigned long nor void** are character types; the license to do
    something like this is restricted to accessing the objects as arrays of
    character type. As soon as you accessed the memory pointed at by p
    though an lvalue of type void**, it acquired that as it's effective
    type. Because of the anti-aliasing rules, a conforming compiler is not
    required to consider the possibility that *(void**)p and *(unsigned
    long)p refer to the same location in memory (even though, in this case,
    that "possibility" can trivially be determined to be a certainty). It
    could therefore have evaluated *(unsigned long)p prior to execution of
    the assignment expression, and used that saved result as an argument for
    the printf() call. In a different, more complicated example, this could
    actually be a reasonable optimization.

    Such an optimization would not be permitted when a character type is
    involved, because the anti-aliasing rules give special status to those
    types.

    >>> This is off-topic: your code does not use memset.

    >
    >> memset() accesses objects as arrays of unsigned char. My example
    >> contains code equivalent to memset(), and cites that equivalence as
    >> justification for the usability of memset() to achieve the desired
    >> effect - how is that off-topic? You might believe the connection to
    >> be incorrect, but I don't see how it's off-topic.

    >
    > Your code doesn't use memset. Please show a code that uses memset
    > (without typos).


    I can't guarantee "no typos", but I'll do my best to avoid them. Where I
    wrote that the second loop was equivalent to a call to memset(), please
    consider that statement to have been replaced by an otherwise identical
    statement that specifies the precise syntax of the equivalent call:

    memset(dp, 0, sizeof *dp);

    I didn't think I'd have to spell out details like that, and I don't
    understand why you think the absences of those details is a problem.
    James Kuyper, Apr 28, 2010
    #15
  16. On Wed, 28 Apr 2010, Vincent Lefevre wrote:

    > In comp.std.c, article
    > <>, Ersek,
    > Laszlo <> wrote:


    >> static double *dp; /* suppose all-bits-zero */
    >> static char unsigned zeroes[sizeof dp];

    >
    >> static void
    >> z1(void **dpp)
    >> {
    >> (void)memcpy(dpp, &dp, sizeof dp);
    >> }

    >
    >> static void
    >> z2(void **dpp)
    >> {
    >> size_t pos;

    >
    >> assert(0 == memcmp(&dp, zeroes, sizeof zeroes));
    >> for (pos = 0u; pos < sizeof zeroes; ++pos) {
    >> ((char unsigned *)dpp)[pos] = zeroes[pos];
    >> }
    >> }

    >
    >> static void
    >> z3(void **dpp)
    >> {
    >> (void)memset(dpp, 0, sizeof(double *));
    >> }


    >> Insomuch as TC2 entry 9 has rendered z2() and z3() equivalent to z1() wrt.
    >> integers, without touching 6.5 at all,


    > ??? Could you explain? I don't see such a thing on
    >
    > http://www.open-std.org/jtc1/sc22/wg14/www/docs/tc2.htm


    That's TC2 to C90 (ISO/IEC 9899:1990). I meant TC2 to C99 (ISO/IEC
    9899:1999):

    http://www.open-std.org/jtc1/sc22/wg14/www/docs/9899-1999_cor_2-2004.pdf

    Cheers,
    lacos
    Ersek, Laszlo, Apr 28, 2010
    #16
  17. In comp.std.c, article <>,
    Ersek, Laszlo <> wrote:

    > On Wed, 28 Apr 2010, Vincent Lefevre wrote:


    > > In comp.std.c, article
    > > <>, Ersek,
    > > Laszlo <> wrote:


    > >> static double *dp; /* suppose all-bits-zero */
    > >> static char unsigned zeroes[sizeof dp];

    > >
    > >> static void
    > >> z1(void **dpp)
    > >> {
    > >> (void)memcpy(dpp, &dp, sizeof dp);
    > >> }

    > >
    > >> static void
    > >> z2(void **dpp)
    > >> {
    > >> size_t pos;

    > >
    > >> assert(0 == memcmp(&dp, zeroes, sizeof zeroes));
    > >> for (pos = 0u; pos < sizeof zeroes; ++pos) {
    > >> ((char unsigned *)dpp)[pos] = zeroes[pos];
    > >> }
    > >> }

    > >
    > >> static void
    > >> z3(void **dpp)
    > >> {
    > >> (void)memset(dpp, 0, sizeof(double *));
    > >> }


    > >> Insomuch as TC2 entry 9 has rendered z2() and z3() equivalent to z1() wrt.
    > >> integers, without touching 6.5 at all,


    > > ??? Could you explain? I don't see such a thing on
    > >
    > > http://www.open-std.org/jtc1/sc22/wg14/www/docs/tc2.htm


    > That's TC2 to C90 (ISO/IEC 9899:1990).


    OK, I didn't notice that.

    > I meant TC2 to C99 (ISO/IEC 9899:1999):


    > http://www.open-std.org/jtc1/sc22/wg14/www/docs/9899-1999_cor_2-2004.pdf


    This entry 9 is about the representation of integers, not about
    aliasing rules. See 6.5 p6 and p7.

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
    Vincent Lefevre, Apr 29, 2010
    #17
  18. In comp.std.c, article <hr9fb0$c91$-september.org>,
    James Kuyper <> wrote:

    > Vincent Lefevre wrote:
    > > In comp.std.c, article <hr9a6k$a37$-september.org>,
    > > James Kuyper <> wrote:
    > >
    > >> Vincent Lefevre wrote:
    > >>> In comp.std.c, article <hr946q$5bh$-september.org>,
    > >>> James Kuyper <> wrote:
    > >>>
    > >>>>> Here this is memset, not memcpy or memmove. I don't know what "is
    > >>>>> copied as an array of character type" intends to mean.
    > >>>> It means that the fact that memcpy() causes the memory to acquire that
    > >>>> effective type is not a magical feature of memcpy() or memmove(), but is
    > >>>> merely a consequence of the defined behavior of those functions. It
    > >>>> means that any other code that has the same defined behavior as either
    > >>>> of those two functions must therefore also have the same effect, of
    > >>>> establishing the effective type of that piece of memory.
    > >>> Perhaps, but I don't see how you can deduce all of this:

    > >
    > >> I don't know how too explain that deduction to you; to me, it all seems
    > >> clearly implied by the phrase "copied as an array of character type".

    > >
    > > Clearly not. There's no "double *" in "copied as an array of character
    > > type".


    > "is copied as an array of character type" is a fragment of a much
    > longer, complicated statement.


    Yes, but you only said "copied as an array of character type" a few
    lines above. I would be interesting in the *whole* reasoning.

    > That complete statement talks about the case when "a value is copied
    > into an object", and following sentences refer to "the object from
    > which the value is copied"


    The point with memset() is that no objects are involved, or possibly
    only an array of unsigned char.

    > - all without constraining the type of either object, or of the
    > value copied, in any ways. An object of type 'double *'. clearly
    > qualifies as "an object", despite the fact that neither 'double' nor
    > '*' appear anywhere in that entire paragraph.


    I recall the code (based on what was posted in the austin-group list):

    double **dp = malloc(sizeof(double *));
    memset (dp, 0, sizeof(double *));

    There's a type double *, but no object of this type here. The question
    was: what is the effective type of the object stored at dp just after
    the memset()?

    > > ... I don't see why a memset would set the effective type to
    > > "double *". And why "double *" and not something else?


    > The relevant issue is the the fact that it has a known representation.
    > If you access memory with no declared type through lvalues of character
    > type in order to create something known to be a valid representation of
    > an object of a given type, and the first access to that memory is
    > through an lvalue of that type, that access gives that memory it's
    > effective type, and will not, in itself, run afoul of any requirement
    > specified by the C standard.


    Where does the C standard say that? Please, quote it!

    > >> But the difference between the two versions is complete at this point.
    > >> If the behavior has not yet differed, it has no license to change from
    > >> this point onward.

    > >
    > > Here's a counter-example. Assume unsigned long and void * have
    > > the same size 8 (no padding bits), and that the null pointer
    > > is represented by a sequence of null bytes, and consider the
    > > following two cases:
    > >
    > > void foo (void)
    > > {
    > > void *p = malloc(8);
    > > *(unsigned long *)p = 0;
    > > printf ("%ld\n", *(unsigned long *)p);
    > > }
    > >
    > > void foo (void)
    > > {
    > > void *p = malloc(8);
    > > *(void **)p = 0;
    > > printf ("%ld\n", *(unsigned long *)p);
    > > }


    > Neither unsigned long nor void** are character types;


    You never said that this was a requirement. Again, I want your full
    reasoning. But I think our disagreement is on your paragraph above,
    on which I ask explanations.

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.net/>
    100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
    Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
    Vincent Lefevre, Apr 29, 2010
    #18
  19. On Thu, 29 Apr 2010, Vincent Lefevre wrote:

    > I recall the code (based on what was posted in the austin-group list):
    >
    > double **dp = malloc(sizeof(double *));
    > memset (dp, 0, sizeof(double *));
    >
    > There's a type double *, but no object of this type here. The question
    > was: what is the effective type of the object stored at dp just after
    > the memset()?


    I would say either "none", which would be okay for the purpose in
    question, or perhaps even "double *", if the memset() established a valid
    value representation, which would be again okay. IIRC your opinion is
    "char".

    I believe the *intent* of the standard justifies my opinion (at least the
    "none" option), but I'm sort of forced to agree that the strict *wording*
    of the standard justifies yours.

    I really have no more arguments in this discussion. I was under the vague
    impression that many clc contributors share the opinion that TC2 entry 9
    rendered memset()-to-\0 defined for zeroing out integers.

    Cheers,
    lacos
    Ersek, Laszlo, Apr 29, 2010
    #19
  20. Ersek, Laszlo

    James Kuyper Guest

    Vincent Lefevre wrote:
    > In comp.std.c, article <hr9fb0$c91$-september.org>,
    > James Kuyper <> wrote:
    >
    >> Vincent Lefevre wrote:

    ....
    >> That complete statement talks about the case when "a value is copied
    >> into an object", and following sentences refer to "the object from
    >> which the value is copied"

    ....
    > The point with memset() is that no objects are involved, or possibly
    > only an array of unsigned char.


    Which means that it does not establish an effect type for the memory
    affected by the memset() call. However, that memory doesn't need to have
    an effective type at this point; it will acquire an effective type at
    the point where the value of that object is read. The key points are that:

    a) The anti-aliasing rules allow any object to be accessed and (if
    modifiable) to be modified as if it were an array of unsigned char.
    Ensuring that the resulting object contains a valid representation for
    it's type is, in general, tricky - unless that representation is
    obtained by copying from another object of that same type already
    containing a valid value. But we're not addressing the general case;
    we're addressing a specific case where a valid representation is known.

    b) An unsigned char with a value of 0 has all-bits-0. This is relevant,
    because the behavior of memset() is defined in terms of unsigned char.

    c) For the specific implementation under discussion, all-bits-0 is also
    a valid representation of a null pointer.

    That's all that's needed to ensure that, when this block of memory
    acquires an effective type by being accessed through an lvalue of type
    double*, it will contain a valid representation of a null pointer to
    double, and therefore can safely be read.

    ....
    > I recall the code (based on what was posted in the austin-group list):
    >
    > double **dp = malloc(sizeof(double *));
    > memset (dp, 0, sizeof(double *));
    >
    > There's a type double *, but no object of this type here. The question
    > was: what is the effective type of the object stored at dp just after
    > the memset()?


    It has none. It acquires an effective type only when *dp is evaluated.
    However, by that time, on such an implementation, it contains a valid
    representation of a null pointer.

    >>> ... I don't see why a memset would set the effective type to
    >>> "double *". And why "double *" and not something else?


    Because dp has the type double**. If it had the type unsigned long*, and
    sizeof(double*) were replaced with (or happened to be identical to)
    sizeof *dp (which is a good idea, any way), then the exact same code
    would cause *dp to have an effective type of unsigned long. In either
    case, it would not acquire that effective type until the *dp was
    actually evaluated.

    >> The relevant issue is the the fact that it has a known representation.
    >> If you access memory with no declared type through lvalues of character
    >> type in order to create something known to be a valid representation of
    >> an object of a given type, and the first access to that memory is
    >> through an lvalue of that type, that access gives that memory it's
    >> effective type, and will not, in itself, run afoul of any requirement
    >> specified by the C standard.

    >
    > Where does the C standard say that? Please, quote it!


    I can't tell you where it says that this does not "run afoul of any
    requirement specified by the C standard", because the C standard does
    not in fact contain words to that effect. What it does have, is a
    complete absence of requirements violated by such code. I can prove that
    by citation, but only by citation of every single clause of the
    standard, and then pointing out that none of them specifies a
    requirement violated by such code. I'm sorry, but I don't feel inclined
    to honor your request that I quote that.

    If you think that there is a requirement that it violates, you should be
    able to the cite that requirement, which seems a more appropriate way
    address our disagreement on this point.

    >>>> But the difference between the two versions is complete at this point.
    >>>> If the behavior has not yet differed, it has no license to change from
    >>>> this point onward.
    >>> Here's a counter-example. Assume unsigned long and void * have
    >>> the same size 8 (no padding bits), and that the null pointer
    >>> is represented by a sequence of null bytes, and consider the
    >>> following two cases:
    >>>
    >>> void foo (void)
    >>> {
    >>> void *p = malloc(8);
    >>> *(unsigned long *)p = 0;
    >>> printf ("%ld\n", *(unsigned long *)p);
    >>> }
    >>>
    >>> void foo (void)
    >>> {
    >>> void *p = malloc(8);
    >>> *(void **)p = 0;
    >>> printf ("%ld\n", *(unsigned long *)p);
    >>> }

    >
    >> Neither unsigned long nor void** are character types;

    >
    > You never said that this was a requirement.


    I thought that you were already aware of the fact that the anti-aliasing
    rules make special allowances for character types, and that my argument
    was based upon those special allowances. Well, if you didn't know that
    before, you know it now.
    James Kuyper, Apr 30, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Zhiqiang Ye
    Replies:
    53
    Views:
    10,220
    Dan Pop
    Jun 28, 2004
  2. Gerard Flanagan
    Replies:
    3
    Views:
    434
    Terry Hancock
    Nov 19, 2005
  3. Replies:
    64
    Views:
    1,166
    Keith Thompson
    Feb 1, 2006
  4. Tomás

    Value Bits Vs Object Bits

    Tomás, Jun 2, 2006, in forum: C Programming
    Replies:
    13
    Views:
    537
    Hallvard B Furuseth
    Jul 1, 2006
  5. Army1987
    Replies:
    6
    Views:
    328
    CBFalconer
    Jul 7, 2007
Loading...

Share This Page