Is the aliasing rule symmetric?

  • Thread starter Johannes Schaub (litb)
  • Start date
P

ptyxs

So let me ask again, to you and anyone else. Is there any difference
between the two programs:

  #include <stddef.h>
  #include <stdlib.h>
  typedef struct T1 { int x; int y; } T1;
  typedef struct T2 { int x; int y; } T2;
  int main(void)
  { T1 *p = malloc(sizeof *p);
    p->x = 1;
    p->y = 2;
    return p->y;
  }

and

  #include <stddef.h>
  #include <stdlib.h>
  typedef struct T1 { int x; int y; } T1;
  typedef struct T2 { int x; int y; } T2;
  int main()
  {
    void* p = malloc(sizeof(T1));
    * (int*) (((char*)p) + offsetof(T1, x)) = 1;
    * (int*) (((char*)p) + offsetof(T1, y)) = 2;
    return ((T1*)p)->y;
  }

Specifically, I presume that everyone agrees C and C++ needs to
support the first program with no UB. The interesting questions I have
concern the second. Does the "return ((T1*)p)->y;" result in UB? Why?
What's the important different between these two programs, and
specifically the parts of the standards which explain the important
differences.

Also, if the second program has no UB, can we instead return "return
((T2*)p)->y;" for implementations which we've tested that T1 and T2
have equivalent layout? That is, it might not be a portable program,
but for those systems which there is no difference in layout, would
the access through T2 have UB? Why?

Perhaps I missed something, but your first program does'nt compile on
my system :
error: invalid conversion from ‘void*’ to ‘T1*’
 
J

Joshua Maurice

Perhaps I missed something, but your first program does'nt compile on
my system :
error: invalid conversion from ‘void*’ to ‘T1*’

Sorry. I was thinking in C, and in C, void pointers implicitly convert
to any other point. Add an explicit cast to make it legal C and C++
(though very un-idiomatic C), as follows:
{ T1 *p = (T1*) malloc(sizeof *p);
 
G

Guest

...
Also, if the second program has no UB, can we instead return "return
((T2*)p)->y;" for implementations which we've tested that T1 and T2
have equivalent layout? That is, it might not be a portable program,
but for those systems which there is no difference in layout, would
the access through T2 have UB? Why?

Behavior, upon use of a nonportable program construct, for which the C
Standard imposes no requirements is by definition 3.4.3 so-called
"undefined behavior".
(That doesn't imply that an existing implementation would behave
unpredictable. If you don't care for portability and have found by
inspection of code, data, machine architecture, whatever, that your
program behaves predictable, then don't worry.)
 
J

Joshua Maurice

Behavior, upon use of a nonportable program construct, for which the C
Standard imposes no requirements is by definition 3.4.3 so-called
"undefined behavior".
(That doesn't imply that an existing implementation would behave
unpredictable. If you don't care for portability and have found by
inspection of code, data, machine architecture, whatever, that your
program behaves predictable, then don't worry.)

I would kindly ask you to read the rest of this thread, and realize
that I am quite well versed on the issues at hand, and I understand
that the intent of the C standards committee was to make the following
program have undefined behavior.

#include <stdlib.h>
typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;
int main(void)
{ T1 *p = malloc(sizeof *p);
p->x = 1;
p->y = 2;
return ((T2*)p)->y;
}

I don't mean to dispute that's how people understand the standard. I
don't plan on writing any code any time soon that violates the well
understood intent of the standard.

However, that's not the rules as written. What I do want to discuss is
if there's any sensible reading of the standard as written which can
give the desired conclusion, while preserving idiomatic usages of C,
such as casting the return of malloc to a struct type pointer, and
assigning to members of that struct.

I note how you did not answer any of my questions, and instead read
the standard line given to those new to the issues. I understand that
this is a generally acceptable method of imparting information, but it
does not apply in this case.

Again: Which of the following programs have UB as written, 1, 2, both,
neither? Why? Please quote exact parts of the standard (C or C++) with
thorough reasoning. Which of the following would have UB if the return
was replaced with "return ((T2*)p)->y;", 1, 2, both, neither? Why?
Please quote exact parts of the standard (C or C++) with thorough
reasoning.

//program 1
#include <stddef.h>
#include <stdlib.h>
typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;
int main(void)
{ T1 *p = (T1*) malloc(sizeof *p);
p->x = 1;
p->y = 2;
return p->y;
}

//program 2
#include <stddef.h>
#include <stdlib.h>
typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;
int main()
{
void* p = malloc(sizeof(T1));
* (int*) (((char*)p) + offsetof(T1, x)) = 1;
* (int*) (((char*)p) + offsetof(T1, y)) = 2;
return ((T1*)p)->y;
}

PS: I hope the intended answer is both do not have UB as written. I
understand the answer that the first would have UB if the return was
changed to "return ((T2*)p)->y;", but even I cannot grasp at the
straws to come to the conclusion that program 2 would have UB if the
return was changed to "return ((T2*)p)->y;".

Note that the above is all assuming a particular implementation where
sizeof(T1) == sizeof(T1), and offsetof(T1, y) == offsetof(T2, y). I
know it's definitely not portable, but I don't see any /rules as
written/ which demand UB on all platforms. (offsetof(T1, x) == 0 and
offsetof(T2, x) == 0 by an already existing guarantee in the C and C++
standards.)

offsetof is just a macro which evaluates to an integer. So what that I
passed T1 to it. It shouldn't matter. I should be able to hardcore 4
in place of offsetof(T1, y) and offsetof(T2, y) and expect it to work
on some platforms, like the common x86 win32. I see nothing in program
2 that says we have an object of type or effective type T1 nor T2. I
see only writes through int lvalues. Moreover, I see little to no
difference between program 1 and program 2 in this regard - I see
little reason to talk about an object with type or effective type T1
nor T2 in program 1.

This is especially true in light of the rules for volatile, and the
rules of POSIX, win32 (maybe?) and C++0x threading, which heavily
interact with the definition of "access". What does "access" mean? I
would argue that if that word is to have any meaning, it means exactly
a read, a write, or both. What does it mean to access an object of
struct type with a member expression, ex: "x.y"? It definitely doesn't
read the full struct x, nor write the full struct x. Hell, it doesn't
even imply a read or a write, ex: "int * a = & x.y;". From our well
understood knowledge of threading, that is neither a read nor a write
of "x" nor "x.y". So, what can the strict aliasing rules say about
this? In the above programs, there is no single expression which we
can say "accesses" an object through a T1 nor T2 lvalue, unless we
want to start using two different contradictory definitions of the
word "access". The same conclusion can be reached through a discussion
of the observable behavior requirements of volatile objects.

Let me again emphasis that I don't plan to write production code like
this ever, but these simple examples elucidate the actual scope and
effect of the standards, such as the proper and correct way to write a
pooling memory allocator on top of malloc or the new operator.
Specifically, at least in the C++ case, we need to know when the
lifetimes of objects begin and end, and what it means to access an
object through an incorrectly typed lvalue.

Again, finally, as far as I can see, the only way to make program 1
have UB with the T2 cast return is to invent some rules which
explicitly mention data dependency analysis in the effective type
rules of C, in the object lifetime rules of C++ (or just copy the
effective type rules of C for POD types into C++), and in the allowed
lvalue access rules, aka the strict aliasing rules of C and C++. When
you consider the implications raised with volatile and threading which
heavily interact with the definition of "access", this seems like the
only way out.

Or, if you can prove me wrong, and help me understand an error of
mine, please do so.
 
G

Guest

(Since my comment only applies to C, I removed comp.lang.c++ from the
newsgroups list.)

...
I would kindly ask you to read the rest of this thread, and realize
that I am quite well versed on the issues at hand, and I understand
that the intent of the C standards committee was to make the following
program have undefined behavior.

  #include <stdlib.h>
  typedef struct T1 { int x; int y; } T1;
  typedef struct T2 { int x; int y; } T2;
  int main(void)
  { T1 *p = malloc(sizeof *p);
    p->x = 1;
    p->y = 2;
    return ((T2*)p)->y;
  }

I don't mean to dispute that's how people understand the standard. I
don't plan on writing any code any time soon that violates the well
understood intent of the standard.

However, that's not the rules as written. What I do want to discuss is
if there's any sensible reading of the standard as written which can
give the desired conclusion, while preserving idiomatic usages of C,
such as casting the return of malloc to a struct type pointer, and
assigning to members of that struct.

For preserving idiomatic usages of C, such as casting the return of
malloc to a struct type pointer, and assigning to members of that
struct, it is not necessary that the above program's behavior were
defined, because the program construct which makes the standard impose
no requirements for the above program's behavior is not one of the
aforementioned.

Unfortunately, at the moment I haven't the time to adress your further
concerns - maybe next week.
 
J

Joshua Maurice

(Since my comment only applies to C, I removed comp.lang.c++ from the
newsgroups list.)






For preserving idiomatic usages of C, such as casting the return of
malloc to a struct type pointer, and assigning to members of that
struct, it is not necessary that the above program's behavior were
defined, because the program construct which makes the standard impose
no requirements for the above program's behavior is not one of the
aforementioned.

I'm sorry, I don't quite follow your English. It is idiomatic usage in
C to implicitly cast the return of malloc to a struct pointer, then,
assign to members of the struct through that pointer, then read or
write those members through the pointer. My program 1 does exactly
that. (Well, it has an explicit cast instead of implicit to make the
code valid C++ as well, but minor point.) Thus the C (and C++)
standard ought to say that program 1 as written has no UB.
Unfortunately, at the moment I haven't the time to adress your further
concerns - maybe next week.

Thank you for your time.

In short, I think my "interesting" questions are: For a platform where
T1 and T2 have the same layout (aka this is not a question about
portable semantics):

- Do programs 1 and 2 have any UB as written? I presume that the
answer is no UB in either as written.

- Would program 2 have UB if the return was changed to "return
((T2*)p)->y;" ? Given no UB before, this change cannot introduce UB.

- Would program 1 have UB if the return was changed to "return
((T2*)p)->y;" ? Now, here's my problem.

Under any sane standard, "program 2 as written" and "program 2 with
T2* cast return" must both be defined, or both have UB. Also under any
sane standard, "program 1 as written" must not have UB.

I think our remaining options are:

- "program 1 as written", "program 2 as written", "program 1 with T2*
cast return", and "program 2 with T2* cast return", all do not have
UB. This is a problem because this is in direct contradiction with the
common understanding of the standard and the intent of the standard
that "program 1 with T2* cast return" has UB.

- "2 as written" and "2 with T2* cast return" have UB.

I also do not like this conclusion because I don't see any important
difference between:
p->y = 2;
and
* (int*) (((char*)p) + offsetof(T1, y)) = 2;
Both are simply writes through int lvalues. I could remove the
offsetof macro and hardcode "4", thereby removing any reference to the
type T1 in that expression, ala:
* (int*) (((char*)p) + 4) = 2;
I see nothing in the C (nor C++) standard which gives me any reason to
think that these assignment expressions are anything but entirely
equivalent. Again, I do not mean to claim that the code is /portable/,
but for those implementations for which the layout is like this, it
should not have UB.

- "program 1 as written" has no UB, and the rest have UB. I do not
like this conclusion for the same reasoning as above, namely I do not
see an important difference between:
p->y = 2;
and
* (int*) (((char*)p) + offsetof(T1, y)) = 2;
 
S

Stephen Sprunk

I'm sorry, I don't quite follow your English. It is idiomatic usage in
C to implicitly cast the return of malloc to a struct pointer, then,

There is no such thing as an "implicit cast". A "cast" is an explicit
conversion; an implicit conversion is one done without casting.

S
 
J

Joshua Maurice

There is no such thing as an "implicit cast".  A "cast" is an explicit
conversion; an implicit conversion is one done without casting.

Mmm. Yes. Thank you.
 
J

Joel C. Salomon

In short, I think my "interesting" questions are: For a platform where
T1 and T2 have the same layout (aka this is not a question about
portable semantics):

Would it be fair to rephrase the question as, "Given an
implementation-defined guarantee of X (permitted but not required by the
Standard), are Y and Z necessarily defined as well?"?

--Joel
 
J

Joshua Maurice

Would it be fair to rephrase the question as, "Given an
implementation-defined guarantee of X (permitted but not required by the
Standard), are Y and Z necessarily defined as well?"?

Well, it's more like asking "When am I allowed to read a value through
an int lvalue, ex:
T1* p = /* ... */
return p->y;
?"

The entire point of these questions is to ferret out the actual
requirements of the strict aliasing rules and/or the effective type
rules. Why would the return "return ((T1*)p)->y;" not be UB, but
"return ((T2*)p)->y" be UB?

As I described, in program 1 and program 2, I don't see how that is a
read or write through a T1 lvalue, so I don't see how the strict
aliasing rules or the effective type rules apply.

Moreover, why should program 1 create an object with type or effective
type T1? Why not T2? I see no important difference between programs 1
and 2 with regards to discussing the effective type of the object in
the memory returned by malloc, and I clearly see nothing in program 2
which creates a T1 object, but not a T2 object. So, something's got to
give. Either:

1-
p->y = 2;
and
* (int*) (((char*)p) + offsetof(T1, y)) = 2;
have different semantics, where the first somehow participates in
changing the effective type of the object to T1, or

2- the well understood meaning of the strict aliasing rules and the
effective type rules is not actually specified in the standard.

At least, I think those are our choices. Is there a third option? I
don't see any as being particularly intuitive or "obviously right".
 
J

Johannes Schaub (litb)

Joshua said:
No. Don't think about it as an aliasing rule. Think about it as a rule
which restricts the types of lvalues with which you can legally access
objects.

You can always access an object through a char or unsigned char
lvalue. (Or maybe it's only for POD types - there's no consensus. I
would only use char and unsigned char to access POD objects.)

You can always access an object through a base class lvalue, but you
can never do the reverse: you can never take a complete object of type
T and access it through a derived type of type T.

You cannot access an object of derived class type and access it as a base
class lvalue either. You always need to point to the proper base class
subobject. If you try to directly access the complete object by a base class
lvalue, you will be lucky if it crashes.

In this sense it's the same for base/derived relationship in both
directions. If the base-class subobject and the complete object have the
same address, you can reinterpret_cast and if you aren't lucky you can
read/write with the resulting lvalue. If you do the proper thing and use an
implicit conversion or an explicit conversion (for the downcast), you have
defined behavior. But that has nothing to do with the aliasing rule. IMO the
respective bullet in 3.10p15 is flawed.
 
J

Johannes Schaub (litb)

Johannes said:
You cannot access an object of derived class type and access it as a base
class lvalue either. You always need to point to the proper base class
subobject. If you try to directly access the complete object by a base
class lvalue, you will be lucky if it crashes.

In this sense it's the same for base/derived relationship in both
directions. If the base-class subobject and the complete object have the
same address, you can reinterpret_cast and if you aren't lucky you can
read/write with the resulting lvalue. If you do the proper thing and use
an implicit conversion or an explicit conversion (for the downcast), you
have defined behavior. But that has nothing to do with the aliasing rule.
IMO the respective bullet in 3.10p15 is flawed.

Having thought about this again, I think the respective bullet is NOT
flawed. The bullet implies that you already have made a successful
conversion and have a proper lvalue.

We do actually have the reverse (access a base class object by the derived
class type), by means of "the dynamic type of the object" (first bullet). It
is catched by that, and to my surprise, if you turn around the bullet about
the base-class subobject rule according to symmetry rule, you get nearly the
same wording

- a type that is the (possibly cv-qualiï¬ed) dynamic class type of
the type of the object

So I think we again see that the following rule seems to be true:

If aliasing of an A object by an lvalue of type B is OK,
is aliasing of a B object by an lvalue of type A OK?

Please correct me If I'm misunderstanding anything.
 
J

Johannes Schaub (litb)

Johannes said:
You cannot access an object of derived class type and access it as a base
class lvalue either. You always need to point to the proper base class
subobject. If you try to directly access the complete object by a base
class lvalue, you will be lucky if it crashes.

In this sense it's the same for base/derived relationship in both
directions. If the base-class subobject and the complete object have the
same address, you can reinterpret_cast and if you aren't lucky you can
read/write with the resulting lvalue. If you do the proper thing and use
an implicit conversion or an explicit conversion (for the downcast), you
have defined behavior. But that has nothing to do with the aliasing rule.
IMO the respective bullet in 3.10p15 is flawed.

Having thought about this again, I think the respective bullet is NOT
flawed. The bullet implies that you already have made a successful
conversion and have a proper lvalue.

I think we do actually have the reverse (access a base class object by the
derived class type),

- a type that is a (possibly cv-qualiï¬ed) derived class type of the
dynamic type of the object

Converting from a "Base&" to a "Derived&" is already UB if the type of the
complete object of the object referred to is not of type "Derived" or not of
a type derived from Derived. So this rule too assumes that we have a proper
lvalue, and thus the symmetric equivalent to that bullet as above is true
too.

So I think we again see that the following rule seems to be true:

If aliasing of an A object by an lvalue of type B is OK,
is aliasing of a B object by an lvalue of type A OK?

Please correct me If I'm misunderstanding anything.
 
J

Johannes Schaub (litb)

Joshua said:
So let me ask again, to you and anyone else. Is there any difference
between the two programs:

#include <stddef.h>
#include <stdlib.h>
typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;
int main(void)
{ T1 *p = malloc(sizeof *p);
p->x = 1;
p->y = 2;
return p->y;
}

and

#include <stddef.h>
#include <stdlib.h>
typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;
int main()
{
void* p = malloc(sizeof(T1));
* (int*) (((char*)p) + offsetof(T1, x)) = 1;
* (int*) (((char*)p) + offsetof(T1, y)) = 2;
return ((T1*)p)->y;
}

Specifically, I presume that everyone agrees C and C++ needs to
support the first program with no UB. The interesting questions I have
concern the second. Does the "return ((T1*)p)->y;" result in UB? Why?
What's the important different between these two programs, and
specifically the parts of the standards which explain the important
differences.

Assuming layout and sizes are equal for structurally equal structs, and
assuming the spec is correct:

P1: In the first, no object prior to the first write has a declared type.
After the write, there are two objects that have an effective type of type
int, and the read afterwards is alright.

P2: In the second, the situation until the return statement is exactly
equal. At the return statement, you access the second object whose effective
type is 'int' by an 'int', so you go fine too (exactly like in P1). So this
is fine too.
Also, if the second program has no UB, can we instead return "return
((T2*)p)->y;" for implementations which we've tested that T1 and T2
have equivalent layout? That is, it might not be a portable program,
but for those systems which there is no difference in layout, would
the access through T2 have UB? Why?

Since we assume layout and size is equal, Casting p to T2 will make no
difference. You still access the second int by an int lvalue. So this is
fine too.

Let's make a different program

T1 *p = malloc(sizeof *p);
*p = (T1){ 0, 1 };

Now we have 3 effectively typed objects. The effective type of the first is
T1, and the one of the second and third is int. Given that, the following is
UB:

T2 p1 = *(T2*)p;

Because you violate the aliasing rule, accessing a T1 effectively typed
object by a T2 lvalue.

I think that is what the spec says. And I don't think it agrees with what
the committee says. The committee wants to say that in P1, there exists a T1
object in addition. According to the committee, if you insert a cast in to
T2 lvalue in P1's return statement, result is undefined. But the spec does
not say that.
 
J

Johannes Schaub (litb)

Joshua said:
So let me ask again, to you and anyone else. Is there any difference
between the two programs:

#include <stddef.h>
#include <stdlib.h>
typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;
int main(void)
{ T1 *p = malloc(sizeof *p);
p->x = 1;
p->y = 2;
return p->y;
}

and

#include <stddef.h>
#include <stdlib.h>
typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;
int main()
{
void* p = malloc(sizeof(T1));
* (int*) (((char*)p) + offsetof(T1, x)) = 1;
* (int*) (((char*)p) + offsetof(T1, y)) = 2;
return ((T1*)p)->y;
}

Specifically, I presume that everyone agrees C and C++ needs to
support the first program with no UB. The interesting questions I have
concern the second. Does the "return ((T1*)p)->y;" result in UB? Why?
What's the important different between these two programs, and
specifically the parts of the standards which explain the important
differences.

Assuming layout and sizes are equal for structurally equal structs, and
assuming the spec is correct:

P1: In the first, no object prior to the first write has a declared type.
After the write, there are two objects that have an effective type of type
int, and the read afterwards is alright.

P2: In the second, the situation until the return statement is exactly
equal. At the return statement, you access the second object whose effective
type is 'int' by an 'int', so you go fine too (exactly like in P1). So this
is fine too.
Also, if the second program has no UB, can we instead return "return
((T2*)p)->y;" for implementations which we've tested that T1 and T2
have equivalent layout? That is, it might not be a portable program,
but for those systems which there is no difference in layout, would
the access through T2 have UB? Why?

Since we assume layout and size is equal, Casting p to T2 will make no
difference. You still access the second int by an int lvalue. So this is
fine too.

Let's make a different program

T1 *p = malloc(sizeof *p);
*p = (T1){ 0, 1 };

Now we have 1 effectively typed object, and that object has type T1. Given
that, the following is UB:

T2 p1 = *(T2*)p;

Because you violate the aliasing rule, accessing a T1 effectively typed
object by a T2 lvalue.

I think that is what the spec says. And I don't think it agrees with what
the committee says. The committee wants to say that in P1, there exists a T1
object in addition. According to the committee, if you insert a cast in to
T2 lvalue in P1's return statement, result is undefined. But the spec does
not say that.
 
J

Johannes Schaub (litb)

Johannes said:
Assuming layout and sizes are equal for structurally equal structs, and
assuming the spec is correct:

P1: In the first, no object prior to the first write has a declared type.
After the write, there are two objects that have an effective type of type
int, and the read afterwards is alright.

P2: In the second, the situation until the return statement is exactly
equal. At the return statement, you access the second object whose
effective type is 'int' by an 'int', so you go fine too (exactly like in
P1). So this is fine too.


Since we assume layout and size is equal, Casting p to T2 will make no
difference. You still access the second int by an int lvalue. So this is
fine too.

Let's make a different program

T1 *p = malloc(sizeof *p);
*p = (T1){ 0, 1 };

Now we have 1 effectively typed object, and that object has type T1. Given
that, the following is UB:

T2 p1 = *(T2*)p;

Because you violate the aliasing rule, accessing a T1 effectively typed
object by a T2 lvalue.

I think that is what the spec says. And I don't think it agrees with what
the committee says. The committee wants to say that in P1, there exists a
T1 object in addition. According to the committee, if you insert a cast in
to T2 lvalue in P1's return statement, result is undefined. But the spec
does not say that.

In particular, I think the committee intends the spec to say that a struct
or union access expression involves an access with the struct or union
lvalue.

T1 *p = malloc(sizeof *p);
p->x = 0;

In this case, I think the committee's intent is that the object pointed to
by "p" is accesse by an lvalue of type T1, and so the effective type of the
object containing the int changes to T1. So a later cast and access by an
lvalue of T2 will be undefined behavior.
 
J

Joshua Maurice

In particular, I think the committee intends the spec to say that a struct
or union access expression involves an access with the struct or union
lvalue.

    T1 *p = malloc(sizeof *p);
    p->x = 0;

In this case, I think the committee's intent is that the object pointed to
by "p" is accesse by an lvalue of type T1, and so the effective type of the
object containing the int changes to T1. So a later cast and access by an
lvalue of T2 will be undefined behavior.

I think this is also the only sensible interpretation of the
committee's intent. That is
p->y = 2;
is not equivalent to
* (int*) (((char*)p) + offsetof(T1, y)) = 2;
That is, the apparently only sensible way out is: the first somehow
participates in unwritten rules to make a T1 object, and the offsetof
way does not.

I wonder where they want to draw the difference. Let this be the
context for the following questions:
#include <stddef.h>
#include <stdlib.h>

typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;

int main()
{
void* p = malloc(sizeof(T1));
/* ... */
}

Consider the subsequent alterations. Let's start with the simple:
T1* a = (T1*) p;
a->y = 2;
return a->y;
Now, changing it to the following shouldn't give it UB.
T1* a = (T1*) p;
T2* b = (T2*) p;
a->y = 2;
return a->y;
Let's add an explicit temporarily variable as follows.
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = & a->y;
*a_y = 2;
return a->y;
Let's add another variable.
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = & a->y;
int* b_y = & b->y;
*a_y = 2;
return a->y;
Ok, up to this point, I'm pretty sure everyone would agree that we
have no UB. Now, let's take that one dubious step, and transform the
above to:
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = (int*) (((char*)a) + offsetof(T1, y));
int* b_y = (int*) (((char*)b) + offsetof(T2, y));
*a_y = 2;
return a->y;
Quick change, replacing some of the "a" and "b" with "p":
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = (int*) (((char*)p) + offsetof(T1, y));
int* b_y = (int*) (((char*)p) + offsetof(T2, y));
*a_y = 2;
return a->y;
Now we have a problem, because on any sane implementation,
offsetof(T1, y) == offsetof(T2, y), which means for most
implementations I can transform it to:
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = (int*) (((char*)p) + offsetof(T2, y));
int* b_y = (int*) (((char*)p) + offsetof(T2, y));
*a_y = 2;
return a->y;
and reverse the dubious step to get:
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = & b->y;
int* b_y = & b->y;
*a_y = 2;
return a->y;
simplify a bit:
((T2*)p)->y = 2;
return ((T1*)p)->y;
And we're done.

So, that means we need to conclude that:
int* y = & a->y;
is fundamentally different than:
int* y = (int*) (((char*)a) + offsetof(T1*, y));
And I think the only way we can formalize this is to require data
dependency analysis. Let me repeat the first program fragment here:
T1* a = (T1*) p;
a->y = 2;
return a->y;
aka:
T1* a = (T1*) p;
int* y = & a->y;
*y = 2;
return *y;
Our only way out appears to be: we have a object of effective type T1
because of the int write "*y = 2;", and because that int write went
through an int lvalue "*y" / int pointer "y" which was obtained via a
data dependency from a memberof expression on a T1 type lvalue "int* y
= & a->y;".

So, earlier when I was rambling on comp.std.c++ about making memberof
expressions special, I was right in that the only way out. However,
this just strikes me as fundamentally wrong though. I don't like it.
 
J

Joshua Maurice

You cannot access an object of derived class type and access it as a base
class lvalue either. You always need to point to the proper base class
subobject. If you try to directly access the complete object by a base class
lvalue, you will be lucky if it crashes.

In this sense it's the same for base/derived relationship in both
directions. If the base-class subobject and the complete object have the
same address, you can reinterpret_cast and if you aren't lucky you can
read/write with the resulting lvalue. If you do the proper thing and use an
implicit conversion or an explicit conversion (for the downcast), you have
defined behavior. But that has nothing to do with the aliasing rule. IMO the
respective bullet in 3.10p15 is flawed.

Indeed and agreed. Pedantic, but still important. This becomes evident
in multiple inheritance and virtual inheritance cases.
 
J

Joshua Maurice

Having thought about this again, I think the respective bullet is NOT
flawed. The bullet implies that you already have made a successful
conversion and have a proper lvalue.

We do actually have the reverse (access a base class object by the derived
class type), by means of "the dynamic type of the object" (first bullet). It
is catched by that, and to my surprise, if you turn around the bullet about
the base-class subobject rule according to symmetry rule, you get nearly the
same wording

  - a type that is the (possibly cv-qualiï¬ed) dynamic class type of
    the type of the object

Implicit in that entire piece of standard is that you obtained that
lvalue through a "proper" explicit or implicit conversion or cast. If
you start throwing around reinterpret_casts, then it's quite easy to
break it. Consider:
struct A { int x; };
struct B : A {};
int main()
{
B b;
b.x = 1;
A* a = & b;
return a->x;
}
Now, what's left is quite pedantic, and I'm not sure of the exact
nomenclature. When I access the base class subobject ala "return a-
x;", is that consider "accessing the stored value of the [derived
class] object" according to the wording of C++03 "3.10 Lvalues and
rvalues / 15" ? I presume yes. Those bullets are there just as
allowance that you /can/ access base class subobjects through base
class type lvalues and through lvalues of the member types, and you
can access the object through the dynamic type of the object. It
doesn't mention that the lvalues must have been properly obtained -
the following is an example of improperly obtaining the lvalue:
int main()
{
B b;
b.x = 1;
A* a = reinterpret_cast<A*>(&b);
return a->x;
}
Is the above UB? I don't know. Maybe? Either way you should never do
it. It definitely is UB if we have virtual or multiple inheritance.
Where is this fundamental distinction mentioned in the standard?
Nowhere where I can see.
So I think we again see that the following rule seems to be true:

    If aliasing of an A object by an lvalue of type B is OK,
    is aliasing of a B object by an lvalue of type A OK?

Please correct me If I'm misunderstanding anything.

Well, yes. If you have an A object, and you can access a sub-object of
that, or a containing object of that, through a B lvalue, then you can
definitely take that same B object, and access the corresponding A
object through an A lvalue. Are you trying to say something more?
 
J

Johannes Schaub (litb)

Joshua said:
I think this is also the only sensible interpretation of the
committee's intent. That is
p->y = 2;
is not equivalent to
* (int*) (((char*)p) + offsetof(T1, y)) = 2;
That is, the apparently only sensible way out is: the first somehow
participates in unwritten rules to make a T1 object, and the offsetof
way does not.

I wonder where they want to draw the difference. Let this be the
context for the following questions:
#include <stddef.h>
#include <stdlib.h>

typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;

int main()
{
void* p = malloc(sizeof(T1));
/* ... */
}

Consider the subsequent alterations. Let's start with the simple:
T1* a = (T1*) p;
a->y = 2;
return a->y;
Now, changing it to the following shouldn't give it UB.
T1* a = (T1*) p;
T2* b = (T2*) p;
a->y = 2;
return a->y;
Let's add an explicit temporarily variable as follows.
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = & a->y;
*a_y = 2;
return a->y;
Let's add another variable.
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = & a->y;
int* b_y = & b->y;
*a_y = 2;
return a->y;
Ok, up to this point, I'm pretty sure everyone would agree that we
have no UB. Now, let's take that one dubious step, and transform the
above to:
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = (int*) (((char*)a) + offsetof(T1, y));
int* b_y = (int*) (((char*)b) + offsetof(T2, y));
*a_y = 2;
return a->y;
Quick change, replacing some of the "a" and "b" with "p":
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = (int*) (((char*)p) + offsetof(T1, y));
int* b_y = (int*) (((char*)p) + offsetof(T2, y));
*a_y = 2;
return a->y;
Now we have a problem, because on any sane implementation,
offsetof(T1, y) == offsetof(T2, y), which means for most
implementations I can transform it to:
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = (int*) (((char*)p) + offsetof(T2, y));
int* b_y = (int*) (((char*)p) + offsetof(T2, y));
*a_y = 2;
return a->y;
and reverse the dubious step to get:
T1* a = (T1*) p;
T2* b = (T2*) p;
int* a_y = & b->y;
int* b_y = & b->y;
*a_y = 2;
return a->y;
simplify a bit:
((T2*)p)->y = 2;
return ((T1*)p)->y;
And we're done.

I think I'm missing something. This last simplification does not seem to be
valid according to the intent. In the unsimplified code, before executing
the "return a->y" you have for read access to "*a_y":

object 1: address X, sizeof(int), effective type: int

for the return access you have

object 1: lvalue T1, address X, sizeof(T1), effective type: T1
object 2: lvalue int, address X, sizeof(int), effective type: int

The effective type in the access to object 1 was taken from the type of the
"lvalue" used for the access. For object 2, the effective type was used that
were set by the write in "*a_y = 2". Now for your simplification, before
executing the "return ((T1*)p)->y" you have for the preceeding write:

object 1: address X, sizeof(T2), effective type: T2
object 2: address X, sizeof(int), affective type: int

Now you are doing a member access in the return statement accessing the
first object using an "lvalue" of type T1 but the object has effective type
T1, violating the aliasing rule.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top