OOP: legal cast?

Alberto =?iso-8859-1?Q?Gim=E9nez?= · May 4, 2005

Hi, I've seen some object oriented programming bits out there and i'm
not sure if they're legal. For example:

struct Object {
int field1;
int field2;
};

struct SubObject {
int field1; /* the same as Object */
int field2; /* the same as Object */

int subobject_field1;
int subobject_field2;
};

And a cast is used to reference the "superclass" of SubObject:

struct SubObject *subobject;

struct Object *parent = (Object *) subobject; /* legal cast? */

And then use parent->field1, etc. Is that cast legal? invokes UB or
something worse?

I've also seen some other code, whichi IMHO is more correct and elegant,
which is kind a real framework for OOP in ANSI C:

struct Class {
size_t size;
int (*ctor) (... etc)
};

struct String {
struct Class *class;
char *text;
};

And so on. Of course, is uses clever new() and so functions.
I've seen it in a pdf book, but I can't tell you the title (the pdf has
not titlepage itself).

Thanks and greetings.

Eric Sosman · May 4, 2005

Alberto said:
Hi, I've seen some object oriented programming bits out there and i'm
not sure if they're legal. For example:

struct Object {
int field1;
int field2;
};

struct SubObject {
int field1; /* the same as Object */
int field2; /* the same as Object */

int subobject_field1;
int subobject_field2;
};

And a cast is used to reference the "superclass" of SubObject:

struct SubObject *subobject;

struct Object *parent = (Object *) subobject; /* legal cast? */

And then use parent->field1, etc. Is that cast legal? invokes UB or
something worse?

The pointer conversion is "legal" in the sense that the
result could be re-converted to a `struct SubObject*' and
work correctly. The conversion itself causes no trouble.

However, I think it's "illegal" to use the converted
pointer to access the fields of a `struct Object' "overlaid"
at the beginning of the `struct SubObject'. It will work
just fine on every compiler I personally have encountered,
but as far as I can see the Standard doesn't describe what
the behavior will be -- which makes the behavior "undefined."
(The guarantees of 6.5.2.3/5 apply only to union instances
that contain structs, not to free-standing struct objects.)
The technique is like driving faster than the speed limit:
technically illegal, but Everybody Does It.

I've also seen some other code, whichi IMHO is more correct and elegant,
which is kind a real framework for OOP in ANSI C:

struct Class {
size_t size;
int (*ctor) (... etc)
};

struct String {
struct Class *class;
char *text;
};

This is fine. However, deep inheritance hierarchies will
produce deeply-nested structures; this can make the notation
rather cumbersome:

/* Strad extends Violin extends String extends Class,
* and Fokker extends Triplane extends Aircraft
* extends Vehicle extends Class
*/
struct Strad *strad = ...;
struct Fokker *fokker = ...;
size_t bufsize = (strad->violin.string.class.size
> fokker->triplane.aircraft.vehicle.class.size)
? strad->violin.string.class.size
: fokker->triplane.aircraft.vehicle.class.size;

If your code starts looking like this, you might want to
reconsider your choice of implementation language.

Mark Piffer · May 4, 2005

Eric said:
The pointer conversion is "legal" in the sense that the
result could be re-converted to a `struct SubObject*' and
work correctly. The conversion itself causes no trouble.

However, I think it's "illegal" to use the converted
pointer to access the fields of a `struct Object' "overlaid"
at the beginning of the `struct SubObject'. It will work
just fine on every compiler I personally have encountered,
but as far as I can see the Standard doesn't describe what
the behavior will be -- which makes the behavior "undefined."
(The guarantees of 6.5.2.3/5 apply only to union instances
that contain structs, not to free-standing struct objects.)
The technique is like driving faster than the speed limit:
technically illegal, but Everybody Does It.

I neither can show a real implementation that will produce UB which
actually makes such a program fail, but that's rather due to my limited
experience; I can however easily imagine an implementation which will
overwrite portions of the larger struct when accessed with a smaller
struct type (let's say there exists a fast instruction to zero out 32
bits at once and the last member of the smaller struct is an
32-bit-aligned 16-bit value plus 16 bit padding where the larger struct
has one 16-bit member more) - so I suppose this is rather like driving
against the direction on a highway - it is highly risky, nevertheless
some do it, and some even survive it.

This is fine. However, deep inheritance hierarchies will
produce deeply-nested structures; this can make the notation
rather cumbersome:

/* Strad extends Violin extends String extends Class,
* and Fokker extends Triplane extends Aircraft
* extends Vehicle extends Class
*/
struct Strad *strad = ...;
struct Fokker *fokker = ...;
size_t bufsize = (strad->violin.string.class.size
> fokker->triplane.aircraft.vehicle.class.size)
? strad->violin.string.class.size
: fokker->triplane.aircraft.vehicle.class.size;

If your code starts looking like this, you might want to
reconsider your choice of implementation language.

Mark

E. Robert Tisdale · May 4, 2005

Alberto said:
I've seen some object oriented programming bits out there

Can you tell us exactly where "out there"?

but I'm not sure if they're legal.
For example:

struct Object {
int field1;
int field2;
};

You are confused. An object is *not* a type.
It is an *instance* of a type.

struct SubObject {
int field1; /* the same as Object */
int field2; /* the same as Object */

int subobject_field1;
int subobject_field2;
};

This is *not* a sub object because it isn't an object.
It isn't even a subtype. It's just another
[user defined] type that happens to have data members
with the same name as another [user defined] data type.

And a cast is used to reference the "superclass" of SubObject:

struct SubObject *subobject;

struct Object *parent = (Object *) subobject; /* legal cast? */

And then use parent->field1, etc. Is that cast legal? invokes UB or
something worse?

I've also seen some other code, which, IMHO, is more correct and elegant,
which is kind a real framework for OOP in ANSI C:

struct Class {
size_t size;
int (*ctor) (... etc)

What is this supposed to be?

};

struct String {
struct Class *class;
char *text;
};

And so on. Of course, is uses clever new() and so functions.
I've seen it in a pdf book,
but I can't tell you the title (the pdf has not titlepage itself).

> cat main.c

#include <stdio.h>

typedef struct Base { // super type
int field1;
int field2;
} Base;

inline static
Base Base_create(int f1, int f2) {
Base base;
base.field1 = f1;
base.field2 = f2;
return base;
}
inline static
void Base_destroy(const Base* p) {
// do nothing
}
inline static
int Base_field1(const Base* p) {
return p->field1;
}
inline static
int Base_field2(const Base* p) {
return p->field2;
}
inline static
int Base_fprintf(FILE* fp, const Base* p) {
return fprintf(fp, "%d %d", p->field1, p->field2);
}

typedef struct Derived {// sub type
Base base;
int field1;
int field2;
} Derived;

inline static
Derived Derived_create(const Base* p, int f1, int f2) {
Derived derived;
derived.base = *p;
derived.field1 = f1;
derived.field2 = f2;
return derived;
}
inline static
void Derived_destroy(const Derived* p) {
// do nothing
}
inline static const
Base* Derived_base(const Derived* p) {
return &(p->base);
}
inline static
int Derived_field1(const Derived* p) {
return p->field1;
}
inline static
int Derived_field2(const Derived* p) {
return p->field2;
}
inline static
int Derived_fprintf(FILE* fp, const Derived* p) {
int total = Base_fprintf(fp, &(p->base));
if (0 < total) {
int local = fprintf(fp, " %d %d", p->field1, p->field2);
total = (0 < local)? total + local: local;
}
return total;
}

int main(int argc, char* argv[]) {
const
Base base = Base_create(13, 14);
const
Derived derived = Derived_create(&base, 15, 16);
Derived_fprintf(stdout, &derived);
fprintf(stdout, "\n");
Base_fprintf(stdout, Derived_base(&derived));
fprintf(stdout, "\n");
Derived_destroy(&derived);
Base_destroy(&base);
return 0;
}

> gcc -Wall -std=c99 -pedantic -o main main.c
> ./main

13 14 15 16
13 14

1. Use the typedef for your class definitions
so that you can drop the superfluous 'struct' qualifier.
2. Define [pseudo] constructors for each type.
3. Define destructors for each type and call them
even if they don't actually do anything.

The C programming language does *not* support inheritance.
If you wish to "derive" a new type from another,
you should make an object of the super type
the *first* data member of the subtype
so that an object and its sub object have the same address.
This will be important later
when you ask about how to implement virtual functions.

CBFalconer · May 4, 2005

Eric said:
The pointer conversion is "legal" in the sense that the
result could be re-converted to a `struct SubObject*' and
work correctly. The conversion itself causes no trouble.

No it isn't. There is no type Object * to which to cast. There is
a type "struct Object*" however.

Keith Thompson · May 4, 2005

E. Robert Tisdale said:
Alberto said:

I've seen some object oriented programming bits out there

Click to expand...

Can you tell us exactly where "out there"?

but I'm not sure if they're legal.
For example:
struct Object {
int field1;
int field2;
};

Click to expand...

You are confused. An object is *not* a type.
It is an *instance* of a type.

struct SubObject {
int field1; /* the same as Object */
int field2; /* the same as Object */
int subobject_field1;
int subobject_field2;
};

Click to expand...

This is *not* a sub object because it isn't an object.
It isn't even a subtype. It's just another
[user defined] type that happens to have data members
with the same name as another [user defined] data type.

I see no confusion here, at least not from the OP.

Of course an object is an instance of a type. Calling a type "struct
Object" doesn't imply otherwise. C has a predefined type called
"char", but given "char c;" we know that c is a character, and char is
a character type.

If you really want to use C++-style object-oriented features, why not
just use C++? I don't mean to imply that you don't necessarily have a
perfectly good reason for using C, but switching to C++ seems like the
obvious solution.

Roderick Bloem · May 5, 2005

I think it is legal. Harbison and Steele (5th Ed., Section 5.6.4) says
that "C compilers are constrained to assign components [of a struct]
increasing memory address in a strict order, with the first component
starting at the beginning address of the structure. [...] Holes or
padding may appear between any two consecutive components or after the
last component in the layout of a structure if necessary to allow proper
alignment of components in memory."

I would read that to mean that for a given architecture the layout of a
struct is fixed, and the compiler must create the smaller struct just
the same way as the beginning of the first struct. (It's not quite
watertight: it does not actually say that the padding should be applied
as late as possible. Perhaps someone has the standard?)

I also think that this is how the C++ preprocessor used to work.

Roderick

Eric Sosman · May 5, 2005

Roderick said:
I think it is legal. Harbison and Steele (5th Ed., Section 5.6.4) says
that "C compilers are constrained to assign components [of a struct]
increasing memory address in a strict order, with the first component
starting at the beginning address of the structure. [...] Holes or
padding may appear between any two consecutive components or after the
last component in the layout of a structure if necessary to allow proper
alignment of components in memory."

I would read that to mean that for a given architecture the layout of a
struct is fixed, and the compiler must create the smaller struct just
the same way as the beginning of the first struct. (It's not quite
watertight: it does not actually say that the padding should be applied
as late as possible. Perhaps someone has the standard?)

I've got the Standard, and I can find no requirement that
the arrangement of padding be consistent across different
struct types (6.5.2.3/5 is suggestive, but applies only to
"initially-similar" structs that reside in the same union).

Even if the structs are arranged similarly, Mark Piffer
(see up-thread) offers a perfectly credible reason to believe
that the type-punning need not work as intended.

Roderick Bloem · May 6, 2005

I am going to take the liberty of crossposting this to comp.lang.c++
(originally comp.lang.c), and to summarize the discussion for the sake
of those reading only c++.

The question is: If you are writing C and you have a struct P, can you
create a struct C that is an extension of the first (starts just like P
and adds some data), and then use a C* as if it were an P*?

Example:

typedef struct {
int a;
short b;
} P;

typedef struct {
int a;
short b;
long long c;
} C;

C *c; P *p;
c = (C*) malloc(sizeof(C));
c->a = 1; c->b =2;
p = (P*) c;
printf("%d\n", p->a);

P and C stand for parent and child, and hint at the OO structure that we
are tying to mimick in C. We want to be able to use a child struct as a
parent struct, as you would in C++.

The basic answer in comp.lang.c is "it works on any compiler I have
seen, but there is no guarantee".

The standard appears to limit the freedom of the compiler in laying out
the struct: the order of the elements if fixed, padding can be added
between elements and at the end, but only if necessary for alignment.
This does not quite prescribe where the padding should be. If you have
three byte-aligned bytes, and a 4-byte aligned 4-byte word, you need one
byte of padding, but you can put that whereever you want before the
word: bbbpwwww or bpbbwwww are both allowed (b is a byte, p padding, and
w part of the word). The standard apperently does not require that the
padding is applied the same way in different structs.

Another problem that has been pointed out is this: what if P ends in a
4-byte aligned byte b and 3 bytes of padding. The compiler may decide
that the most efficient way to clear b is to do a four-byte clear
operation. If C adds 3 bytes to the struct, these may go in the
padding, and an attempt to assign p.b=0 may clear the extra the extra
bytes in C if p points to a C struct.

Now the reason to crosspost to comp.lang.c++: I think the c++ to c
translator used overlapping for inheritance, so the c++ people must be
experts. Am I correct? Does that mean that the translator depended on
features of the compilers that are not prescribed by the standard, or am
I missing something?

It is clear that there are alternatives, e.g., we may define C as
typedef struct {
P p;
long c;
} C;
at the expense of some extra typing when accessing common elements.

[disclaimer: I do not have the C standard. Everyting I write about it
is either hearsay or Harbison & Steele.]

Roderick

Keith Thompson · May 6, 2005

Roderick Bloem said:
The standard appears to limit the freedom of the compiler in laying
out the struct: the order of the elements if fixed, padding can be
added between elements and at the end, but only if necessary for
alignment.

C99 6.7.2.1p13:
Within a structure object, the non-bit-field members and the units
in which bit-fields reside have addresses that increase in the
order in which they are declared. A pointer to a structure object,
suitably converted, points to its initial member (or if that
member is a bit-field, then to the unit in which it resides), and
vice versa. There may be unnamed padding within a structure
object, but not at its beginning.

C99 6.7.2.1p15:
There may be unnamed padding at the end of a structure or union.

There is no implication that padding can be added only if necessary
for alignment. The compiler is free to insert padding because it
makes the struct look bigger and scares away predators.

[...]

Now the reason to crosspost to comp.lang.c++: I think the c++ to c
translator used overlapping for inheritance, so the c++ people must be
experts. Am I correct? Does that mean that the translator depended
on features of the compilers that are not prescribed by the standard,
or am I missing something?

Are you referring to cfront?

It probably means that the author(s) of the translator either were
experts on C, or were lucky enough not to run into any problems. It
doesn't imply anything about the C expertise of C++ programmers other
than the ones who worked on the translator.

There's no fundamental reason why either the translator or the code it
generated had to be written in perfectly portable C. As long as it
did the job, that may have been good enough, and the authors were free
to take advantage of assumptions that happen to be valid for all C
implementations of interest, even if they're not guaranteed by the
standard. (Portable standard-conforming code is generally better, all
else being equal, but all else is not always equal.)

Chris Torek · May 6, 2005

Now the reason to crosspost to comp.lang.c++: I think the c++ to c
translator used overlapping for inheritance, so the c++ people must be
experts. Am I correct?

On the first, perhaps; on the second, well...

Does that mean that the translator depended on features of the
compilers that are not prescribed by the standard ...

If you are referring to cfront, it *definitely* *did* depend on
non-portable features. In particular, you had to tell it all about
how the C compiler it used as its "assembler" laid out structures,
including padding, so that it could track the C compiler's work
and subvert it.

Note that cfront was in fact a "real compiler" according to the
definition I prefer:

To decide if Step S is a "preprocessor" or a "compiler",
answer the following question: if an error occurs *after*
Step S, is it a mistake by the programmer, or is it a
mistake in Step S?

Consider the following examples:

foo.c, line 123: invalid operand to unary &
# or same with "foo.cpp" as the file name

/tmp/151522.c, line 123: invalid operand to unary &

/tmp/151523.s, line 5012: invalid register operand to add

When compiling a C or C++ program named "foo.c" or "foo.cpp", the
first message is perfectly natural if you goofed up some "#define",
because the preprocessor part of the language does not understand
the language proper. But getting (just) the second message from
a C++ compiler, when compiling "foo.cpp", indicates a bug in the
C++ compiler, not invalid C++ code that was simply copied through
to the C compiler. So C++ is not a "preprocessor", because it is
a bug in the C++ system, not a bug in your own code, that produced
the message about file in /tmp.

In all cases, the last message (from the assembler) indicates a
bug in the compiler, because the compiler should not be emitting
invalid CPU register names. The exception to this rule occurs if
the compiler happens to have an "insert arbitrary assembly code"
escape clause (like __asm__), and you used it.

Tim Rentsch · May 8, 2005

Eric Sosman said:
Alberto said:

Hi, I've seen some object oriented programming bits out there and i'm
not sure if they're legal. For example:

struct Object {
int field1;
int field2;
};

struct SubObject {
int field1; /* the same as Object */
int field2; /* the same as Object */

int subobject_field1;
int subobject_field2;
};

And a cast is used to reference the "superclass" of SubObject:

struct SubObject *subobject;

struct Object *parent = (Object *) subobject; /* legal cast? */

[understood to be '(struct Object *)']

And then use parent->field1, etc. Is that cast legal? invokes UB or
something worse?

Click to expand...

The pointer conversion is "legal" in the sense that the
result could be re-converted to a `struct SubObject*' and
work correctly. The conversion itself causes no trouble.

Technically I think that's not right. The two pointer types must have
the same representation and alignment requirements, but the two struct
types are allowed to have different alignment requirements; if they
do, converting a pointer value that is not correctly aligned for the
new type evokes undefined behavior.

Pointer casts for OOP	2	Aug 18, 2011
Lambda and cast to pointer to void	1	Sep 20, 2010
OOP & Abstract Classes	8	May 11, 2009
Can one get away with an under-allocated union?	5	Dec 25, 2010
Implementing polymorphism with vtables?	6	Jul 24, 2009
[Slightly OT] Trying to simulate OOP with C	4	Feb 20, 2005
Interface design - options with varargs	9	May 26, 2009
OOP Newb	1	Sep 10, 2005

OOP: legal cast?

Alberto =?iso-8859-1?Q?Gim=E9nez?=

Eric Sosman

Mark Piffer

E. Robert Tisdale

CBFalconer

Keith Thompson

Roderick Bloem

Eric Sosman

Roderick Bloem

Keith Thompson

Chris Torek

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads