Typecasting portability in C

G

Guest

What I am trying to do is implement polymorphism in C. Why? This is to
build a library which will be a C library and callable from C.
However, I want to have polymorphic functions which are callable from
outside the library. Specifically I want to have functions like Show()
which will take a pointer to an object and do something different
based on the type object passed to it. Of course this would be trival
in C++ but I'm restricted to use C.

OK so my current idea for something like the polymophic Show()
routine: Since all the objects are created inside the library I can
attach to each object a signature. The Show() routine would accept a
void* but then typecast it and get the signature out of the object.
Since each object is a structure (and created and defined inside the
library) I would append this signature to the front of each and every
struct. The signature would likely be something like an integer. So
each and every object would look like:

struct SomeObjectType {
int signature;
.....//bunch other other stuff
};

The Show() routine would typecast the void* to the following
structure:

struct {
int signature;
}

So my Show() and similar polymorphic routines would get the object and
ultimatly open it and look at the first field which would always be
the signature object. Based on what it found there (the value of
signature) it would do something different. Makes sense? I hope so.

So I have implemented some prototype code on this and it does work but
what I was wondering is how portable is this? The structure's first
item (the signature) will always be the same this I can guarantee but
again what about portability. Really I'm converting from a pointer to
one type, to a void*, then to a pointer to another type (for signature
extraction). Certainly the first object in the final structure will be
the same as the first object in the original structure and the same
size as well. But does that make it portable?

Is this legal C as defined by the standard? Is this going to work
across platforms? I hope you see my dilemma the code appears to work
on my system. The code below happily prints the expected output of
112. but I don't know if it's guaranteed to work everywhere and how
portable the library functions will be because of it?

Is this defined in any C standard anywhere what will happen and what
about general portability concerns?

In the example below I defined a structure CouldBeAnything and filled
it with a long and a char but really not only that structure but
anything after that initial signature integer will be different from
object to object. The only thing I can guarantee is that the first
item will be that signature integer on each object. Will this
conversion work? Is it portable? It works on my machine but does that
mean it will work in general? Thank you. :)

---
//On the code below when run on my machine it happily prints out 112
//No warnings are given by the gcc complier with warning flag -Wall

#include <stdio.h>

//This structure could contain anything
struct CouldBeAnything
{
long ThisTimeItsALong;
char AndAChar;
};

struct MinimalStructure
{
int Signature;
};


struct LargerStructure
{
int Signature;
struct CouldBeAnything SomeStructure;
};

int main ()
{
struct LargerStructure SomeLargeStructure;

SomeLargeStructure.Signature = 112;
SomeLargeStructure.SomeStructure.ThisTimeItsALong = 1;
SomeLargeStructure.SomeStructure.AndAChar = 'a';

struct MinimalStructure* Minimal = ( struct MinimalStructure* )
&SomeLargeStructure;

printf( "Signature = %d\n", Minimal->Signature );
return 0;
}
 
T

Thad Smith

What I am trying to do is implement polymorphism in C. ....
I would append this signature to the front of each and every
struct. The signature would likely be something like an integer. So
each and every object would look like:

struct SomeObjectType {
int signature;
....//bunch other other stuff
};

The Show() routine would typecast the void* to the following
structure:

struct {
int signature;
} ....
Is this legal C as defined by the standard?

Yes. If the initial sequence of two structures match, the members in
the initial sequence can be accessed through either structure type.

You might consider, though, defining the struct to more accurately match
you actual usage:

struct OverloadedType {
enum {ST1, ST2} subtype; /* subtype of following data */
union {
struct ST1 {
/* T1 subtype members */
} st1;
struct ST2 {
/* T2 subtype members */
} st2;
} u;
} *ot;
....
switch (ot-> subtype) {
case ST1:
do_something_to_ST1 (&ot->u.st1);
break;
case ST2:
do_something_to_ST2 (&ot->u.st2);
break;
}
 
E

Eric Sosman

Thad said:
Yes. If the initial sequence of two structures match, the members in
the initial sequence can be accessed through either structure type.

... provided the two structs are members of the same union.
A compiler would need to be awfully smart and awfully perverse
to arrange things so the presence of the union made any difference,
but the formal guarantee has union membership ("Look for ... the
union label") as a precondition.

However, the O.P. doesn't need this guarantee because there's
another that fills the bill without requiring unionization: A
pointer to a struct can be converted to a pointer to its first
element (and back) without loss or damage. So if he begins with
a `struct SomeObjectType*' and converts it to a `void*' for the
function call, he can in turn convert again to an `int*' to access
the `signature' element at the beginning.
You might consider, though, defining the struct to more accurately match
you actual usage:

struct OverloadedType {
enum {ST1, ST2} subtype; /* subtype of following data */
union {
struct ST1 {
/* T1 subtype members */
} st1;
struct ST2 {
/* T2 subtype members */
} st2;
} u;
} *ot;
...
switch (ot-> subtype) {
case ST1:
do_something_to_ST1 (&ot->u.st1);
break;
case ST2:
do_something_to_ST2 (&ot->u.st2);
break;
}

This is cleaner, but it requires that all "overloading"
types be known at the point where `struct OverloadedType' is
declared. Also, it may be wasteful if `struct ST1' and
`struct ST2' are of very different sizes. Starting each type
with a signature -- int, enum, even a small struct -- may be
more convenient and more extensible.

An important kind of "signature" is a pointer to a struct
or other data structure containing function pointers. Instead
of the select-and-implement logic shown above, one can then do

obj->methods->someMethod(obj, 42);
or maybe
obj->typeData.methods[SOME_METHOD](obj, 42);

One warning: Although this pattern promotes extensibility, it
makes static analysis of the code much more difficult and/or
much less effective.
 
S

Stephen Sprunk

Eric Sosman said:
An important kind of "signature" is a pointer to a struct
or other data structure containing function pointers. Instead
of the select-and-implement logic shown above, one can then do

obj->methods->someMethod(obj, 42);
or maybe
obj->typeData.methods[SOME_METHOD](obj, 42);

If the methods are the same (and in the same order) for all of the
polymorphic types, then why not do:

obj->someMethod(obj, 42);

This is about as close as one can get to virtual functions in C.
Multiple inheritance is, of course, impossible with this strategy, but
that's probably a good thing.

Though I'll admit to having used such hackery myself, if I had a real
need for this sort of thing in more than a couple places I'd use C++
instead. Just because something is possible doesn't mean it's a good
idea.

S
 
G

Guest

Eric said:
... provided the two structs are members of the same union.
A compiler would need to be awfully smart and awfully perverse
to arrange things so the presence of the union made any difference,
but the formal guarantee has union membership ("Look for ... the
union label") as a precondition.

It would not need to be awfully smart and perverse.

#include <stdlib.h>
struct A { int m; };
struct B { int m; int n; };
void f(struct A *a, struct B *b)
{
a->m = 0;
b->m = 1;
if(a->m != 0)
abort();
}

The final check can be removed by a conforming and sane compiler.
 
E

Eric Sosman

Stephen said:
Eric Sosman said:
An important kind of "signature" is a pointer to a struct
or other data structure containing function pointers. Instead
of the select-and-implement logic shown above, one can then do

obj->methods->someMethod(obj, 42);
or maybe
obj->typeData.methods[SOME_METHOD](obj, 42);

If the methods are the same (and in the same order) for all of the
polymorphic types, then why not do:

obj->someMethod(obj, 42);

That works, but it means every instance needs to carry a
copy of the entire raft of function pointers. It's often more
economical for all the instances of a type to share one copy
of the function pointer table: If there are N methods you save
the space used for N function pointers at the cost of one data
pointer and an extra indirection level. For small N the savings
may not be worth the cost, but for large N (and with a large
population of object instances) it probably is.
 
L

Laurent Deniau

Eric said:
Thad said:
Yes. If the initial sequence of two structures match, the members in
the initial sequence can be accessed through either structure type.

... provided the two structs are members of the same union.
A compiler would need to be awfully smart and awfully perverse
to arrange things so the presence of the union made any difference,
but the formal guarantee has union membership ("Look for ... the
union label") as a precondition.

However, the O.P. doesn't need this guarantee because there's
another that fills the bill without requiring unionization: A
pointer to a struct can be converted to a pointer to its first
element (and back) without loss or damage. So if he begins with
a `struct SomeObjectType*' and converts it to a `void*' for the
function call, he can in turn convert again to an `int*' to access
the `signature' element at the beginning.
You might consider, though, defining the struct to more accurately
match you actual usage:

struct OverloadedType {
enum {ST1, ST2} subtype; /* subtype of following data */
union {
struct ST1 {
/* T1 subtype members */
} st1;
struct ST2 {
/* T2 subtype members */
} st2;
} u;
} *ot;
...
switch (ot-> subtype) {
case ST1:
do_something_to_ST1 (&ot->u.st1);
break;
case ST2:
do_something_to_ST2 (&ot->u.st2);
break;
}

This is cleaner, but it requires that all "overloading"
types be known at the point where `struct OverloadedType' is
declared. Also, it may be wasteful if `struct ST1' and
`struct ST2' are of very different sizes. Starting each type
with a signature -- int, enum, even a small struct -- may be
more convenient and more extensible.

An important kind of "signature" is a pointer to a struct
or other data structure containing function pointers. Instead
of the select-and-implement logic shown above, one can then do

obj->methods->someMethod(obj, 42);
or maybe
obj->typeData.methods[SOME_METHOD](obj, 42);

One warning: Although this pattern promotes extensibility, it
makes static analysis of the code much more difficult and/or
much less effective.

You may have a look to OOC-2.0 which keep track of static typing like
Java does (http://cern.ch/laurent.deniau/html/oopc.html#OOC2). For more
dynamic approach, one may have a look to
http://cern.ch/laurent.deniau/html/oopc.html#COS which is also on
sourceforge.net (but yet incomplete and in alpha release).

a+, ld.
 
L

Laurent Deniau

Stephen said:
Eric Sosman said:
An important kind of "signature" is a pointer to a struct
or other data structure containing function pointers. Instead
of the select-and-implement logic shown above, one can then do

obj->methods->someMethod(obj, 42);
or maybe
obj->typeData.methods[SOME_METHOD](obj, 42);

If the methods are the same (and in the same order) for all of the
polymorphic types, then why not do:

obj->someMethod(obj, 42);

This is about as close as one can get to virtual functions in C.

Because most objects do not need to carry its interface with itself.
Shared interface are better (like in C++ or Java).
Multiple inheritance is, of course, impossible with this strategy, but
that's probably a good thing.

It is possible, see http://cern.ch/laurent.deniau/html/oopc.html#OOPC

a+, ld.
 
L

Laurent Deniau

Harald said:
It would not need to be awfully smart and perverse.

#include <stdlib.h>
struct A { int m; };
struct B { int m; int n; };
void f(struct A *a, struct B *b)
{
a->m = 0;
b->m = 1;
if(a->m != 0)
abort();
}

The final check can be removed by a conforming and sane compiler.

Does not need to be perverse. The compiler may write after the last
field simply because it does not know that it is not the last.

struct A { int i; char c; };
struct B { int i; char c1, c2; };
struct C { struct A a; char c2; };

void f(void)
{
struct B b = { 1, 2, 3 };
struct C c = { { 1, 2 }, 3 };

struct A *a_b = (struct A*)&b;
struct A *a_c = (struct A*)&c;

a_b->c = 10; // may change also c2 in b
a_c->c = 10; // safe
}

But there is some tricks to make the layout of struct B compatible with
the layout of struct A.

a+, ld.
 
T

Thad Smith

Eric said:
... provided the two structs are members of the same union.
A compiler would need to be awfully smart and awfully perverse
to arrange things so the presence of the union made any difference,
but the formal guarantee has union membership ("Look for ... the
union label") as a precondition.

Thanks, Eric for that important correction.

As far as perverse goes, here's a possibility:

An implementation has an alignment requirement that an int be on an
4-byte boundary and single byte objects are accessed faster on even
addresses than odd addresses.

Given
struct S1 {
char a,b;
} s1;

struct S2 {
char a,b;
int c;
} s2;

The implementation may choose to implement S1 with no padding bytes, and
S2 with a single padding byte between a and b, and another between b and
c. That arrangement gives faster accesss to s2.b, while keeping the
size of struct S1 to a minimum.
 
Y

Yevgen Muntyan

What I am trying to do is implement polymorphism in C. Why? This is to
build a library which will be a C library and callable from C.
However, I want to have polymorphic functions which are callable from
outside the library. Specifically I want to have functions like Show()
which will take a pointer to an object and do something different
based on the type object passed to it. Of course this would be trival
in C++ but I'm restricted to use C.

OK so my current idea for something like the polymophic Show()
routine: Since all the objects are created inside the library I can
attach to each object a signature. The Show() routine would accept a
void* but then typecast it and get the signature out of the object.
Since each object is a structure (and created and defined inside the
library) I would append this signature to the front of each and every
struct. The signature would likely be something like an integer. So
each and every object would look like:

struct SomeObjectType {
int signature;
....//bunch other other stuff
};

The Show() routine would typecast the void* to the following
structure:

struct {
int signature;
}

So my Show() and similar polymorphic routines would get the object and
ultimatly open it and look at the first field which would always be
the signature object. Based on what it found there (the value of
signature) it would do something different. Makes sense? I hope so.

As you were told elsewhere, it won't work. But you can do same
thing with

struct Base {
int signature;
};

struct Child {
struct Base base;
long something;
};

It will require a cast if you got Child* and want Base members, but
it's not really high price.

Yevgen
 
L

Laurent Deniau

Thad said:
Thanks, Eric for that important correction.

As far as perverse goes, here's a possibility:

An implementation has an alignment requirement that an int be on an
4-byte boundary and single byte objects are accessed faster on even
addresses than odd addresses.

Given
struct S1 {
char a,b;
} s1;

struct S2 {
char a,b;
int c;
} s2;

The implementation may choose to implement S1 with no padding bytes, and
S2 with a single padding byte between a and b, and another between b and
c. That arrangement gives faster accesss to s2.b, while keeping the
size of struct S1 to a minimum.

This is the same point as my answer to Harald van Dijk. So the trick is
to put an anonymous 0 length bit field after the matching part:

struct S1 {
char a,b;
};

struct S2 {
char a,b;
int :0;
int c;
};

struct S3 {
struct S1 s;
int c;
};

Then S1, S2 and S3 should have compatible layout on the common part
whatever the field are in this part. This is what OOC-2.0 does and I
haven't seen problem with that. I don't know if anonymous 0 length bit
field was created for this purpose, but it works fine.

a+, ld.
 
J

Jamie Boy

OK so my current idea for something like the polymophic Show()
routine: Since all the objects are created inside the library I can
attach to each object a signature. The Show() routine would accept a
void* but then typecast it and get the signature out of the object.
Since each object is a structure (and created and defined inside the
library) I would append this signature to the front of each and every
struct. The signature would likely be something like an integer. So
each and every object would look like:

struct SomeObjectType {
int signature;
....//bunch other other stuff
};


V-Table is what you want. Something like:

typedef struct VTableForVehicle {
void (*StartEngine)(void);
void (*Accelerate)(unsigned);
void (*Brake)(unsigned);
} VTableForVehicle;

Provide definitions for each of the functions for each kind of vehicle:

void Car_StartEngine(void) { /* Code */ }
void Car_Accelerate(unsigned) { /* Code */ }
void Car_Brake(unsigned) { /* Code */ }
void Bike_StartEngine(void) { /* Code */ }
void Bike_Accelerate(unsigned) { /* Code */ }
void Bike_Brake(unsigned) { /* Code */ }

Then populate global const v-table objects for each kind of vehicle:

VTableForVehicle const
vtab_car = {Car_StartEngine,Car_Accelerate,Car_Brake},
vtab_bike = {Bike_StartEngine,Bike_Accelerate,Bike_Brake};

Now put a V-Table pointer at the beginning of each vehicle object, then put
a vehicle object inside every car, every bike:

typedef struct Vehicle { VTableForVehicle const *pvtab; } Vehicle;

typedef struct Car { Vehicle vehicle; } Car;
typedef struct Bike { Vehicle vehicle; } Bike;

And "initialise" it as follows:

Car obj = { {vtab_car} };

Then polymorphism can be exploited as follows:

void FuncWhichTakesAVehicle(Vehicle *const p)
{
p->Accelerate();
}

Bit of a snappy explanation but you get the idea.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top