Pointer to "base" type - what does the Standard say about this?

Stephan Beal · Nov 5, 2008

Hi, all!

Before i ask my question, i want to clarify that my question is not
about the code i will show, but about what the C Standard says should
happen.

A week or so ago it occurred to me that one can implement a very basic
form of subclassing in C (the gurus certainly already know this, but
it was news to me). What i've done (shown below) seems to work all
fine and well, and does exactly what i'd expect, but i'm asking about
it because when i switched to a higher optimization level on gcc i
started getting warnings about type-punned pointers violating "strict
mode." That got me wondering, does it mean "strict C mode" or "strict
GCC mode"? i don't care much about the latter, as long as i comply
with the former. To be clear (again), my question is not GCC-specific.
My question is whether or not the approach i've taken here is legal
according to The Standard. i only ask because GCC suggests (at some
optimization levels, anyway) that i might be violating some C rule
without knowing i'm doing so.

The code (sorry for the length - it's about as short as i can make
this example in C while still keeping it readable):

// ------------------- begin code
#include <stdio.h>
#include <stdlib.h>

struct base_type; // unfortunate fwd decl
// Public API for base_type objects:
struct base_public_api
{
void (*func1)( struct base_type const * self );
long (*func2)( struct base_type const * self, int );
};
typedef struct base_public_api base_public_api;

// Base-most type of the abstract interface
struct base_type
{
base_public_api api;
};
typedef struct base_type base_type;

// Implementation of base_type abstract interface
struct sub_type
{
base_public_api api;
int member1;
};
typedef struct sub_type sub_type;

#define MARKER if(1) printf("MARKER: %s:%d:%s():
\n",__FILE__,__LINE__,__func__); if(1) printf

#define SUBP ((sub_type const *)self)
void impl_f1( base_type const * self )
{
MARKER("SUBP->member1=%d\n",SUBP->member1);
}
long impl_f2( base_type const * self, int x )
{
return SUBP->member1 * x;
}

// Now here's the part which is dubious: note the concrete types here:
static const sub_type sub_type_inst = { {impl_f1,impl_f2}, 42 };
static base_type const * sub_inst = (base_type const*) &sub_type_inst;
// ^^^^ "warning: dereferencing type-punned pointer will break strict-
aliasing rules"

int main( int argc, char const ** argv )
{

sub_inst->api.func1(sub_inst);
MARKER("func2()==%ld\n", sub_inst->api.func2(sub_inst, 2) );
return 0;
}
// ------------------- end code

On my box that looks like:
stephan@jareth:~/tmp$ ls -la inher.c
-rw-r--r-- 1 stephan stephan 1184 2008-11-05 14:43 inher.c
stephan@jareth:~/tmp$ make inher
cc inher.c -o inher
stephan@jareth:~/tmp$ ./inher
MARKER: inher.c:34:impl_f1():
SUBP->member1=42
MARKER: inher.c:48:main():
func2()==84

Am i headed down a Dark Path with this approach? Or is there a better/
more acceptable approach to simulating single inheritance in C? (i'm
not abject to changing the model, but i really do need some form of
separate interface/implementation for what i'm doing.)

Many thanks in advance for your insights.

PS (not relevant to the question, really): what's the point of all
that? i'm working on a library where i really need abstract base
interfaces (with only one level of inheritance necessary), and this
approach seems to be fairly clear (though a tad bit verbose at times).
i've used it to implement subclasses of an abstract stream interface,
for example, so my library can treat FILE handles and in-memory
buffers (or client-supplied stream types, with an appropriate wrapper)
with the same read/write API.

PS2: my appologies for the dupe post on comp.lang.c.moderated - i
inadvertently posted to that group.

Antoninus Twink · Nov 5, 2008

Am i headed down a Dark Path with this approach? Or is there a better/
more acceptable approach to simulating single inheritance in C? (i'm
not abject to changing the model, but i really do need some form of
separate interface/implementation for what i'm doing.)

Having subtype as "base class struct + some extra fields" and casting
back to the base when necessary is completely ubiquitous in networking
code based on Berkeley sockets. So if your code is meant to run on
Windows or a *nix that does networking, then this approach will
certainly work.

If your interested in head-on-a-pin discussions about whether it will
work on embedded C for a coffee machine with half the standard library
missing, the "regulars" will no doubt be along soon with their usual
grandstanding answers.

Stephan Beal · Nov 5, 2008

Having subtype as "base class struct + some extra fields" and casting
back to the base when necessary is completely ubiquitous in networking
code based on Berkeley sockets.

That's good to hear, thanks

.

So if your code is meant to run on
Windows or a *nix that does networking, then this approach will
certainly work.

The code is intended to be platform neutral (but "C" below...), and it
now sounds like the approach i chose is also platform neutral. i was
worried there for a while.

As a baseline i'm trying to make sure it also compiles (and behaves
properly) with tcc (TinyC Compiler).

If your interested in head-on-a-pin discussions about whether it will
work on embedded C for a coffee machine with half the standard library
missing, the "regulars" will no doubt be along soon with their usual
grandstanding answers.

i'm only interested in conforming to The Standard. My programming
won't allow me to sleep at night if i knowingly make use of a compiler-
specific extension (with the exception of a couple very common
extensions, like free placement of var decls in functions, instead of
all at the front).

Thanks for the answer - it sounds like i'm not treading any
particularly dangerous ground here (assuming the API is used properly,
of course).

Stephan Beal · Nov 5, 2008

The code is intended to be platform neutral (but "C" below...), and it
now sounds like the approach i chose is also platform neutral. i was
worried there for a while.

Doh, i spoke to soon:

http://www.cellperformance.com/mike_acton/2006/06/understanding_strict_aliasing.html

Says:

"In C99, it is illegal to create an alias of a different type than the
original. This is often refered to as the strict aliasing rule."

Now i'm at an impasse - a few parts of my code rely on C99 features,
all but 1 of them (vsscanf()) i could probably easily do without. But
if i require C99 mode then i'm knowingly using undefined behaviour.
That's a tough call, considering that i don't think i can reasonably
reimplement around this limitation. Damn.

Jean-Marc Bourguet · Nov 5, 2008

Antoninus Twink said:
Having subtype as "base class struct + some extra fields" and casting
back to the base when necessary is completely ubiquitous in networking
code based on Berkeley sockets. So if your code is meant to run on
Windows or a *nix that does networking, then this approach will
certainly work.

If your interested in head-on-a-pin discussions about whether it will
work on embedded C for a coffee machine with half the standard library
missing, the "regulars" will no doubt be along soon with their usual
grandstanding answers.

The problem isn't with a coffee machine -- the compiler for which will
probably have a simple optimizer -- but with a complex optimizer which uses
the aliasing rules to drives the optimisation. And so code will
works... until the optimizer see an opportunity to use the fact that two
things shouldn't alias.

In the OP case, there is hope. Replacing

static base_type const * sub_inst = (base_type const*) &sub_type_inst;

by

static base_type const * sub_inst = &sub_type_inst.api;

will remove complains about this line. Now the only problematic part is

((sub_type const *)self)

but this isn't problematic AFAIK (6.7.2.1/14 in C99 and I seem to remember
that C90 has the same language: A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a bit-field,
then to the unit in which it resides), and vice versa.) BTW, that reference
shows that there isn't a need to complains about the first line.

Yours,

jameskuyper · Nov 5, 2008

Stephan Beal wrote:
....

// ------------------- begin code
#include <stdio.h>
#include <stdlib.h>

struct base_type; // unfortunate fwd decl
// Public API for base_type objects:
struct base_public_api
{
void (*func1)( struct base_type const * self );
long (*func2)( struct base_type const * self, int );
};
typedef struct base_public_api base_public_api;

You can combine those two statements:

typedef struct base_public_api
{
// details
} base_public_api;

// Base-most type of the abstract interface
struct base_type
{
base_public_api api;
};
typedef struct base_type base_type;

// Implementation of base_type abstract interface
struct sub_type
{
base_public_api api;
int member1;
};
typedef struct sub_type sub_type;
....
// Now here's the part which is dubious: note the concrete types here:
static const sub_type sub_type_inst = { {impl_f1,impl_f2}, 42 };
static base_type const * sub_inst = (base_type const*) &sub_type_inst;
// ^^^^ "warning: dereferencing type-punned pointer will break strict-
aliasing rules"

There's no problem there. Section 6.7.2.1p13 says "A pointer to a
structure object, suitably converted, points to its initial member (or
if that member is a bit-field, then to the unit in which it resides),
and vice versa."

Strictly speaking, you have to apply that rule twice, and that implies
that the correct conversion has to be:

static_base_type const* sub_inst = (base type const*)
(base_public_api*)&sub_type_inst;

If we are given the fact that, for a specific set of types A, B, and
C, where p is object of type C*, the conversion (A*)(B*)p is legal,
the standard says nothing that allows us to conclude that (A*)(B*)p ==
(A*)p. It doesn't even guarantee that (A*)p is also legal. However,
in practice, I would expect this to work, and I believe that the
intent was that it should work.

Stephan Beal · Nov 5, 2008

The problem isn't with a coffee machine -- the compiler for which will
probably have a simple optimizer -- but with a complex optimizer which uses
the aliasing rules to drives the optimisation. And so code will
works... until the optimizer see an opportunity to use the fact that two
things shouldn't alias.

Optimization is certainly a potential problem. i can conceive that for
some reason a compiler might optimize or pad these differently:

struct sub1
{
base_api api;
int m1;
double m2;
};

struct sub1
{
base_api api;
double m1;
char const * m2;
};

but my knowledge of explicit optimizations done by any given compiler
is pretty minimal. i'm much better versed in C++ than C.

In the OP case, there is hope. Replacing

static base_type const * sub_inst = (base_type const*) &sub_type_inst;

by

static base_type const * sub_inst = &sub_type_inst.api;

That was my original thought, but the point of passing (base_type
[const]*) as the "self" argument of base_type_api was to give me a
level of indirection which i want (for storage of subtype-specific
data without requiring subclasses to literally redeclare the whole
public API), and i lose that (and features based off of it) if i pass
a (base_type_api*) and cast it to a (sub_type*) (which in my eyes is
just plain wrong, even if it might work in this case).

will remove complains about this line. Now the only problematic part is

((sub_type const *)self)

but this isn't problematic AFAIK (6.7.2.1/14 in C99 and I seem to remember
that C90 has the same language: A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a bit-field,
then to the unit in which it resides), and vice versa.)

Coincidentally, i spent the last half hour reading up on that topic.
The reading here was enlighting:

http://www.cellperformance.com/mike_acton/2006/06/understanding_strict_aliasing.html

After reading that, i'm very confused about whether using the
"restrict" keyword might in some way help me here.

My current conclusion is:

a) it's technically illegal.
b) it's largely assumed to be safe on "most" platforms.
c) my subclasses are all private implementations (file-level scope)
and are never referenced using their subtype outside of a single file,
so the compiler might have a chance of knowing what i'm hoping for.

BTW, that reference
shows that there isn't a need to complains about the first line.

gcc doesn't complain until i turn on -O2 or higher or turn on the -
fstrict-aliasing flag. tcc doesn't complain at all.

Thanks for your input

.

Stephan Beal · Nov 5, 2008

You can combine those two statements:

typedef struct base_public_api
{
// details

} base_public_api;

Thanks for that. i'm far more used to C++ than C (been away from C
since 1995 or so), so i'm not all caught up yet.

There's no problem there. Section 6.7.2.1p13 says "A pointer to a
structure object, suitably converted, points to its initial member (or
if that member is a bit-field, then to the unit in which it resides),
and vice versa."

AND vice versa? That changes things. So that means that i could cast
the (base_type_api*) itself back to the original subclass which
contains it (provided i use it as we've shown here)?

So does that imply that the following is actually legal? (warning -
uncompiled)

typedef struct API
{
int (*func1)( struct API * );
// ... more API ...
} API;

typedef struct Foo
{
API api;
int val;
} Foo;

int func1_Foo( API *x )
{
return (Foo*)x;
}

....

Foo foo;
foo.api.func1 = func1_Foo;
printf("%d\n",foo.api.func1( &foo ));

If that's legal then i'm happy (though a bit perplexed as to why it
would be allowed).

If we are given the fact that, for a specific set of types A, B, and
C, where p is object of type C*, the conversion (A*)(B*)p is legal,
the standard says nothing that allows us to conclude that (A*)(B*)p ==
(A*)p. It doesn't even guarantee that (A*)p is also legal.

Those last two parts: fair enough, and i think i understand why that
must be so, but it is philosophically unsettling nonetheless.

However,
in practice, I would expect this to work, and I believe that the
intent was that it should work.

MY intent is for it to work, certainly

. i just hope i'm not
violating anyone's bits by doing this.

Thanks for your insightful reply

.

jameskuyper · Nov 5, 2008

jameskuyper said:
Stephan Beal wrote:
...

There's no problem there.

[Nonsense]

Sorry about that. I got mixed up because I misread your code. I was
somehow got confused while reading your code, and was under the
impression that sub_type_inst had a member of type base_public_api
(which was correct), and that base_public_api had a member of type
base_type (which is exactly backwards). You can only perform such a
conversion by defining a union of both types, and only when the
original pointer points to one of the members of such a union.

Stephan Beal · Nov 5, 2008

Sorry about that. I got mixed up because I misread your code. I was
somehow got confused while reading your code, and was under the
impression that sub_type_inst had a member of type base_public_api
(which was correct), and that base_public_api had a member of type
base_type (which is exactly backwards).

It was a good try, though

.

You can only perform such a
conversion by defining a union of both types, and only when the
original pointer points to one of the members of such a union.

i'm considering the union option. However, that would require that i
use all the concrete subtypes in the public union. Right now my
subtypes are all static/file-scope implementations, so that wouldn't
be possible without revealing those types, some of which require
optional add-on functionality like sqlite3 (which the subclasses hide
in the current API).

In the off chance that you'd like to see the actual code, visit here:

http://fossil.wanderinghorse.net/repos/c11n/index.cgi/dir?name=include/s11n.net/c11n/io

give in the user name "anonymous" with password "anonymous" (that's an
anti-bot feature of the hosting SCM software), click Login, and you'll
see the files. The base types are declared in c11n_io.h (search for
"_api api;") and a couple concrete impls of the c11n_stream base are
in c11n_stream_*.c. (c11n_io_handler_*.c are also relevant but
significantly more complex because they are grammar parser
implementations.)

Again, thanks for the insights!

Hallvard B Furuseth · Nov 5, 2008

Stephan said:
Doh, i spoke to soon:

http://www.cellperformance.com/mike_acton/2006/06/understanding_strict_aliasing.html

Says:

"In C99, it is illegal to create an alias of a different type than the
original. This is often refered to as the strict aliasing rule."

Not quite. What is illegal, with some exceptions, is to access an
object (such as the object the pointer points at) thorugh a different
type than it was a created with. In Standardese, its "effective type".
But you can cast pointers back and forth, as long as you don't break
alignment requirements.

One such exception is to access equivalent initial members of structs
that are union members. So that's one formally valid way _if_ you know
all the "subtypes" of your base type: Stuff them all into a union, then
pass that union around. Don't make the base type a struct member,
instead do
#define BASE_STRUCT_BODY(prefix) \
int prefix##_i, prefix##_j
struct base_struct { BASE_STRUCT_BODY(bs); };
struct other_struct { BASE_STRUCT_BODY(os); other members; };

The prefix is only necessary if you want different member names in
different structs.

It's not really clear to me what else the standard allows though. A
compiler's users would be kind of annoyed if sockets didn't work, as
Antoninus suggests. OTOH if you are worrying about formal rather than
practical examples that's not enough. Besides, the socket interface
could use implementation-specific tricks to disable optimizations what
would break sockets.

One (or the?) point of the aliasing rules is to enable optimizations.
If you access an object of type T and then call a function which the
compiler knows does not access type T, nor call a function which does,
nor use one of the exceptions, the compiler knows that it can move the
access to the T object past the function call. One trick to protect
your code from that is to keep the accesses through different types in
different source files. However that can get defeated by "link-time
optimizations".

Anyway, here are the relevant parts from C99 - with examples.

6.5.2.3 Structure and union members

5 One special guarantee is made in order to simplify the use of unions:
if a union contains several structures that share a common initial
sequence (see below), and if the union object currently contains one
of these structures, it is permitted to inspect the common initial
part of any of them anywhere that a declaration of the complete type
of the union is visible. Two structures share a common initial
sequence if corresponding members have compatible types (and, for
bit-fields, the same widths) for a sequence of one or more initial
members.

6 EXAMPLE 1 If f is a function returning a structure or union, and x
is a member of that structure or union, f().x is a valid postfix
expression but is not an lvalue.

7 EXAMPLE 2 In:
struct s { int i; const int ci; };
struct s s;
const struct s cs;
volatile struct s vs;
the various members have the types:
s.i int
s.ci const int
cs.i const int
cs.ci const int
vs.i volatile int
vs.ci volatile const int

8 EXAMPLE 3 The following is a valid fragment:

union {
struct {
int alltypes;
} n;
struct {
int type;
int intnode;
} ni;
struct {
int type;
double doublenode;
} nf;
} u;
u.nf.type = 1;
u.nf.doublenode = 3.14;
/* ... */
if (u.n.alltypes == 1)
if (sin(u.nf.doublenode) == 0.0)
/* ... */

The following is not a valid fragment (because the union type is not
visible within function f):

struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 * p1, struct t2 * p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
/* ... */
return f(&u.s1, &u.s2);
}

And the basic rules:

6.5 Expressions
6 The effective type of an object for an access to its stored value is
the declared type of the object, if any.[72] If a value is stored
into an object having no declared type through an lvalue having a
type that is not a character type, then the type of the lvalue
becomes the effective type of the object for that access and for
subsequent accesses that do not modify the stored value.
If a value is copied into an object having no declared type using
memcpy or memmove, or is copied as an array of character type, then
the effective type of the modified object for that access and for
subsequent accesses that do not modify the value is the effective
type of the object from which the value is copied, if it has one.
For all other accesses to an object having no declared type, the
effective type of the object is simply the type of the lvalue used
for the access.

7 An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:[73]
- a type compatible with the effective type of the object,
- a qualified version of a type compatible with the effective type of
the object,
- a type that is the signed or unsigned type corresponding to the
effective type of the object,
- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
- an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
- a character type.

Footnotes:
72) Allocated objects have no declared type.
73) The intent of this list is to specify those circumstances in
which an object may or may not be aliased.

Andrey Tarasevich · Nov 5, 2008

Stephan said:
Many thanks in advance for your insights.

The technique you describe has been used in C since forever. The other
posters already gave you a quote form the language specification, which
validates this useful technique, and which was actually included into
the language specification specifically for that purpose.

(The valid struct<->first member conversion actually made it into C++
specification as well).

Stephan Beal · Nov 5, 2008

Not quite. What is illegal, with some exceptions, is to access an

<HUGE snip

Wow, thanks for that! Now i've got some reading to do

.

The base struct macro you show is basically how i'm initiaizing my
subclasses, with the exception that i have on extra degree of
indirection - the subtypes hold an object which itself holds the
common API (that approach seems to simplify maintenance of the
subclass implementations in the event of a change in the base API).

Stephen Sprunk · Nov 5, 2008

Stephan said:
Optimization is certainly a potential problem. i can conceive that for
some reason a compiler might optimize or pad these differently:

struct sub1
{
base_api api;
int m1;
double m2;
};

struct sub1
{
base_api api;
double m1;
char const * m2;
};

but my knowledge of explicit optimizations done by any given compiler
is pretty minimal. i'm much better versed in C++ than C.

Those structs will likely be padded differently, yes, but you _should_
be able to cast from one to the other as long as you only access the
initial elements that they have in common (in this case, only "api").
Compilers are required to pad consistently enough that, as far into the
struct as the element types remain the same, they will be at the same
offsets; this is deliberate to allow casts to access them.

Think about this example:

struct point {
int x;
int y;
};
struct circle {
int x;
int y;
int radius;
};
struct rect {
int x;
int y;
int width;
int height;
};

Any time you need a struct point, you can safely cast a struct circle
and access x or y. This is very, very bare-bones inheritance and
polymorphism.

In the OP case, there is hope. Replacing

static base_type const * sub_inst = (base_type const*) &sub_type_inst;

by

static base_type const * sub_inst = &sub_type_inst.api;

Click to expand...

That was my original thought, but the point of passing (base_type
[const]*) as the "self" argument of base_type_api was to give me a
level of indirection which i want (for storage of subtype-specific
data without requiring subclasses to literally redeclare the whole
public API), and i lose that (and features based off of it) if i pass
a (base_type_api*) and cast it to a (sub_type*) (which in my eyes is
just plain wrong, even if it might work in this case).

You could also do the above as:

struct point {
int x;
int y;
};
struct circle {
struct point center;
int radius;
};

The syntax to access x and y isn't quite as pretty, but the layout in
memory will be the same (a pointer to a struct is guaranteed to be
equivalent to a pointer to its first element) and the compiler should be
quiet if you cast a struct circle to a struct point.

I haven't read all your code due to the length, so I'm not entirely sure
this helps, but I've used the same tricks in OO code of my own.

gcc doesn't complain until i turn on -O2 or higher or turn on the -
fstrict-aliasing flag. tcc doesn't complain at all.

If TCC doesn't complain, it probably doesn't have enough optimizing
intelligence to care about aliasing problems. GCC is pretty aggressive
in that area, but there's a huge cost in complexity to detect aliasing
(or lack thereof), which TCC probably can't afford given its name.

S

jameskuyper · Nov 5, 2008

Stephen Sprunk wrote:
....

Those structs will likely be padded differently, yes, but you _should_
be able to cast from one to the other as long as you only access the
initial elements that they have in common (in this case, only "api").
Compilers are required to pad consistently enough that, as far into the
struct as the element types remain the same, they will be at the same
offsets; this is deliberate to allow casts to access them.

The relevant section of the standard makes that guarantee only if the
two structs are members of the same union. In practice, it generally
works for a much wider range of cases than the ones guaranteed by the
standard.

CBFalconer · Nov 5, 2008

Stephan said:
.... snip ...

i'm only interested in conforming to The Standard. My programming
won't allow me to sleep at night if i knowingly make use of a
compiler-specific extension (with the exception of a couple very
common extensions, like free placement of var decls in functions,
instead of all at the front).

Then you should entirely ignore Twink. He is a troll, and only
interested in disturbing the newsgroup.

CBFalconer · Nov 5, 2008

Stephan said:
Optimization is certainly a potential problem. i can conceive that
for some reason a compiler might optimize or pad these differently:

....

The following references may be helpful. The C99 ones are the
standard, while the n869_txt.bz2 is a bzipped version of n169.txt,
which in turn is the last version available as a text file.

Some useful references about C:
<http://www.ungerhu.com/jxh/clc.welcome.txt>
<http://c-faq.com/> (C-faq)
<http://benpfaff.org/writings/clc/off-topic.html>
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf> (C99)
<http://cbfalconer.home.att.net/download/n869_txt.bz2> (pre-C99)
<http://www.dinkumware.com/c99.aspx> (C-library}
<http://gcc.gnu.org/onlinedocs/> (GNU docs)
<http://clc-wiki.net/wiki/C_community:comp.lang.c:Introduction>

Stephan Beal · Nov 6, 2008

See James Kuyper's response, but the struct layout is not the
only issue. Even if the layouts are as you want them, the compiler
is allowed to "know" that a `struct sub1*' and a `struct sub2*' do
not point to the same thing (see Hallvard Furuseth's response). An
aggressive optimizer might assume that storing to the `api' element
of something pointed at by a `struct sub1*' does not affect the `api'
element of something pointed at by a `struct sub2*', and if in fact
they point at the same memory odd things could happen.

That thought kept me up much of the night

.

One clean way to handle this is to pack all the allied types
into a union, but this requires that you know all those types up
front, which can be irksome. Another clean way to handle it is to
use a `base_api*' and point it at the `api' element of whichever
structs you're dealing with. Since the `api' is the first element
in each struct, it is always safe to convert the struct pointer to
a `base_api*' and back.

Coincidentally, that's the approach was is on my list of trying out
tonight. It requires the fewest changes and "seems" to be safest
(aside from the Union) so far.

Stephan Beal · Nov 6, 2008

A follow up on how i got to a safe solution...

i've ruled out the union idea because concrete impls can be provided
by client code, and i obviously can't link those in to my lib.

Here's what i've ended up doing, which offers both an approach with
the safety guaranty approach and the extension-which-might-work-but-is-
technically-unsafe approach:

typedef struct base_api {
void (*member1)( struct foo const * self );
int (*member1)( struct foo const * self, int arg1 );
...
void const * implData;
} base_api;

Now my Base type looks like:

typedef struct base {
base_api const * api;
};

(This extra level of indirection isn't really necessary any longer,
and i may get rid of it.)

For my particular cases, all of my implementations can (and in fact
should) be initialized with constant, immutable data (it may be
instance-specific but should be immutable). With this approach i no
longer need concrete "subclasses" - i only need concrete
implementations of base, which allows me to completely avoid the
((base*)mySubT) cast. The impl functions can require that the api-

implData object be set to some implementation-specific value, which

the impls can then cast to their heart's content.

What's all this for?

As part of c11n (http://s11n.net/c11n/) i need abstract interfaces for
3 particular object types. The interfaces are used by the rest of the
API and only care that impls follow the rules defined in the API docs
for the base class API. For example, i have an interface called
c11n_marshaller, which is a marshaller type for de/serializing objects
of a specific type (we need one implementation/instance per
serializable type). Some common cases (e.g. well-known PODs) can be
combined into a single implementation of the base_api functions,
differing only in the metadata they need for the marshalling
conversion. To do this we point the api->implData to some instance-
specific static struct containing that metadata which differs from POD
type to POD type (e.g. a printf/scanf specifier). For the c11n_stream
interface, the (void * implData) (non-const) member will hold info for
the underlying native stream object (e.g. FILE handle or in-memory
buffer).

Anyway...

Thanks a thousand times to all of you for your feedback - it's helped
me move away from a potentially horrible design mistake!

What does the C standard say about...	1	May 1, 2012
What does the standard say about this code?	0	Jul 12, 2011
pointer vs pointer to pointer	4	Jun 11, 2012
Question from "The Standard C Library" - create a program with this	8	Apr 29, 2012
What does the standard say about array access wraparound?	24	May 27, 2004
Pointer problem when trying to abstract away an array type	11	May 11, 2009
What does the standard say about this	1	Jul 10, 2004
How to understand the union part in this C segment	1	Sep 13, 2010

Pointer to "base" type - what does the Standard say about this?

Stephan Beal

Antoninus Twink

Stephan Beal

Stephan Beal

Jean-Marc Bourguet

jameskuyper

Stephan Beal

Stephan Beal

jameskuyper

Stephan Beal

Hallvard B Furuseth

Andrey Tarasevich

Stephan Beal

Stephen Sprunk

jameskuyper

CBFalconer

CBFalconer

Stephan Beal

Stephan Beal

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads