... you do need to link fat-pointer-compiled object files with
fat-pointer-compiled libraries; much as you need to link 64-bit object
files with 64-bit libraries, little-endian object files with little-
endian libraries, and on MS-DOS used to link large memory model object
files with large memory model libraries.
All true. Note, however, that "fat-to-thin shims" are easy to
construct: if fat_f() is a function compiled with "fat" pointers,
and it calls thin_g() in a library compiled with "thin" pointers
while passing a pointer, the call need only pass through a "skim
off the fat" layer (fat_g_to_thin_g() perhaps). A compiler could
even generate such a shim "on the fly" at link-time.
Going the other direction -- from thin to fat, including thin_g()'s
return value if it returns a pointer -- is considerably more
difficult. There are several ways to deal with this, with different
tradeoffs.
Contrary to Chuck F's followup, it *is* possible to implement fat
pointers in C, despite cast-conversions and malloc() and different
array sizes and pointer usage and so on. Again, there are multiple
ways to deal with various issues, with different tradeoffs.
In general, the simplest method is to have all "fat pointers"
represented as a triple: <currentvalue, base, limit>. A pointer
is "valid" if its current-value is at least as big as its base and
no bigger than its limit:
if (p.current >= p.base && p.current <= p.limit)
/* pointer value is valid */;
else
/* pointer value is invalid */;
A pointer is valid-for-dereference if it is valid (as above) *and*
strictly less than its limit. This is so that we can compute
&array[N] (where N is the size of the array) but not access the
nonexistent element array[N].
Given this type of "fat pointer", the simplest thin-to-fat conversion
is done like this:
fat_pointer make_fat_pointer(machine_pointer_type value) {
fat_pointer result;
result.current = value;
result.base = (machine_pointer_type) MINIMUM_MACHINE_ADDRESS;
result.limit = (machine_pointer_type) MAXIMUM_MACHINE_ADDRESS;
return result;
}
Clearly this is somewhat undesirable, as it means all "thin-derived"
pointers lose all protection.
When dealing with pointers to objects embedded within larger objects
(such as elements of "struct"s), the simplest method is again to
widen the base-and-limit to encompass the large object. Consider,
e.g.:
% cat derived.c
#include "base.h"
struct derived {
struct base common;
int additional;
};
void basefunc(struct base *, void (*)(struct base *)); /* in base.c */
static void subfunc(struct base *);
void func(void) {
struct derived var;
...
basefunc(&var.common, subfunc);
...
}
static void subfunc(struct base *p0) {
struct derived *p = (struct derived *)p0; /* line X */
... use p->common and p->additional here ...
}
There is clearly no problem in the call to basefunc(), even if we
"narrow" the "fat pointer" to point only to the sub-structure
&var.common. However, when basefunc() "calls back" into subfunc(),
as presumably it will, with a "fat pointer" to the "common" part
of a "struct derived", we will have to "re-widen" the pointer. We
can do that at the point of the cast (line X), or simply avoid
"narrowing" the fat pointer at the call to basefunc(), so that when
basefunc() calls subfunc(), it passes a pointer to the entire
structure "var", rather than just var.common.
(It is probably the case, not that I have thought about it that
much, that we can pass only the "fully widened to entire object"
pointer if we are taking the address of the *first* element of
the structure. That is, C code of the form:
struct multi_inherit {
struct base1 b1;
struct base2 b2;
int additional;
};
static void callback(struct base2 *);
void func(void) {
struct multi_inherit m;
...
b2func(&m.b2, callback);
...
}
static void callback(struct base2 *p0) {
struct multi_inherit *p;
p = (struct multi_inherit *)
((char *)p0 - offsetof(struct multi_inherit, b2)); /* DANGER */
... use p->b1, p->b2, and p->additional ...
}
is "iffy" at the line marked "danger", although it works in practice
on real C compilers. If the call to b2func() passes a pointer
whose base is &m.b2 and whose limit is (&m.b2 + 1), the cast in
callback() must somehow both reduce the base and increase the limit.
The C Standard says that a pointer to the first element of a
structure can be converted back to a pointer to the entire structure,
but says nothing about this kind of tricky subtraction to go from
"middle of structure" to "first element" and thence to "entire
structure". Still, several ways to handle this -- either by
forbidding it, or by recognizing such subtractions embedded within
cast expressions -- are obvious.)
A more complicated way to implement "fat pointers", which also
provides a more useful way to go from "thin" to "fat", is to record
thin-to-fat conversions in one or more runtime tables, and do
lookups as needed:
fat_pointer make_fat_pointer(machine_pointer_type value) {
fat_pointer *p;
p = look_up_in_table(value);
if (p == NULL)
__runtime_exception("invalid pointer");
return *p;
}
The compiler can cache these computed fat pointers, use them
internally, pass them around, or "thin" them (by simply taking
p.current) at any time as needed for compatibility with code compiled
without the fat pointers. But there is a significant runtime cost
whenever going from "thin" to "fat", typically greater than that
added by verifying p.current as needed. (Note also that malloc()
must manipulate the, or a, fat-pointer table, if pointers that come
out of malloc() are ever to be looked-up. Thus, you *can* link
against various thin functions, but never against "thin malloc".)