why the usage of gets() is dangerous.

Keith Thompson · Nov 20, 2007

Tor said:
I don't think so.

Methinks, fat pointers break pointer arithmetic and thus require at
least a new language dialect.

Hmm. You might be right. I don't care enough about a hypothetical
"safe" gets() to work out the details.

Also, the buffer passed to gets() may not be malloc'ed, but can be an
array, or even a sub-array.

Certainly. My assumption is that *all* pointers would be "fat". For
example, if you take the address of a single declared object, the
resulting pointer value includes the information that it points to a
single object (and dereferencing ptr+1 is therefore not allowed). When
an array name decays to a pointer to its first element, that pointer
value contains bounds information about the array. Pointer arithmetic
preserves and updates the bounds information.

A "fat" pointer might consist of three elements: the address of the
zeroth element of the array that contains the object being pointed to,
the index of the specific element, and the length of the array. All
operations that create pointers must correctly initialize this
information; all operations on pointers must preserve and update it.

I *think* that such an implementation could conform to the C standard,
and could detect many pointer bugs (at a perhaps unacceptable cost in
performance).

I believe Malcolm's claim as stated is correct. It's not particularly
useful, but he didn't claim that it was; I believe it was merely an
intellectual excercise, not a serious proposal.

Click to expand...

I can't see how Malcolm's claim can be correct, the only way.... is IF
the implementation restrict gets() buffer writes to some hard upper
limit, let say one less than MAX_GETS_WRITE, then the

char buf[MAX_GETS_WRITE];

gets(buf);

would be safe.

Such an implementation of gets() would be non-conforming, since it
wouldn't allow you to read into a buffer bigger than MAX_GETS_WRITE bytes.

Flash Gordon · Nov 20, 2007

Keith Thompson wrote, On 20/11/07 22:36:

Hmm. You might be right. I don't care enough about a hypothetical
"safe" gets() to work out the details.

Certainly. My assumption is that *all* pointers would be "fat". For
example, if you take the address of a single declared object, the
resulting pointer value includes the information that it points to a
single object (and dereferencing ptr+1 is therefore not allowed). When
an array name decays to a pointer to its first element, that pointer
value contains bounds information about the array. Pointer arithmetic
preserves and updates the bounds information.

A "fat" pointer might consist of three elements: the address of the
zeroth element of the array that contains the object being pointed to,
the index of the specific element, and the length of the array. All
operations that create pointers must correctly initialize this
information; all operations on pointers must preserve and update it.

I *think* that such an implementation could conform to the C standard,
and could detect many pointer bugs (at a perhaps unacceptable cost in
performance).

Here is a problem for it. Assume that all allocations succeed and each
function is in a separate TU with appropriate headers etc...

char *p1;
char *p2;
char *p3;

char *alloc()
{
return p2 = p3 = malloc(5);
}

char *ralloc(char *orig)
{
return realloc(orig,10);
}

char *foo(void)
{
p1 = ralloc(alloc());
if (p1 == p2)
strcpy(p3,"Hello World");
else
strcpy(p3,"Bye");
}

If realloc has not moved the pointer then how will the size information
in the fat pointer p3 get updated? This is a contrived example, but
there might be more likely situations that would also be a problem, so
there would be a lot of work to nail down everything for a fat pointer
system.

I believe Malcolm's claim as stated is correct. It's not particularly
useful, but he didn't claim that it was; I believe it was merely an
intellectual excercise, not a serious proposal.

Click to expand...

I can't see how Malcolm's claim can be correct, the only way.... is IF
the implementation restrict gets() buffer writes to some hard upper
limit, let say one less than MAX_GETS_WRITE, then the

char buf[MAX_GETS_WRITE];

gets(buf);

would be safe.

Click to expand...

Such an implementation of gets() would be non-conforming, since it
wouldn't allow you to read into a buffer bigger than MAX_GETS_WRITE bytes.

Even if it returned NULL to indicate an error?

Ben Bacarisse · Nov 21, 2007

Flash Gordon said:
Keith Thompson wrote, On 20/11/07 22:36:

Here is a problem for it. Assume that all allocations succeed and each
function is in a separate TU with appropriate headers etc...

char *p1;
char *p2;
char *p3;

char *alloc()
{
return p2 = p3 = malloc(5);
}

char *ralloc(char *orig)
{
return realloc(orig,10);
}

char *foo(void)
{
p1 = ralloc(alloc());
if (p1 == p2)
strcpy(p3,"Hello World");
else
strcpy(p3,"Bye");
}

If realloc has not moved the pointer then how will the size
information in the fat pointer p3 get updated?

I don't think that example is valid. An implementation is allowed to
have realloc always move the object, and that is what I'd do if I were
implementing fat pointers.

RoS · Nov 21, 2007

In data Tue, 20 Nov 2007 23:18:50 +0000, Flash Gordon scrisse:

Here is a problem for it. Assume that all allocations succeed and each
function is in a separate TU with appropriate headers etc...

char *p1;
char *p2;
char *p3;

char *alloc()
{
return p2 = p3 = malloc(5);
}

char *ralloc(char *orig)
{
return realloc(orig,10);
}

char *foo(void)
{
p1 = ralloc(alloc());
if (p1 == p2)
strcpy(p3,"Hello World");
else
strcpy(p3,"Bye");
}

If realloc has not moved the pointer then how will the size information
in the fat pointer p3 get updated?

it is update from realloc

if realloc not move the pointer, p3 point to a valid address
and its size is changed from realloc (aggiornato da realloc)

if realloc move the pointer, p3 point to not valid address

Richard Bos · Nov 21, 2007

Tor Rustad said:
I don't think so.

In theory, he's correct. In practice, it depends on whether you think
either a predictable crash or predictable loss of data counts as "safe".
It is at least generally safer than having gets() write all over the end
of its target.

Methinks, fat pointers break pointer arithmetic and thus require at
least a new language dialect.

No, they don't. Pointer arithmetic beyond the bounds of an object has
undefined behaviour anyway, and within an object it works fine with fat
pointers. Adding an integer to a pointer is now a matter of adding it to
a single field of the pointer structure, rather than to a flat index,
but something similar is needed with, e.g., segmented architectures.

Also, the buffer passed to gets() may not be malloc'ed, but can be an
array, or even a sub-array.

So? A sub-array simply has it recorded, in its fat pointer data, that it
is a sub-array, and what of.

Richard

Tor Rustad · Nov 21, 2007

Richard said:
[...]

Methinks, fat pointers break pointer arithmetic and thus require at
least a new language dialect.

Click to expand...

No, they don't.

Hmm... what about volatile, extern and variable-length arrays? Also, I
expect GC's, to do low-level pointer magic.

Pointer arithmetic beyond the bounds of an object has
undefined behaviour anyway,

Off-by-one is allowed, see e.g. response to DR #76 and DR #221.

N1256 6.5.6p8:

"Moreover, if the expression P points to the last element of an array
object, the expression (P)+1 points one past the last element of the
array object, and if the expression Q points one past the last element
of an array object, the expression (Q)-1 points to the last element of
the array object. If both the pointer operand and the result point to
elements of the same array object, or one past the last element of the
array object, the evaluation shall not produce an overflow;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
otherwise, the behavior is undefined. If the result points one past the
last element of the array object, it shall not be used as the operand of
a unary * operator that is evaluated."

For example:

int a[10], *p;

p = (a + 11) - 1; /* UB */
p = a + 10; /* No UB, but off-by-one */
(&*p); /* No UB in C99, but off-by-one */

and within an object it works fine with fat
pointers. Adding an integer to a pointer is now a matter of adding it to
a single field of the pointer structure, rather than to a flat index,
but something similar is needed with, e.g., segmented architectures.

Well, it appears to me that that segmented architectures capable of C99,
might not be able to have 65535 bytes for an object, when using fat
pointers.

So? A sub-array simply has it recorded, in its fat pointer data, that it
is a sub-array, and what of.

So... the compiler would generate instrumented code then, lots of
run-time checks! This is getting far to theoretical.. fat pointers
make sense to me, *if* C had such a new pointer type, else everyone need
to use "the same" compiler, or how do you suggest we link in libraries
in practice?

Tor Rustad · Nov 21, 2007

Keith said:
Tor Rustad wrote:
[...]

the implementation restrict gets() buffer writes to some hard upper
limit, let say one less than MAX_GETS_WRITE, then the

char buf[MAX_GETS_WRITE];

gets(buf);

would be safe.

Click to expand...

Such an implementation of gets() would be non-conforming, since it
wouldn't allow you to read into a buffer bigger than MAX_GETS_WRITE bytes.

MAX_GETS_WRITE could give some system-dependent limit of the max length
of an input line. IIRC, Dan Pop had such a DOS box in his office once,
which made using gets() perfectly safe on that system.

Malcolm McLean · Nov 21, 2007

RoS said:
In data Tue, 20 Nov 2007 23:18:50 +0000, Flash Gordon scrisse:

it is update from realloc

if realloc not move the pointer, p3 point to a valid address
and its size is changed from realloc (aggiornato da realloc)

if realloc move the pointer, p3 point to not valid address

p3 is invalidated by the realloc() call. If it happens to still point to an
area of memory that is physically under the control of the program that is
pure chance.

Malcolm McLean · Nov 21, 2007

Kenny McCormack said:
I do see this as being a bit difficult. It boils down to: Is
it possible to keep enough information in the system so that we can
know, for any possible pointer and/or pointer value, how much valid
memory there is after that pointer?

I can't think of any counter-examples off-hand, but that doesn't mean
there aren't any.

The difficult situationwas given by Ben Bacarisse

struct any
{
char array[10];
int x;
};

struct any list[2];

char *ptr = list[1].aray;
struct any *ptr2 = (struct any *) ptr;
struct any *ptr3 = ptr2-1;

ptr3 is defined. So we need an additional "mother" field in our fat pointer,
purely to give the correct bounds to ptr2.

In practise of course the standard would have to be tweaked, if fat pointers
ever gained currency.

CBFalconer · Nov 22, 2007

Faulty routine - fails to return a value.

p3 is invalidated by the realloc() call. If it happens to still
point to an area of memory that is physically under the control
of the program that is pure chance.

This illustrates the dangers of keeping copies of allocated
pointers. p3 is not necessarily invalidated. However, if (p1 !=
p2) then p3 has been invalidated. You have no control over this.

Keith Thompson · Nov 22, 2007

Tor said:
Keith said:

Tor Rustad wrote:
[...]

the implementation restrict gets() buffer writes to some hard upper
limit, let say one less than MAX_GETS_WRITE, then the

char buf[MAX_GETS_WRITE];

gets(buf);

would be safe.

Click to expand...

Such an implementation of gets() would be non-conforming, since it
wouldn't allow you to read into a buffer bigger than MAX_GETS_WRITE
bytes.

Click to expand...

MAX_GETS_WRITE could give some system-dependent limit of the max length
of an input line. IIRC, Dan Pop had such a DOS box in his office once,
which made using gets() perfectly safe on that system.

Ok, good point. A length-limited gets() could be conforming on a system that
already imposes a maximum length on input lines.

But on systems that don't impose such a limit, gets() must be able to read
arbitrarily long lines, as long as the provided buffer is big enough.
(Normally, of course, gets can't tell how big the buffer is, which is why it's
so dangerous in most cases.)

RoS · Nov 22, 2007

In data Wed, 21 Nov 2007 18:35:01 -0800, Keith Thompson scrisse:

Tor said:
Tor said:

Keith said:

Tor Rustad wrote:
[...]

the implementation restrict gets() buffer writes to some hard upper
limit, let say one less than MAX_GETS_WRITE, then the

char buf[MAX_GETS_WRITE];

gets(buf);

would be safe.

Such an implementation of gets() would be non-conforming, since it
wouldn't allow you to read into a buffer bigger than MAX_GETS_WRITE
bytes.

Click to expand...

MAX_GETS_WRITE could give some system-dependent limit of the max length
of an input line. IIRC, Dan Pop had such a DOS box in his office once,
which made using gets() perfectly safe on that system.

Click to expand...

Ok, good point. A length-limited gets() could be conforming on a system that
already imposes a maximum length on input lines.

But on systems that don't impose such a limit, gets() must be able to read
arbitrarily long lines, as long as the provided buffer is big enough.
(Normally, of course, gets can't tell how big the buffer is, which is why it's
so dangerous in most cases.)

if the compiler has a array of elements
arrayelement{char* where; size_t size}
that point to each memory object in the memory returned from malloc or
allocated in the stack
it is possible to write a routine that says if one address is allow to
read-write or not; this mean it is possible to write a safe gets()
in the sense: make gets() function not overflow the input array

int isok(char* a)
{arrayelement *p=&whereitis;
for(i=0; p.where!=0; ++i)
{if(a>p.where && a<p.where+p.size)
return 1;
}
return 0;
}

size_t sizefromhere(char* a)
{arrayelement *p=&whereitis;
for(i=0; p.where!=0; ++i)
{if(a>p.where && a<p.where+p.size)
return p.size-(a-p.where);
}
return 0;
}

so gets could be something like

char* gets(char* buf)
{char *p;
int c;
size_t limit, h;
p=buf;
limit=sizefromhere(buf);
if(limit==0) return 0;
h=0;
l0:;
if(h==limit)
{l1:;
p[h]=0;
return 0;
}
c=getchar();
if(c==EOF)
{if(ferror(stdin)) goto l1
p[h]=0;
return buf;
}
if(c=='\n')
{p[h]='\n'; /* limit-1 */
p[h+1]=0; /* limit */
return buf;
}
p[h]=c; ++h; goto l0;
}

Richard Bos · Nov 22, 2007

Tor Rustad said:
Hmm... what about volatile, extern and variable-length arrays?

Why should they be different?

Also, I expect GC's, to do low-level pointer magic.

But GC in itself breaks ISO C compatibility. GC introduces a new
dialect; fat pointers do not.

Off-by-one is allowed, see e.g. response to DR #76 and DR #221.

Of course, but that still doesn't break fat pointers.

Well, it appears to me that that segmented architectures capable of C99,
might not be able to have 65535 bytes for an object, when using fat
pointers.

Why on earth not? For starters, who says that segments _must_ be 1980s
Intel 8088-compatible segments?

So... the compiler would generate instrumented code then, lots of
run-time checks!

Yes. That's most the point of fat pointers. They are generally used in
debugging implementations. I don't know of any normal C implementation
which uses them.

This is getting far to theoretical.. fat pointers make sense to me, *if*
C had such a new pointer type, else everyone need to use "the same"
compiler, or how do you suggest we link in libraries in practice?

You misunderstand. Fat pointers are not a new pointer type within ISO C.
Rather, they are a way to implement normal C pointers behind the scene.
And yes, you do need to link fat-pointer-compiled object files with
fat-pointer-compiled libraries; much as you need to link 64-bit object
files with 64-bit libraries, little-endian object files with little-
endian libraries, and on MS-DOS used to link large memory model object
files with large memory model libraries.

Richard

Flash Gordon · Nov 22, 2007

Malcolm McLean wrote, On 21/11/07 22:35:
I should really have used memcmp here to avoid the problem of evaluating
a possibly invalid pointer. Yes, memcmp could find a difference when the
realloc has not moved the allocated space, but...

p3 is invalidated by the realloc() call.

Not necessarily.

If it happens to still point to
an area of memory that is physically under the control of the program
that is pure chance.

I was attempting to be careful to only use p3 if it was still valid.
Although as I point out above I made a mistake.

Flash Gordon · Nov 22, 2007

CBFalconer wrote, On 22/11/07 01:43:

Faulty routine - fails to return a value.

Only if the return value is used...

OK, I was going to do some more stuff but ended up not bothering. It is
also irrelevant to the main points.

This illustrates the dangers of keeping copies of allocated
pointers. p3 is not necessarily invalidated. However, if (p1 !=
p2) then p3 has been invalidated. You have no control over this.

The function checked for this, although the real error that no one
pointed out was not using memcmp for the comparison.

Flash Gordon · Nov 22, 2007

Ben Bacarisse wrote, On 21/11/07 00:59:

I don't think that example is valid. An implementation is allowed to
have realloc always move the object, and that is what I'd do if I were
implementing fat pointers.

OK, the implementer can make that choice and thus prevent my idea from
braking things.

CBFalconer · Nov 23, 2007

Richard said:
.... snip ...

Yes. That's most the point of fat pointers. They are generally
used in debugging implementations. I don't know of any normal C
implementation which uses them.

You misunderstand. Fat pointers are not a new pointer type within
ISO C. Rather, they are a way to implement normal C pointers
behind the scene. And yes, you do need to link fat-pointer-compiled
object files with fat-pointer-compiled libraries; much as you need
to link 64-bit object files with 64-bit libraries, little-endian
object files with little- endian libraries, and on MS-DOS used to
link large memory model object files with large memory model
libraries.

I maintain that 'fat' pointers are incompatible with C in the first
place. Remember that pointers can point to structures, arrays,
single objects, etc. Other pointers can be created by performing
arithmetic on pointers (effectively indexing). The purpose of
'fat' pointers is to allow complete range checking at any time
(correct me if this is wrong). It only takes one example of
failure to prevent it all.

Let us assume we have "int foo[10];" declared. Now this is an
array, and entirely usable within its scope. However, to pass it
elsewhere, we have to convert it to "int *foobar" and lose the 10
dimension. What happens if we try to retain that dimension? Well,
maybe we need to pass out a single entity in the foo array, by
"foobar[2]". There is no proscription against the destined routine
regaining the original pointer via "foobarptr - 2". Yet the passed
foobar necessarily specified a valid array length of 1, allowing
only indexing by 0. -2 seems not to fit!

Things get worse when we try other things, such as passing a char
pointer, and making valid (or invalid) assumptions as to what was
legally accessible.

Don't forget that these pointers must arise from allocated memory,
static memory, and auto memory. Some operations (such as free)
become invalid on incremented/decremented pointers.

Unless such an improvement can handle EVERY type of occurrence, it
is better to simply not provide the 'improvement'. Now the poor
programmer may even have to think.

santosh · Nov 23, 2007

CBFalconer said:
Richard said:

... snip ...

Yes. That's most the point of fat pointers. They are generally
used in debugging implementations. I don't know of any normal C
implementation which uses them.

You misunderstand. Fat pointers are not a new pointer type within
ISO C. Rather, they are a way to implement normal C pointers
behind the scene. And yes, you do need to link fat-pointer-compiled
object files with fat-pointer-compiled libraries; much as you need
to link 64-bit object files with 64-bit libraries, little-endian
object files with little- endian libraries, and on MS-DOS used to
link large memory model object files with large memory model
libraries.

Click to expand...

I maintain that 'fat' pointers are incompatible with C in the first
place. Remember that pointers can point to structures, arrays,
single objects, etc. Other pointers can be created by performing
arithmetic on pointers (effectively indexing). The purpose of
'fat' pointers is to allow complete range checking at any time
(correct me if this is wrong). It only takes one example of
failure to prevent it all.

Let us assume we have "int foo[10];" declared. Now this is an
array, and entirely usable within its scope. However, to pass it
elsewhere, we have to convert it to "int *foobar" and lose the 10
dimension. What happens if we try to retain that dimension? Well,
maybe we need to pass out a single entity in the foo array, by
"foobar[2]". There is no proscription against the destined routine
regaining the original pointer via "foobarptr - 2". Yet the passed
foobar necessarily specified a valid array length of 1, allowing
only indexing by 0. -2 seems not to fit!

I think that regardless of whether we pass a pointer to the start of the
array or somewhere into it, the implementation will pass a fat pointer
containing the bounds of the parent object (in this case 10). Therefore
an index of -2 would be legal since it is within the bounds of the
object.

Only an explicit cast should be allowed to change the details within the
fat pointer, effectively converting it into a fat pointer for
a "different" object.

Things get worse when we try other things, such as passing a char
pointer,
Why?

and making valid (or invalid) assumptions as to what was
legally accessible.

Well invalid assumptions break existing pointers too, so fat pointers
are not uniquely culpable in this regard. C's fundamental assumption
that the programmer knows what he's doing would continue to be in
effect, fat pointers or not.

Don't forget that these pointers must arise from allocated memory,
static memory, and auto memory. Some operations (such as free)
become invalid on incremented/decremented pointers.

Unless such an improvement can handle EVERY type of occurrence, it
is better to simply not provide the 'improvement'. Now the poor
programmer may even have to think.

I am not knowledgeable enough with C to say whether fat pointers break
it's rules sufficiently severely to rule out their inclusion, but from
what I know, I can't see how it would be non-permissible.

Obviously it would require a lot of behind the screens compiler magic,
and is likely to severely degrade performance, but it ought to be, from
what I know possible. Of course I'm likely to be proved wrong in a few
minutes by an expert here.

Chris Torek · Nov 23, 2007

... you do need to link fat-pointer-compiled object files with
fat-pointer-compiled libraries; much as you need to link 64-bit object
files with 64-bit libraries, little-endian object files with little-
endian libraries, and on MS-DOS used to link large memory model object
files with large memory model libraries.

All true. Note, however, that "fat-to-thin shims" are easy to
construct: if fat_f() is a function compiled with "fat" pointers,
and it calls thin_g() in a library compiled with "thin" pointers
while passing a pointer, the call need only pass through a "skim
off the fat" layer (fat_g_to_thin_g() perhaps). A compiler could
even generate such a shim "on the fly" at link-time.

Going the other direction -- from thin to fat, including thin_g()'s
return value if it returns a pointer -- is considerably more
difficult. There are several ways to deal with this, with different
tradeoffs.

Contrary to Chuck F's followup, it *is* possible to implement fat
pointers in C, despite cast-conversions and malloc() and different
array sizes and pointer usage and so on. Again, there are multiple
ways to deal with various issues, with different tradeoffs.

In general, the simplest method is to have all "fat pointers"
represented as a triple: <currentvalue, base, limit>. A pointer
is "valid" if its current-value is at least as big as its base and
no bigger than its limit:

if (p.current >= p.base && p.current <= p.limit)
/* pointer value is valid */;
else
/* pointer value is invalid */;

A pointer is valid-for-dereference if it is valid (as above) *and*
strictly less than its limit. This is so that we can compute
&array[N] (where N is the size of the array) but not access the
nonexistent element array[N].

Given this type of "fat pointer", the simplest thin-to-fat conversion
is done like this:

fat_pointer make_fat_pointer(machine_pointer_type value) {
fat_pointer result;

result.current = value;
result.base = (machine_pointer_type) MINIMUM_MACHINE_ADDRESS;
result.limit = (machine_pointer_type) MAXIMUM_MACHINE_ADDRESS;
return result;
}

Clearly this is somewhat undesirable, as it means all "thin-derived"
pointers lose all protection.

When dealing with pointers to objects embedded within larger objects
(such as elements of "struct"s), the simplest method is again to
widen the base-and-limit to encompass the large object. Consider,
e.g.:

% cat derived.c
#include "base.h"
struct derived {
struct base common;
int additional;
};

void basefunc(struct base *, void (*)(struct base *)); /* in base.c */
static void subfunc(struct base *);

void func(void) {
struct derived var;
...
basefunc(&var.common, subfunc);
...
}

static void subfunc(struct base *p0) {
struct derived *p = (struct derived *)p0; /* line X */
... use p->common and p->additional here ...
}

There is clearly no problem in the call to basefunc(), even if we
"narrow" the "fat pointer" to point only to the sub-structure
&var.common. However, when basefunc() "calls back" into subfunc(),
as presumably it will, with a "fat pointer" to the "common" part
of a "struct derived", we will have to "re-widen" the pointer. We
can do that at the point of the cast (line X), or simply avoid
"narrowing" the fat pointer at the call to basefunc(), so that when
basefunc() calls subfunc(), it passes a pointer to the entire
structure "var", rather than just var.common.

(It is probably the case, not that I have thought about it that
much, that we can pass only the "fully widened to entire object"
pointer if we are taking the address of the *first* element of
the structure. That is, C code of the form:

struct multi_inherit {
struct base1 b1;
struct base2 b2;
int additional;
};

static void callback(struct base2 *);

void func(void) {
struct multi_inherit m;
...
b2func(&m.b2, callback);
...
}

static void callback(struct base2 *p0) {
struct multi_inherit *p;

p = (struct multi_inherit *)
((char *)p0 - offsetof(struct multi_inherit, b2)); /* DANGER */
... use p->b1, p->b2, and p->additional ...
}

is "iffy" at the line marked "danger", although it works in practice
on real C compilers. If the call to b2func() passes a pointer
whose base is &m.b2 and whose limit is (&m.b2 + 1), the cast in
callback() must somehow both reduce the base and increase the limit.
The C Standard says that a pointer to the first element of a
structure can be converted back to a pointer to the entire structure,
but says nothing about this kind of tricky subtraction to go from
"middle of structure" to "first element" and thence to "entire
structure". Still, several ways to handle this -- either by
forbidding it, or by recognizing such subtractions embedded within
cast expressions -- are obvious.)

A more complicated way to implement "fat pointers", which also
provides a more useful way to go from "thin" to "fat", is to record
thin-to-fat conversions in one or more runtime tables, and do
lookups as needed:

fat_pointer make_fat_pointer(machine_pointer_type value) {
fat_pointer *p;

p = look_up_in_table(value);
if (p == NULL)
__runtime_exception("invalid pointer");
return *p;
}

The compiler can cache these computed fat pointers, use them
internally, pass them around, or "thin" them (by simply taking
p.current) at any time as needed for compatibility with code compiled
without the fat pointers. But there is a significant runtime cost
whenever going from "thin" to "fat", typically greater than that
added by verifying p.current as needed. (Note also that malloc()
must manipulate the, or a, fat-pointer table, if pointers that come
out of malloc() are ever to be looked-up. Thus, you *can* link
against various thin functions, but never against "thin malloc".)

Flash Gordon · Nov 23, 2007

CBFalconer wrote, On 23/11/07 03:15:

I maintain that 'fat' pointers are incompatible with C in the first
place.

I disagree. They are not easy to implement, but there are ways around
all of the problems if you work hard enough at it.

Remember that pointers can point to structures, arrays,
single objects, etc. Other pointers can be created by performing
arithmetic on pointers (effectively indexing). The purpose of
'fat' pointers is to allow complete range checking at any time
(correct me if this is wrong).

It is to allow range checking.

It only takes one example of
failure to prevent it all.

Only if the example cannot be detected and the range checking disabled
or worked around in that example.

Let us assume we have "int foo[10];" declared. Now this is an
array, and entirely usable within its scope. However, to pass it
elsewhere, we have to convert it to "int *foobar" and lose the 10
dimension. What happens if we try to retain that dimension? Well,
maybe we need to pass out a single entity in the foo array, by
"foobar[2]". There is no proscription against the destined routine
regaining the original pointer via "foobarptr - 2". Yet the passed
foobar necessarily specified a valid array length of 1, allowing
only indexing by 0. -2 seems not to fit!

That one is easy to deal with. The fat pointer includes the information
that it points to the second element of a 10 element array, then it
knows that you can go back 2 elements.

Things get worse when we try other things, such as passing a char
pointer, and making valid (or invalid) assumptions as to what was
legally accessible.

A sweeping statement like that does nothing for your argument.

Don't forget that these pointers must arise from allocated memory,
static memory, and auto memory. Some operations (such as free)
become invalid on incremented/decremented pointers.

The implementation does not have to worry about what happens if you (the
programmer) do something the C standard states is undefined, since as
you know under those situations anything the implementation does is
"correct".

Unless such an improvement can handle EVERY type of occurrence, it
is better to simply not provide the 'improvement'.

No, if it catches some instances that would otherwise be security holes
then it is useful even if it does not catch everything. Otherwise we
might as well disable all memory protection and go back to the days of
one application being able to overwrite memory belonging to another.

Now the poor
programmer may even have to think.

It is not to protect the programmer from having to think, it is to
reduce the chance of the user from having an unknown security hole
and/or program get away with silently producing incorrect results
instead of crashing. Just like other memory protection facilities which
have been popular on all decent systems for more years than I have been
programming.

gets() - dangerous?	302	Dec 24, 2005
Why is it dangerous?	184	Aug 10, 2008
Safe version of gets	45	Aug 12, 2005
CIN Input #2 gets skipped, I don't understand why.	1	Feb 9, 2023
Coding Problem/Challenge website that limits your resource usage	2	Aug 16, 2022
Dangerous UDP Checksum code ?!?	11	Nov 8, 2008
What is the problem in a programmer point of view ?	3	Mar 8, 2024
gets and puts warning	1	Feb 27, 2009

why the usage of gets() is dangerous.

Keith Thompson

Flash Gordon

Ben Bacarisse

RoS

Richard Bos

Tor Rustad

Tor Rustad

Malcolm McLean

Malcolm McLean

CBFalconer

Keith Thompson

RoS

Richard Bos

Flash Gordon

Flash Gordon

Flash Gordon

CBFalconer

santosh

Chris Torek

Flash Gordon

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads