why the usage of gets() is dangerous.

H

Hallvard B Furuseth

Harald said:
We'll declare that pointer cosist of three values - the address, the
start of the object, and the end of the object.
(..)
struct S {
char c[10];
int i;
} s;
(...)
A pointer to s.c would have to store the end as &s.c[10].

Okay, so then you can't get back the original &s?

If I remember correctly, (struct S *)s.c is indeed undefined. Or it
is defined but ((struct S *)s.c)->i and (struct S *)s.c + 1 are not.
Probably for similar reasons that the 'struct hack' is undefined.
 


$)CHarald van D)&k

Harald said:
We'll declare that pointer cosist of three values - the address, the
start of the object, and the end of the object.
(..)
struct S {
char c[10];
int i;
} s;
(...)
A pointer to s.c would have to store the end as &s.c[10].

Okay, so then you can't get back the original &s?

If I remember correctly, (struct S *)s.c is indeed undefined. Or it is
defined but ((struct S *)s.c)->i and (struct S *)s.c + 1 are not.
Probably for similar reasons that the 'struct hack' is undefined.

Quoting from 6.7.2.1p13:
"A pointer to a structure object, suitably converted, points to its
initial member (or if that member is a bit-field, then to the unit in
which it resides), and vice versa."

The "and vice versa" means that a pointer to the initial member of a
structure can be converted back to a pointer to that structure object,
does it not? Technically, the initial member of the structure is an array
of char, not a char, but I believe I addressed that in my previous
message.
 
M

Malcolm McLean

Flash Gordon said:
The drug dispenser reads a file on a regular basis to check what it should
be dispensing. At 3AM it come across an over-length line and the program
abort. The patient then does not get the drugs keeping him/her alive and
dies.

So by using a "safe" gets you have just made it impossible to safely
handle out-of-range input whereas it is easy to do with fgets.
The computer blows a fuse. The patient dies. No different to aborting with
an error message. All safety critical systems have to consider the
possibility of components failing. As long as they fail, it should be OK.
The danger comes when they work wrongly.
 
F

Flash Gordon

Malcolm McLean wrote, On 17/11/07 23:11:
The computer blows a fuse. The patient dies. No different to aborting
with an error message.

The big difference is that one can easily be avoided in SW by the simple
expedient of ignoring your advice and using fgets instead of gets.
All safety critical systems have to consider the
possibility of components failing.

They also have to minimise the probability of any given component
failing. Part of doing this is doing your best to avoid the SW failing,
and part of that is using not using gets, another part is testing
out-of-bounds input which will catch incorrect usage of fgets.
As long as they fail, it should be
OK. The danger comes when they work wrongly.

Which is an excellent reason for not using gets.

I assume that as you did not object to my points about fgets being safe
to use in a safety critical application you agree that it is safe? If
not why did you snip my points without addressing them?
 
C

CBFalconer

$)CHarald van D)&k said:
gets is a standard library function. Standard library functions
need not be written in standard C, and may make use of highly
implementation-specific features.

If you disagree, please give even just a single example of an
implementation of fopen or longjmp written purely in standard C.

Need not, not must not. gets() (and any replacement) can be
written in standard C and fully satisfy all the specifications. As
evidenced by my ggets() and others.

I wish the library had been divided into things implementable
within the language, and things requiring extensions. Although
actually doing it might have been a horrendous task, and it is now
too late to try.
 
K

Keith Thompson

CBFalconer said:
Need not, not must not. gets() (and any replacement) can be
written in standard C and fully satisfy all the specifications. As
evidenced by my ggets() and others.

Yes, a conforming gets() can be written in standard C. A "safe" gets()
cannot, but in theory a "safe" gets *could* be written in non-standard
C, or in non-C (for example, as Malcolm suggests, using fat pointers to
avoid writing past the end of the buffer).

This is not a defense of gets(), just an observation that a safe gets()
might be theoretically possible in some circumstances.
I wish the library had been divided into things implementable
within the language, and things requiring extensions. Although
actually doing it might have been a horrendous task, and it is now
too late to try.

What would be the benefit of this division? Programmers don't really
need to know how standard library functions are implemented, or at least
shouldn't depend on such knowledge. Implementers can figure out easily
enough which functions can be implemented in standard C, but are allowed
to use non-standard C for greater efficiency, or for any other reason.
 
M

Malcolm McLean

CBFalconer said:
Need not, not must not. gets() (and any replacement) can be
written in standard C and fully satisfy all the specifications. As
evidenced by my ggets() and others.
On top of fgetc().

I wish the library had been divided into things implementable
within the language, and things requiring extensions. Although
actually doing it might have been a horrendous task, and it is now
too late to try.
The standard library fits into a small book. I don't think it would take
more than a few minutes to go through ticking each one.

assert (portable)
the isxxxx character macros (portable*)
tolower (portable*)
toupper (portable*)
math.h functions (portable but)
setjmp (compiler-specific)
longjmp (compiler-specific)
stdarg,h (compiler-specific)
stdio.h - all functions that take a FILE parameter need system calls, as do
those whicht ake an implicit stdin / stdout.
sprintf, vsprintf - (portable with one niggle)
remove, rename, tmpname - (system call)
tmpfile - interesting one. Needs either a malloc(0 call or access tot he
filesystem.
atof, atoi, atol, strtod, strtol, strtoul (portable)
rand, srand (portable)
malloc family (in reality system, but theoretically portable except for a
niggle)
abort, exit (system / compiler-specific|)
atexit - (portable)
system - (system)
getenv - (system)
bsearch, qsort - (portable)
abs, labs - (portable)
div, ldiv - (anyone heard of these? compiler-specific)
string functions - (portable, in reality compiler-specific)
clock - (system)
time - (system)
difftime, mktime, ctime, gmtime, localtime, strftime - (portable as logn as
you know the internal time structure)

There, more or less done it.

The problem is the division doesn't really work. sqrt() was originally a
portable function. Now most larger machines have dedicated root-finding
hardware, which in practise you must use.
tolower and toupper, and the isxxxx macros can be written in C, but to
implement with any efficiency you need to know the execution character
encoding.
Some functions, like longjmp(), do not need to make system calls, but cannot
be implemented without an intimate knowledge of the compiler. sprinf() can
be written completely portably, except for the %p field. The string
functions can be portable, in reality you'd want to take advantage of the
alignment. malloc() realistically needs a system call on all but the
smallest machines that run only one program, but you can write using a
global arena, except that there is no cast iron way of ensuring correct
alignment in portable C.
 
R

Richard Heathfield

Keith Thompson said:

This is not a defense of gets(), just an observation that a safe gets()
might be theoretically possible in some circumstances.

Not in portable code, since portable code cannot assume that it is only
ported to implementations where a buffer-protecting gets implementation
exists.

Programmers don't really
need to know how standard library functions are implemented, or at least
shouldn't depend on such knowledge.

Precisely my point. And yet they would have to depend on such knowledge if
they wished to use the proposed "safe gets".
 
M

Malcolm McLean

Richard Heathfield said:
Keith Thompson said:


Precisely my point. And yet they would have to depend on such knowledge >
if they wished to use the proposed "safe gets".
You mean to say that you don't trust your compiler vendor to provide a safe
function, if such can be implemented?
 
C

CBFalconer

Keith said:
CBFalconer wrote:
.... snip ...


What would be the benefit of this division? Programmers don't
really need to know how standard library functions are implemented,
or at least shouldn't depend on such knowledge. Implementers can
figure out easily enough which functions can be implemented in
standard C, but are allowed to use non-standard C for greater
efficiency, or for any other reason.

Oh? Then why the frantic efforts right here on c.l.c to implement
sizeof? To cast pointers to integers? etc. Some of these
'possibilities' have fairly subtle impediments.
 
R

Richard Heathfield

Malcolm McLean said:
You mean to say that you don't trust your compiler vendor to provide a
safe function, if such can be implemented?

I trust my standard library supplier (which may or may not be the same
entity as that which supplies the compiler or interpreter) to recognise
that nobody with a brain uses gets anyway, so there's no point in his or
her wasting time implementing a safe version, and his or her time would be
better spent on something more productive.
 
R

Richard Tobin

CBFalconer said:
Oh? Then why the frantic efforts right here on c.l.c to implement
sizeof?

As far as I can tell, most recent questions about this seem to be the
result of one particular C programming course, in India.

-- Richard
 
K

Keith Thompson

Malcolm said:
You mean to say that you don't trust your compiler vendor to provide a
safe function, if such can be implemented?

Implementing a safe gets() would require "fat pointers", which would
have substantial performance effects on the entire implementation
(unless there's hardware support). A "safe" gets() is theoretically
possible, but not realistic. I trust my compiler vendor to be aware of
that -- and I trust myself not to use gets() in the first place.
 
K

Keith Thompson

CBFalconer said:
Oh? Then why the frantic efforts right here on c.l.c to implement
sizeof? To cast pointers to integers? etc. Some of these
'possibilities' have fairly subtle impediments.

Um, what does trying to re-implement the sizeof operator (which is a
fairly useless thing to do) have to do with implementing standard
library functions?

Sure, there are subtleties in the implementation of standard library
functions. I just don't see how dividing them as you suggest is helpful
or relevant.

What problem would such a division solve?
 
F

Flash Gordon

Malcolm McLean wrote, On 18/11/07 08:39:
On top of fgetc().

Or fgets. However, I accept your point.
The standard library fits into a small book. I don't think it would take
more than a few minutes to go through ticking each one.

assert (portable)
the isxxxx character macros (portable*)

No, they require system specific knowledge.
tolower (portable*)
toupper (portable*)
math.h functions (portable but)
setjmp (compiler-specific)
longjmp (compiler-specific)
stdarg,h (compiler-specific)
stdio.h - all functions that take a FILE parameter need system calls, as
do those whicht ake an implicit stdin / stdout.

At least two of the IO functions need to use a system specific method,
but the rest can be built on top of the two selected. The system
specific method is not necessarily a system call (on DOS it could be
accessing the keyboard and video HW directly, this might or might not be
a *good* choice but it is possible).
sprintf, vsprintf - (portable with one niggle)
remove, rename, tmpname - (system call)

Or other system specific method.
tmpfile - interesting one. Needs either a malloc(0 call or access tot he
filesystem.

I don't see how malloc(0) helps. You need access to the file system
because you are opening a file that does not already exist so you need
to either get the OS to do it or makes sure that the file you are
opening does not already exist.
atof, atoi, atol, strtod, strtol, strtoul (portable)
rand, srand (portable)
malloc family (in reality system, but theoretically portable except for
a niggle)
abort, exit (system / compiler-specific|)
atexit - (portable)

Not really, it depends on how the startup code works.
system - (system)
getenv - (system)
bsearch, qsort - (portable)
abs, labs - (portable)
div, ldiv - (anyone heard of these? compiler-specific)

You could write them in standard C. It might not be the most efficient
method, but...
string functions - (portable, in reality compiler-specific)
clock - (system)
time - (system)
difftime, mktime, ctime, gmtime, localtime, strftime - (portable as logn
as you know the internal time structure)

You you need to know the internal structure then it is not portable.
There, more or less done it.

Apart from the bits others disagree with, which shows it is not easy.
The problem is the division doesn't really work. sqrt() was originally a
portable function. Now most larger machines have dedicated root-finding
hardware, which in practise you must use.

That has applied to a number of the maths functions on some
implementations for a lot of years.
tolower and toupper, and the isxxxx macros can be written in C, but to
implement with any efficiency you need to know the execution character
encoding.

Others in the group need knowledge of the execution character set to
implement. For example, how do you implement isprint without knowing
which characters are printable?
Some functions, like longjmp(), do not need to make system calls, but
cannot be implemented without an intimate knowledge of the compiler.
sprinf() can be written completely portably, except for the %p field.

You can implement %p portably, you just write the pointer a byte at a
type for sizeof(void*) bytes.
The string functions can be portable, in reality you'd want to take
advantage of the alignment. malloc() realistically needs a system call
on all but the smallest machines that run only one program, but you can
write using a global arena, except that there is no cast iron way of
ensuring correct alignment in portable C.

I agree with what I believe your main point is, i.e. that trying to
split the functions in to ones that can be implemented portably and ones
that can't is pointless. My disputes of some of your categorisations
just show that it is not as easy to do as some people might think.
 
M

Malcolm McLean

Richard Tobin said:
As far as I can tell, most recent questions about this seem to be the
result of one particular C programming course, in India.
I'd expect an Indian to ask why sizeof(void) does not equal zero.
 
R

Richard Bos

CJ said:
This sort of absolute prohibition on gets() is completely wrong-headed.

No, it is the only sane attitude.
It's completely fine to use gets(), as long as you use it properly. To
use it properly, *you* need to be in control of the data that gets()
reads.

In other words, you need to have
- a lock on the door
- no network connection
- bondage gear for all your cow-orkers
and you yourself need to eb a _perfect_ typist. A single typo could undo
you.

Richard
 
T

Tor Rustad

Keith said:
No, he doesn't. You're asking for more than Malcolm claimed.

Malcolm didn't claim that it could be made safe within the gaurantees
provided by the C standard. His claim is a much more modest one,
that it's possible for a (hypothetical) C implementation to provide a
"safe" gets() function, and I believe he's correct.

I don't think so.
His solution requires the use of "fat pointers", which are not

Methinks, fat pointers break pointer arithmetic and thus require at
least a new language dialect.

Also, the buffer passed to gets() may not be malloc'ed, but can be an
array, or even a sub-array.
I believe Malcolm's claim as stated is correct. It's not particularly
useful, but he didn't claim that it was; I believe it was merely an
intellectual excercise, not a serious proposal.

I can't see how Malcolm's claim can be correct, the only way.... is IF
the implementation restrict gets() buffer writes to some hard upper
limit, let say one less than MAX_GETS_WRITE, then the

char buf[MAX_GETS_WRITE];

gets(buf);

would be safe.
 
J

jameskuyper

Tor said:
Keith Thompson wrote: ....

Methinks, fat pointers break pointer arithmetic and thus require at
least a new language dialect.

I don't believe that's the case; could you explain what breakage you
would expect?
 
K

Kenny McCormack

I don't believe that's the case; could you explain what breakage you
would expect?

While I agree with the sentiment behind Malcolm's idea (and I'm of the
opinion that most of the naysaying in this thread is of the usual
"usual negative comments" variety as noted by Jacob - i.e., the usual
nonsense), I do see this as being a bit difficult. It boils down to: Is
it possible to keep enough information in the system so that we can
know, for any possible pointer and/or pointer value, how much valid
memory there is after that pointer?

I can't think of any counter-examples off-hand, but that doesn't mean
there aren't any.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,214
Latest member
JFrancisDavis

Latest Threads

Top