Why "gets" has not been deprecated yet?

M

Marcus

We all know that the "gets" function from the Standard C Library (which
is part of the Standard C++ Library) is dangerous. It provides no
bounds check, so it's easy to overwrite memory when using it, and
impossible to guarantee that it won't happen.

Therefore, i think it's surprising that this function has not been
deprecated.
The C++98 Standard keeps it from the C89 standard.
The C99 Standard has kept it :-o.

Now, the C standard committee is working on safe functions (the ones
that end with "_s") for the C Standard Library. I don't know if they
are going to deprecate the dreaded "gets". Even if not, i think it
would be a good idea to deprecate it in the next C++ standard, since
C++ has better ways to accomplish the same task (getline). It's too
early to expect the safe functions (*_s) in the C++ Standard, but
getting rid of "gets" is not that hard, isn't it? Programs that use it
are broken anyway. Also, C++ has deprecated other features from C just
because C++ has better alternatives (static meaning "internal linkage"
and headers ending in ".h").

Opinions? Should this message be posted on comp.std.c++?
 
J

Josh Mcfarlane

Marcus said:
Opinions? Should this message be posted on comp.std.c++?

You probably want to post it over there, as people on here generally
focus more on application, and less on changing / debating the
standard.

Be careful with the assumption that all things using gets are
inherantly flawed however. =P
 
R

Ron Natalie

Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").
 
J

Josh Mcfarlane

Ron said:
Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").

Well, ya, my point was, if you can confine to arguments within their
range, they do function (at least to my knowledge). Good? No, but still
functionable.

Anywho, let's go throw this at the std people and see if it can get any
support.
 
G

Gaijinco

Many functions in the C library have undefined behavior when given
arguments outside their range. They are by and large a piece of
inherited crap that should have never received standard status (the
STDIO part of the library is the most misdesigned malodious thing
ever foisted on the community, it was derived from a misnamed
piece of crap from an ancient UNIX project called the "portable
IO library").

Wow! I had never hear about that, can you explain a little more what
are the problems of <stdio.h>?
 
R

Ron Natalie

Gaijinco said:
Wow! I had never hear about that, can you explain a little more what
are the problems of <stdio.h>?
Functions like gets that have no provisions for safety.
All the functions have arguments in different order. Some
of them have the file stream arg first, some last.
fwrite/fread have a number of records and record size number
that nobody knows what to do with other than multiply together.
It just goes on from their, the library is crap.
 
T

tony_in_da_uk

This propensity for undefined behaviour is an example of Design by
Contract (DbC): you meet the preconditions, and you get the contracted
behaviour. The philosophy says: if you stuff up, and fail to pick it
up in your testing, it's your fault and you're a pathetic excuse for a
programmer, (and probably a human being). Anyway, the point is that
DbC can work, but you have to guarantee the preconditions. For gets,
they're extreme: if you know that standard input necessarily sends
lines below a certain length, then you can use it. This is probably
only the case when standard input is coming from some other source that
you control. For example, you might write a filter that works on some
fixed-length records, and is designed to be used in a pipeline ala
(UNIX) "cat file | filter" or (DOS) "type file | filter". Who's to say
that you don't know what you're doing well enough to guarantee the line
length precondition? It's your own call whether you use it.

FWIW, I dislike DbC and agree that gets should hardly ever be used,
would happily consider that it should never be used in new code, but
wouldn't go to the extent of saying that it must never be used and it's
worth breaking existing code using it. More generally, the stdio
library has proven itself a well-designed bit of work, in that while
it's error-proneness been the cause of innumerable errors, it's
concision, usability and flexibility has supported innumerable systems
that do useful work. If you think you can write better in C, go ahead
and see if anyone wants to use your creations.... One of the
compromises of C++ is that it should overwhelmingly be a superset of C,
with benefits in porting, skills transfer etc..

Tony
 
R

Rolf Magnus

Josh said:
Well, ya, my point was, if you can confine to arguments within their
range, they do function (at least to my knowledge). Good? No, but still
functionable.

The problem about gets is that there is no way for the program to provide
arguments that are really 100% safe. gets will produce a buffer overflow if
the buffer you provided isn't large enough for the incoming data. There is
no (portable) way to make the buffer big enough in every case, since the
program can't control the amount of data that is read. This lack of control
leads me to the conclusion that gets() can be seen as generally invoking
undefined behavior.
 
P

Pete Becker

Rolf said:
The problem about gets is that there is no way for the program to provide
arguments that are really 100% safe. gets will produce a buffer overflow if
the buffer you provided isn't large enough for the incoming data. There is
no (portable) way to make the buffer big enough in every case, since the
program can't control the amount of data that is read. This lack of control
leads me to the conclusion that gets() can be seen as generally invoking
undefined behavior.

That's too broad. The behavior of gets is undefined if the input in fact
is too large for the buffer. If it isn't, the behavior is well defined.

That's not a comment on its utility, but on how to apply technical terms.
 
N

Neil Cerutti

This propensity for undefined behaviour is an example of Design
by Contract (DbC): you meet the preconditions, and you get the
contracted behaviour. The philosophy says: if you stuff up,
and fail to pick it up in your testing, it's your fault and
you're a pathetic excuse for a programmer, (and probably a
human being). Anyway, the point is that DbC can work, but you
have to guarantee the preconditions. For gets, they're
extreme: if you know that standard input necessarily sends
lines below a certain length, then you can use it. This is
probably only the case when standard input is coming from some
other source that you control. For example, you might write a
filter that works on some fixed-length records, and is designed
to be used in a pipeline ala (UNIX) "cat file | filter" or
(DOS) "type file | filter". Who's to say that you don't know
what you're doing well enough to guarantee the line length
precondition?

Crackers.
 
I

Ian Malone

Ron said:
fwrite/fread have a number of records and record size number
that nobody knows what to do with other than multiply together.
It just goes on from their, the library is crap.

fwrite and fread return the number of objects written or read,
not the number of chars. But in general you may as well
use <iostream> and friends.
 
T

tony_in_da_uk

Consider: someone writes two programs that share a header file
containing a buffer-size constant. In one program, lines are generated
and checked against this maximum length. The other program defines a
buffer based on this length, but uses gets(). The two programs may be
reasonably well synchronised, in that a change to the header triggers
rebuilds of both. Just hope they're distributed together too! This is
arguably in line with a workable (but deeply unappealing to me) DbC
philosophy. I can't say it's crackers, even though I'd like to be able
to! - Tony
 
R

Ron Natalie

Ian said:
fwrite and fread return the number of objects written or read,
not the number of chars. But in general you may as well
use <iostream> and friends.
Yeah, so? But there is no concept of reading anything other
than char's from the stream. All the function does is multiply
those two args togehter and divides by the size on return.
It's a stupid design.
 
P

Pete Becker

Rolf said:
Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon use
of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control whenther
the input fits or not. It can control the size of the buffer, but not the
amount of data coming in, so it doesn't have any way of ensuring the
well-defined behavior that you are writing about.

That's correct.
It's as if you say "the behavior is well-defined only on full moon".

No, it's not. Not being able to control input is not the same as input
always being ill-formed. For a quick and dirty one-off command line
utility I'd have no qualms about using gets.
 
R

Rolf Magnus

Pete said:
That's too broad. The behavior of gets is undefined if the input in fact
is too large for the buffer. If it isn't, the behavior is well defined.

However, the C++ standard does not specify how large the input is or may be,
and there is no way for the program to know it, so the "if it isn't, the
behavior is well defined" part is of no relevance for my program. I must
assume that the input may be too large, no matter what my program does.
That's not a comment on its utility, but on how to apply technical terms.

Ok, let's apply technical terms then:
According to the C++ standard, UB is "behavior, such as might arise upon use
of an erronous program construct or erroneous data, for which this
International Standard imposes no requirements". Applying that to your
sentence above, that means that my program has "an erronous program
construct or erroneous data", if the input is too large and is correct if
the input fits in the provided space. But my program can't control whenther
the input fits or not. It can control the size of the buffer, but not the
amount of data coming in, so it doesn't have any way of ensuring the
well-defined behavior that you are writing about.
It's as if you say "the behavior is well-defined only on full moon".
 
M

Mike Wahler

This propensity for undefined behaviour is an example of Design by
Contract (DbC): you meet the preconditions, and you get the contracted
behaviour. The philosophy says: if you stuff up, and fail to pick it
up in your testing, it's your fault and you're a pathetic excuse for a
programmer, (and probably a human being). Anyway, the point is that
DbC can work, but you have to guarantee the preconditions. For gets,
they're extreme: if you know that standard input necessarily sends
lines below a certain length, then you can use it. This is probably
only the case when standard input is coming from some other source that
you control. For example, you might write a filter that works on some
fixed-length records, and is designed to be used in a pipeline ala
(UNIX) "cat file | filter" or (DOS) "type file | filter". Who's to say
that you don't know what you're doing well enough to guarantee the line
length precondition? It's your own call whether you use it.

Who's to say that an input stream with a 'guaranteed' limit of
'record size', did not get corrupted by some outside influence,
rendering the 'guarantee' spurious? I've actually had to deal
with this issue in the real world (receiving data over an RS232
line, subject to ocassional 'noise'). My program was not able
to make *any* assumptions about the expected data stream.

'Knowing what I was doing', I knew that such 'guarantee' was
impossible to implement. 'Knowing what I was doing' meant that
it was my program's responsibility to deal with 'dirty' data
in a safe manner (e.g. discarding it, or perhaps re-acquiring it).


-Mike
 
J

Josh Mcfarlane

Rolf said:
It's always potentially being ill-formed.

That's like saying that every value that you set is potentially
invalid. If I have a set output from another digital source on the
machine, it COULD be ill-formed if the machine doesn't work as it's
suppose to, just as if a pointer set to a certain object could randomly
change from the OS not operating as it is suppose to and overwriting
that segment of memory.
 
R

Rolf Magnus

Pete said:
That's correct.

So, you think the correctness of a C++ program can depend on what the user
enters at run-time?
No, it's not. Not being able to control input is not the same as input
always being ill-formed.

It's always potentially being ill-formed.
For a quick and dirty one-off command line utility I'd have no qualms
about using gets.

That way of thinking is the reason for quite a lot of security holes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top