scanf (yes/no) - doesn't work + deprecation errors scanf, fopen etc.

  • Thread starter =?ISO-8859-1?Q?Martin_J=F8rgensen?=
  • Start date
J

Jordan Abel

Could someone explan to why me gets() is worse than any other function
that cant itself limit its input/working space to the size of the
buffer provided? e.g scanf, strtok. Or are all these function now
moved and "non standard"?

scanf can be limited, %.40s and such.

strtok doesn't take a buffer.

The length needed by sprintf [which you omitted] can in principle be
calculated in advance for a given format string, or, in case of %s,
given the format string and arguments.
 
J

Jordan Abel

I was referring more to the fact its buggy in a reentrant system
since its internal databuffer is static. In addition it can continue
to read over "illegal" memory if the token isnt found before the end
of the passed in string buffer

Not if it's passed a string rather than an arbitrary char * [it will
stop at the null terminator, the same as _any_ str... function]
in the same way almost virtually any C func does. Dont know why it
cropped into my head.

But anyway, regardless, why the pressure on the other function which
is undoubtedly in hundreds of thousands of lines of legacy code?

gets is in a class by itself in that it _cannot_ be made safe with any
amount of effort. [strtok is safe if you use a lock for it or only use
it with single-threaded programs, and of course don't call it from any
libraries that might be called while you're in the middle of using it on
something]
 
R

Richard G. Riley

I was referring more to the fact its buggy in a reentrant system
since its internal databuffer is static. In addition it can continue
to read over "illegal" memory if the token isnt found before the end
of the passed in string buffer

Not if it's passed a string rather than an arbitrary char * [it will
stop at the null terminator, the same as _any_ str... function]

Clearly. I was referring to dodgy data since we are talking dodgy
data. It could include a poorly written input data parser.
in the same way almost virtually any C func does. Dont know why it
cropped into my head.

But anyway, regardless, why the pressure on the other function which
is undoubtedly in hundreds of thousands of lines of legacy code?

gets is in a class by itself in that it _cannot_ be made safe with any
amount of effort. [strtok is safe if you use a lock for it or only use
it with single-threaded programs, and of course don't call it from any
libraries that might be called while you're in the middle of using it on
something]

The main objection being that the user can type over the buffer?
Ok. Simple answer.
 
M

Michael Wojcik

[regarding placing constants on the LHS of ==]

It is also far more natural to read.

Since C source code is not found in nature, and reading is learned
behavior, this seems extremely dubious.

If you believe this then there really is no point discussing
further.

I'm not sure there's any point in further discussion on this topic -
without actual evidence being presented - regardless of what I
believe. However...
We read from left to right here and in C code and ti has been
the concention from day one in 99% of C code in industry.

I read "if (5 == a)" from left to right. How do you read it?

Nothing in C makes the order of the operands of the == operator
significant. I don't see anything in English, or any other natural
language, that does so either (and I have a considerable academic
background in the study of English); nor do I see why it would
matter if it did.
Are you trolling?

No, and I don't believe I've written anything here to suggest that
I am. What could your point possibly be here?
The thread had moved to that.

I'm sorry; I must have missed the solicitation of your personal
opinion on matters of style in a previous post. Care to cite it?
What is your contribution here other
than to be purposely obnoxious and trying to put forward some Micky
Mouse intellectual hyperbole about how "natural reading of C" doesnt exist?

Next time, try reading for comprehension.

My contribution, as ought to be obvious, was to note that this
discussion has been enjoined many times, and rarely are any facts
presented.

My comments regarding natural readings contain no instances of the
rhetorical trope hyperbole; they are precise and correct. Perhaps
you do not know what the word "hyperbole" means, or perhaps you are
simply flailing about because you do not have a real argument.

If you dislike intellectual points, I'd suggest that comp.lang.c is
a poor venue for your tastes; no doubt lower-brow newsgroups are
available.
How flamebait. I was supoporting a point.

Offering your personal opinion does not constitute an argument, and
so does not support any claim.
As you appear to have proven.

Thank you.
There were no "religious" arguments :

This *is* a religious topic, as numerous threads on comp.lang.c have
demonstrated. You can't wish that away.
only perferred methods with reasons to support them.

"an abomination IMO" is not a supporting argument.
 
R

Richard G. Riley

Thank you.

I strongly disagree wiht your assertions. However, I can now see it is
a religous thing, so, in hindsight shouldnt have argued the
point. Thanks for remaining civil
 
K

kuyper

Richard said:
Could someone explan to why me gets() is worse than any other function
that cant itself limit its input/working space to the size of the
buffer provided? e.g scanf, strtok. Or are all these function now
moved and "non standard"?

Of course I see the issues with it : I just wonder why it is being
picked on.

With most of the other library functions, you can control what the
inputs are from within the program before you pass them to that
function. Most of the possible problems you're referring to can be
avoided by null-terminating the string, or by choosing the version of a
function that takes a maximum size argument. scanf() is an exception to
this, in the same way that gets() is, because they read a number of
bytes from the standard input which is not always under the program's
contol. The tricky directives are %s and %[, because those write to a
piece of memory whose size is determined by the inputs. However, they
can both be controlled by including a maximum field width. You have no
such control over gets().
 
K

Keith Thompson

Richard Bos wrote:
...

And even after gets() is removed from the standard, some new code will
continue to use gets(). That will be possible, because few
implementations can afford the consequences of failing to continue
supporting it as an extension to the standard, if only because of
legacy code. I'm not disagreeing with the goal; l'm just pointing out
that relief from the problems caused by gets() will not occur until
several decades after release of the first version of the standard from
which it is removed.

Yes, which implies that the sooner we start to get rid of it, the
sooner it will be gone.
 
F

Flash Gordon

Richard said:
What do you mean about strtok?
If you call strtok with two string pointers as arguments,
you have defined behavior. Teh Enb.

I was referring more to the fact its buggy in a reentrant system
since its internal databuffer is static. In addition it can continue
to read over "illegal" memory if the token isnt found before the end
of the passed in string buffer
Not if it's passed a string rather than an arbitrary char * [it will
stop at the null terminator, the same as _any_ str... function]

Clearly. I was referring to dodgy data since we are talking dodgy
data. It could include a poorly written input data parser.

The point is you can write a non-dodgy input parser to protect all the
functions other than gets. With gets, however good the rest of your code
is, even if you achieve the impossible of the rest being perfect, gets
can *still* overrun the buffer.
in the same way almost virtually any C func does. Dont know why it
cropped into my head.

But anyway, regardless, why the pressure on the other function which
is undoubtedly in hundreds of thousands of lines of legacy code?
gets is in a class by itself in that it _cannot_ be made safe with any
amount of effort. [strtok is safe if you use a lock for it or only use
it with single-threaded programs, and of course don't call it from any
libraries that might be called while you're in the middle of using it on
something]

The main objection being that the user can type over the buffer?
Ok. Simple answer.

No, the objection is the programmer can do absolutely *nothing* to
prevent the buffer being overflowed. With any other function, the
programmer *can* protect against bad data.
--
Flash Gordon
Living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidlines and intro -
http://clc-wiki.net/wiki/Intro_to_clc
 
M

Micah Cowan

Richard G. Riley said:
Could someone explan to why me gets() is worse than any other function
that cant itself limit its input/working space to the size of the
buffer provided? e.g scanf, strtok. Or are all these function now
moved and "non standard"?

Of course I see the issues with it : I just wonder why it is being
picked on.

scanf() can be made to limit its input space, it's just that the
mechanism for doing it is hardly practical (consider scanf("%10s",
which will store up to a maximum of 10 characters). strtok() does not
have any buffer-overflow problems; it just has the poor taste to
modify the string given it: but occaisionally that's okay.

gets() is the only standard function that is /impossible/ to use
correctly. Even atoi() can be used safely in some few situations
(where one has control or certain knowledge about the input string),
and at least there are no existing implementations which do anything
dangerous when overflows occur in atoi(). But there is no
safe implementation of gets().

-Micah
 
D

Douglas A. Gwyn

Richard G. Riley said:
Of course I see the issues with it : I just wonder why it is being
picked on.

Because it's trendy.

Buffer overruns aren't caused by gets; indeed most of the ones
I've seen reported did not involve gets at all. People tend to
look for simplistic solutions to problems rather than for
correct solutions, especially when correct solutions require
educating people.
 
D

Douglas A. Gwyn

Flash said:
The point is you can write a non-dodgy input parser to protect all the
functions other than gets. With gets, however good the rest of your code
is, even if you achieve the impossible of the rest being perfect, gets
can *still* overrun the buffer.
...
No, the objection is the programmer can do absolutely *nothing* to
prevent the buffer being overflowed. With any other function, the
programmer *can* protect against bad data.

Safe input parser components of systems almost certainly
will use only a subset of the other standard functions.
That doesn't mean that the other functions should be
deprecated either. Basically, if {X} does something
harmful, where {X} could be any of zillions of things,
then: don't do {X}. Or. more specifically, if doing {Y}
is harmful in context {Z} then don't do {Y} in context
{Z}. E.g., I don't use gets() in any context where I
don't have sufficient control over its input, nor should
anyone else. That doesn't mean that it should *never*
be used. And whether or not it is ever used in newly
written code, as a legacy interface it needs an available
specification.

The tendency to try to force somebody's idea of desired
behavior upon everybody by legislating it, is a bankrupt
idea the infeasibility of which should be apparent. If
you want programs to operate correctly, you won't get
that, nor even a significant step toward that, by any
amount of tweaking language/library specifications.
Those are just general-purpose tools with no inherent
notion of what the application needs to be doing.
 
K

Keith Thompson

Douglas A. Gwyn said:
Because it's trendy.

It's been "trendy" for as long as I can remember.
Buffer overruns aren't caused by gets; indeed most of the ones
I've seen reported did not involve gets at all. People tend to
look for simplistic solutions to problems rather than for
correct solutions, especially when correct solutions require
educating people.

Nobody is claiming that deprecating gets() would avoid all buffer
overruns, or even a majority of them. The claim is that gets() is
inherently dangerous in a way that no other function in the C standard
library is.

Both gets() and strdup() were widely implemented and used. gets() was
standardized, but strdup() wasn't. Why is that?
 
F

Flash Gordon

Douglas said:
Because it's trendy.

Buffer overruns aren't caused by gets; indeed most of the ones
^^^^
I've seen reported did not involve gets at all.

If *any* buffer overruns have involved a user overrunning a buffer by
entering more data than fits in the buffer passed to gets then I would
say that is proof that the use of gets *does* cause buffer overruns.
> People tend to
look for simplistic solutions to problems rather than for
correct solutions, especially when correct solutions require
educating people.

By this argument no simple solution should *ever* be used, so do you
think the laws requiring seat belts be warn in cars should be abolished
since those are simplistic solutions. Admittedly they are simplistic
solutions that save lives, but they certainly have not solved the
problem of people being killed in car crashes so they are not the
correct solution.

Most real life problems don't have a single correct solution.

So why do you consider it so wrong to remove gets from the standard and
so push towards at least one of the holes being plugged? It won't
instantly break all existing code that uses gets, just as removing
implicit int (which I would strongly argue is far *less* harmful) didn't
suddenly stop people from using all the vast amounts of C code that uses
implicit int.

So if you can remove something that can easily be used harmlessly, why
can you not remove something that is almost impossible (if not actually
impossible) to be used safely?
--
Flash Gordon
Living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidlines and intro -
http://clc-wiki.net/wiki/Intro_to_clc
 
R

Richard Bos

Richard G. Riley said:
Could someone explan to why me gets() is worse than any other function
that cant itself limit its input/working space to the size of the
buffer provided? e.g scanf, strtok. Or are all these function now
moved and "non standard"?

Because, unlike all those other functions, you _cannot_ take precautions
within the realm of C to make sure that it is used safely in you case.

For example, yes, you can use scanf() with an unlimited "%s" format (and
IMO that misfeature should go as badly as gets()); but you can also use
it quite safely (if awkwardly, but hey, if you like pain, it's your
party) with "%<some_number>s".
As for strtok(), if you don't use it reentrantly or on a non-writable
string - both precautions _you_, the programmer, can arrange for - it's
quite safe. A pain in the arse - but safe.

gets(), OTOH, is _never_ safe. You cannot - fundamentally can not ever -
tell it to limit itself. Give it a buffer of a million chars, and some
joker will feed it a core file of a million and ten. And you will not
have prevented the buffer overrun that results, because with gets(), and
only with gets(), you just can not.

Richard
 
R

Richard Bos

Douglas A. Gwyn said:
Because it's trendy.

I find it scary that someone this condescending can be on the Standard
committee.
Buffer overruns aren't caused by gets; indeed most of the ones
I've seen reported did not involve gets at all.

I find it even scaries that someone who doesn't remember the Morris worm
can be on the Standard committee.

Richard
 
R

Richard Bos

Richard Bos wrote:
...

And even after gets() is removed from the standard, some new code will
continue to use gets(). That will be possible, because few
implementations can afford the consequences of failing to continue
supporting it as an extension to the standard, if only because of
legacy code. I'm not disagreeing with the goal; l'm just pointing out
that relief from the problems caused by gets() will not occur until
several decades after release of the first version of the standard from
which it is removed.

True. But it's a start. As long as gets() has the official OK of the
Standard, the kind of newbie programmer who doesn't look beyond what he
was taught will see that It Is Official, Therefore It Is Good.

Richard
 
R

Richard Bos

Douglas A. Gwyn said:
Safe input parser components of systems almost certainly
will use only a subset of the other standard functions.
That doesn't mean that the other functions should be
deprecated either. Basically, if {X} does something
harmful, where {X} could be any of zillions of things,
then: don't do {X}. Or. more specifically, if doing {Y}
is harmful in context {Z} then don't do {Y} in context
{Z}. E.g., I don't use gets() in any context where I
don't have sufficient control over its input, nor should
anyone else.

If you have sufficient control over the input to gets(), you are
probably in violation of sexual laws in several states of the USA.

Richard
 
N

Niklas Matthies

gets(), OTOH, is _never_ safe.

Well, that's not necessarily true.

Consider an implementation which provides saturation semantics for
pointer arithmetics such that incrementing pointers into (non-sub-)
objects beyond one-past-the-end yields one-past-the-end again, and
which also ignores writes through such one-past-the-end pointers and
yields zero for corresponding reads.

Using gets() on such an implementation would be safe. It could
actually be quite idiomatic on such an implementation to use functions
like gets(), since the buffer size is implicit and hence cannot be
gotten wrong by having to also explicitly pass it to the respective
function.

-- Niklas Matthies
 
R

Richard Bos

Niklas Matthies said:
Well, that's not necessarily true.

Consider an implementation which provides saturation semantics for
pointer arithmetics

Considering a debugging implementation, or one which has a fixed length
hardware input buffer of 80 chars, one can also imagine that in those
cases gets() would be safe.

Until someone decides that your program is just what they need on their
MS-Windows computer, and recompiles it there.

One big problem with external safeguards for gets() is that there is
nothing in the Standard that requires them, suggests them, or even lets
you find out whether or not they're in place. Nor can there be. This
means that any code which uses gets() must rely, blindly, on what it
gets from the outside - and that information may be unreliable.

Another big problem is that code will be ported, programs will be
recompiled, possibly on another system, possibly on the same system with
different compiler options - IOW, a single compile of some code which
uses gets() _may_ be safe, but that code itself never is.

Richard
 
R

Richard G. Riley

Considering a debugging implementation, or one which has a fixed length
hardware input buffer of 80 chars, one can also imagine that in those
cases gets() would be safe.

Until someone decides that your program is just what they need on their
MS-Windows computer, and recompiles it there.

One big problem with external safeguards for gets() is that there is
nothing in the Standard that requires them, suggests them, or even lets
you find out whether or not they're in place. Nor can there be. This
means that any code which uses gets() must rely, blindly, on what it
gets from the outside - and that information may be unreliable.

Another big problem is that code will be ported, programs will be
recompiled, possibly on another system, possibly on the same system with
different compiler options - IOW, a single compile of some code which
uses gets() _may_ be safe, but that code itself never is.

It is simple enough to see the issue with it : the main issue is that
it is there and needs to remain there to support legacy code. I dont
think the standards commitee will pay for millions of lines to have
system specific handwritten gets() to replace the old one or, god
forbid, to change all the calls themselves to do boundary checks and
use something more robust.

What is the process for C to depreciate something like this? Is it a
new flag to switch it on?

e.g

ifdef OLD_BUFFER CODE
/* declare gets */


Like a lot of buggy old stuff, the people who used it were often not so
stupid and made big enough buffers to cope with the declared range of
inputs. Perfect? No. But worked for them at the time. Would I suggest
using it afresh? Of course not.

Best bet would be to declare an all compiler warning that the
function is simply not safe.

Best non practical solution is to remove it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,905
Latest member
Kristy_Poole

Latest Threads

Top