Why is it dangerous?

C

CBFalconer

James said:
.... snip about gets ...

Whenever I build the Index to the Fabulous Pedigree
http://fabpedigree.com/altix.htm
I do several hundred thousand gets()'s, but none of them are
"dangerous". I live with a few "dangerous" messages during the
build (although I'm sure the pedants would prefer that each of
the several hundred thousand gets()'s produced its own such
message. :)

They're all dangerous. For example, a one bit error in reading a
'\n' from your prewritten data can blow everything.

I assume you use it because of the simplicity of the call. You can
get that simplicity, together with safety, by using ggets. All you
have to do is free the returned string when you are done with it.
It is available, written in standard C, and put in the public
domain, at:

<http://cbfalconer.home.att.net/download/ggets.zip>

EX:
char *line;
...
while (0 == ggets(&line)) {
/* use the line */
free(line);
}

or you can keep the returned line as long as you need it.
 
A

Antoninus Twink

Willem said:

They might also suffer the even more traumatic experience of having you
smash through their windscreen, injuring or even killing them.

Oh come on, has this ever actually happened? It's no surprise that you
take the authoritarian view on this, but don't invent facts to support
your "case".
And possibly their own.

Pure hyperbolic melodrama!
 
B

Ben Bacarisse

CBFalconer said:
They're all dangerous. For example, a one bit error in reading a
'\n' from your prewritten data can blow everything.

I assume you use it because of the simplicity of the call. You can
get that simplicity, together with safety, by using ggets.

How does ggets overcome the problem you've just identified with the
OP's use of gets?
 
F

fjblurt

The way to get rid of the warning on gets(), if you really want to
do it, is (on FreeBSD, anyway, but I suspect it applies to other
systems that use gcc as a default compiler):

(1) Remove the call to __warn_references() in gets.c .
(2) Rebuild the C library.

Rebuilding the compiler isn't necessary.

You can also

objcopy -R .gnu.warning.gets /usr/lib/libc.a

Repeat for libc.so, etc.

It's a little surprising that 'ld' doesn't appear to have an option to
suppress these.
 
A

Andrew Poelstra

How does ggets overcome the problem you've just identified with the
OP's use of gets?

ggets() interally allocates a dynamically-resized buffer. In the case
that it runs out of memory, it returns an appropriate error condition.

On the other hand, if you really do have "complete control" over your
input, it would be cheaper to use fgets(), since you can use a fixed-
size buffer and the sizeof operator.

The choice is one between convienence and raw efficiency. Neither one
equates to using gets().
 
C

CBFalconer

Richard said:
.... snip ...

Again, I stress that there is no shame in an input routine not
detecting bit errors of this kind. But ggets was offered, by its
author, as a solution to the problem of a one bit error in
reading a '\n' - and it is no such thing.

No I didn't. It is a solution to having such an error blow up your
machine. Your long mischaracterization is not helping anybody.
 
V

vippstar

It doesn't - insofar as it makes no special provision against one-bit
errors in reading a '\n' from your prewritten data. It does, however,
return an error if it runs out of memory. But it fails to take the obvious
precaution of allowing the caller to specify an upper limit to the number
of bytes taken from the stream.

So let's say you have this situation - you know your lines are no longer
than 6 bytes (the same argument applies to more typical line lengths, eg
72 or 80, but the much lower value is chosen simply because it is easy to
write and easy to read in a Usenet article).

Here are your data:

3.141
1.618

A one bit error in reading a '\n' from the 3.141 line results in it being
interpreted as a J instead (01001010 instead of 00001010). So if ggets
really did overcome this obstacle, it would detect the one-bit error and
correct for it or at least report it. In practice, what ggets will do is
read the next line too, so that you'll get:

<snip>

What is this one-bit error you are talking about? It's the first time
I hear it.
Assuming that error happends, wouldn't the error flag for the stream
be set?
 
B

Ben Bacarisse

Andrew Poelstra said:
ggets() interally allocates a dynamically-resized buffer. In the case
that it runs out of memory, it returns an appropriate error
condition.

You've snipped the "problem" so my objection has been lost. What I
was saying (and it is a bit of a wild card) is how can any routine be
safe in the presence of bit errors in the input? I don't think it is
reasonable to assume that stdio works under these conditions. We have
no reasonable method to reason about what a program does when the
input system does not deliver the input.

<snip>
 
B

Ben Bacarisse

CBFalconer said:
I suggest you download it and see for yourself. It is only about
an 11k download.

I was not explicit enough. I my book, once stdio has "gone wrong"
i.e. not done what it should, I think you are into a special kind UB
and all bets are off. You recently posted (in another group) that
evaluating ++a + a++ might erase all files on all discs but you are
prepared to say that one program is safer than another when stdio is
not working correctly?

From an engineering point of you are 100% right, and if you normally
posted from the "what a computer will usually do" point of view, I
would probably have said nothing, but seem to be an all-or-nothing
poster about UB.
 
V

vippstar

They're all dangerous. For example, a one bit error in reading a
'\n' from your prewritten data can blow everything.

What is this one bit error you are talking about?
 
D

Doug Miller

Usually these "complaints" are just the observation that
self-determination is a pretty fundamental liberty that we should have
in a free society. The state has *no damn business* telling me what I
should or shouldn't do in the privacy of my own car, when it doesn't
affect anyone's safety but my own.

Safety is not the only aspect to consider: since insurance is based on the
concept of average, or shared, risk across a large pool of insured, *everyone*
pays higher insurance costs because of the knuckleheads who refuse to wear
seat belts.
 
C

CBFalconer

Richard said:
CBFalconer said:

Then I am at a loss to explain your previous response.

Maybe you should take a course in reading.
Perhaps you could explain how such an error could blow up one's
machine in the first place.

Do you want a probability graph? Based on random data, which is
not what will be inserted on such an error.
 
C

CBFalconer

What is this one bit error you are talking about?

Who cares? With the error the '\n' is no longer a '\n', and thus
the line is not ended, and gets can happily overwrite
who-knows-what with whatever.
 
A

Andrew Poelstra

You've snipped the "problem" so my objection has been lost. What I
was saying (and it is a bit of a wild card) is how can any routine be
safe in the presence of bit errors in the input? I don't think it is
reasonable to assume that stdio works under these conditions. We have
no reasonable method to reason about what a program does when the
input system does not deliver the input.

How would such bit errors be detected? Your input would have to have
encoded checksums or perhaps length. Giving ggets() a maximum-memory
parameter would be helpful in preventing excess memory being used in
this case, but in the general case there is no way to tell if there
are bit-level read errors in I/O.
 
R

Richard Bos

I must say, that plan beats trolling a technical newsgroup. At least it
gets you laid, which trolling prevents.
I'd use fgets even for a "throwaway" program because it's really as easy
to use and I won't have to worry about carefully deleting the sources
later.

I'd use fgets() instead of gets() all the time, simply because I have
never met a programmer incapablae of making a typo, or of dropping a
book on his keyboard.

Richard
 
J

James Kuyper

Ben Bacarisse wrote:
....
You've snipped the "problem" so my objection has been lost. What I
was saying (and it is a bit of a wild card) is how can any routine be
safe in the presence of bit errors in the input?

By writing the routine so that, no matter what the input is, it never
has undefined behavior. This means validating all inputs before using
them in any context where an invalid value could result in undefined
behavior. The possibility of erroneous input is an unavoidable fact of
life; code which cannot deal with that fact safely is poorly designed.

What is "safely"? That is something for the designer to decide. However,
for code which is meant to be portable, "safely" is incompatible with
undefined behavior, because the designer inherently cannot have decided
what that behavior will be. For code which is, as a matter of deliberate
intent, portable only to a limited range of implementations, code with
behavior which is undefined by the standard can still count as "safe",
so long as the behavior of that code is defined by all of the
implementations it is supposed to be portable to.
... I don't think it is
reasonable to assume that stdio works under these conditions. We have
no reasonable method to reason about what a program does when the
input system does not deliver the input.

I think you're going overboard by assuming that a single corrupted bit
in an input necessarily renders the entire system untrustworthy.
Hardware can fail, and the failure is not always detectable. A single
undetected failure in the hardware associated with a disk drive or in
the network connection to an NFS mounted drive does not, in general,
justify assuming that any other component of the system has failed. It
doesn't even justify automatically assuming that there will be a
catastrophic failure of the component that was malfunctioning (though
that is certainly a possibility to consider). In particular, it doesn't
justify assuming that the routines declared in <stdio.h> are defective.
 
K

Keith Thompson

Richard Heathfield said:
CBFalconer said:


Then I am at a loss to explain your previous response.


Perhaps you could explain how such an error could blow up one's machine in
the first place.
[...]

Suppose a program reads input using gets(). Suppose the programmer is
confident that no input line can be longer than 80 characters, but a
one-bit error reading a '\n' character causes two lines to be merged,
overflowing the input buffer.

Nothing in C can prevent or correct such an error, but just about
anything other than gets() (including ggets(), fgets(), or any of a
number of other routines) can avoid having it cause a buffer overflow.

The hypothetical one-bit error is just one example of a longer than
expected input line. I suppose it was brought up because it's
something that you can't prevent even if you think you have complete
control over what appears on stdin.

Chuck's reply upthread could be read as a claim that using ggets() can
correct such an error, but I don't believe that's what he meant.
 
B

Ben Bacarisse

James Kuyper said:
Ben Bacarisse wrote:
...

By writing the routine so that, no matter what the input is, it never
has undefined behavior. This means validating all inputs before using
them in any context where an invalid value could result in undefined
behavior. The possibility of erroneous input is an unavoidable fact of
life; code which cannot deal with that fact safely is poorly designed.

What is "safely"? That is something for the designer to
decide. However, for code which is meant to be portable, "safely" is
incompatible with undefined behavior, because the designer inherently
cannot have decided what that behavior will be. For code which is, as
a matter of deliberate intent, portable only to a limited range of
implementations, code with behavior which is undefined by the standard
can still count as "safe", so long as the behavior of that code is
defined by all of the implementations it is supposed to be portable
to.

I agree when I have my engineering hat on. You can engineer for all
sorts of probable errors, but...
I think you're going overboard by assuming that a single corrupted bit
in an input necessarily renders the entire system
untrustworthy.

I am only pointing out an apparent conflict. CBFalconer seems to be a
"that is UB -- all bets are off" sort of a poster. I am deliberately
going overboard because I can't see how such a person can claim any
behaviour if the stdio library is not behaving as per the spec.
Hardware can fail, and the failure is not always
detectable. A single undetected failure in the hardware associated
with a disk drive or in the network connection to an NFS mounted drive
does not, in general, justify assuming that any other component of the
system has failed. It doesn't even justify automatically assuming that
there will be a catastrophic failure of the component that was
malfunctioning (though that is certainly a possibility to
consider). In particular, it doesn't justify assuming that the
routines declared in <stdio.h> are defective.

Very reasonable, and I agree. I would not have said a word if ggets
was designed with system safety in mind the face of mysterious IO
failings, but it is not. It is better at handling some failures but
for others it will exhaust system resources. I've never seen a \n get
miss-read (I am sure it can happen, I am just saying that it has not
yet happened to me, to my knowledge) but I have seen drivers that just
endlessly return blocks of zeros when they fails to detect a file's
end point. ggets will just keep slurping them so while its behaviour
may remain defined, it is not particularly well engineered for failing
IO systems.

It was not my intent to put down ggets, only to point out what seemed
to be to be a very selective idea of when a program's behaviour is
formally defined and the sort of failures that ggets can cope with.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top