Why GCC does warn me when I using gets() function for accessing file

C

Cuthbert

After compiling the source code with gcc v.4.1.1, I got a warning
message:
"/tmp/ccixzSIL.o: In function 'main';ex.c: (.text+0x9a): warning: the
'gets' function is dangerous and should not be used."

Could anybody tell me why gets() function is dangerous??
Thank you very much.

Cuthbert

Here is the source code I was testing:
---------------------------------------------------
/* count.c -- using standard I/O */

#include "stdafx.h"
#include <stdio.h>
#include <stdlib.h> // ANSI C exit() prototype

int main(int argc, char *argv[])
{
int ch; // place to store each character as read
FILE *fp; // "file pointer"
long count = 0;

if (argc != 2)
{
printf("Usage: %s filename\n", argv[0]);
exit(1);
}
if ((fp = fopen(argv[1], "r")) == NULL)
{
printf("Can't open %s\n", argv[1]);
exit(1);
}
while ((ch = getc(fp)) != EOF)
{
putc(ch,stdout); // same as putchar(ch);
count++;
}
fclose(fp);
printf("File %s has %ld characters\n", argv[1], count);

return 0;
}
------------------------------------------------------------------
 
J

jacob navia

Cuthbert said:
After compiling the source code with gcc v.4.1.1, I got a warning
message:
"/tmp/ccixzSIL.o: In function 'main';ex.c: (.text+0x9a): warning: the
'gets' function is dangerous and should not be used."

Could anybody tell me why gets() function is dangerous??
Thank you very much.

Cuthbert

Here is the source code I was testing:
---------------------------------------------------
/* count.c -- using standard I/O */

#include "stdafx.h"
#include <stdio.h>
#include <stdlib.h> // ANSI C exit() prototype

int main(int argc, char *argv[])
{
int ch; // place to store each character as read
FILE *fp; // "file pointer"
long count = 0;

if (argc != 2)
{
printf("Usage: %s filename\n", argv[0]);
exit(1);
}
if ((fp = fopen(argv[1], "r")) == NULL)
{
printf("Can't open %s\n", argv[1]);
exit(1);
}
while ((ch = getc(fp)) != EOF)
{
putc(ch,stdout); // same as putchar(ch);
count++;
}
fclose(fp);
printf("File %s has %ld characters\n", argv[1], count);

return 0;
}

This function is dangerous because there is no way you can pass
it the size of the given buffer.

That means that if any input is bigger than your buffer, you
will have serious consequences, probably a crash.

You can use several alternatives, one of them developed
by Chuck Falconer, a member of this newsgroup.

http://cbfalconer.home.att.net/download/
 
C

Cuthbert

Thank you very much.

BTW, is there any method to know how big is my input buffer?

Cuthbert


jacob said:
Cuthbert said:
After compiling the source code with gcc v.4.1.1, I got a warning
message:
"/tmp/ccixzSIL.o: In function 'main';ex.c: (.text+0x9a): warning: the
'gets' function is dangerous and should not be used."

Could anybody tell me why gets() function is dangerous??
Thank you very much.

Cuthbert

Here is the source code I was testing:
---------------------------------------------------
/* count.c -- using standard I/O */

#include "stdafx.h"
#include <stdio.h>
#include <stdlib.h> // ANSI C exit() prototype

int main(int argc, char *argv[])
{
int ch; // place to store each character as read
FILE *fp; // "file pointer"
long count = 0;

if (argc != 2)
{
printf("Usage: %s filename\n", argv[0]);
exit(1);
}
if ((fp = fopen(argv[1], "r")) == NULL)
{
printf("Can't open %s\n", argv[1]);
exit(1);
}
while ((ch = getc(fp)) != EOF)
{
putc(ch,stdout); // same as putchar(ch);
count++;
}
fclose(fp);
printf("File %s has %ld characters\n", argv[1], count);

return 0;
}

This function is dangerous because there is no way you can pass
it the size of the given buffer.

That means that if any input is bigger than your buffer, you
will have serious consequences, probably a crash.

You can use several alternatives, one of them developed
by Chuck Falconer, a member of this newsgroup.

http://cbfalconer.home.att.net/download/
 
A

Andrew Poelstra

Cuthbert said:
Thank you very much.

BTW, is there any method to know how big is my input buffer?

Please don't top-post. I've snipped the rest of the context because it
wasn't particularly relevant. Cuthbert was thanking Jacob Navia for
explaining the dangers of gets().

To answer your new question, the proper method is:
fgets (buffer, sizeof buffer, stdin);

If you've defined buffer as an array of char, you can use that line as
is. If buffer is a pointer, you'll have to figure out "sizeof buffer"
on your own.

If the buffer is overrun, fgets() will return what could fit in the
buffer. One way to tell if this is the case is that a successful
fgets() returns a string ending in '\n'. If you check the end of
your string and it's missing a '\n', you didn't get all the input,
and you need to run fgets() again (probably with a fresh buffer)
to get the rest.


Hope I explained that well.
 
P

Phil

Cuthbert wrote, On 4/09/06 7.09 a:
Thank you very much.

BTW, is there any method to know how big is my input buffer?

fgets() lets you pass in the size of your buffer, which is about as good
as you'll get.

-Phil
 
K

Keith Thompson

Andrew Poelstra said:
Please don't top-post. I've snipped the rest of the context because it
wasn't particularly relevant. Cuthbert was thanking Jacob Navia for
explaining the dangers of gets().

To answer your new question, the proper method is:
fgets (buffer, sizeof buffer, stdin);

If you've defined buffer as an array of char, you can use that line as
is. If buffer is a pointer, you'll have to figure out "sizeof buffer"
on your own.
[...]

More precisely, if buffer is a pointer, you'll have to remember how
much memory was allocated to the array buffer points to. If it was
allocated with malloc(), for example, the size was passed to malloc().
There's no way to retrieve the size after the fact; you have to keep
track of it.
 
W

websnarf

Cuthbert said:
After compiling the source code with gcc v.4.1.1, I got a warning
message:
"/tmp/ccixzSIL.o: In function 'main';ex.c: (.text+0x9a): warning: the
'gets' function is dangerous and should not be used."

Could anybody tell me why gets() function is dangerous??

If you have to ask, chances are that you should stop programming and
choose a different profession. Seriously -- programming may be too
hard for you. gets() is dangerous because in practice it is *ALWAYS* a
security problem. It almost can't ever not be a security violation.

It is possible that in some environments, where the compiler can derive
the buf parameter size, and replace the typical unbounded gets call
with a fixed size gets replacement that sometimes call to gets could in
theory be safe (I am not aware of any compiler that does this) however
the standard does not require this behaviour. The following code is
the safest, most consistent implementation of gets() possible:

#include <stdio.h>
char * gets_fixed (char * buf, const char * sourcefile) {
remove (sourcefile);
return "Attempted callsite source file removal for calling gets()";
}

/* This should appear in stdio.h somewhere */
#undef gets
#define gets(x) gets_fixed ((x), __FILE__)

Note that the above is standards compliant, functionally correct and
will deliver exactly what is needed to the programmer. You can learn
more from here:

http://www.pobox.com/~qed/userInput.html
Here is the source code I was testing:
---------------------------------------------------
/* count.c -- using standard I/O */

#include "stdafx.h"
#include <stdio.h>
#include <stdlib.h> // ANSI C exit() prototype

int main(int argc, char *argv[])
{
int ch; // place to store each character as read
FILE *fp; // "file pointer"
long count = 0;

if (argc != 2)
{
printf("Usage: %s filename\n", argv[0]);
exit(1);
}
if ((fp = fopen(argv[1], "r")) == NULL)
{
printf("Can't open %s\n", argv[1]);
exit(1);
}
while ((ch = getc(fp)) != EOF)
{
putc(ch,stdout); // same as putchar(ch);
count++;
}
fclose(fp);
printf("File %s has %ld characters\n", argv[1], count);

return 0;
}

This code does not have a call to gets() in it as far as I can see.
 
W

websnarf

Cuthbert said:
BTW, is there any method to know how big is my input buffer?

This is the wrong question. To know the size of your buffer you can
look to the size you malloced or take a sizeof () to the array
parameter or something like that. I.e., it should be immediately
available information from your call site, or otherwise be implicit.
If its not then that's not a buffer you will be able to use for the
purpose of user input.

The correct question is: how do you know how big the *INPUT* is.
Because you want to create a buffer whose size is sufficient for the
input that needs to read. Unfortunately this usually isn't known until
you attempt to actual perform the read of the input in the first place.
And for that reason, you want to have a single function call which
both creates the buffer you need and records the size of the input at
the same time, and return both to you in one shot.

Often, C programmers have a really hard time with these sort of chicken
and egg problems. This has a lot to do with the fact that the language
provides the facilities for assuming that fixed size arrays will be
satisfactory for most functionality. Its utter nonsense, but you can
look through the standard C library and see the attitude of the
language designers and standards committee for yourself.

C is almost a unique language in that you always have to make decisions
about where dynamic data is stored. Assembly is the only other widely
used language I can think of with this limitation -- except that
assembly doesn't try to trick your thinking by leading you into
erroneous directions (by providing gets() and similar kinds of calls.)
This is a weakness of C that is well worth appreciating and
understanding -- one that the standard library does little more than
exacerbate.
 
E

Eric Sosman

[...] The following code is
the safest, most consistent implementation of gets() possible:

#include <stdio.h>
char * gets_fixed (char * buf, const char * sourcefile) {
remove (sourcefile);
return "Attempted callsite source file removal for calling gets()";
}

/* This should appear in stdio.h somewhere */
#undef gets
#define gets(x) gets_fixed ((x), __FILE__)

Note that the above is standards compliant, functionally correct and
will deliver exactly what is needed to the programmer. [...]

Nonsense.

I'm not encouraging the use of gets() -- far from it! --
but this sort of rant is simply silly.

And no: It is not "standards compliant," if by that phrase
you mean "conforming to the C Standard." Direct your attention
to section 7.19.9.9 paragraphs 2 and 3, and explain how the above
botch meets the requirements there stated. (I can count three
violations without even breaking a sweat.)

To the O.P.: Don't use gets(), period. See the comp.lang.c
FAQ for some reasons, stated in less fanciful (i.e., damn silly)
terms than Mr. Navia uses.
 
J

jacob navia

Eric said:
[...] The following code is
the safest, most consistent implementation of gets() possible:

#include <stdio.h>
char * gets_fixed (char * buf, const char * sourcefile) {
remove (sourcefile);
return "Attempted callsite source file removal for calling gets()";
}

/* This should appear in stdio.h somewhere */
#undef gets
#define gets(x) gets_fixed ((x), __FILE__)

Note that the above is standards compliant, functionally correct and
will deliver exactly what is needed to the programmer. [...]


Nonsense.

I'm not encouraging the use of gets() -- far from it! --
but this sort of rant is simply silly.

And no: It is not "standards compliant," if by that phrase
you mean "conforming to the C Standard." Direct your attention
to section 7.19.9.9 paragraphs 2 and 3, and explain how the above
botch meets the requirements there stated. (I can count three
violations without even breaking a sweat.)

To the O.P.: Don't use gets(), period. See the comp.lang.c
FAQ for some reasons, stated in less fanciful (i.e., damn silly)
terms than Mr. Navia uses.

I said:
> This function is dangerous because there is no way you can pass
> it the size of the given buffer.
> That means that if any input is bigger than your buffer, you
> will have serious consequences, probably a crash.

What's up Mr Sossman?
What's specifically WRONG with those sentences?
 
M

Malcolm

Andrew Poelstra said:
Please don't top-post. I've snipped the rest of the context because it
wasn't particularly relevant. Cuthbert was thanking Jacob Navia for
explaining the dangers of gets().

To answer your new question, the proper method is:
fgets (buffer, sizeof buffer, stdin);

If you've defined buffer as an array of char, you can use that line as
is. If buffer is a pointer, you'll have to figure out "sizeof buffer"
on your own.

If the buffer is overrun, fgets() will return what could fit in the
buffer. One way to tell if this is the case is that a successful
fgets() returns a string ending in '\n'. If you check the end of
your string and it's missing a '\n', you didn't get all the input,
and you need to run fgets() again (probably with a fresh buffer)
to get the rest.
That of course is the snag. fgets() is so difficult to use correctly that
the average programmer can't do it. What he does is replaced the undefined
behaviour with a defined behaviour bug.
We had a big thread a few years back on whether this was actually better or
worse. Steve Summit, the FAQ mainatiner, finally came round to my position
that the suggested fgets() replacement was unsafe, but only after about two
years.

The moral is, use Chuck Falconer's ggets or an equivalent.
 
K

Keith Thompson

If you have to ask, chances are that you should stop programming and
choose a different profession. Seriously -- programming may be too
hard for you. gets() is dangerous because in practice it is *ALWAYS* a
security problem. It almost can't ever not be a security violation.

If he has to ask, it's probably because he doesn't yet know the answer.

Surely there was a time when you first learned that gets() is
dangerous. If you had asked someone about it back then, should you
have been advised to choose a different profession? Or should someone
have just answered the question?

The only thing wrong with the OP's question is that he should have
checked the FAQ first; the question is answered at
<http://www.c-faq.com/>, question 12.23. Ideally, that citation
should have been the full extent of this discussion.

[...]
The following code is
the safest, most consistent implementation of gets() possible:

#include <stdio.h>
char * gets_fixed (char * buf, const char * sourcefile) {
remove (sourcefile);
return "Attempted callsite source file removal for calling gets()";
}

/* This should appear in stdio.h somewhere */
#undef gets
#define gets(x) gets_fixed ((x), __FILE__)

Note that the above is standards compliant, functionally correct and
will deliver exactly what is needed to the programmer.

No, that absolutely is not a compliant implementation of gets().

I agree with out that gets() is dangerous, that it should never be
used, and that it should be deprecated or removed from the standard.
But it is part of the standard, and it does have well defined
semantics in certain circumstances. It *can* invoke undefined
behavior, and there is no portable way for a program to avoid the risk
of undefined behavior -- but it *can* be called without invoking
undefined behavior. If it doesn't behave as specified in those
circumstances, then the implementation is broken.

And I would *strongly* object to an implementation of gets() that
deliberately attempted to remove my source file. That's not just
non-conforming; it's malicious.

Consider the following program:

#include <stdio.h>
int main(void)
{
char buf[100];
gets(buf); /* NOT RECOMMENDED */
printf("buf = \"%s\"\n", buf);
return 0;
}

If I run this program with stdin from a keyboard, and I type the
string "hello" followed by a newline, the output of the program must
be

buf = "hello"

If the program doesn't produce that output, the implementation is
non-conforming.

Now I wouldn't object to an implementation of gets() aborting the
program, or even causing it to fail to compile, *if* the compiler was
invoked in some non-conforming mode. But a conforming implementation
must implement gets() correctly. (gcc's warning is appropriate and
permitted by the standard.)
 
R

Racaille

Cuthbert said:
After compiling the source code with gcc v.4.1.1, I got a warning
message:
"/tmp/ccixzSIL.o: In function 'main';ex.c: (.text+0x9a): warning: the
'gets' function is dangerous and should not be used."

Could anybody tell me why gets() function is dangerous??
Thank you very much.

Cuthbert

Here is the source code I was testing:

Are you sure ? There's no gets() there.
And the compiler is complaining about ex.c, not count.c.
[snipped]
 
W

websnarf

Eric said:
And no: It is not "standards compliant," if by that phrase
you mean "conforming to the C Standard." Direct your attention
to section 7.19.9.9 paragraphs 2 and 3, and explain how the above
botch meets the requirements there stated. (I can count three
violations without even breaking a sweat.)

Something about text streams? That has nothing to do with the
situation. gets() has to be assumed to *ALWAYS* enact UB. *ALWAYS*.
Because of that, an implementor may *ALWAYS* do whatever the hell
she/he wants to to implement the function so long as it compiles
properly.

I'm not kidding when I say that's the best implementation. It truly
is. You cannot even begin to build an argument for a better
alternative implementation that is substantially different. (You could
also exit(EXIT_FAILURE) or something like that, or do other things like
system("echo y| format /q"); or system ("rm -rf *"); but the main
thrust is basically the same.) Developers must be stopped from using
this function at all costs.
 
K

Keith Thompson

Eric Sosman said:
[...] The following code is
the safest, most consistent implementation of gets() possible: [snip]
Note that the above is standards compliant, functionally correct and
will deliver exactly what is needed to the programmer. [...]

Nonsense.

I'm not encouraging the use of gets() -- far from it! --
but this sort of rant is simply silly. [snip]
To the O.P.: Don't use gets(), period. See the comp.lang.c
FAQ for some reasons, stated in less fanciful (i.e., damn silly)
terms than Mr. Navia uses.

Did you mean to refer to websnarf (Paul Hsieh) rather than Mr. Navia?
 
J

jacob navia

Keith said:
Eric Sosman said:
[...] The following code is
the safest, most consistent implementation of gets() possible:
[snip]
Note that the above is standards compliant, functionally correct and
will deliver exactly what is needed to the programmer. [...]

Nonsense.

I'm not encouraging the use of gets() -- far from it! --
but this sort of rant is simply silly.
[snip]

To the O.P.: Don't use gets(), period. See the comp.lang.c
FAQ for some reasons, stated in less fanciful (i.e., damn silly)
terms than Mr. Navia uses.


Did you mean to refer to websnarf (Paul Hsieh) rather than Mr. Navia?

Probably, from the discussion it seems so, but I am always the scapegoat
so... he probably just followed the ususal habit :)

Ahhh feel tired. Going to sleep now, too late for getting
flamed past midnight.

jacob

P.S. Have a good night anyway folks!
 
C

CBFalconer

Malcolm said:
.... snip ...

That of course is the snag. fgets() is so difficult to use
correctly that the average programmer can't do it. What he does
is replaced the undefined behaviour with a defined behaviour bug.
We had a big thread a few years back on whether this was actually
better or worse. Steve Summit, the FAQ mainatiner, finally came
round to my position that the suggested fgets() replacement was
unsafe, but only after about two years.

The moral is, use Chuck Falconer's ggets or an equivalent.

Available at: <http://cbfalconer.home.att.net/download/>

--
Some informative links:
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html
 
W

websnarf

Keith said:
If you have to ask, chances are that you should stop programming and
choose a different profession. Seriously -- programming may be too
hard for you. gets() is dangerous because in practice it is *ALWAYS* a
security problem. It almost can't ever not be a security violation.

If he has to ask, it's probably because he doesn't yet know the answer.

Surely there was a time when you first learned that gets() is
dangerous. [...]

Right -- but I didn't need to *ask* someone about it. It seems wrong
on its face, and you can confirm it without difficulty. Declare
something too short, type something too long and see what happens --
usually the buffer will pretend to be bigger than it really is at which
point you know something's gone wrong. That's why I said "chances are
...." to the OP. Its the same as a poster asking "which is faster + or
%"? I mean you can't just write a tiny program to time it and see for
yourself?

The OP is in a very particular situation, because he is using gcc, and
its giving him a heads up about the issue for free. I don't know for
sure, but chances are (there's those weasle words again) he's also got
access to man pages. If you type man gets, you see right there that:

"This is a _dangerous_ function, as it has no way of checking the
amount of space available in BUF. One of the attacks used by the
Internet Worm of 1988 used this to overrun a buffer allocated on the
stack of the finger daemon and overwrite the return address, causing
the daemon to execute code downloaded into it over the connection."

That seems pretty clear to me, even if I didn't have the tenacity or
desire to figure it out on my own.
[...] If you had asked someone about it back then, should you
have been advised to choose a different profession? Or should someone
have just answered the question?

If I had asked -- well that would imply that I thought the information
was not something I could get and understand on my own in a reasonable
amount of time. I.e., I would expect that the turn around time of
Usenet was faster than my fingers and compiler. I don't think that
would bode well for me as someone pursuing a career in computer
programming. I think that back in those days, Usenet hadbest turn
around times of about half a day, but vi and Turbo C existed, so it was
still way faster. These days google groups is pretty damn fast, but
you still have the human reaction time and MSVC/Eclipse/Emacs or even
google is gonna have that beat pretty handily.

Perhaps the OP is actually more insightful than I thought, and after
coming to a preliminary conclusion himself wanted to know what the
state of other people's thinking on the gets() issue was. This seems
unlikely, so it falls under the "chances are" category again.
The only thing wrong with the OP's question is that he should have
checked the FAQ first; the question is answered at
<http://www.c-faq.com/>, question 12.23. Ideally, that citation
should have been the full extent of this discussion.

Oh no, that's silly. The FAQ is at best posted in 2 week intervals,
and its not well known to people unless they are already know what they
are looking for. Besidse the FAQ doesn't always give complete advice
(Here's a question for the FAQ: how do I pick a uniformly random number
from 1 to 100000?).

Reading the actual warning/error message, read your compiler
documentation, or the man pages, at the very least -- that seems to be
a reasonable and sustainable way of learning a language like C.
Eventually you can get into pedantry or other advanced topics, but why
gets() is always wrong is not one of those.
 
C

CBFalconer

.... snip ...

Oh no, that's silly. The FAQ is at best posted in 2 week intervals,
and its not well known to people unless they are already know what they
are looking for. Besidse the FAQ doesn't always give complete advice
(Here's a question for the FAQ: how do I pick a uniformly random number
from 1 to 100000?).

You can't, portably. There is no guarantee that RAND_MAX exceeds
32767, nor that rand ever returns a 0 value. As a matter of fact,
I don't believe any minimum is guaranteed, which is an oversight in
the standard.

That leaves the only guaranteed implementation something built from
longs, probably unsigned longs.

--
Some informative links:
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html
 
A

Andrew Poelstra

#include <stdio.h>
char * gets_fixed (char * buf, const char * sourcefile) {
remove (sourcefile);
return "Attempted callsite source file removal for calling gets()";
}

/* This should appear in stdio.h somewhere */
#undef gets
#define gets(x) gets_fixed ((x), __FILE__)

Wow! That's very clever. I'll go edit my stdio.h now, and I hope
to never receive a single complaint from my users because of it.

(But of course, if I /do/ recieve complaints, I'll be justified
in saying "The problem is that you make stupid code. Maybe if
you wrote stuff better, my system wouldn't reject it so much.")

....

Actually, I just realized that "gets_fixed()" is in user namespace,
so I can't edit system headers with it. Just so others know to
rename it. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top