gets() is dead

M

Michael Brennan

And fgets() is in practise too difficult for any but the best programmers to
use safely.

That certainly means I am unable to use fgets() safely,
and since I just wrote this function which reads from stdin
using fgets() I thought I would to like to hear any opinions
on my code from the people here.
Although it may not be the best way to handle memory, I'm
mostly interested to know if this code is correct, and safe.
Thanks!

/Michael Brennan


#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define BUFSZ 128
char *readstr(void)
{
char buffer[BUFSZ];
char *input;
size_t size;
char *tmp;

input = NULL;
size = 0;
do {
tmp = realloc(input, size + BUFSZ);
if (tmp == NULL) {
free(input);
return NULL;
}
input = tmp;

if (fgets(buffer, sizeof buffer, stdin) == NULL) {
free(input);
return NULL;
}

if (size == 0)
strcpy(input, buffer);
else
strcat(input, buffer);

size = strlen(input);
} while (strrchr(buffer, '\n') == NULL);

input[strcspn(input, "\n")] = '\0';

return input;
}
 
R

Richard Heathfield

Michael Brennan said:

Although it may not be the best way to handle memory, I'm
mostly interested to know if this code is correct, and safe.

I can't see anything "incorrect" about it, in terms of C rules. It does,
however, expose your program to the possibility of a memory exhaustion
attack, since you place no upper limit on the size of the block to be
allocated.

Also, there are a couple of places where you could speed it up.
Specifically, you could use an exponential allocation strategy rather
than linear, and you could avoid repeated calls to strlen and strcat by
keeping track of length information.

Anyway, why are you using fgets at all? Wouldn't the code be a lot
simpler and perhaps a touch quicker if you just used getc?
 
R

Richard Heathfield

Harald van D?k said:
You can if your code isn't strictly conforming ANSI C, you've forked a
process, and you're using pipes for communication between the parent
and the child.

You seem to believe that the program is only run in ways of which you
approve. Programs, despite being written by programmers, have a nasty
habit of ending up in the hands of users, and users don't as a rule
worry about such things as the flakiness of the programs they are
using. (Were this not the case, it would be impossible to explain the
popularity of Internet Explorer.)
Of course, if the only possible safe uses of gets() rely on extensions
to standard C, it wouldn't be bad to make gets() itself an extension
as well, so I'm not saying deprecating it was a bad idea.

I think any implementation offering gets() as an extension should itself
be deprecated - for low QoI.
 
R

Richard Bos

Flash Gordon said:
Dave Vandervies wrote, On 26/04/07 20:08:

I don't think it would be a big thing, just adding one sentence (or
paragraph at the most). It is not as if it would be that big a thing for
most implementations to implement, after all most can warn against the
things which have been deprecated anyway.

After all, we already do have _required_-fatal diagnostics, in the case
of #error. Adding the opposite should not be a problem.

Richard
 
R

Richard Bos

Gregor H. said:
Hear, hear. Well, sure, a gun is also completely safe as long you do
not aim at people with it.

In which context it is significant that people _do_ get killed with
their own guns, both by others and by themselves.

Richard
 
J

jacob navia

Richard said:
Michael Brennan said:




I can't see anything "incorrect" about it, in terms of C rules. It does,
however, expose your program to the possibility of a memory exhaustion
attack, since you place no upper limit on the size of the block to be
allocated.

If the hacker types 10 chars/second, it will take him/her
27.77 hours to get to 1MB, 11 days of continuous
typing to just allocate 10MB, and more than half a year
to type 100MB...

Today allocating 100MB is no problem at all, most machines
dispose of 2GB Virual memory. This means that you have
years to catch the hacker trying to exhaust your
memory by typing.

If stdin is redirected, that's another problem.

Also, there are a couple of places where you could speed it up.
Specifically, you could use an exponential allocation strategy rather
than linear, and you could avoid repeated calls to strlen and strcat by
keeping track of length information.

Typing speed is the limit quantity here
 
F

Flash Gordon

jacob navia wrote, On 27/04/07 07:25:
If the hacker types 10 chars/second, it will take him/her
27.77 hours to get to 1MB, 11 days of continuous
typing to just allocate 10MB, and more than half a year
to type 100MB...

Today allocating 100MB is no problem at all, most machines
dispose of 2GB Virual memory. This means that you have
years to catch the hacker trying to exhaust your
memory by typing.

If stdin is redirected, that's another problem.

The program does not ensure that stdin is not redirected. It also fails
to ensure that I have not reprogrammed the keyboard controller to
generate such input.

If a hacker has access to the keyboard s/he can probably redirect stdin,
if not it is a remote attack and there is no reason to assume it is a
person typing.
Typing speed is the limit quantity here

Only assuming input is someone typing, which frequently is not the case.

I'm not convinced using getc would be faster. C makes no such guarantee ;-)
 
¬

¬a\\/b

Francine wrote, On 26/04/07 17:53:

I and many others here disagree.


Most people here (even those who have argued for keeping gets) think
that it is only useful for quick throw-away programs (which are not a
problem as they are going to be thrown away anyway) and very specialised
situations (which being very specialised do not occur often and so do
not make up a lot of code). Any other code needs to be fixed anyway, so
the sooner it will not build the better since it will force it to be
fixed or discarded.


In such a situation you know the size of the buffer you have allocated
so is it really such a big problem to do:
fgets(buffer,sizeof buffer,stdin);
or
fgets(buffer,BUFFER_SIZE,stdin);

fgets should to return the number of char successfull written in
buffer; so the use is easier and should not be use strlen one more
time again

something like:

unsigned h;
char buf[2048];
if((h=fgets(buf, sizeof buf, stdin))==0 || buf[h-1]!='\n')
error();
"" error
"\n" ok
"fifjfj\n" ok
"jfofif^D" error
"jfofif^Z" error
2070 string error
etc

or something like:
char buf[2048], *h;
if((h=fgets(buf, sizeof buf, stdin))==0 || *h!='\n')
error();
 
¬

¬a\\/b

something like:

unsigned h;
char buf[2048];
if((h=fgets(buf, sizeof buf, stdin))==0 || buf[h-1]!='\n')
error();
"" error
"\n" ok
"fifjfj\n" ok
"jfofif^D" error
"jfofif^Z" error
2070 string error
etc

or something like:
char buf[2048], *h;
if((h=fgets(buf, sizeof buf, stdin))==0 || *h!='\n')
error();

i already have above functions
 
R

Richard Heathfield

Flash Gordon said:
I'm not convinced using getc would be faster. C makes no such
guarantee ;-)

Indeed it doesn't, which is why I said "perhaps". Nevertheless, the
possibility is a real one. fgets must copy the data into the supplied
buffer (as if by repeated calls to getc, IIRC), and then the OP copies
it out again into another buffer. Using getc directly, whilst not
guaranteed to produce any saving at all (let alone a significant
saving), is nevertheless highly likely so to do.
 
C

Chris Dollin

Malcolm said:
You've provided your own evidence.
Where?

You can't articulate why fgets() is dangerous,

You have no evidence for this claim.
or why it could be considered dangerous if you persist in disagreeing with me.

What? I think my parser is broken, or something.
A threat you can't recognise is much more deadly than one that you can.

/Some/ threats you can't recognise are more deadly than ones that
you can. This is not news.

Do you have an actual /reason/ for the claim you made in the nested
quote above?
 
C

Chris Dollin

Malcolm said:
No one else can articulate why fgets() is dangerous.
False.

Therefore there are two
possibilities.

False dichotomy.
The first is that it isn't dangerous at all, the second is
that it is especially dangerous and even experienced programmers often can't
see the problem.

The third is that you have to be awake when you use it, just like you
have to be awake when you use `+` or `if` or `*`.

`fgets` is "dangerous" in the same way that most code is.
 
M

Michael Brennan

Michael Brennan said:



I can't see anything "incorrect" about it, in terms of C rules. It does,
however, expose your program to the possibility of a memory exhaustion
attack, since you place no upper limit on the size of the block to be
allocated.

Also, there are a couple of places where you could speed it up.
Specifically, you could use an exponential allocation strategy rather
than linear, and you could avoid repeated calls to strlen and strcat by
keeping track of length information.

Points taken. Thank you!
Anyway, why are you using fgets at all? Wouldn't the code be a lot
simpler and perhaps a touch quicker if you just used getc?

Good point. I did not realize that when I wrote it, since my goal
was just to handle the case where the line exceeded the size I
passed to fgets(). But as it looks now I might just as well
use getc().

But that makes me wonder, how should I use fgets() if I want to
make sure that a too long input line won't break my program by
sending the trailing data to functions that expect something else?

I mean, if I just write this convenient line

fgets(buffer, sizeof buffer, stdin);

I'd have trailing data waiting if the user inputs a too long line.

I suppose I could just have a loop with getc() at the end that
discards the rest of the input. Is that a good way to do it
if you use fgets()?

/Michael Brennan
 
C

Christopher Benson-Manica

Richard Bos said:
After all, we already do have _required_-fatal diagnostics, in the case
of #error. Adding the opposite should not be a problem.

#ifndef __STDC__
#diagnostic Danger, Will Robinson!
#endif
 
R

Richard Heathfield

Michael Brennan said:
Good point. I did not realize that when I wrote it, since my goal
was just to handle the case where the line exceeded the size I
passed to fgets(). But as it looks now I might just as well
use getc().

But that makes me wonder, how should I use fgets() if I want to
make sure that a too long input line won't break my program by
sending the trailing data to functions that expect something else?

Um... don't? :) Just use your super-duper new function instead.
I mean, if I just write this convenient line

fgets(buffer, sizeof buffer, stdin);

I'd have trailing data waiting if the user inputs a too long line.

I suppose I could just have a loop with getc() at the end that
discards the rest of the input. Is that a good way to do it
if you use fgets()?

As ever, it depends. If the data matters, read it and process it. If
not, then throwing it away is reasonable. If you do decide to use fgets
*and* discover that you have not captured an entire line (because there
is no \n in your buffer) *and* you don't want the rest of the line,
then you can do something like:

int clreol(FILE *fp)
{
int ch;
while((ch = getc(fp)) != EOF && ch != '\n')
{
continue;
}
return ch == '\n';
}
 
G

Guest

Richard said:
Harald van D?k said:


You seem to believe that the program is only run in ways of which you
approve.

I don't. Are you perhaps assuming that the separate processes are
separate programs? If so, yes, running the intended child separately
would be unsafe, but there is no need to make separate programs
available. The same program can act both as parent and as child.
 
D

Dave Vandervies

You've provided your own evidence.
You can't articulate why fgets() is dangerous, or why it could be considered
dangerous if you persist in disagreeing with me. A threat you can't
recognise is much more deadly than one that you can.

Can you articulate why breathing air is dangerous?

No? Then it must be EXTREMELY dangerous. I think you should stop.


dave
 
D

Default User

Charlton said:
DU> As he humbly discards the possibility that he's one of the
DU> best programmers.

In my case, I have to. I know how inexpert I am, and the notion that
I might be one of the best is truly horrifying.

I'd be happy around here to be considered competent. On the other hand,
for my "skills assessment" at work, I blithely checked "expert" level
for C. I know my competition.





Brian
 
D

Default User

Malcolm McLean wrote:

No one else can articulate why fgets() is dangerous. Therefore there
are two possibilities. The first is that it isn't dangerous at all,
the second is that it is especially dangerous and even experienced
programmers often can't see the problem.


It's a fairly typical C function, especially of the string-handling
type. If you give the wrong size as the second argument, or pass in a
null or unintialized pointer, UB awaits with Nasty, Big, Pointy Teeth.

The only thing the user can do is give you something longer than can be
read a single gulp. So what? That's pretty minor on the list of things
C programmers have to watch out for.





Brian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,277
Latest member
VytoKetoReview

Latest Threads

Top