safe scanf( ) or gets

E

Eric Boutin

Hi ! I was wondering how to quickly and safely use a safe scanf( ) or gets
function... I mean.. if I do :

char a[256];
scanf("%s", a);
and the user input a 257 char string..
that creates a problem.. same for gets..

even if you create a char array that's 99999999999999 char long.. if the
user input something longer it will still be a bug.. and I don't want
this..

<OT>
C++ have std::string that dynamicaly realloc themself if they are running
too big, but what about us ?
</OT>

I though about using character input function, from stdin, and then create a
string with this single character, then appending this character to the then
end of a string, and if the string gets too small, realloc( ) a bigger
one.. however this is quite annoying to do this each time I want to read
input.. yes I could create a function with this.. and that's what I gonna
do.. however I was wondering what you C experts were doing to avoid a
segfault or a bug in a such situation

thanks !
 
N

nrk

Eric said:
Hi ! I was wondering how to quickly and safely use a safe scanf( ) or
gets
function... I mean.. if I do :

I don't know of any safe way to use gets. Use fgets instead where you can
specify the maximum number of characters to read into your buffer. With
scanf, the same can be achieved by specifying a maximum field width in the
conversion specifier (see below).
char a[256];
scanf("%s", a);
and the user input a 257 char string..
that creates a problem.. same for gets..

even if you create a char array that's 99999999999999 char long.. if the
user input something longer it will still be a bug.. and I don't want
this..

You can avoid this problem by specifying the maximum field width in the
conversion specifier:
scanf("%254s", a);
which will read a maximum of 254 characters into "a". Read the
documentation of scanf for more details.
<OT>
C++ have std::string that dynamicaly realloc themself if they are running
too big, but what about us ?
</OT>

You'll have to roll your own unfortunately... but... (see below)
I though about using character input function, from stdin, and then create
a string with this single character, then appending this character to the
then
end of a string, and if the string gets too small, realloc( ) a bigger
one.. however this is quite annoying to do this each time I want to read
input.. yes I could create a function with this.. and that's what I
gonna
do.. however I was wondering what you C experts were doing to avoid a
segfault or a bug in a such situation

Several regulars in this group (CBFalconer, Richard Heathfield, Morris
Dovey) have developed functions that do something along the lines of what
you want. Even if you want to roll your own, search the archives for a
thread with subject "Reading a line from a file" to get the URLs for the
same, to get a feel for how to go about it.

-nrk.

ps: This question seems to crop up so often around here that perhaps it
should be added to the FAQ?
 
M

Malcolm

Eric Boutin said:
Hi ! I was wondering how to quickly and safely use a safe scanf( ) or >
gets function... I mean.. if I do :The real answer is that stdin is seldom used in real programs. If the
program takes a few parameters from the user these are passed on the command
line, if it needs a large number of inputs these are given in an ASCII file,
and if it really needs interactivity then it uses a GUI.

There are plenty of functions knocking around that read an arbitrary-length
string from stdin. You only have to write these once.

The advice to replace gets() with a call to fgets() and throw away the
trailing '\n' is bad, since you replace undefined behaviour with wrong
behaviour on overflow. To use fgets() properly you have to take action on
overflow, which makes the program complex.
 
M

Martin Dickopp

Malcolm said:
The real answer is that stdin is seldom used in real programs.

I strongly disagree. In fact, to get any /real/ work done with a computer,
programs which read from standard input and write to standard output (AKA
filter programs) are absolutely mandatory, IMO.

Without filter programs, computers would be useless to me. (I would also
be unemployed, because my work would be impossible.)
if it needs a large number of inputs these are given in an ASCII file,

I don't understand. A text file can contain arbitrarily long lines, just
like standard input. How does reading from a file instead of standard
input change the situation?

(In fact, on many operating systems, standard input can be redirected
from a file, and a file name is provided for the terminal, so IMHO it
doesn't make much sense to distinguish between standard input and named
files.)
and if it really needs interactivity then it uses a GUI.

Again, if this is supposed to be general advice (it sounds as if it is,
sorry if I misunderstood you), I strongly disagree. Many people (including
myself) prefer non-GUI programs to GUI programs.

Martin
 
M

Malcolm

Martin Dickopp said:
I strongly disagree. In fact, to get any /real/ work done with a
computer, programs which read from standard input and write to
standard output (AKA filter programs) are absolutely mandatory,
IMO.
Sound like someone knows about a world which I know nothing about.
Again, if this is supposed to be general advice (it sounds as if it is,
sorry if I misunderstood you), I strongly disagree. Many people
(including myself) prefer non-GUI programs to GUI programs.
Then we realise that we are more probably dealing with an eccentric. GUIs
have swept the board for interactive programs.
I don't know about filter programs - maybe in mainframe environments with
non-user generated stdin. As a games programmer I would never use nor write
a program written in such a fashion.
 
M

Martin Ambuhl

Malcolm said:
gets function... I mean.. if I do :

The real answer is that stdin is seldom used in real programs. If the
program takes a few parameters from the user these are passed on the command
line, if it needs a large number of inputs these are given in an ASCII file,
and if it really needs interactivity then it uses a GUI.

This is grossly untrue. *Many* real programs are filters, taking stdin as
the default source.
There are plenty of functions knocking around that read an arbitrary-length
string from stdin. You only have to write these once.

The advice to replace gets() with a call to fgets() and throw away the
trailing '\n' is bad, since you replace undefined behaviour with wrong
behaviour on overflow. To use fgets() properly you have to take action on
overflow, which makes the program complex.

This is ridiculous. Only someone who doesn't know how to discard the '\n'
properly could have written such drivel. Of course it is not "bad" to call
fgets() and discard the trailing '\n'. What is necessary is to decide what
to do when there is no trailing '\n', and the level of complexity involved
need not be large at all.
 
I

Irrwahn Grausewitz

Malcolm said:
Sound like someone knows about a world which I know nothing about.

You don't know about operating systems providing command line
interfaces? Can't believe that.
Then we realise that we are more probably dealing with an eccentric.

Nobody preferring a console interface over a GUI is an eccentric,
but someone who knows about the power of command lines.
GUIs
have swept the board for interactive programs.

stdin and interactive input are not equivalent. In a typical
environment input and output of a program are often redirected
to/from other sources. Remember: stdin/stdout/stderr are streams
which may be connected to a console, or a physical file. From C's
POV there's no difference, hence the rule: never use gets (for
suitable values of 'never').
I don't know about filter programs - maybe in mainframe environments with
non-user generated stdin.

A lot of standard command line utilities on a vast number of OSs are
filter programs. For example, you virtually can't do anything useful
on a typical *nix system without using stream filters, e.g. grep,
head, tail, sed, awk, gzip, more, cut, sort, ...
As a games programmer I would never use nor write
a program written in such a fashion.

Not all the world is a Wintel box. ;-)

Regards
 
M

Malcolm

Irrwahn Grausewitz said:
Nobody preferring a console interface over a GUI is an eccentric,
but someone who knows about the power of command lines.
No its eccentric. Users generally won't accept command line programs unless
forced to use them. A GUI is generally far easier to use - I'm typing this
into a GUI newsreader.
A lot of standard command line utilities on a vast number of OSs are
filter programs. For example, you virtually can't do anything useful
on a typical *nix system without using stream filters, e.g. grep,
head, tail, sed, awk, gzip, more, cut, sort, ...
Well grep you would usually invoke with the name of the file to search.
"more" does use redirection, and it is a quirky thing to use - basically a
patch on the other utilities not being GUI. I have never had any reason to
use the other utilities mentioned.
Not all the world is a Wintel box. ;-)
Just the vast majority of general-purpose computers in use today. Even jobs
that used to require a mainframe can now often be done on PCs. Things like
supermarket checkouts and airport information screens are often PCs
underneath.
The vast majority of medium-sized systems that aren't PCs are probably games
consoles. They don't use command lines either. Nor do mobile phones.
 
M

Malcolm

Martin Ambuhl said:
This is ridiculous. Only someone who doesn't know how to discard
the '\n' properly could have written such drivel.
Well the FAQ showed to "replace gets() with a call to fgets()", and
discarded the trailing '\n', which means that undefined behaviour on
overflow is very likely to be replaced by incorrect behaviour on overflow.

If an experienced programmer like Steve Summitt can't get it right, then I
think we can say that fgets() is difficult to use.
Of course it is not "bad" to call fgets() and discard the trailing '\n'.
What is necessary is to decide what to do when there is no trailing '\n',
So how do you determine if there is no trailing '\n', if it's been
discarded?
and the level of complexity involved need not be large at all.
It depends what you mean. If you are comparing to some sort of analysis of
equations used in particle physics then, no, its not complicated. If you
mean that it adds substantial extra hassle to what should be a simple
process of getting a line from the user, then, yes, using fgets() properly
is complicated. You need to check the '\n', then discard the remainder of
the line, report an error message to the user (probably), and then loop to
get another line.
 
A

Arthur J. O'Dwyer

Careful about those generalizations, Irrwahn -- I bet I could
provide a few counter-examples if provoked. :)
No its eccentric. Users generally won't accept command line programs unless
forced to use them. A GUI is generally far easier to use - I'm typing this
into a GUI newsreader.

Yes, and that's precisely one of the examples I was going to
bring up prevthread, as one of your apps requiring "more complicated
input" than your average non-GUI app can deliver. Also text editors
and programs computation- or I/O-heavy enough to really *require* a
progress indicator (e.g., my system's default invocation of 'wget').
Line-based text editors do exist, but IMHO only eccentrics really
*do* use those. ;-)
However, I frequently use a command-line compiler (gcc), which
is IMHO orders of magnitude more powerful and user-friendly than
the typical Visual offering. And of course if you've ever used
an MS-DOS Command Prompt on your Wintel box, you've seen how programs
like 'copy' and 'dir' can be useful from time to time. :)

Well grep you would usually invoke with the name of the file to search.

Yes, but you may not have known that you can *also* invoke 'grep'
like this:

c:\> grep 'hello' < myfile.txt
or
c:\> dir /s | grep "myprog.exe" | sort

which last is a complex operation which to my knowledge cannot be
performed by any of the out-of-the-box GUI tools on Windows XP
(although third-party tools exist, of course). See, command-line
tools can be very useful for the everyday tasks of people who know
and use computers every day -- even if gamers don't need them.

Just the vast majority of general-purpose computers in use today. Even jobs
that used to require a mainframe can now often be done on PCs. Things like
supermarket checkouts and airport information screens are often PCs
underneath.
The vast majority of medium-sized systems that aren't PCs are probably games
consoles. They don't use command lines either. Nor do mobile phones.

BZZT. How do you think those phones are programmed? I'm willing
to bet, even though I don't know, that the guys who work for Nokia
or whatever have on their desks little gray boxes with cords that
plug into ports on the phones and interface with the phone's
file system at a rudimentary command line level. Because line-
driven shells are easy to write, and GUIs are hard (generally
speaking).
And a note in passing to look up "Linux" sometime -- it looks
like it might be becoming more popular in both the commercial and
home markets. :)

-Arthur
 
A

Arthur J. O'Dwyer

This is grossly untrue. *Many* real programs are filters, taking stdin as
the default source.

However, do many of those programs use gets() or scanf()? My
experience is that the simpler ones often use state machines based on
getc(), and the more complicated ones often allow the user to
specify files either directly or implicitly, which means using a
FILE * variable, which rules out scanf() and gets() right off the
bat (since they *only* read from stdin -- not that that doesn't
rule out fscanf(stdin,...), of course).
This is ridiculous. Only someone who doesn't know how to discard the '\n'
properly could have written such drivel. Of course it is not "bad" to call
fgets() and discard the trailing '\n'. What is necessary is to decide what
to do when there is no trailing '\n', and the level of complexity involved
need not be large at all.

Right. Remember, often the goal is *not* to read a whole line of
input no matter what. A lot of the time, the goal is to process some
reasonable set of data that a benevolent user is inputting. So the
pseudocode goes like this:

Try to get some data.
Are the data reasonable?
Process them.
else
Reject them.

This is *easily* implemented in C as something like

p = fgets(buffer, sizeof buffer, fp);
if (p && buffer[strlen(buffer)-1]=='\n')
process(buffer);
else
do_error("Line too long, or EOF reached\n");

No problem! The fact that we raise an error on long lines,
instead of trying to read the line anyway, doesn't matter,
because a too-long line is in most cases an indication of
badly formed input. (A program source file, or an ASCII text
file, with 1000-character lines, is almost certainly malformed
or flat-out malicious. In applications where this is not
the case, of course, one must take appropriate measures.)

The gets()/fgets() issue is that replacing gets() with
fgets() replaces UNdefined behavior with DEFINED behavior,
and that's a *gigantic* step in the right direction. Dealing
with malicious input in a domain-specific way may be important,
but the fact that now an attacker *cannot* crash your program
is more important than the fact that he'll get a snide error
message when he tries.

-Arthur
 
M

Martin Ambuhl

Malcolm said:
So how do you determine if there is no trailing '\n', if it's been
discarded?

This question demonstrates that you have no clue. Please learn something
about programming before posting your opinions, which have all bee
ill-considered so far.
 
M

Malcolm

Arthur J. O'Dwyer said:
However, I frequently use a command-line compiler (gcc), which
is IMHO orders of magnitude more powerful and user-friendly than
the typical Visual offering.
It is easy to write a bad GUI. Visual C is fine most of the time, but
occasionally won't let you into the system. For instance, I once made the
mistake of allowing it to create a "shell" project for me. It created a
header called "stdafx.h", then when I added a pre-existing source file,
started complaining about precompiled headers not being found an all sorts
of nuisance. After about a hour messing about with it I finally gave up,
deleted, and started over by hand. For functions designed to save time, such
as pre-compiled headers and wizards, actually became time wasters. This is
quite common.
Using a non-GUI compiler, for a simple program it is of course easier to
type "cc foo.c" and everything will work. For a complex program you would
use make and a makefile (not stdin). make files aren't the easiest things to
write and maintain, though at least you are in control.
Yes, but you may not have known that you can *also* invoke 'grep'
like this:

c:\> grep 'hello' < myfile.txt
or
c:\> dir /s | grep "myprog.exe" | sort
And that's the sort of reason why users won't accept command lines.
Using Windows "find" we can look for files or folders named "myprog.exe",
and the GUI leads us through it. We can then double-click on the icons it
brings up to see what's in them. However you do lose some flexibility - you
can't pipe to your own version of "sort" which handles numerical values as
numbers (so x2 is before x11), for instance.
BZZT. How do you think those phones are programmed? I'm willing
to bet, even though I don't know, that the guys who work for Nokia
or whatever have on their desks little gray boxes with cords that
plug into ports on the phones and interface with the phone's
file system at a rudimentary command line level.
Never programmed a phone but it probably works on the same priciple as a
console.
You have a card which fits into the PC, and downloads the program into the
phone. A phone won't have a file system in the sense that you mean. Often
there'll be lots of turning the phone on and off to clear its memory.
Because line-driven shells are easy to write, and GUIs are hard
(generally speaking).
There's some truth in this. If you have a market of only a few hundred for
you compiler then it's hard to justify writing an expensive GUI for it.
However often it will be written so that it plugs into the VC++ GUI.
Similarly the program to download the executable into the phone might be
command line, though generally the command line programs are being replaced
by GUIs
And a note in passing to look up "Linux" sometime -- it looks
like it might be becoming more popular in both the commercial and
home markets. :)
It's only likely to succeed if it can protect the user from needing to use
the command shell. I can't use it since I need to run programs that only
work on Windows.
 
M

Martin Ambuhl

Arthur said:
However, do many of those programs use gets() or scanf()?

Irrelevant. The grossly untrue statement is "stdin is seldom used in real
programs." There is nothing in that statement referring to gets() or
scanf(). His "real answer" was either a lie or a sign that he has no idea
what he is talking about.

The "real answer" is to learn to use stdin correctly; claiming that it is
seldom used is either a lie or stupid.
 
M

Malcolm

Arthur J. O'Dwyer said:
The gets()/fgets() issue is that replacing gets() with
fgets() replaces UNdefined behavior with DEFINED behavior,
and that's a *gigantic* step in the right direction.
Defined wrong behaviour can be worse than undefined behaviour. The goal is
not just to prevent the computer from crashing, but to prevent any sort of
undesired behaviour. A crash is actually the least dangerous malfunction -
no diferrent to the computer blowing a fuse. What is really dangerous is
wrong but reasonable-looking results.
 
M

Malcolm

Martin Ambuhl said:
The "real answer" is to learn to use stdin correctly; claiming that it is
seldom used is either a lie or stupid.
stdin in is commonly used in beginner programs. It is old-fashioned and is
seldom used in most areas of computing. Interaction with computers is via a
GUI, and lengthy input is by files, and generally it does not make sense to
pipe these to standard input.
You may be working with a legacy system where modern practises haven't
arrived, but don't pretend that this is common.
 
M

Mark McIntyre

No its eccentric. Users generally won't accept command line programs unless
forced to use them.

Donkeys gonads. Users accept any blasted program that they're paid to
operate. /Your/ users may be a collection of gollums hunched over
their gamepads, but mine live out in the real world, where they're
hired to, say trade bonds, and just fscking well get on with it using
a megaphone and some counting beads if we say so. Of course, generally
we let them have an HP desk calculator and a telephone if they start
making money. :)
A GUI is generally far easier to use -

Pardon me, but b*llsh*t. I have an office full of users who require
fulltime assistance just to log their computers on in the morning. And
all of these guys have degrees.
GUIs are notoriously difficult to master. Few people get beyond the
basics, and still do most of their actual work by typing stuff in.
Just the vast majority of general-purpose computers in use today. Even jobs
that used to require a mainframe can now often be done on PCs. Things like
supermarket checkouts and airport information screens are often PCs
underneath.

True, but irrelevant.
The vast majority of medium-sized systems that aren't PCs are probably games
consoles.

Ha! You'd better come down off that ivory tower soon ....
They don't use command lines either. Nor do mobile phones.

Pray tell me, how /do/ you send an text without typing?
 
M

Mark McIntyre

It's only likely to succeed if it can protect the user from needing to use
the command shell.

You may want to pop over to the Mac groups sometime before making
absurd statements like this. Mac users are actually glad have a CLI at
last.
I can't use it since I need to run programs that only
work on Windows.

SoftWindows.
 
M

Martin Dickopp

Malcolm said:
lengthy input is by files, and generally it does not make sense to
pipe these to standard input.

On the contrary, the larger the amount of data to be processed, the more
sense it makes to pipe it to standard input. Imagine a pipe of several
filter programs; each one has its stdin connected to the stdout of the
previous program in the pipe and its stdout connected to the stdin of the
next one. If it were necessary to create a file at each intermediate step,
that would be a great waste of both performance and disk space.

Martin
 
E

Eric

Mark McIntyre said:
You may want to pop over to the Mac groups sometime before making
absurd statements like this. Mac users are actually glad have a CLI at
last.

Not true.

I know several Mac users who hate it and everything about it and the
fact that sometimes they have to use it.

However, others, like me, are happy with the UNIX core and the extra
access to some great software it offers.

Malcolm, is, of course, correct. A CLI system will never be popular
among the general population as the vast majority of people will never
want to deal with the learning curve because they have no need for the
extra abilities it provides.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top