How to use scanf() safely?

I

iwinux

Hi.
Before I use scanf(), I must malloc the memory for it, like this:

//Start
char * buffer;

buffer = malloc(20);
scanf("%s", &buffer);
//End

As we know, if I type 30 characters in, something bad will happen.
So, how can I solve this problem?
(I mean, no matter how many charaters you type in, it can works well.)
 
P

pemo

iwinux said:
Hi.
Before I use scanf(), I must malloc the memory for it, like this:

//Start
char * buffer;

buffer = malloc(20);
scanf("%s", &buffer);
//End

As we know, if I type 30 characters in, something bad will happen.
So, how can I solve this problem?
(I mean, no matter how many charaters you type in, it can works well.)

Don't use scanf, use something like fgets instead.
 
M

Michael Mair

iwinux said:
Before I use scanf(), I must malloc the memory for it, like this:

//Start
char * buffer;

buffer = malloc(20);
scanf("%s", &buffer);
//End

As we know, if I type 30 characters in, something bad will happen.
So, how can I solve this problem?
(I mean, no matter how many charaters you type in, it can works well.)

scanf() cannot easily be used in a safe manner.
See past discussions and the FAQ for this.
Usually, one just uses fgets() (or getchar() in a loop).

Back to scanf():
If you have compile time limits, you can use

#define stringize(s) #s
#define XSTR(s) stringize(s)
#define BUFSIZE 20

char *buffer = malloc(BUFSIZE+1);
if (buffer) {
if (1 == scanf("%"XSTR(BUFSIZE)"s", &buffer) {
do_something(buffer);
}
}

Otherwise, you can do
int len;
char *format;
char *buffer;

len = 1 + snprintf(0, 0, "%%%lus", bufSize);
if (len > 0) {
format = malloc(len);
buffer = malloc(bufSize+1);
if (format && buffer) {
snprintf(format, len, "%%%lus", bufSize);
if (1 == scanf(format, buffer)) {
do_something(buffer);
}
}
}

Cheers
Michael
 
E

Eric Sosman

iwinux said:
Hi.
Before I use scanf(), I must malloc the memory for it, like this:

//Start
char * buffer;

buffer = malloc(20);

if (buffer == NULL) ...
scanf("%s", &buffer);

scanf ("%s", buffer); /* no & */
//End

As we know, if I type 30 characters in, something bad will happen.
So, how can I solve this problem?
(I mean, no matter how many charaters you type in, it can works well.)

There's a whole suite of different things you can do.
One is to tell scanf() how much space is available:

scanf ("%19s", buffer); /* 19 + 1 == 20 */

This will prevent scanf() from trying to store characters
beyond the end of the allocated memory, but it still isn't
wonderful: If you type "supercalifragilisticexpialidocious"
the buffer will receive "supercalifragilisti" and a zero
byte, and then the next input operation will start with
"cexpial...". If you type "It is an Ancient Mariner" the
buffer will receive "It" and a zero byte, and the next
input operation will start with " is an...".

Experience suggests that scanf() is *not* a good
function for interactive input. It is often better to
read a line at a time with fgets() (not with gets(),
mind you!) and then extract data from the complete
line, possibly with sscanf(). fgets() has its own set
of problems, but they are usually easier to deal with
than those of the much more complex scanf().
 
R

Richard Heathfield

iwinux said:
Hi.
Before I use scanf(), I must malloc the memory for it, like this:

//Start
char * buffer;

buffer = malloc(20);

What if malloc returns NULL?
scanf("%s", &buffer);

The & is incorrect.
//End

As we know, if I type 30 characters in, something bad will happen.

Right. Well, it might. Or it might try to lull you into a false sense of
security.
So, how can I solve this problem?
(I mean, no matter how many charaters you type in, it can works well.)

There is always a limit, of course. But if you are prepared to abandon
scanf, you can make the limit sufficiently large for any practical purpose,
without having stupidly large static arrays around the place.

http://www.cpax.org.uk/prg/writings/fgetdata.php contains an article I wrote
which deals with precisely this problem, and which comes up with some
practical solutions.
 
I

iwinux

There is always a limit, of course. But if you are prepared to abandon
scanf, you can make the limit sufficiently large for any practical purpose,
without having stupidly large static arrays around the place.

So it's not easy to deal with a very long string?
Such as an text editor.
 
R

Richard Heathfield

iwinux said:
So it's not easy to deal with a very long string?

Define "easy". It's easy for me. I don't know whether it's easy for you.
Such as an text editor.

<shrug> If you're writing a text editor, the ability to handle arbitrarily
long strings is the least of your worries.
 
L

lovecreatesbeauty

Eric said:
There's a whole suite of different things you can do.
One is to tell scanf() how much space is available:

scanf ("%19s", buffer); /* 19 + 1 == 20 */

This will prevent scanf() from trying to store characters
beyond the end of the allocated memory, but it still isn't
wonderful: If you type "supercalifragilisticexpialidocious"
the buffer will receive "supercalifragilisti" and a zero
byte, and then the next input operation will start with
"cexpial...". If you type "It is an Ancient Mariner" the
buffer will receive "It" and a zero byte, and the next
input operation will start with " is an...".

Before the coming input operation, the program can clear the remainder
characters and has a correct beginning.
Experience suggests that scanf() is *not* a good
function for interactive input. It is often better to
read a line at a time with fgets() (not with gets(),
mind you!) and then extract data from the complete
line, possibly with sscanf(). fgets() has its own set
of problems, but they are usually easier to deal with
than those of the much more complex scanf().

Do you think the fgets and sscanf combination is also a right candidate
for non-user-interactive input, e.g. file input? Which functions should
be used for file input? Thank you.
 
E

Eric Sosman

lovecreatesbeauty said:
Do you think the fgets and sscanf combination is also a right candidate
for non-user-interactive input, e.g. file input? Which functions should
be used for file input? Thank you.

It depends on the "provenance" of the file. It's perfectly
all right to use fscanf() directly if you're sure that the file
adheres to the expected format (or if you're willing to accept
the consequences of a deviation). If a program writes a file,
rewinds it, and reads it back again, fscanf() seems fine. If
Program A writes the file and a "related" Program B reads it,
fscanf() with bare-bones error-checking may be good enough (one
still needs some error-checking in case A 1.1 writes something
that B 1.0 can't digest).

If the file comes from an "unrelated" program, one must be
more cautious when reading it. If you write a program intending
that it be used as "vmstat 10 | myprogram" you must be on guard
against "vmstat -p 10 | myprogram" or "iostat -xn 5 | myprogram"
or even "myprogram < /etc/passwd". It is usually sufficient to
terminate with regrets when unexpected input is detected, but the
detection itself is also usually important ...

For "untrusted" line-oriented files, fgets() is a good place
to start because it captures the notion of "line." (Imperfectly,
in the case of lines too long for the provided buffer, but you
can write a little extra code to deal with that or to detect it
and say "This line of >1023 characters didn't come from vmstat.")
Once you've got the line sitting in a character array, C has a
good assortment of surgical tools for dissecting it: there's
sscanf(), strtok() -- I use it unashamedly, with care -- strchr(),
the <ctype.h> arsenal, strtod(), and all the rest.

In extreme cases, you might even write a full-fledged parser
that recognizes the input as matching (or failing to match) a
formal grammar, and possibly verifies other constraints as well --
the XML fad is founded on the desire to be able to do this sort
of thing in a fairly mechanical fashion. Such a parser might or
might not need the notion of "line;" it depends on the format.
 
L

lovecreatesbeauty

Eric said:
It is often better to
read a line at a time with fgets() (not with gets(),
mind you!) and then extract data from the complete
line, possibly with sscanf().

I once thought fgets and sscanf may be better than the single scanf. At
the moment, I do not have that feeling at all. sscanf and scanf come
from one same family, the defeats in scanf remain in sscanf. When a
user enters, e.g. "WHAT_VALUE_ABC", both fail:
scanf("%d", &i);
or
sscanf(buf, "%d", &i);

The program validates the range of the data user provided, prompts
users to reenter proper data after invalid data provided. Isn't this
the right way?
 
E

Eric Sosman

lovecreatesbeauty wrote On 07/18/06 11:29,:
I once thought fgets and sscanf may be better than the single scanf. At
the moment, I do not have that feeling at all. sscanf and scanf come
from one same family, the defeats in scanf remain in sscanf. When a
user enters, e.g. "WHAT_VALUE_ABC", both fail:
scanf("%d", &i);
or
sscanf(buf, "%d", &i);

The program validates the range of the data user provided, prompts
users to reenter proper data after invalid data provided. Isn't this
the right way?

Try the experiment yourself. For each of these
programs:

/* Program S */
#include <stdio.h>
int main(void) {
int x;
for (;;) {
puts ("Enter a value:");
if (scanf("%d", &x) == 1)
break;
puts ("Try again, please.");
}
printf ("The number is %d\n", x);
return 0;
}

/* Program SS */
#include <stdio.h>
int main(void) {
int x;
for (;;) {
char buff[100];
puts ("Enter a value:");
if (fgets(buff, sizeof buff, stdin) == buff
&& sscanf(buff, "%d", &x) == 1)
break;
puts ("Try again, please.");
}
printf ("The number is %d\n", x);
return 0;
}

.... enter WHAT_VALUE_ABC at the first prompt and 42 at
the second. Are there any differences in behavior? If
so, which behavior do you think is more useful in an
interactive setting? Why?
 
P

pete

lovecreatesbeauty said:
Before the coming input operation, the program can clear the remainder
characters and has a correct beginning.

scanf can be used more powerfully than that:

/* BEGIN new.c */
/*
** If rc equals 0, then an empty line was entered
** and the array contains garbage.
** If rc equals EOF, then the end of file was reached.
** If rc equals 1, then there is a string in array.
** Up to LENGTH number of characters are read
** from a line of a text file or stream.
** If the line is longer than LENGTH,
** then the extra characters are discarded.
*/
#include <stdio.h>

#define LENGTH 80
#define str(x) # x
#define xstr(x) str(x)

int main(void)
{
int rc;
char array[LENGTH + 1];

puts("The LENGTH macro is " xstr(LENGTH));
fputs("Enter a string with spaces:", stdout);
fflush(stdout);
rc = scanf("%" xstr(LENGTH) "[^\n]%*[^\n]", array);
if (!feof(stdin)) {
getchar();
}
while (rc == 1) {
printf("Your string is:%s\n\n"
"Hit the Enter key to end,\nor enter "
"another string to continue:", array);
fflush(stdout);
rc = scanf("%" xstr(LENGTH) "[^\n]%*[^\n]", array);
if (!feof(stdin)) {
getchar();
}
if (rc == 0) {
*array = '\0';
}
}
return 0;
}

/* END new.c */
 
L

lovecreatesbeauty

Eric said:
lovecreatesbeauty wrote On 07/18/06 11:29,:
I once thought fgets and sscanf may be better than the single scanf. At
the moment, I do not have that feeling at all. sscanf and scanf come
from one same family, the defeats in scanf remain in sscanf. When a
user enters, e.g. "WHAT_VALUE_ABC", both fail:
scanf("%d", &i);
or
sscanf(buf, "%d", &i);

The program validates the range of the data user provided, prompts
users to reenter proper data after invalid data provided. Isn't this
the right way?

Try the experiment yourself. For each of these
programs:

/* Program S */
#include <stdio.h>
int main(void) {
int x;
for (;;) {
puts ("Enter a value:");
if (scanf("%d", &x) == 1)
break;
puts ("Try again, please.");
}
printf ("The number is %d\n", x);
return 0;
}

/* Program SS */
#include <stdio.h>
int main(void) {
int x;
for (;;) {
char buff[100];
puts ("Enter a value:");
if (fgets(buff, sizeof buff, stdin) == buff
&& sscanf(buff, "%d", &x) == 1)
break;
puts ("Try again, please.");
}
printf ("The number is %d\n", x);
return 0;
}

... enter WHAT_VALUE_ABC at the first prompt and 42 at
the second. Are there any differences in behavior? If
so, which behavior do you think is more useful in an
interactive setting? Why?

/*scanf and sscanf are very similar. I can think of two differences
between them, one is sscanf needs one more argument, the other is the
difference demonstrated by the example code. but that can be fixed, see
line 9. please correct me if I am wrong.*/

/* Program S.2 */
#include <stdio.h>
int main(void) {
int x;
for (;;){
puts("Enter a value:");
if (scanf("%d", &x) == 1)
break;
while ((x = getchar()) != '\n' && x != EOF) ; /*line 9*/
puts ("Try again, please.");
}
printf ("The number is %d\n", x);
return 0;
}
 
E

Eric Sosman

lovecreatesbeauty said:
/*scanf and sscanf are very similar. I can think of two differences
between them, one is sscanf needs one more argument, the other is the
difference demonstrated by the example code. but that can be fixed, see
line 9. please correct me if I am wrong.*/

/* Program S.2 */
#include <stdio.h>
int main(void) {
int x;
for (;;){
puts("Enter a value:");
if (scanf("%d", &x) == 1)
break;
while ((x = getchar()) != '\n' && x != EOF) ; /*line 9*/
puts ("Try again, please.");
}
printf ("The number is %d\n", x);
return 0;
}

Good: You've spotted the difference -- but you haven't
thought about it enough yet. Exercise: Modify the program
to read an integer from one line and a double from another,
prompting with "Enter an integer" and "Enter a double".
Test it by entering "42" on the first line and "42.0" on
the second. Then run it again, but this time enter "4 2"
on the first line. Run it a third time, entering "42 BAD"
on the first line and "BAD 42.0" on the second. Run it a
fourth time, entering " " at each prompt. Try to emit error
messages that describe as accurately as possible just how the
input differs from what the program expects.

The fundamental reason that fscanf() is not very good for
interactive input is that much interactive input is line-oriented,
but fscanf() is very nearly oblivious to line boundaries. fgets()
can provide the line awareness and then sscanf() can perform the
parsing, with the knowledge that it's operating on a line and not
on a stream of input that crosses an arbitrary number of line
boundaries, possibly more or fewer than you were expecting.

It is *possible* to do interactive input with fscanf(),
just as it is *possible* to write full-fledged C programs without
for, do, while, and if. Nobody will forbid you to indulge in
self-imposed hardships if that's your pleasure, but many will
wonder why you insist on doing things the hard way.
 
W

websnarf

iwinux said:
Before I use scanf(), I must malloc the memory for it, like this:

//Start
char * buffer;

buffer = malloc(20);
scanf("%s", &buffer);
//End

As we know, if I type 30 characters in, something bad will happen.
So, how can I solve this problem?

As with any problem, to solve it you must first understand the nature
of the problem. scanf() forces all destination variables to be
predclared before the input starts. So using scanf itself is the
source of the problem. In general its preferable to obtain the input
from some other method (an iterated fgets is possible, but hardly
ideal) then use *sscanf()* AFTER deciding on how much memory to malloc
for your destinations.

(Another problem is that more than likely you don't want to scanf()
parsing semantics. Strings are terminated by white space with scanf()
for some inexplicable reason.)
(I mean, no matter how many charaters you type in, it can works well.)

Anyhow, first lets start with getting a full line of input safely (C
doesn't have any built-in provisions for doing this):

http://www.pobox.com/~qed/userInput.html

The key point being that using fgetstralloc(), you know the length of
the input and have a the entire contents of the input in one shot (most
other programming languages have a built-in mechanism for doing this,
BTW). From there you can estimate the destination sizes, or use
strcspn() to help you parse before you figure out exactly how much
memory you need for your destination parameters, then use sscanf() or
whatever to extract the exact results.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top