Newbie-question: scanf alternatives?

M

merman

Hi,

I'm a C-Newbie. What is the best (and securest) method to read a string
from commandline? A code-sample would be cool.

Thanks for help.

Thomas
 
R

Richard Bos

merman said:
I'm a C-Newbie. What is the best (and securest) method to read a string
from commandline? A code-sample would be cool.

From the command line, you use the parameters from main(). From standard
input, you use fgets(). How you use it depends on what you want from it.

Richard
 
M

merman

There is a question:

char *name;

printf("\nEnter your name > ");

scanf("%s", name);

or

gets(name);

What is most common way to read this?

Thomas
 
P

pete

merman said:
Hi,

I'm a C-Newbie.
What is the best (and securest) method to read a string
from commandline? A code-sample would be cool.

Use Pop's Device. It goes like this:

#define LENGTH 3 /* LENGTH doesn't have to be 3 */
#define str(x) # x
#define xstr(x) str(x)

int rc;
char array[LENGTH + 1];
rc = scanf("%" xstr(LENGTH) "[^\n]%*[^\n]", array);
if (!feof(stdin)) {
getchar();
}

scanf reads the characters up to the newline or up to LENGTH,
whichever is less. If LENGTH is shorter than the input line,
then second format specifier in the scanf call,
will eat and ignore the rest of the characters up to the newline.
And then, getchar eats the newline character.

If there is an end of file condition, rc will be EOF.
If the enter key is hit with no other input, rc will be zero.
If rc equals one, then you have a string in array.
If rc doesn't equal one then array isn't guaranteed to have a string.

/* BEGIN grade.c */

#include <stdio.h>
#include <stdlib.h>

#define LENGTH 3
#define str(x) # x
#define xstr(x) str(x)

int main(void)
{
int rc, number;
char array[LENGTH + 1];
const char letter[4] = "DCBA";

fputs("Enter the Numeric grade: ", stdout);
fflush(stdout);
rc = scanf("%" xstr(LENGTH) "[^\n]%*[^\n]", array);
if (!feof(stdin)) {
getchar();
}
while (rc == 1) {
number = (int)strtol(array, NULL, 10);
if (number > 60) {
if (number > 99) {
number = 99;
}
array[0] = letter[(number - 60) / 10];
switch (number % 10) {
case 0:
case 1:
case 2:
array[1] = '-';
array[2] = '\0';
break;
case 7:
case 8:
case 9:
array[1] = '+';
array[2] = '\0';
break;
default:
array[1] = '\0';
break;
}
} else {
array[0] = 'F';
array[1] = '\0';
}
printf("The Letter grade is: %s\n", array);
fputs("Enter the Numeric grade: ", stdout);
fflush(stdout);
rc = scanf("%" xstr(LENGTH) "[^\n]%*[^\n]", array);
if (!feof(stdin)) {
getchar();
}
}
return 0;
}
 
M

Michael Mair

Hello,

There is a question:

char *name;

You want to point this somewhere; either allocate memory or point
it to a char array.
printf("\nEnter your name > ");

scanf("%s", name);

or

gets(name);

What is most common way to read this?

One is as bad as the other. Never, ever use gets.
Never.
scanf() can be "healed" (see below).


#define NAMELEN 30
.....

char name[NAMELEN+1];

printf("\nEnter your name > ");
if (fgets(name, NAMELEN+1, stdin) == NULL) {
/* Deal with the error */
}

if (name[strlen(name)-1] != '\n') {
/* name is not long enough to contain what the user typed
** or we read in '\0' before '\n' */
}
else
name[strlen(name)-1] = '\0';

printf("I read %s\n", name);

.....

Read the corresponding stuff in the FAQ why it is a bad idea
to mix scanf and fgets. If you'd rather use scanf in the same
way, you can use the stringize preprocessing operator to
set the field width for the scanf to the appropriate value:

#define STRINGIZE(s) # s
#define MAKESTRING(s) STRINGIZE(s)

#define NAMELEN 30
#define NAMEFORMAT "%"MAKESTRING(NAMELEN)"s"
.....

char name[NAMELEN+1];

printf("\nEnter your name > ");
if (scanf(NAMEFORMAT,name) != 1) {
/* Deal with the error */
}

printf("I read %s\n", name);

.....


Cheers,
Michael
 
C

CBFalconer

merman said:
There is a question:

char *name;

printf("\nEnter your name > ");

scanf("%s", name);

or

gets(name);

What is most common way to read this?

Nobody has the vaguest idea what you are responding to, due to the
absence of any quotations. However ...

The code you show is seriously broken. First and foremost, NEVER
EVER use gets(). It is impossible to use correctly. Secondly,
your usage stores gets (or scanf) output via a pointer that has
never been initialized, and thus is guaranteed to cause undefined
behavior regardless of the users input.
 
M

Malcolm

merman said:
There is a question:

char *name;

printf("\nEnter your name > ");

scanf("%s", name);

or

gets(name);

What is most common way to read this?
char name[1024];

printf("Enter your name >");
fflush(stdout);

gets(name);

will read in the string.

(Not the call to fflush(), if you don't have a trailing newline this is
required to ensure output is visible).

The problem comes when the user enters more than 1023 characters. For some
applications, this is more theoretical than real, but for code that you
release to a third party it is essential to think about it, since some
malicious person could deliberately crash your program, even on some systems
hack into the system (because the overflow overwrites the function return
address, allowing arbitrary code to be run, if you know what you are doing).

fgets() will fix this problem, but adds a new one. What if over 1023
characters are entered, and the partly-read input is processed as whole? The
results are quite likely to be much worse than the undefined behaviour that
results from using gets(), since undefined behaviour is usually correct
behaviour (terminate the offending program with an error message), whilst no
operating system can guard against coded incorrect behaviour, such as
chopping off one of the hundred names of the Indian god brumin-brah and
getting you torn to pieces by his devotees for blasphemy.

Fortunately, fgets() leaves a trailing newline in the buffer, to indicate
that it has read the line correctly.

So what we need to do is

if(!strrchr(name, '\n'))
{
fprintf(stderr, "Input too long\n");
exit(EXIT_FAILURE);
}
 
C

Chris Torek

fgets() will fix this problem, but adds a new one. What if over 1023
characters are entered, and the partly-read input is processed as whole? The
results are quite likely to be much worse than the undefined behaviour that
results from using gets() ...

s/worse/better/ :)

Seriously, which is more annoying: that your program produces bad
output based on bad input, or that the latest malware takes over
your machine, pops up 10,000,000 porn windows, etc? I would much
rather have garbage output from garbage input, than the latest
security breach.
Fortunately, fgets() leaves a trailing newline in the buffer, to indicate
that it has read the line correctly.

So what we need to do is

if(!strrchr(name, '\n'))
{
fprintf(stderr, "Input too long\n");
exit(EXIT_FAILURE);
}

Or, instead of exiting, just consume up to and including the next
newline:

void discard_a_line(FILE *fp) {
int c;

while ((c = getc(fp)) != '\n' && c != EOF)
continue;
}

and do something reasonable -- whatever that may be -- with the
partial input line in "name".
 
M

Malcolm

Chris Torek said:
Seriously, which is more annoying: that your program produces bad
output based on bad input, or that the latest malware takes over
your machine, pops up 10,000,000 porn windows, etc? I would much
rather have garbage output from garbage input, than the latest
security breach.
The danger with fgets() is that it simply truncates over-long input. For
some applications, this can result in reasonable-seeming but wrong values,
which is the worst possible case (if you send a gas bill for six billion
dollars to an old granny then it is merely embarssing, if you send a bill
for two hundred and thirty dollars when the real amount is one hundred and
ninety, you could easily end up in court facing awkward questions).

If you make the buffer so large that no reasonable input would overflow it,
like using a thousand characters for a name, then you are probably OK. If
the input isn't so easily bounded, such as an English-language sentence,
then you can easily not be OK.
 
E

Edmund Bacon

Malcolm said:
The danger with fgets() is that it simply truncates over-long input. For
some applications, this can result in reasonable-seeming but wrong values,
which is the worst possible case (if you send a gas bill for six billion
dollars to an old granny then it is merely embarssing, if you send a bill
for two hundred and thirty dollars when the real amount is one hundred and
ninety, you could easily end up in court facing awkward questions).

Perhaps you just need to know how to use fgets() appropriately:

consider:


$ cat sample.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *get_input(void)
{
char *ptr = 0;
char buff[10] = {0};
size_t len = 0;

fputs("--> ", stdout);
fflush(stdout);

while( buff[strlen(buff) -1 ] != '\n')
{
fgets(buff, sizeof buff, stdin);
len += strlen(buff);
ptr = realloc(ptr, len+1); /* error checking omitted */
strcat(ptr, buff);
}

if(ptr)
ptr[len-1] = '\0'; /* strip trailing newline */

return ptr;
}

int main()
{

while( !feof(stdin) )
{
char *ptr = get_input();

if(ptr)
{
printf( "input was: \"%s\" : %d characters\n",
ptr, strlen(ptr));
free(ptr);
}
}

return 0;
}

$ gcc sample.c -o sample -Wall -W -pedantic -ansi

$ sample
--> a
input was: "a" : 1 characters
--> abcdefgji
input was: "abcdefgji" : 9 characters
--> abcdefghijklmnopqrstuvwxyz
input was: "abcdefghijklmnopqrstuvwxyz" : 26 characters
-->

It doesn't appear that "fgets() is truncating over long input."
At least not on my system.

If this were going into production, I'd want to handle feof() more
gracefully, I'd definitely want to do something if realloc() failed,
and I would set my input buffer to a reasonably large number (perhaps
1024), so that I'm only calling realloc once for most input. But I
don't have to worry about buffer over-runs, or (within the limits of
alloc()) truncating user input.
 
R

Ravi Uday

<snip>

Perhaps you just need to know how to use fgets() appropriately:
consider:


$ cat sample.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *get_input(void)
{
char *ptr = 0;
char buff[10] = {0};
size_t len = 0;

fputs("--> ", stdout);
fflush(stdout);

while( buff[strlen(buff) -1 ] != '\n')
{
fgets(buff, sizeof buff, stdin);
len += strlen(buff);
ptr = realloc(ptr, len+1); /* error checking omitted */
strcat(ptr, buff);
}

if(ptr)
ptr[len-1] = '\0'; /* strip trailing newline */

return ptr;
}

int main()
{

while( !feof(stdin) )
{
char *ptr = get_input();

if(ptr)
{
printf( "input was: \"%s\" : %d characters\n",
ptr, strlen(ptr));
free(ptr);
}
}

return 0;
}

$ gcc sample.c -o sample -Wall -W -pedantic -ansi

$ sample
--> a
input was: "a" : 1 characters
--> abcdefgji
input was: "abcdefgji" : 9 characters
--> abcdefghijklmnopqrstuvwxyz
input was: "abcdefghijklmnopqrstuvwxyz" : 26 characters
-->

It doesn't appear that "fgets() is truncating over long input."
At least not on my system.

If this were going into production, I'd want to handle feof() more
gracefully, I'd definitely want to do something if realloc() failed,
and I would set my input buffer to a reasonably large number (perhaps
1024), so that I'm only calling realloc once for most input. But I
don't have to worry about buffer over-runs, or (within the limits of
alloc()) truncating user input.
Dude i have tried with the following inputs.
Why is it not working..

bash-2.02$ gcc -ansi -Wall fget_s.c
bash-2.02$ ./a.exe
--> hello
input was: "hello" : 5 characters
--> damn
input was: "hell" : 4 characters
--> damn
input was: "damn" : 4 characters
--> damn
input was: "damn" : 4 characters
--> dam
input was: "dam" : 3 characters
--> hello
input was: "damhe" : 5 characters
--> hello
input was: "damhe" : 5 characters
--> ello
input was: "damh" : 4 characters
--> h
input was: "d" : 1 characters
--> hello
input was: "dhell" : 5 characters
-->

- Ravi
 
C

CBFalconer

Malcolm said:
The danger with fgets() is that it simply truncates over-long input.
For some applications, this can result in reasonable-seeming but
wrong values, which is the worst possible case (if you send a gas
bill for six billion dollars to an old granny then it is merely
embarssing, if you send a bill for two hundred and thirty dollars
when the real amount is one hundred and ninety, you could easily end
up in court facing awkward questions).

If you make the buffer so large that no reasonable input would
overflow it, like using a thousand characters for a name, then you
are probably OK. If the input isn't so easily bounded, such as an
English-language sentence, then you can easily not be OK.

The simple answer is to use a routine that ensures you get a
complete line. I have made available ggets (and fggets) which
safely ensure that only complete lines are input, and combines the
simplicity of gets with the safety of fgets. Written in
completely standard C, and available at:

<http://cbfalconer.home.att.net/download/>
 
D

Dan Pop

In said:
NEVER, NEVER, NEVER use gets. Look up fgets instead.

Don't bother: fgets() is broken by design, too. There is NO alternative
to scanf() in the standard C library. If you can't be bothered to learn
how to use scanf() properly, write your own readline() using getc(). Just
make sure it's better designed than fgets() :)

Dan
 
D

Dan Pop

The danger with fgets() is that it simply truncates over-long input.

fgets() would be a nice function if it did that. Unfortunately, the
non-consumed input from the same line is left into stream, to be read
as a brand new line, by the next fgets() or whatever call.

fgets() is a great function for applications that only need to read one
line of input from each stream and don't care if the line was too long
to fit in the buffer.

Dan
 
F

Flash Gordon

<snip>

Perhaps you just need to know how to use fgets() appropriately:

char *ptr = calloc(1,1);
char buff[10] = {0};
size_t len = 0;

fputs("--> ", stdout);
fflush(stdout);

while( buff[strlen(buff) -1 ] != '\n')
{
fgets(buff, sizeof buff, stdin);
len += strlen(buff);
ptr = realloc(ptr, len+1); /* error checking omitted

as you say, failure would need to be dealt with.
*/ strcat(ptr, buff);
}

if(ptr)
ptr[len-1] = '\0'; /* strip trailing newline
*/

return ptr;
}

int main()
{

while( !feof(stdin) )
{
char *ptr = get_input();

if(ptr)
{
printf( "input was: \"%s\" : %d characters\n",
ptr, strlen(ptr));
free(ptr);
}
}

return 0;
}

$ gcc sample.c -o sample -Wall -W -pedantic -ansi

Dude i have tried with the following inputs.
Why is it not working..

bash-2.02$ gcc -ansi -Wall fget_s.c
bash-2.02$ ./a.exe
--> hello
input was: "hello" : 5 characters
--> damn
input was: "hell" : 4 characters

<snip>

It failed because the first time through the input loop realloc provides
an uninitialised block of memory which the code then proceeds to to use
strcat to write to. Probably because of security realloc is returning a
pointer to memory which starts with a null character (and I would guess
might be all nulls) on the first call, but it reuses the memory that was
freed and contains data from previous times the input function was
called.

The fix I've done of callocing a 1 byte buffer (which will therefor
contain a 0 length string) avoids that problem. However, the code will
still be horribly inefficient if passed a long line and is nothing like
what I would write.
 
D

Douglas G

Dan said:
In <[email protected]> Randy Howard


Don't bother: fgets() is broken by design, too. There is NO alternative
to scanf() in the standard C library. If you can't be bothered to learn
how to use scanf() properly, write your own readline() using getc(). Just
make sure it's better designed than fgets() :)

Dan
As another newbie, would it be advisable to use getchar() and set it up to
use a loop that stop at the buffer length or terminating character that you
set up?
 
R

Richard Bos

As another newbie, would it be advisable to use getchar() and set it up to
use a loop that stop at the buffer length or terminating character that you
set up?

Don't bother. Use fgets() instead. Dan Pop is just about the only poster
in this group who thinks fgets() is broken; most of the rest of us think
scanf() is unnecessarily complicated unless your requirements involve
single line lengths only, and fgets() does exactly what it should do.

Richard
 
M

Michael Wojcik

If you make the buffer so large that no reasonable input would overflow it,
like using a thousand characters for a name, then you are probably OK.

This is precisely the attitude which has made so many C programs
security risks. No one who believes it should be writing C code.

--
Michael Wojcik (e-mail address removed)

Thanatos, thanatos! The labourer, dropping his lever,
Hides a black letter close to his heart and goes,
Thanatos, thanatos, home for the day and for ever. -- George Barker
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top