reading from a text file

G

googler

I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.

#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.", when the input
file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.
 
S

Skarmander

googler said:
I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.
Any suggestion is appreciated. Thanks.

fgets.

S.
 
G

googler

Skarmander said:
fgets.

S.

Sorry, I should have used fgets.. my bad.

One more question. My code looks like:
while (!feof(fp))
{
fgets(str, 120, fp);
printf("%s", str);
}

This prints the last line twice. I don't understand why. When it prints
the last line for the first time, it should have known that the end of
the file has been reached, so the next condition check for the while
loop should have failed. Why is it still entering the while loop and
printing the last line again?

Thanks.
 
G

granvillemw

Don't use feof. You can always use fgets to check if it gets to the end
of the file.

for example,
while( fgets(str, 120,fp) )
{
printfr("%s", str);
}

cheers
 
A

AllaTurca

Reading a line and a character are not much different in terms of
efficiency because a caching is already done by file reader function.
Meaning it reads some blocks (much more than you need) of file at one
time and stores it on the memory. I think you must use the appropriate
one according to your need. If you just want to print the file on the
screen the two would not make much difference but fgets would be a
little better. And in processing the sum of "a little"s are big.
 
G

granvillemw

Or actually, I should've done some explanation that, during your last
double-printing, fgets returned NULL whilst the "str" memory stayed the
same in terms of the content.

the workaround to your problem whilst keeping the feof would be as
follows:
while ( !feof(fp) )
....
if( fgets(...) )
printf....
 
R

Richard Heathfield

googler said:
I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.

Others have already answered your question, but nobody appears to have
pointed out yet that...
#include <stdio.h>
void main(void)

....in C, main returns int, not void. This is a common error, and those who
commit it often find it hard to believe that it's wrong. Nevertheless, no C
compiler is required to accept a main function that returns void unless it
specifically documents that acceptance - which few, if any, do.
 
C

Chris Torek

One more question. [The following code fragment, slightly edited for space]
while (!feof(fp)) {
fgets(str, 120, fp);
printf("%s", str);
}
... prints the last line twice. I don't understand why. When it prints
the last line for the first time, it should have known that the end of
the file has been reached,

What makes you believe that? Remember that fgets() is (loosely)
defined as:

int c;
while ((c = getc(fp)) != EOF && c != '\n') {
line[i++] = c;
if (c == '\n')
break;
}
line = '\0';

(of course fgets() avoids overrunning your buffer, so it is a little
more complicated than that, but assume for the moment that no input
lines will be overlong).

Suppose the stdio "FILE *fp" is connected to a human being (at the
keyboard, or a terminal, or behind a network connection via telnet
or ssh or whatever), who has some way of signalling "end of file"
while still remaining connected (on many OSes this is done by
entering ^Z or ^D or @EOF or some similar character or string as
the only input on a line). The human types:

abc[enter]

so the first fgets() reads "abc\n". The fgets() call returns.
What should feof(fp) be?

The human *might* be *about* to press ^D or ^Z or type @EOF or
whatever it is that will signal EOF. Should feof(fp) wait until
he does so? What should it do if, instead, he types "def" and
presses ENTER?

You are effectively expecting the feof() predicate to predict the
future. There is no way for it to do that. It *could*, of course,
try to read input from the file -- in effect, waiting for the human
to signal EOF or enter "def\n". But it does not do that. Predicting
the future is too difficult. C is a simple language. It is much
easier to "predict" the past ... so that is what feof() does!
Instead of telling you "a future attempt to read is not going to
work because EOF is coming up", it tells you "a previous attempt
to read that failed, failed because EOF came up."

Suppose, now, that instead of a human, the stdio FILE *fp is
connected to a file on a floppy disk (or CD-ROM or DVD or whatever).
Suppose further that the floppy has been corrupted (someone used
a magnet to hold it up on the fridge, or scratched the CD-ROM, or
whatever). Your program/OS knows that the file should be 271483
bytes long, but partway in, the media turns out to be unreadable.
The fgetc() function -- or its getc() equivalent -- will return
EOF, indicating that it is unable to continue reading.

What should feof(fp) be? The file size is known (271483 bytes)
but you have at this point successfully read only 65536 bytes.
Should feof(fp) return nonzero (true)? You have not reached the
end of the file!

As before, feof() does not try to predict the future; instead, it
"predicts" the past. It tells you whether the getc() that returned
EOF did so because of end-of-file. In this case, it is *not* the
end of the file -- so feof(fp) is 0 (i.e., false). The other
predicate, ferror(fp), will be nonzero (i.e., true). It is
"predicting" the past, and telling you that the getc() failed due
to error. (Of course, the ability to distinguish between "normal
end of file" and "error reading data" is O/S and sometimes filesystem
or device specific, but it is fairly common.)

Because feof() only tells you about *previous* failures, and --
worse -- only tells you about EOF and not about errors, any loop
of the form:

while (!feof(fp))

is virtually *guaranteed* to be wrong. If you ever see this in
C code, be very suspicious.

As for why the last line prints twice, well, that one is a FAQ. :)
 
P

pete

googler said:
I'm trying to read from an input text file and print it out. I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word (terminating in whitespace
or newline) at a time, instead of reading the whole line.

#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.", when the input
file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.

/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>

#define SOURCE "test.txt"
#define LINE_LEN 119
#define str(s) # s
#define xstr(s) str(s)

int main(void)
{
int rc;
FILE *fd;
char line[LINE_LEN + 1];

fd = fopen(SOURCE, "r");
if (fd == NULL) {
fprintf(stderr,
"\nfopen() problem with \"%s\"\n", SOURCE);
exit(EXIT_FAILURE);
}
do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
++rc;
}
printf("%s\n", line);
} while (rc == 1);
fclose(fd);
return 0;
}

/* END new.c */
 
P

pete

pete said:
I'm trying to read from an input text file and print it out.
I can do
this by reading each character, but I want to implement it in a more
efficient way. So I thought my program should read one line at a
time
and print it out. How can I do this? I wrote the code below but it's
not correct since the fscanf reads one word
(terminating in whitespace
or newline) at a time, instead of reading the whole line.

#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.",
when the input
file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.

/* BEGIN new.c */

new.c outputs the last line double.
I'm working on it.
 
P

pete

pete said:
do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
++rc;
}
printf("%s\n", line);
} while (rc == 1);

/*
** The following shows all of the different values that rc can have
** and also fixes the double output of the last line.
*/

do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
}
if (rc != EOF) {
printf("%s\n", line);
}
} while (rc == 1 || rc == 0);
 
B

bildad

googler said:
#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.", when the input
file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.

/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>

#define SOURCE "test.txt"
#define LINE_LEN 119
#define str(s) # s
#define xstr(s) str(s)

int main(void)
{
int rc;
FILE *fd;
char line[LINE_LEN + 1];

fd = fopen(SOURCE, "r");
if (fd == NULL) {
fprintf(stderr,
"\nfopen() problem with \"%s\"\n", SOURCE);
exit(EXIT_FAILURE);
}
do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
++rc;
}
printf("%s\n", line);
} while (rc == 1);
fclose(fd);
return 0;
}

/* END new.c */

This is my solution, after research. Criticism welcome.

#include <stdio.h>

#define MAX_LEN 120

void ReadFile(FILE *fp);
int ErrorMsg(char *str);

int main(void)
{
FILE *fp;
char filename[]= "test.txt";

if ((fp = fopen(filename, "r")) == NULL){
ErrorMsg(filename);
} else {
ReadFile(fp);
fclose(fp);
}
return 0;
}

void ReadFile(FILE *fp)
{
char buff[MAX_LEN];

while (fgets(buff, MAX_LEN, fp)) {
printf("%s", buff);
}
}

int ErrorMsg(char *str)
{
printf("Cannot open %s.\n", str);
return;
}
 
P

pete

bildad said:
googler said:
#include <stdio.h>
void main(void)
{
char str[120];

FILE *fp = fopen("test.txt", "r");

while (!feof(fp))
{
fscanf(fp, "%s\n", str);
printf("%s", str);
}
}

So I get an output like "Thisisatest.Itisnotworking.",
when the input file contains:
This is a test.
It is not working.

Any suggestion is appreciated. Thanks.

/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>

#define SOURCE "test.txt"
#define LINE_LEN 119
#define str(s) # s
#define xstr(s) str(s)

int main(void)
{
int rc;
FILE *fd;
char line[LINE_LEN + 1];

fd = fopen(SOURCE, "r");
if (fd == NULL) {
fprintf(stderr,
"\nfopen() problem with \"%s\"\n", SOURCE);
exit(EXIT_FAILURE);
}
do {
rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
if (!feof(fd)) {
getc(fd);
}
if (rc == 0) {
*line = '\0';
++rc;
}
printf("%s\n", line);
} while (rc == 1);
fclose(fd);
return 0;
}

/* END new.c */

This is my solution, after research. Criticism welcome.

#include <stdio.h>

#define MAX_LEN 120

If the lines are longer than LINE_LEN,
then the characters after LINE_LEN and before the newline,
are discarded.
What happens if the lines are longer than MAX_LEN?
void ReadFile(FILE *fp);
int ErrorMsg(char *str);

int main(void)
{
FILE *fp;
char filename[]= "test.txt";

if ((fp = fopen(filename, "r")) == NULL){
ErrorMsg(filename);
} else {
ReadFile(fp);
fclose(fp);
}
return 0;
}

void ReadFile(FILE *fp)
{
char buff[MAX_LEN];

while (fgets(buff, MAX_LEN, fp)) {
printf("%s", buff);
}
}

int ErrorMsg(char *str)
{
printf("Cannot open %s.\n", str);
return;

return 0; /* maybe? */
 
K

Keith Thompson

bildad said:
This is my solution, after research. Criticism welcome.

Not bad, but I do have a few comments.
#include <stdio.h>

#define MAX_LEN 120

Obviously this is arbitrary (as it must be). If you haven't already,
you should think about what happens if the input file contains lines
longer than MAX_LEN characters. Since you're using fgets(), the
answer is that it works anyway, but you should understand why. Read
the documentation for fgets() and work through what happens if an
input line is very long.
void ReadFile(FILE *fp);
int ErrorMsg(char *str);

int main(void)
{
FILE *fp;
char filename[]= "test.txt";

In a real program, you'd want to be able to specify the name of the
file, probably on the command line.
if ((fp = fopen(filename, "r")) == NULL){
ErrorMsg(filename);

The ErrorMsg() function claims to return an int (but see below), but
you discard the result.
} else {
ReadFile(fp);
fclose(fp);
}
return 0;

This would be a good opportunity to indicate to the environment
whether you were able to open the file, using "return EXIT_SUCCESS" if
you were successful, "return EXIT_FAILURE" if you weren't (or,
equivalently, "exit(EXIT_SUCCESS)" or "exit(EXIT_FAILURE)"). Note
that the exit() function and the EXIT_SUCCESS and EXIT_FAILURE macros
}

void ReadFile(FILE *fp)
{
char buff[MAX_LEN];

while (fgets(buff, MAX_LEN, fp)) {

Some people (including me) would prefer an explicit comparison against
NULL. What you've written is fine, though, and any C programmer
should be able to read both forms easily.
printf("%s", buff);

Since you're not doing any formatting of the output, "fputs(buff);"
would be simpler; it's also more closely parallel with fgets(). This
is a matter of style, though. It's often easier to use printf()
consistently than to remember the details of puts() vs. fputs() (as
well as fputc(), putc(), and putchar()).

As a matter of style, it might make more sense to pass the file name
as an argument to ReadFile(), and make FP local to it rather than to
main(). The fopen() call would then be inside ReadFile(). Also,
CopyFile() (or copy_file()) might be a better name, since it doesn't
just read the file.
int ErrorMsg(char *str)
{
printf("Cannot open %s.\n", str);
return;
}

Unless you're going to add more to this, I'm not sure it needs to be a
function. You might as well just replace the call to ErrorMsg() with
the printf() call. Also, it's traditional to write error messages to
stderr:
fprintf(stderr, "Cannot open %s\n", str);
Note that I've dropped the '.' in the error message; it could look
like it's part of the file name.

The ErrorMsg() function is declared to return an int, but you don't
return a value. In fact, this is illegal (at least in C99). You
should declare the function to return void, not int.

The return statement isn't even necessary. A return with no value is
equivalent to falling off the end of the function, which you're about
to do anyway. (Were you expecting the return to terminate the main
program? It doesn't; it just returns control to the point after the
call.)
 
K

Keith Thompson

pete said:
bildad wrote: [...]
#define MAX_LEN 120

If the lines are longer than LINE_LEN,
then the characters after LINE_LEN and before the newline,
are discarded.
What happens if the lines are longer than MAX_LEN?

You mean MAX_LEN, not LINE_LEN, right?

In this case, no, they're not discarded, they're just left on the
input stream. See my other response in this thread.
 
B

bildad

Not bad, but I do have a few comments.


Obviously this is arbitrary (as it must be). If you haven't already,
you should think about what happens if the input file contains lines
longer than MAX_LEN characters. Since you're using fgets(), the
answer is that it works anyway, but you should understand why. Read
the documentation for fgets() and work through what happens if an
input line is very long.
K&R2, p.164, 7.7, par.1:

char *fgets(char *line, int maxline, FILE *fp)

"at most maxline-1 characters will be read."

I changed MAX_LEN to test this but it still seemed to work fine. The only
documentation I have is K&R2 and King's C Programming. Am I looking in the
wrong place. I googled "fgets()" and "c programming fgets()" but didn't
find anything relevant (at least to me).
 
K

Keith Thompson

bildad said:
K&R2, p.164, 7.7, par.1:

char *fgets(char *line, int maxline, FILE *fp)

"at most maxline-1 characters will be read."

I changed MAX_LEN to test this but it still seemed to work fine. The only
documentation I have is K&R2 and King's C Programming. Am I looking in the
wrong place. I googled "fgets()" and "c programming fgets()" but didn't
find anything relevant (at least to me).

Right. Suppose an input line is 300 characters long. Your call to
fgets() will read 119 characters; the resulting buffer will contain a
valid string terminated by a '\0' character, but it won't contain a
newline. Your call to fputs() or printf() will print this partial
line.

Think about what happens when you all fgets() again. You still have
the rest of the line waiting to be read, and the next fgets() gets the
next 119 characters of the line, which you then print.

On the *next* call to fgets(), you read the remainder of the long
input line, including the newline, and you then print it. You've read
and printed the entire line, but you've done it in 3 chunks.

If all you're doing with each result from fgets() is printing it, it
doesn't matter that it might take several calls to fgets() to read the
whole line. If you're doing more processing than that (as you
typically would in a real-world program), it could become a problem.
 
C

Chris Torek

K&R2, p.164, 7.7, par.1:

char *fgets(char *line, int maxline, FILE *fp)

"at most maxline-1 characters will be read."

The obvious question, then, is: "what happens to characters that
are not read?"

(What do you think *should* happen to them?)
I changed MAX_LEN to test this but it still seemed to work fine.

Indeed; as Keith Thompson noted, "it works anyway".

Suppose MAX_LEN were (say) 3, so that fgets() could read at most
two characters at a time. Suppose further that file being read
consisted entirely of the single line of:

The quick brown fox jumps over the lazy dog.

The first fgets() call would read at most two characters (3 - 1),
stopping if it encounters EOF, and also stopping if it encounters
a newline. The first two characters are 'T' and 'h', so fgets()
will set your array to contain the sequence {'T', 'h', '\0'} (which
is a valid C string) and return a non-NULL value.

What happens to the characters that are not yet read?

The next fgets() call will read at most two more characters. What
will they be?

What happens when fgets() has read 'g' and '.', so that only one
character, '\n', remains in the input file? What will fgets()
read and what will it put in your array? What will happen on the
next fgets() call?
 
B

bildad

The obvious question, then, is: "what happens to characters that
are not read?"

They're read in succeeding calls to fgets()?
(What do you think *should* happen to them?)


Indeed; as Keith Thompson noted, "it works anyway".

Suppose MAX_LEN were (say) 3, so that fgets() could read at most
two characters at a time. Suppose further that file being read
consisted entirely of the single line of:

The quick brown fox jumps over the lazy dog.

The first fgets() call would read at most two characters (3 - 1),
stopping if it encounters EOF, and also stopping if it encounters
a newline. The first two characters are 'T' and 'h', so fgets()
will set your array to contain the sequence {'T', 'h', '\0'} (which
is a valid C string) and return a non-NULL value.

What happens to the characters that are not yet read?

The next fgets() call will read at most two more characters. What
will they be?

What happens when fgets() has read 'g' and '.', so that only one
character, '\n', remains in the input file? What will fgets()
read and what will it put in your array?
newline

What will happen on the
next fgets() call?

EOF ?
 
C

Chris Torek

On 30 Oct 2005 02:07:06 GMT, Chris Torek wrote:
[given a buffer of size 3, so that fgets() reads at most two
characters at a time, and an input line that contains an even
number of characters followed by a newline followed by EOF...]
They're read in succeeding calls to fgets()?
Indeed!

Correct -- which is of course just one character; the array will
be modified to hold {'\n', '\0'} in elements 0 and 1 respectively,
with element 2 unchanged. (I am not actually sure the standard
*requires* element 2 to be unchanged, but in practice, it is.)

Indeed.

Thus, the loop in:

/* where fp is some valid input file, of course, with
text as described above */
char buf[3];
while (fgets(buf, sizeof buf, fp) != NULL)
printf("%s", buf); /* or: fputs(buf, stdout); */

will print out two characters at a time until it prints the
final newline (one character) and then terminates (because
fgets() will return NULL, having encountered EOF).

In this case, this is just what you want. If you were actually
trying to interpret whole input lines -- as is often the case when
reading input from a human being who is typing commands -- it is
probably not what you want, as the loop might look more like:

while (fgets(buf, sizeof buf, fp) != NULL) {
... code to interpret a command ...
}

and you probably do not want to interpret "he", then "ll", then
"o\n" as three separate commands. In this case you would (a) need
a bigger buffer, and (b) need to double-check to see whether the
human managed to type in an overly long input line despite the
bigger buffer.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top