fgets, EOF in middle of line, does not cause error

T

TTroy

Hello,
I have found some peculiar behaviour in the fgets runtime library
function for my compiler/OS/platform (Dev C++/XP/P4) - making a C
console program (which runs in a CMD.exe shell).

The standard says about fgets:

synopsis
#include <stdio.h>
char *fgets(char *s, int n, FILE *stream);

description

"The fgets function returns s if successful. If end-of-file is
encountered and no characters have been read into the array, the
contents of the array remain unchanged and a null pointer is returned.
If a read error occurs during the operation, the array contents are
indeterminate and a null pointer is returned."



The problem I'm having is, if EOF is provided after another character,
fgets simply doesn't return NULL to flag an error, so my error catching
code is not finding anything. Also, the EOF in the line, for me, all
it does it causes fgets to ignore the EOF -AND- also ignore everything
after the EOF upto and including the first seen newline.

So fgets hangs if I provide it with an EOF, because it basically
discards the EOF condition and all characters after it including the
next newline(which normally tells fgets to stop, but in this case it
just discards and ignores it). So I find myself having to hit newline
again to make fgets -unblock- or wahtever you want to call it.

For example, I ran this test code:

#include <stdio.h>
#include <stdlib.h>

#define MAXINPUT 30

int
main(void)
{
char sample[MAXINPUT];

if(!fgets(sample, MAXINPUT, stdin))
{
fprintf(stderr, "\nfgets() error, program quitting\n");
exit(1);
}

printf("sample = %s\n", sample);

return 0;
}


My input was:
-------------
hello[EOF]idiot[\n][space]world[\n]


My output was (not including echoes):
-------------------------------------
sample = hello world[\n][\n]

- notice that there are two newlines in the output because fgets stores
a newline and my printf format string also has one



As you can see, all fgets did was ignore everything from the EOF upto
and including the next newline. It definitely ignored this newline,
because fgets stayed -blocked- and the program was still waiting for my
input (as if it required another newline) - so I typed a space and
'world' then hit enter (this enter/newline registered properly and
fgets unblocks) and fgets assumed it go the input {hello world\n} and
just ignored/discarded the middle part: {[EOF]idiot\n} - ignore the
braces, they are used as delimitters only.

This wouldn't bother me if fgets returned a null pointer like the
standard said it would, because I would really like to trap this type
of stupid user input and deal with it, but I can't even do that.

Can someone test this program on their system or tell me what I'm
missing? Isn't fgets supposed to return a null pointer?

thanking everyone very much,

Tinesan Troy, B.Eng (mech)
(e-mail address removed)
 
R

Richard Kettlewell

TTroy said:
I have found some peculiar behaviour in the fgets runtime library
function for my compiler/OS/platform (Dev C++/XP/P4) - making a C
console program (which runs in a CMD.exe shell).

The standard says about fgets:

synopsis
#include <stdio.h>
char *fgets(char *s, int n, FILE *stream);

description

"The fgets function returns s if successful. If end-of-file is
encountered and no characters have been read into the array, the
contents of the array remain unchanged and a null pointer is returned.
If a read error occurs during the operation, the array contents are
indeterminate and a null pointer is returned."

You've omitted a paragraph. The description actually says it reads up
to eof or newline. If more than one character is read but eof is
found before newline, then s is returned.
The problem I'm having is, if EOF is provided after another character,
fgets simply doesn't return NULL to flag an error, so my error catching
code is not finding anything.

A null pointer is only returned if eof is found before any other
characters, or if an error occurs. If you want to detect end of file,
use feof().
Also, the EOF in the line, for me, all it does it causes fgets to
ignore the EOF -AND- also ignore everything after the EOF upto and
including the first seen newline.

You'd expect see this behaviour if the eof character is interpreted
after reading a whole line (and truncates that line) rather than
immediately causing a short read by the next layer down from fgets()
(the latter being the UNIX behaviour).

IOW, if the input is "foo<eof>bar<newline>" then in UNIX the first
read() call returns "foo" and the next "bar<newline>".

But if the equivalent of read() returns "foo<eof>bar<newline>",
because it doesn't know that whatever character you typed for <eof> is
special, then they must return "foo" the first time and either store
"bar<newline>" for the following call, or discard it. In this case it
sounds like it is discarding it.
 
M

Mark McIntyre

The problem I'm having is, if EOF is provided after another character,
fgets simply doesn't return NULL to flag an error, so my error catching
code is not finding anything.

If you read the standard correctly, you'll see that if fgets encounters eof in a
string, it returns the string read so far. It only returns EOF if it encounters
end-of-file or an error.
 
M

Mac

I deleted comp.std.c from the newsgroups list.

Hello,
I have found some peculiar behaviour in the fgets runtime library
function for my compiler/OS/platform (Dev C++/XP/P4) - making a C
console program (which runs in a CMD.exe shell).

The standard says about fgets:

synopsis
#include <stdio.h>
char *fgets(char *s, int n, FILE *stream);

description

"The fgets function returns s if successful. If end-of-file is
encountered and no characters have been read into the array, the
contents of the array remain unchanged and a null pointer is returned.
If a read error occurs during the operation, the array contents are
indeterminate and a null pointer is returned."



The problem I'm having is, if EOF is provided after another character,
fgets simply doesn't return NULL to flag an error, so my error catching
code is not finding anything. Also, the EOF in the line, for me, all
it does it causes fgets to ignore the EOF -AND- also ignore everything
after the EOF upto and including the first seen newline.

First of all, the standard quote you provided above does not cover the
case where EOF is encountered after some characters are read, which is the
case you are interested in.

Second, how can there possibly be anything after EOF? If you are reading a
binary file after opening it in text mode, then I would have to say,
"Don't do that." If you think it is a text file, and open it up in text
mode, but it has an "EOF character" followed by something else, then it
isn't really a text file. And if you have a binary file opened in binary
mode, I believe it is truly impossible to have anything after an EOF. But
you shouldn't really use fgets on a binary file.

If you are reading from the console (as appears to be the case), you
should be aware that console input is often line buffered by the OS before
your program even sees it. I think there is something about this in the
FAQ list (which you might want to read). You should also be aware that the
exact way the OS communicates EOF from the console line varies from OS to
OS, and is outside the scope of the C programming language in any event.

It may be enlightening to write a simple hex dump program using getc() or
getchar(), and find out exactly what your program sees when you type
various things into the console.
So fgets hangs if I provide it with an EOF, because it basically
discards the EOF condition and all characters after it including the
next newline(which normally tells fgets to stop, but in this case it
just discards and ignores it). So I find myself having to hit newline
again to make fgets -unblock- or wahtever you want to call it.
[snip]

This wouldn't bother me if fgets returned a null pointer like the
standard said it would, because I would really like to trap this type of
stupid user input and deal with it, but I can't even do that.

The quote from the standard you posted above does not say that fgets
returns NULL when EOF is encountered after reading at least one character.
If you think the standard says that somewhere, please post it.
Can someone test this program on their system or tell me what I'm
missing? Isn't fgets supposed to return a null pointer?

No. fgets is not supposed to return a null pointer if it reads some
characters then encounters an EOF.
thanking everyone very much,

Tinesan Troy, B.Eng (mech)
(e-mail address removed)

--Mac
 
B

Brian Inglis

fOn 11 Mar 2005 21:59:05 -0800 in comp.std.c, "TTroy"
I have found some peculiar behaviour in the fgets runtime library
function for my compiler/OS/platform (Dev C++/XP/P4) - making a C
console program (which runs in a CMD.exe shell).

The standard says about fgets:

synopsis
#include <stdio.h>
char *fgets(char *s, int n, FILE *stream);

description

"The fgets function returns s if successful. If end-of-file is
encountered and no characters have been read into the array, the
contents of the array remain unchanged and a null pointer is returned.
If a read error occurs during the operation, the array contents are
indeterminate and a null pointer is returned."

The problem I'm having is, if EOF is provided after another character,
fgets simply doesn't return NULL to flag an error, so my error catching
code is not finding anything. Also, the EOF in the line, for me, all
it does it causes fgets to ignore the EOF -AND- also ignore everything
after the EOF upto and including the first seen newline.

So fgets hangs if I provide it with an EOF, because it basically
discards the EOF condition and all characters after it including the
next newline(which normally tells fgets to stop, but in this case it
just discards and ignores it). So I find myself having to hit newline
again to make fgets -unblock- or wahtever you want to call it.

For example, I ran this test code:
As you can see, all fgets did was ignore everything from the EOF upto
and including the next newline. It definitely ignored this newline,
because fgets stayed -blocked- and the program was still waiting for my
input (as if it required another newline) - so I typed a space and
'world' then hit enter (this enter/newline registered properly and
fgets unblocks) and fgets assumed it go the input {hello world\n} and
just ignored/discarded the middle part: {[EOF]idiot\n} - ignore the
braces, they are used as delimitters only.

This wouldn't bother me if fgets returned a null pointer like the
standard said it would, because I would really like to trap this type
of stupid user input and deal with it, but I can't even do that.

Can someone test this program on their system or tell me what I'm
missing? Isn't fgets supposed to return a null pointer?

You have a platform problem: to have an EOF recognized under Windows,
you have to type EOF on a line by itself, followed by a newline.
 
T

TTroy

Richard Kettlewell wrote:
But if the equivalent of read() returns "foo<eof>bar<newline>",
because it doesn't know that whatever character you typed for <eof> is
special, then they must return "foo" the first time and either store
"bar<newline>" for the following call, or discard it. In this case it
sounds like it is discarding it.

So there really is no way for me to "trap" this anomaly (where the user
can supply an EOF-related character in the middle of input)?

Incidentally, I've tested the program on linux and it has much better
behaviour, which makes it even worse for me to try to make portable
programs.

Does anyone know how I can detect "any type of weird EOF related
activity caused by the user" ?

Thank you for all your help. I hope I haven't frustrated anyone by my
ignorance.

Tinesan Troy, B.Eng (mech)
(e-mail address removed)
 
B

Barry Margolin

"TTroy said:
Richard Kettlewell wrote:


So there really is no way for me to "trap" this anomaly (where the user
can supply an EOF-related character in the middle of input)?

If fgets() returns a string that isn't terminated by a newline and
doesn't fill the buffer, the read must have been terminated by EOF.
 
B

Brian Inglis

The problem I'm having is, if EOF is provided after another character,
fgets simply doesn't return NULL to flag an error, so my error catching
code is not finding anything. Also, the EOF in the line, for me, all
it does it causes fgets to ignore the EOF -AND- also ignore everything
after the EOF upto and including the first seen newline.

So fgets hangs if I provide it with an EOF, because it basically
discards the EOF condition and all characters after it including the
next newline(which normally tells fgets to stop, but in this case it
just discards and ignores it). So I find myself having to hit newline
again to make fgets -unblock- or wahtever you want to call it.

There is another platform problem with console EOF input on Windows:
the EOF condition in Windows is not cleared until there is output to
the console, which may be generated by a prompt. If a program attempts
further console input without an intervening output, it will again
receive an EOF indication.
This could be used as confirmation you received an EOF, by again
requesting input, if you alreday received a short, unterminated input.
 
R

Richard Bos

TTroy said:
So there really is no way for me to "trap" this anomaly (where the user
can supply an EOF-related character in the middle of input)?

I wouldn't even bother to. It is "normal behaviour" for MS platforms
(whatever "normal" means for them, anyway...), and their users expect
it, if they know anything about it at all. Getting round it, even if
possible, would confuse those MS users who know something about their
command line, and wouldn't help those who aren't that expert anyway.

Richard
 
R

Richard Kettlewell

TTroy said:
Richard Kettlewell wrote:

So there really is no way for me to "trap" this anomaly (where the
user can supply an EOF-related character in the middle of input)?

Apparently not portably.
Incidentally, I've tested the program on linux and it has much
better behaviour, which makes it even worse for me to try to make
portable programs.

Does anyone know how I can detect "any type of weird EOF related
activity caused by the user" ?

I think it's fair to ignore the issue in most cases; if the user
engages in weird EOF-related activity then they get weird results, and
really that's exactly what they asked for.
 
D

Douglas A. Gwyn

TTroy said:
So there really is no way for me to "trap" this anomaly (where the user
can supply an EOF-related character in the middle of input)?

I'll assume a Unix-compatible environment.. The so-called
"EOF character" is *not* an EOF character, but a *delimiter*
character. The Unix convention is that a *0-length* read
is interpreted as EOF. You get a 0-length read from a
terminal device only when the delimiter character is the
*first* thing typed after a previous newline or delimiter.
Another way of thinking about it is that the so-called "EOF
character" really means "send what I have typed immediately,
without the usual newline", and only when you have typed
*nothing* should that be taken as "end of all input". This
is especially tricky since the C standard now requires that
the stdio EOF condition be "sticky", whereas for raw reads
(read(2) system call) the default behavior is for that
condition to be transitory; this leads to some strange
behavior if programs aren't carefully written.

You can of course detect the non-newline input line by
noticing the absence of a newline in the input buffer..
 
T

TTroy

Douglas said:
I'll assume a Unix-compatible environment.. The so-called
"EOF character" is *not* an EOF character, but a *delimiter*

That's what I meant by EOF-related character in my then latest post.
character. The Unix convention is that a *0-length* read
is interpreted as EOF. You get a 0-length read from a
terminal device only when the delimiter character is the
*first* thing typed after a previous newline or delimiter.
Another way of thinking about it is that the so-called "EOF
character" really means "send what I have typed immediately,
without the usual newline", and only when you have typed
*nothing* should that be taken as "end of all input". This
is especially tricky since the C standard now requires that
the stdio EOF condition be "sticky", whereas for raw reads
(read(2) system call) the default behavior is for that
condition to be transitory; this leads to some strange
behavior if programs aren't carefully written.

You can of course detect the non-newline input line by
noticing the absence of a newline in the input buffer..

Yes, the unix related behaviour you speak off is something I've
experienced before. It is quite "normal" because if the user does give
a Ctrl D in the middle of input, all it does is force a read and the
user can then press Ctrl D again to force a true EOF indication. My
programs were originally tested in unix environment. In unix I can put
in extra code to detect these sort of things.

In DOS however, I can't detect the anomaly I speak of in my original
post. I think this is a quality of implementation fault and nothing
more.

The "noew required EOF condition to be sticky" really caught my
attention, and I was wondering if you can point me to some examples on
the internet(tutorials) of this new behaviour.

I've never encountered a situation where I had to "unstick" an EOF
error or a normal error via clearerr( ), so your statement is scary
(being a novice). Can you expand a little on what you mean (I've spent
some time trying to find enlightenment via google and deja/ggroup
archives, but couldn't find anything)?

Thank you

Tinesan Troy, B.Eng. (mech)
(e-mail address removed)
 
T

TTroy

Barry said:
If fgets() returns a string that isn't terminated by a newline and
doesn't fill the buffer, the read must have been terminated by EOF.

But it doesn't do that, it returns what seems like a valid string. I
don't know whether the standard allows fgets to return a non-terminated
string, but it sure doesn't on my platform.
 
B

Barry Margolin

"TTroy said:
But it doesn't do that, it returns what seems like a valid string. I
don't know whether the standard allows fgets to return a non-terminated
string, but it sure doesn't on my platform.

You mean it includes a newline even if the file you're reading from
didn't actually have one?

Could it be a record-oriented file rather than a stream? These don't
have explicit newline characters, so every record is automatically
treated as if it has a trailing newline for stdio purposes.
 
R

Richard Tobin

If fgets() returns a string that isn't terminated by a newline and
doesn't fill the buffer, the read must have been terminated by EOF.
[/QUOTE]
But it doesn't do that, it returns what seems like a valid string.

It's a valid string, but one that doesn't have a newline at the
end. (It does have a nul of course.)

-- Richard
 
A

Antoine Leca

En (e-mail address removed), TTroy va escriure:
I have found some peculiar behaviour in the fgets runtime library
function for my compiler/OS/platform (Dev C++/XP/P4)

So probably using Microsoft's (MSVCRT) library.


if(!fgets(sample, MAXINPUT, stdin))
My input was:

Sorry to ask something probably obvious: what is this [EOF] really? a break
on a serial line?

According to the supplied documentation
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/h
tml/_crt_console_and_port_i.2f.o.asp, sorry for the long link that will
probably be cut), the end-of-file ought to have conclude the input.
But I cannot decide if your [EOF] matches this "end-of-file".

Also, I was unable to find there a description of what constitutes "the end
of the... conventional input."


Antoine
 
D

Douglas A. Gwyn

TTroy said:
I've never encountered a situation where I had to "unstick" an EOF
error or a normal error via clearerr( ), so your statement is scary
(being a novice). Can you expand a little on what you mean (I've spent
some time trying to find enlightenment via google and deja/ggroup
archives, but couldn't find anything)?

As of C99, once EOF is reported for a stdio input stream,
it continues to be reported for further reads from that
stream, until the condition is cleared by one of several
possible actions, such as repositioning (seeking) the
stream. This is due to the spec for fgetc, in terms of
which all stdio input is described. (Detecting EOF sets
the end-of-file indicator for the stream, and later
fgetc first checks the end-of-file indicator and if it
is set reports EOF without further reading of the file
associated with the stream.)

The change, originally stated in a DR response as I
recall, was intended to support the common practice of
reading EOF more than once, which always worked for
regular disk files but, at least on Unix, was not wise
for magtape, pipes, terminal files, etc. While some of
us objected to imposing this requirement, the argument
for programming convenience prevailed.

Thus, if you want to use stdio and really don't want the
EOF condition to persist, you need to clear it;
fseek(ifp,0L,SEEK_CUR) ought to do it, but since there
are probably some implementations for which that would
fail on e.g. a terminal device, clearerr(ifp) should be
used instead. Note that if there is a "hard" EOF, as
with a regular disk file, the next input attempt will
report EOF and again set the end-of-file condition for
the stream.
 
L

lawrence.jones

In comp.std.c Douglas A. Gwyn said:
As of C99, once EOF is reported for a stdio input stream,
it continues to be reported for further reads from that
stream, until the condition is cleared by one of several
possible actions, such as repositioning (seeking) the
stream.

On the contrary, the committee's response to DR 141 (for C89) indicates
that that was always the intended behavior; the wording changes in C99
were just a clarification, not a change.

-Larry Jones

I'm a genius. -- Calvin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top