feof usage

B

Barry Schwarz

That's not quite true. There are certainly explicit requirements for the
arguments passed to fgets(), lest UB occur.

I agree that bad arguments can cause UB but that's true for any
function. fgets appears in section J.2 only once (regarding an
attempt to use the array if fgets reports an I/O error) and that
section is not specific to fgets but includes other I/O functions.
I prefer to think the behaviour on reading the last line of text is
implementation defined.

Implementation-defined implies there is more than one possible legal
behavior. If you think it is implementation defined, does your
implementation document it and, if so, as what? What other possible
behaviors are there that satisfy the standard? Can fgets return
something other than the array address? If the '\n' is not present in
the stream, can it stop reading before the end of file? Can it fail
to terminate the string with a '\0'? Can it add a '\n' of its own?

Now that I think about it some more, I think this should be required
to be detected as an I/O error rather than undefined behavior.
I don't disagree, but sanity would dictate that there's no need to lump
fgets() into the same basket as gets().

I don't follow this. How does discussing what fgets does when it
processes an improperly formatted stream "lump it with gets"?
[Aside: Dealing with null bytes via fgets() is much trickier than trailing
newlines. Throw in non-sticky EOF and ...]

The standard does not indicate that fgets treats a '\0' in the input
stream different than any other non-'\n' character. In fact, there
are only three conditions (other than I/O error) that will terminate
the operation:
'\n' encountered in the stream
end of file encountered in the stream
n-1 characters read from the stream

What is a non-sticky EOF?

What does this have to do with how fgets processes the last line of a
text file?
ITYM: that is not specific to fgets...

Yes.


<<Remove the del for email>>
 
M

Mac

On Mon, 22 Sep 2003 22:34:38 +1000, "Peter Nilsson"

[snip]
[Aside: Dealing with null bytes via fgets() is much trickier than trailing
newlines. Throw in non-sticky EOF and ...]

The standard does not indicate that fgets treats a '\0' in the input
stream different than any other non-'\n' character. In fact, there
are only three conditions (other than I/O error) that will terminate
the operation:
'\n' encountered in the stream
end of file encountered in the stream
n-1 characters read from the stream

I think the problem is, how do you know what is after the nul? fgets()
doesn't return anything that lets you know how many characters you've
read, and strlen() will be fooled by the nul. So, unless you specifically
write your code to deal with that, whatever is after the nul will be lost.
I've never thought of that before reading this thread, but I'm sure that
is what Peter Nilsson must be referring to. (right?)

Of course, people who embed nul's in text files should be shot anyway...

Mac
--
 
M

Martin Ambuhl

Clifton Liles wrote:

while( ! feof(infile)) {
fscanf( infile," %d", &in );
}
If the format "%d" fails, fscanf will NOT move forward and this code
will hang. I just ran into this problem with working code and a
corrupted file. fscanf returns the number of data converted, so what
that for a NULL or less that you expected.

This last bit is wrong. NULL is a pointer. Compare the return value from
fscanf to EOF or for fewer than the number of items you wanted converted,
if you want, but check the last code snippet below.

Only if end-of-file is encountered before any conversion is EOF returned.
So a matching error which is not properly diagnosed can lead to one of two
types of error:

Type (1) Endless loop
while (EOF != fscanf(infile, "%d", &in)) { /* whatever */ }

Type (2) Loop ends, but with EOF not detected in the loop:
while (1 != fscanf(infile, "%d", *in)) { /* whatever */ }
/* but feof(infile) and ferror(infile) can be used here, after
the loop */

Consider a loop like this, to be fleshed out later:
{
int count;
while (1)
{
count = fscanf(infile,"%d", *in));
if (feof(infile)) {
/* eof code, perhaps using the value count */
break;
}
else if (ferror(infile)) {
/* error code, perhaps using the value count */
break;
}
/* successful read code */
}
}
 
I

Irrwahn Grausewitz

Mac said:
I think the problem is, how do you know what is after the nul? fgets()
doesn't return anything that lets you know how many characters you've
read, and strlen() will be fooled by the nul. So, unless you specifically
write your code to deal with that, whatever is after the nul will be lost.
I've never thought of that before reading this thread, but I'm sure that
is what Peter Nilsson must be referring to. (right?)

Of course, people who embed nul's in text files should be shot anyway...
Files with embedded nul characters cannot considered to be text files.
And people reading binary files with fgets() should be shot... ;-)

Regards

Irrwahn
 
L

LibraryUser

Barry said:
.... snip ...

What is a non-sticky EOF?

What does this have to do with how fgets processes the last
line of a text file?

An eof that "self-heals", such as provided by many
implementations of stdin when connected to the console. This
allows a user to signal EOF by such a key combination as CTL-Z or
CTL-D, and have that EOF condition disappear once reported.

I have used systems with sticky EOF, where signalling EOF on the
input console required contacting the system operator and
aborting the session before that terminal was of any further use
(for input). On todays systems the equivalent might be to
require a reboot.

There are arguments for and against such systems.
 
B

Barry Schwarz

On Mon, 22 Sep 2003 22:34:38 +1000, "Peter Nilsson"

[snip]
[Aside: Dealing with null bytes via fgets() is much trickier than trailing
newlines. Throw in non-sticky EOF and ...]

The standard does not indicate that fgets treats a '\0' in the input
stream different than any other non-'\n' character. In fact, there
are only three conditions (other than I/O error) that will terminate
the operation:
'\n' encountered in the stream
end of file encountered in the stream
n-1 characters read from the stream

I think the problem is, how do you know what is after the nul? fgets()
doesn't return anything that lets you know how many characters you've
read, and strlen() will be fooled by the nul. So, unless you specifically
write your code to deal with that, whatever is after the nul will be lost.
I've never thought of that before reading this thread, but I'm sure that
is what Peter Nilsson must be referring to. (right?)

While you can't use strchr or strstr, you can search the input buffer
for a '\n'. If you find it, you know that is where fgets stopped (or
at the '\0' in the following byte. If you don't find a '\n' and
feof() is false, then you know that exactly n-1 bytes were read in. I
don't know how to deal with the case where end or file is detected but
there is no '\n'.
Of course, people who embed nul's in text files should be shot anyway...



<<Remove the del for email>>
 
M

Mac

Files with embedded nul characters cannot considered to be text files.
And people reading binary files with fgets() should be shot... ;-)

OK, OK. How 'bout this: If it's supposed to be a text file, and someone
puts a nul in it, let's shoot him or her. But if it's supposed to be a
binary file, and the programmer is using fgets(), then we'll shoot the
programmer. This seems pretty fair to me. ;-)
Regards

Irrwahn

Mac
--
 
I

Irrwahn Grausewitz

Mac said:
OK, OK. How 'bout this: If it's supposed to be a text file, and someone
puts a nul in it, let's shoot him or her. But if it's supposed to be a
binary file, and the programmer is using fgets(), then we'll shoot the
programmer. This seems pretty fair to me. ;-)

Definitely, yes. :D

Regards

Irrwahn
 
D

Dave Thompson

[ 7.19.2p2 snipped ]
That makes the requirement for a final '\n' in the stream
implementation specific. It in no way implies that fgets has any
flexibility on how it process the final line. fgets is required to

read at most n-1 bytes
stop after reading and retaining an '/n' or after reaching eof
return the array address in all cases except
eof encountered without transferring any data
a read error occurs
Agree so far, except for the typo.
Nowhere in the standard is any implementation-defined or undefined
behavior specifically described for fgets. This includes the case
where the stream does not terminate with a '\n', whether required to
or not.

One could argue that if the implementation requires the final '\n' in
the stream and it is not present then attempting to read this line
invokes undefined behavior because the stream is not well formed. But
that has nothing to do with fgets and should apply equally to any I/O
function attempting to read the stream.
I would conclude that, although the requirement is not written 'shall'
(and is not in a constraint) so it might be arguable.

But this option exists primarily for the benefit of implementations on
systems where the files used for text, unless the plain-ASCII streams
of Unix and the MSDOS tribe, *cannot* store an unterminated last line.
On those systems, the UB for read never occurs; what matters is that
*writing* an unterminated last line to an output file does.

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top