David Mathog said:
Keith Thompson wrote: [...]
C99 7.19.2p2:
Whether space characters that are written out immediately
before a new-line character appear when read in is
implementation-defined.
I think this is intended to allow for a text file format consisting
of fixed-width records padded with blanks (think about punch
cards).
I'm trying to parse that section and it still doesn't make sense to
me. It says "whether space characters that ARE written out immediately
before a new-line character", which to me clearly says that the spaces
must land in the output file. Then it continues "appear when READ IN
is implementation-defined". So according to this C99 doesn't
guarantee that it can read all the characters in a line like
foo(space)(space)\n
and is free to return any of these:
foo(space)(space)\n
foo(space)\n (presumably, the spec doesn't exclude this)
foo\n
Right.
This seems even more daft than not writing the spaces out in the
first place! For instance, it means that getc() may return EITHER
(space) or \n following the foo and still be standard compliant???
Yes. It means that trailing spaces may be ignored on input.
On output, it's not possible to tell whether a printed space is a
trailing space or not until the line is terminated with a new-line.
If you've printed the characters 'f', 'o', 'o', ' ', ' ', the system
doesn't know whether the next character you print will be another
space, a new-line, or some other character. If trailing spaces are
going to be ignored on input, the system has to keep track of spaces
as they're printed. But as soon as you print the new-line, the system
can drop your trailing spaces.
Or, more likely, the output line is always padded with trailing spaces
anyway. On input, the system can't tell whether those trailing spaces
were actually part of your output, or were just system-imposed
padding. The system is allowed to assume that they were
system-imposed padding and ignore them. (If you printed "foo\n", you
wouldn't want to see "foo \n" when you read the file.)
And what does this do to fsetpos() or especially fseek()? If an
implementation that drops trailing spaces is fseek'd to the first or
second space, does it return a \n in both cases?
On a system that ignores trailing spaces on input, you *can't* fseek()
to the first space, at least not using portable code. The input file
(which could have been an output file created by the same program)
will consistently appear to be a text file with no trailing spaces.
Remember that the result of fseek() for a text file doesn't
necessarily have a consistent physical meaning; it needn't be a count
of characters or bytes.
I'm still very confused. What exactly was the reason for this
clause in the standard? I use fixed width text fields all the time
and having the the language nip off the trailing spaces at will
would wreck a heck of a lot of code. For programs which expect a
series of strings of width N, it is common to check that each read
has in fact returned a string of that length, and if not to blow up
as that's a read error, not an acceptable variation induced by the
compiler.
You're not likely to run into a system that actually ignores trailing
spaces. If you do, your code that uses fixed-width text fields might
break, but only if you actually have spaces at the very end of a line.