Christopher said:
Ignoring Mr. Sosman and Mr. Bos would be colossally stupid.
Ignoring Mr. Sosman is sometimes a good idea, but not in
this case. (Or so it seems to me ...)
In the case at hand, my experience is at variance with
that reported by Mr. Navia, who argues that opening a
binary stream, seeking to the end, and reporting the ftell()
value as the file size is reliable. "I have never seen a
system where setting the file pointer at the end would fail,"
he writes. Well, neither have I ...
... but I *have* seen systems where the result of this
query is (or can be) useless. It tells you how many bytes
you could read from the file with a binary stream, but that
number is only one of several notions of "file size," and
possibly not the notion that the programmer cares about.
The usual reason for wanting to know "the" size of a file
in a C program is to decide how much memory to allocate to
hold the file's content. If the file is in fact a big bag of
binary bytes, Navia's computation will work on every system
I've seen (even though the Standard specifically disavows it,
which suggests that I haven't seen all systems). If the file
is textual, though, and will eventually be read with a text
stream, Navia's result may be well off the mark. Two cases
from real life:
- Navia can overestimate the character count by failing
to account for the translation of multi-byte line-
delimiting sequences to single newline characters.
The commonest example may be Windows' conversion
between \r\n and \n, but others exist -- I've used
one system where the line delimiter can be as long
as *five* bytes.
- Navia can *under*estimate the character count by
being blind to special formatting codes in the file.
This can happen on OpenVMS, where one of the file
formats puts a control prefix on each line indicating
the desired vertical spacing: leave so-and-so many
blank lines before/after this one. When translated
by a text stream, such a prefix can synthesize a
potentially large number of \n characters not actually
present on the disk.
Mr. Navia is a knowledgeable user of C, but in this matter
(as in others) he demonstrates that portability is not among
his principal concerns.
(Pompous enough for ya, Jacob?)