In my "Happy Christmas" message, I proposed a function to read
a file into a RAM buffer and return that buffer or NULL if
the file doesn't exist or some other error is found.
It is interesting to see that the answers to that message prove that
programming exclusively in standard C is completely impossible even
for a small and ridiculously simple program like the one I proposed.
1 I read the file contents in binary mode, what should allow me
to use ftell/fseek to determine the file size.
No objections to this were raised, except of course the obvious
one, if the "file" was some file associated with stdin, for
instance under some unix machine /dev/tty01 or similar...
I did not test for this since it is impossible in standard C:
isatty() is not in the standard.
I would expect fseek() to return with an error if the file is not
"seekable", such as stdin.
2) There is NO portable way to determine which characters should be
ignored when transforming a binary file into a text file. One
reader (CB Falconer) proposed to open the file in binary mode
and then in text mode and compare the two buffers to see which
characters were missing... Well, that would be too expensive.
Oh, well, if you want to do this stuff in text mode, you need to
iterate character by character. This distinction of "text mode"
versus "binary mode" is nothing but performance problems no matter
what. Personally, log files are the only thing I ever open in text
mode any more.
3) I used different values for errno defined by POSIX, but not by
the C standard, that defines only a few. Again, error handling
is not something important to be standardized, according to
the committee. errno is there but its usage is absolutely
not portable at all and goes immediately beyond what standard C
offers.
Yeah, its just one of the many ways the C language standard encourages
vendors to be non-portable. In any event, its not friendly to re-
entrancy anyways, as the error value needs to be copied out to make
way for other errors to be reported. If you are going to hang onto
error values anyways, you might as well just return them, and make up
your own anyways. That's what I do. (OTOH, I don't consider mutable
static context, such as errno, to be a valid interface for anything.)
We hear again and again that this group is about standard C *"ONLY"*.
Could someone here then, tell me how this simple program could be
written in standard C?
Personally, I think trying to do something like this according to the
standard is a complete waste of time. I have files on disk today that
are in excess of ULONG_MAX in size. intmax_t (which is int64_t on my
system) is plenty large enough to hold the size, so its not like it
couldn't or shouldn't be representable. And of course, my system also
has some nice 64 bit extensions to fseek and ftell, so we know its
possible.
If we ignore file sizes and, say, allow INT_MAX-1 to be good enough
for the size of files, then I would recommend downloading "The Better
String Library" and just calling bread() (or breada()). It
iteratively allocates more and more memory to fit the data that is
read -- you can then call ballocmin() on the result if you want to
keep the memory usage tight. Its just an extra O(n) anyways. Meh.
This confirms my arguments about the need to improve the quality
of the standard library!
Well for fseek and ftell, the case is very clear and obvious.
"unsigned long" is not an appropriate type for file offsets. This was
"corrected" with fgetpos() and fsetpos() except that they took all
arithmetic away from you for some reason, and there is no way to
simply seek to the end of the file with them (without simply reading
it all.) Clearly we need to add something like:
intmax_t fgetfilesize (FILE * fp);
int faddpos (fpos_t * pos, intmax_t offset);
You can't do *anything* in just standard C.
Well that might be an exaggeration. But what can definitively be
said, is that as technology and ideas improve, what the C standard
specifies becomes applicable to a smaller and smaller percentage of
that space. And certainly portable C is diminishing to zero.
What the standard specifies does not scale, and does not take
improving technology into account, except in that it allows platforms
to extend away and do their own thing with respect to implementation
specific behavior. (Which makes the idea of a "standard"
meaningless.)
A fairly straight forward application like Bittorrent cannot be
written in anything resembling portable C code, not even just the file
system back end. It just an embarrassment that a tool that should be
seen as a system level tool is best written in Python on Java.