Paavo Helde said:
Why "few chars less"? Because you are not sure in the documentation?
Or in yourself?
had to go check the documentation...
actually, I had thought the N was the max number of chars to read, excluding
the '\n' and the 0.
apparently the N is adjusted automatically...
oh well...
doesn't matter too much if there is an occasional fgets around with an N of
254...
Accepted by who? I'm serving 10MB HTTP packets through std::string so I'm
sorry I have never heard of this convention. (There was a 256-char string
limitation in Turbo Pascal 3.3, but fortunately this is about 15 years in
history ;-)
"accepted" by traditional practice.
typically, constants like PATH_MAX, ... are 256.
it doesn't take long (for example, if one digs around in system headers),
before a string-length limit of 256 becomes a recurring pattern (even if
there are variations, for example, UNIX_MAX is 108, but I think this is
because of a general rule that (sizeof(sockaddr_storage)==128) or so, ...).
there are many other examples of this particular limit being in use.
it is an accepted limit, much like how i, j, and k, are accepted names for
integer variables, ...
granted, sometimes one wants a bigger limit though, and sometimes a bigger
limit is used (as there is no real technical reason for this particular
limit apart from convention), ...
I once wrote an HTTP server though, and requests with longer strings kept
comming from nowhere (mostly a string of repreating characters with some
garbage at the end), so in that case I made the limit 1024 and also put in a
limit check. (it can be noted that I think a lot of them were like 256 A's
followed by the garbage...).
luckily though, any buffer overflow exploits intended for one server are
likely to do little more than crash another...
It seems you are confusing the human interface with the program
interface.
either way, this limit is established, as a sort of rule of convention for
most well-formed text files.
it is much like how, by convention, a programmer should not write code with
lines longer than this limit.
Using fixed-size arrays does not mean you may skip the check if the data
fits in there. Actually, if the input is not verified and comes from
outside of the program, then it is ridiculous to not check its size. The
cost of doing that is zero, as compared to the time of getting the data
from outside into the program.
granted, external disk IO is usually measurable at around 20 MB/s IME.
however, there is a lot which often happens "within" the program, say, when
ones' app is divided up into lots of DLL's which do lots of their internal
communication via data serialized as strings, ...
one component will produce streams of text as its output, and another
component will parse them and follow embedded commands. many tasks may
involve many stages of processing of this sort (in addition to the use of
binary API's, ...).
nevermind that, in many of these cases, ANY unsafe input would be a security
risk, even if it does fit nicely into the buffers. the reason here being
that many of these facilities actually have access to features which are
either turing complete in their own right (yeah, this property tends to pop
up a lot...), or have access to code-generation machinery.
consider for example one has a text-stream "eval" mechanism. outside access
to eval is dangerous even if the text itself is well-formed, since eval will
generally allow whatever code hits it to much around with the app (unless of
course the eval is sandboxed, but I am assuming here it is not...).
similar goes if several components are connected via a stream in a
PostScript like format, and, say, some input goes over which fouls up the
command-interpreter, creates an infinite loop, or worse.
trivial example: "/foo {foo} def foo"
granted, this trivial case could be handled by detecting a stack overflow,
but in the general case, it would be difficult to secure even with input
validation...
I guess many viruses are in dept to guys like you when the "internally
safe" code somehow gets re-used and exploited in the wild.
or it could be just like expecting to check that pointers always point to
valid addressable memory (say, if one is using a garbage collector with the
ability to validate that a pointer is a heap pointer).
often, it would be too expensive, and too much of a hassle, to check these
things as a general matter of practice.
so, a tradeoff is made:
we assume that the caller is passing valid data, and typically check either
in code which is not likely to be a bottleneck, or where the "safety" of the
other end is not ensured.
typically, validity checking will be done when: performing file IO, dealing
with a network connection, or implementing or dealing with a public API.
if none of these is being done (for example, all this is stuff going on
purely internal to the app, which could happen easily enough) then there may
not be a need to validate.