Chris said:
This is, at best, an overstatement.
A buffer overflow happens when a fixed size memory area is defined
but a program writes PAST the fixed size buffer. This is a buffer
overflow.
Now, the standard specifies a buffer length of 26 for the buffer of
asctime.
In the official C standard of 1999 we find the specifications of the
“asctime” function, page 341:
char *asctime(const struct tm *timeptr)
{
static const char wday_name[7][3] = {
"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"
};
static const char mon_name[12][3] = {
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
};
static char result[26]; // <<<<<<<------------------------!!
sprintf(result, "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",
wday_name[timeptr->tm_wday],
mon_name[timeptr->tm_mon],
timeptr->tm_mday, timeptr->tm_hour,
timeptr->tm_min, timeptr->tm_sec,
1900 + timeptr->tm_year);
return result;
}
Nowhere is specified that the year value should be less than 8900.
If you "put some wrong values" in, you have little hope of expecting
*anything*
Of course. This is exactly the kind of sloppy specifications
attitude where anything goes, and no error analysis is ever done!
Shouldn't a seriously designed function have some way of
indicating an error when some of its inputs are wrong instead of
just making a buffer overflow?
Shouldn't a standard specify either:
o a bigger buffer to accommodate ANY year up to INT_MAX?
o a maximum year where the standard says (at least) that years must be
smaller than 8100 and sets upper and lower bounds for the input
data???
THAT would be a correctly specified function. UB would be clearly
signaled. In the text as it stands in the standard there is NO MENTION
of any limit!!!
I am not the first one to discover this.
Mr Clive Feather submitted a defect report saying in substance the
same thing as I said. The committee answer was:
<quote>
Thus, asctime() may exhibit undefined behavior if any of the members of
timeptr produce undefined behavior in the sample algorithm (for example,
if the timeptr->tm_wday is outside the range 0 to 6 the function may
index beyond the end of an array).
As always, the range of undefined behavior permitted includes:
Corrupting memory
Aborting the program
Range checking the argument and returning a failure indicator (e.g., a
null pointer)
Returning truncated results within the traditional 26 byte buffer.
There is no consensus to make the suggested change or any change along
this line.
<end quote>
You read correctly. Corrupting memory (i.e. a buffer overflow) is
within the range of undefined behavior acceptable!!!!
I have the right then, to name a buffer overflow for what it is, a
buffer overflow in the C standard with all the committee behind it.
-- what happens in lcc-win32, for instance, if I write:
struct big { int a[1000]; };
struct big main(double oops) {
short x = strlen((char *)0x98766542);
... /* more "wrong values" as inputs as needed */
return *(struct big *)42;
}
? If you want to protect against bad inputs, you need to think
hard about which kinds of "bad inputs" to guard against, and do
some serious cost/benefit analysis.
Yes. Let's do this ok?
The number of bytes needed is very easy to calculate. I explained
how to do this in my tutorial about the C language page 122:
<quote>
1.26.1.1 Getting rid of buffer overflows
How much buffer space we would need to protect asctime from buffer
overflows in the worst case?
This is very easy to calculate. We know that in all cases, %d can't
output more characters than the maximum numbers of characters an integer
can hold. This is INT_MAX, and taking into account the possible negative
sign we know that:
Number of digits N = 1 + ceil(log10((double)INT_MAX));
For a 32 bit system this is 11, for a 64 bit system this is 21.
In the asctime specification there are 5 %d format specifications,
meaning that we have as the size for the buffer the expression:
26+5*N bytes
In a 32 bit system this is 26+55=81.
This is a worst case oversized buffer, since we have already counted
some of those digits in the original calculation, where we have allowed
for 3+2+2+2+4 = 13 characters for the digits. A tighter calculation can
be done like this:
Number of characters besides specifications (%d or %s) in the string: 6.
Number of %d specs 5
Total = 6+5*11 = 61 + terminating zero 62.
The correct buffer size for a 32 bit system is 62.
<end quote>
COST and BENEFIT ANALYSIS:
--------------------------
The difference between 62 and 26 is 36. For sparing 36 bytes we have a
buffer overflow. Now do your cost/benefit analysis. Since I paid
120 euros for 2GB (2*1024*1024*1024) bytes, each byte costs
0.0000000004656612873077392578125 euros. Times 36 gives:
0.00000001676380634307861328125 euros.
Is this too expensive for you?
Moreover, if your objection is that values of .tm_year greater
than 8100 (or less than or equal to some negative number) cause
problems, you can always test for that in your own implementation:
A buffer of 62 bytes will handle ANY POSSIBLE INPUT in a 32 bit
implementation, there isn't even any need for testing!!!
Are we developing software in 2007?
Or we are still living in the PDP-11?
Why this myopic attitude towards error analysis, that has led to
people leaving C as a reasonable language forever?
C == buffer overflow...
Many people thing like this already. Do we need to furnish a proof with
a buffer overflow in the text of the C standard?
Yours truly.
jacob