J
jacob navia
Contrary to the opinion of many people, I am convinced that the problem in C is not the language
itself, that I consider an excellent one, but the wrong use it is done of C by people that do not
take care about the details and do not respect the rules of software engineering.
I will use an example of badly written code to show several of the problems that arise when people
forget to take care of the details:
char *asctime(const struct tm *timeptr)
{
static const char wday_name[7][3] = {
"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" };
static const char mon_name[12][3] = {
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec" };
static char result[26];
sprintf(result, "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",
wday_name[timeptr->tm_wday],
mon_name[timeptr->tm_mon],
timeptr->tm_mday, timeptr->tm_hour,
timeptr->tm_min, timeptr->tm_sec,
1900 + timeptr->tm_year);
return result;
}
We can see here many errors here that are typical of badly written C programs:
(1) No error checking at all. The above code will crash or corrupt memory at the slightest error in
its input. This is a common error that leads to brittle software: software that crashes too often
for anybody's taste. It can be argued that this function supposes that the specifications forbid
(2) No provision for error return. We see that this program will always return the buffer, even in
the case of an error. Obviously for some inputs the program will crash before it reaches the return
statement, but for many inputs, it will just access undefined memory locations leading to wrong results.
(3) The use of snprintf, instead of sprintf is recommended. Sprintf doesn't allow to specify the
length of the buffer, so it is very error prone. It is interesting that this code appears in a book
about C99. C99 introduced snprintf, but the authors apparently forgot its existence.
(4) The sprintf specification is "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n" but the buffer "result" is
dimensioned to 26 positions. This is wrong.
How much buffer space we would need to protect asctime from buffer overflows in the worst
case?
This is very easy to calculate. We know that in all cases, %d can't output more characters than
the maximum numbers of characters an integer can hold. This is INT_MAX, and taking into account the
possible negative sign we know that:
Number of digits N = 1 + ceil(log10((double)INT_MAX));
For a 32 bit system this is 11, for a 64 bit system this is 21.
In the asctime format specification there are 5 %d format specifications, meaning that we have as
the size for the buffer the expression:
26+5*N bytes
In a 32 bit system this is 26+55=81.
This is a worst case oversized buffer, since we have already counted some of those digits in the
original calculation, where we have allowed for 3+2+2+2+4 = 13 characters for the digits. A
tighter calculation can be done like this:
Number of characters besides specifications (%d or %s) in the string: 6.
Number of %d specs 5
Total = 6+5*11 = 61 + terminating zero 62.
The correct buffer size for a 32 bit system is 62.
The problems of buffer overflows (that can happen in all programming languages by the way, they are
not specific to C) in this case are the result of a bad calculation of the maximum buffer required.
(5) This function returns a pointer to a static buffer. This is best avoided in multithreaded
applications since all those functions can never be made reentrant
In the next installment of this series I will propose another function that fixes most problems and
retains the interface of asctime().
itself, that I consider an excellent one, but the wrong use it is done of C by people that do not
take care about the details and do not respect the rules of software engineering.
I will use an example of badly written code to show several of the problems that arise when people
forget to take care of the details:
char *asctime(const struct tm *timeptr)
{
static const char wday_name[7][3] = {
"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" };
static const char mon_name[12][3] = {
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec" };
static char result[26];
sprintf(result, "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",
wday_name[timeptr->tm_wday],
mon_name[timeptr->tm_mon],
timeptr->tm_mday, timeptr->tm_hour,
timeptr->tm_min, timeptr->tm_sec,
1900 + timeptr->tm_year);
return result;
}
We can see here many errors here that are typical of badly written C programs:
(1) No error checking at all. The above code will crash or corrupt memory at the slightest error in
its input. This is a common error that leads to brittle software: software that crashes too often
for anybody's taste. It can be argued that this function supposes that the specifications forbid
(2) No provision for error return. We see that this program will always return the buffer, even in
the case of an error. Obviously for some inputs the program will crash before it reaches the return
statement, but for many inputs, it will just access undefined memory locations leading to wrong results.
(3) The use of snprintf, instead of sprintf is recommended. Sprintf doesn't allow to specify the
length of the buffer, so it is very error prone. It is interesting that this code appears in a book
about C99. C99 introduced snprintf, but the authors apparently forgot its existence.
(4) The sprintf specification is "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n" but the buffer "result" is
dimensioned to 26 positions. This is wrong.
How much buffer space we would need to protect asctime from buffer overflows in the worst
case?
This is very easy to calculate. We know that in all cases, %d can't output more characters than
the maximum numbers of characters an integer can hold. This is INT_MAX, and taking into account the
possible negative sign we know that:
Number of digits N = 1 + ceil(log10((double)INT_MAX));
For a 32 bit system this is 11, for a 64 bit system this is 21.
In the asctime format specification there are 5 %d format specifications, meaning that we have as
the size for the buffer the expression:
26+5*N bytes
In a 32 bit system this is 26+55=81.
This is a worst case oversized buffer, since we have already counted some of those digits in the
original calculation, where we have allowed for 3+2+2+2+4 = 13 characters for the digits. A
tighter calculation can be done like this:
Number of characters besides specifications (%d or %s) in the string: 6.
Number of %d specs 5
Total = 6+5*11 = 61 + terminating zero 62.
The correct buffer size for a 32 bit system is 62.
The problems of buffer overflows (that can happen in all programming languages by the way, they are
not specific to C) in this case are the result of a bad calculation of the maximum buffer required.
(5) This function returns a pointer to a static buffer. This is best avoided in multithreaded
applications since all those functions can never be made reentrant
In the next installment of this series I will propose another function that fixes most problems and
retains the interface of asctime().