Buffer overflows and C A simple example

J

jacob navia

Contrary to the opinion of many people, I am convinced that the problem in C is not the language
itself, that I consider an excellent one, but the wrong use it is done of C by people that do not
take care about the details and do not respect the rules of software engineering.

I will use an example of badly written code to show several of the problems that arise when people
forget to take care of the details:

char *asctime(const struct tm *timeptr)
{
static const char wday_name[7][3] = {
"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" };
static const char mon_name[12][3] = {
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec" };
static char result[26];

sprintf(result, "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",
wday_name[timeptr->tm_wday],
mon_name[timeptr->tm_mon],
timeptr->tm_mday, timeptr->tm_hour,
timeptr->tm_min, timeptr->tm_sec,
1900 + timeptr->tm_year);
return result;
}


We can see here many errors here that are typical of badly written C programs:

(1) No error checking at all. The above code will crash or corrupt memory at the slightest error in
its input. This is a common error that leads to brittle software: software that crashes too often
for anybody's taste. It can be argued that this function supposes that the specifications forbid

(2) No provision for error return. We see that this program will always return the buffer, even in
the case of an error. Obviously for some inputs the program will crash before it reaches the return
statement, but for many inputs, it will just access undefined memory locations leading to wrong results.

(3) The use of snprintf, instead of sprintf is recommended. Sprintf doesn't allow to specify the
length of the buffer, so it is very error prone. It is interesting that this code appears in a book
about C99. C99 introduced snprintf, but the authors apparently forgot its existence.

(4) The sprintf specification is "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n" but the buffer "result" is
dimensioned to 26 positions. This is wrong.

How much buffer space we would need to protect asctime from buffer overflows in the worst
case?

This is very easy to calculate. We know that in all cases, %d can't output more characters than
the maximum numbers of characters an integer can hold. This is INT_MAX, and taking into account the
possible negative sign we know that:

Number of digits N = 1 + ceil(log10((double)INT_MAX));

For a 32 bit system this is 11, for a 64 bit system this is 21.

In the asctime format specification there are 5 %d format specifications, meaning that we have as
the size for the buffer the expression:

26+5*N bytes

In a 32 bit system this is 26+55=81.

This is a worst case oversized buffer, since we have already counted some of those digits in the
original calculation, where we have allowed for 3+2+2+2+4 = 13 characters for the digits. A
tighter calculation can be done like this:

Number of characters besides specifications (%d or %s) in the string: 6.
Number of %d specs 5
Total = 6+5*11 = 61 + terminating zero 62.
The correct buffer size for a 32 bit system is 62.

The problems of buffer overflows (that can happen in all programming languages by the way, they are
not specific to C) in this case are the result of a bad calculation of the maximum buffer required.

(5) This function returns a pointer to a static buffer. This is best avoided in multithreaded
applications since all those functions can never be made reentrant


In the next installment of this series I will propose another function that fixes most problems and
retains the interface of asctime().
 
J

jacob navia

jacob navia a écrit :
(1) No error checking at all. The above code will crash or corrupt
memory at the slightest error in its input. This is a common error that
leads to brittle software: software that crashes too often for anybody's
taste. It can be argued that this function supposes that the
specifications forbid

.... wrong values.

Sorry I missed those words.
 
K

Keith Thompson

jacob navia said:
Contrary to the opinion of many people, I am convinced that the
problem in C is not the language itself, that I consider an excellent
one, but the wrong use it is done of C by people that do not take care
about the details and do not respect the rules of software engineering.

I will use an example of badly written code to show several of the
problems that arise when people forget to take care of the details:

char *asctime(const struct tm *timeptr)
{ [snip]
}

[big snip; yes, the code has some problems]
In the next installment of this series I will propose another function
that fixes most problems and retains the interface of asctime().

Why bother? There are some problems with the standard's presented
implementation of asctime(), and it could be made a bit more robust
without breaking existing portable code. But IMHO the defined
interface to asctime() is at least as bad as the implementation.
It forces the use of a date format that's US-centric and *should*
be considered obsolete, it appends a newline to the result, and it
uses a static buffer that's clobbered on each call. Like old-style
function definitions, it exists in the standard only to support
legacy code.

Why not just use the much more flexible strftime()?

I advocate deprecating both asctime() and ctime(), and until they're
removed from the standard, leaving their definitions alone apart from
adding a footnote outlining the circumstances in which their behavior
is undefined. (The footnote needn't go into all the details, such
as the fact that an out-of-range tm_sec value doesn't cause UB if
tm_year is sufficiently small.)

Wouldn't that solve the problem you're concerned with?

As I said in another thread, why sharpen the corners on a square
wheel?

(Or are you going to ignore this because I wrote it? And are you
going to reply to my recent e-mail?)
 
J

jacob navia

Gareth Owen a écrit :
jacob navia said:
I will use an example of badly written code to show several of the
problems that arise when people forget to take care of the details:

char *asctime(const struct tm *timeptr)
{
static const char wday_name[7][3] = {
"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" };
static const char mon_name[12][3] = {
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec" };
static char result[26];

sprintf(result, "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",
wday_name[timeptr->tm_wday],
mon_name[timeptr->tm_mon],
timeptr->tm_mday, timeptr->tm_hour,
timeptr->tm_min, timeptr->tm_sec,
1900 + timeptr->tm_year);
return result;
}

asctime() is poorly defined. Wow. I'm shocked. I wish someone had
mentioned this before.

Do not fear Gareth!!!

After you have overcome the schock and surprise you can now relax.
The committee itself has decided to redefine that, and added language
to make it at least well behaved.

Best to burn the standard and start again from scratch.

Ohhh BAD BOY!!!!

What a pity. After you burn the standard you will have to buy it again...
Well, its just 20 bucks, who cares?

Or add a container library nobody else wants. One or the other.

#include <stdio.h>
#include <stdlib.h>
int main (void)
{
for (int i=0; i<INT_MAX;i++) {
printf("I HATE jacob\nNobody wants his stupid library\n");
}
return (EXIT_FAILURE);
}
Gareth "Not anonymous, really, that's my real name, FWIW, I was named
after Gareth Edwards" Owen

GREAT!

Nice to meet you Mr Owen.
 
K

Keith Thompson

jacob navia said:
Gareth Owen a écrit : [...]
asctime() is poorly defined. Wow. I'm shocked. I wish someone had
mentioned this before.
[...]
After you have overcome the schock and surprise you can now relax.
The committee itself has decided to redefine that, and added language
to make it at least well behaved.
[...]

The committee has merely stated (and slightly expanded) the
circumstances in which its behavior is undefined. It's an
improvement, but it's not a fundamental change.

I can think of only one case where someone called asctime() in a
manner that caused its behavior to be undefined (other than in code
specifically intended to demonstrate the issue). That was something
you had once written. I don't remember the exact details, but that
call's behavior is still undefined under the revised C201X draft.

But again, if you're satisfied with the change, I certainly won't
argue that you shouldn't be.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top