Erradicating a Buffer Overflow

A

Arctic Fidelity

Hello everyone, here is a sample program with what I think has a possible
buffer overflow vulnerability:

#include <stdio.h>

int main(int argc, char *argv[])
{
char junk[10]; /* Possibly dangerous */
char wday[4];
char mon[4];
char time[9];
int day;
int year;

if (argc == 2) {
sscanf(argv[1],
"%.3s, %d %.3s %4d %.8s %s",
wday,
&day,
mon,
&year,
time,
junk);
}

return 0;
}

Now, from what I can tell, wday, mon, and time will all be safe, because
there is a very strict limit to how much will be scanned in. The problem,
seems to be the junk buffer.

As you can guess, this is designed to take a specifically formatted date
string and read it into variables. However, in the date format I am
processing (mbox/overview file type dates), there is an extra bit after
the time that could be an arbitrary length. Generally, it's not bigger
than 10, which is why I initially used that value, but it did not click in
my head before that this would cause a problem. Then, while I was thinking
about it today, I realized that you could put in more than 10 characters
after the time section of the string, and overflow the program. My
question is, what is the proper way of handling this? How can I remedy it?
I could change %s to %.9s or something of that nature, but that would be
ugly, because I would end up with a bunch of whitespace and padding at the
beginning or the end. Ideally I would not want that. How could I make this
a safer process?

- Arctic Fidelity
 
R

Richard Bos

Arctic Fidelity said:
char junk[10]; /* Possibly dangerous */
char wday[4];
char mon[4];
char time[9];
int day;
int year;

if (argc == 2) {
sscanf(argv[1],
"%.3s, %d %.3s %4d %.8s %s",
wday,
&day,
mon,
&year,
time,
junk);
}
Then, while I was thinking
about it today, I realized that you could put in more than 10 characters
after the time section of the string, and overflow the program.

Yes, that's correct.
My question is, what is the proper way of handling this? How can I remedy it?
I could change %s to %.9s or something of that nature, but that would be
ugly, because I would end up with a bunch of whitespace and padding at the
beginning or the end.

If it really is junk, I would not bother to read it at all. If you want
to inspect it, and it can be of any length, there are several things you
can do. The simplest solution is probably not to copy it into another
char array, but to set a char pointer inside the argument string.

Richard
 
A

Arctic Fidelity

If it really is junk, I would not bother to read it at all. If you want
to inspect it, and it can be of any length, there are several things you
can do. The simplest solution is probably not to copy it into another
char array, but to set a char pointer inside the argument string.

Hmm, alright, not sure I understand that. It *is* junk, but since it is
junk, do I just leave out the "%s" part in sscanf and sscanf will not read
it? How would I just dump that extra part?

- Arctic Fidelity
 
R

Richard Bos

Arctic Fidelity said:
Hmm, alright, not sure I understand that. It *is* junk, but since it is
junk, do I just leave out the "%s" part in sscanf and sscanf will not read
it?
Quite.

How would I just dump that extra part?

What do you mean, "dump"? You have a single string. You're not reading
from a file. You don't need to dump anything.

BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?

Richard
 
J

Jirka Klaue

Arctic Fidelity:
....
Hmm, alright, not sure I understand that. It *is* junk, but since it is
junk, do I just leave out the "%s" part in sscanf and sscanf will not
read it? How would I just dump that extra part?

You could use %*s, but since it is the last argument of sscanf, you could
just drop it as well.

Jirka
 
A

Arctic Fidelity

What do you mean, "dump"? You have a single string. You're not reading
from a file. You don't need to dump anything.

Hehe, I was not aware that you could specify a format string in sscanf
that did not encompass the entire string. :) My bad. I suppose the
correct phrasing is, ignore.
BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?

Yes, I am aware of that. This was just the smallest reasonable program
that I could make to demonstrate my question. My actual use of this
function has nothing to do with command line arguments or any such thing.
I figured you all wouldn't want to see the rest of the code, when all I
was asking about was a simple buffer overflow.

- Arctic Fidelity
 
A

Arctic Fidelity

If it really is junk, I would not bother to read it at all. If you want
to inspect it, and it can be of any length, there are several things you
can do.

BTW, thank you for the quick response. I learn something new everyday I
read this group.

- Arctic Fidelity
 
A

Arctic Fidelity

You could use %*s, but since it is the last argument of sscanf, you could
just drop it as well.

Thank you very much. I had no idea that %*s worked like that with sscanf.
I re-read the information in my manual about that, and lo, and behold,
look what I find! :) Thanks a bunch.

- Arctic Fidelity
 
S

SM Ryan

# sscanf(argv[1],

# after the time section of the string, and overflow the program. My
# question is, what is the proper way of handling this? How can I remedy it?

Do you realize you aren't required to use *scanf? If the tools are
too difficult to use, get better tools.
 
A

Arctic Fidelity

Do you realize you aren't required to use *scanf? If the tools are
too difficult to use, get better tools.

I suppose I should say that I am unsure of what other tools in the
Standard C Library allow me to extract, in one function call, all the date
information from a string that I need, in such a straightforward fashion.
If there is, I'd love to hear it. :) I personally came accross a sample
usage of sscanf in documentation, and found that it was much faster
compared to my original idea of single character stepping through the date
string.

- Arctic Fidelity
 
W

Walter Roberson

I personally came accross a sample
usage of sscanf in documentation, and found that it was much faster
compared to my original idea of single character stepping through the date
string.

Faster? In what sense? Faster to write the code, or faster execution
time, or faster to debug the security problems?
 
R

Richard Bos

SM Ryan said:
# sscanf(argv[1],

# after the time section of the string, and overflow the program. My
# question is, what is the proper way of handling this? How can I remedy it?

Do you realize you aren't required to use *scanf? If the tools are
too difficult to use, get better tools.

And what better tool would you use in this particular situation?

Richard
 
A

Arctic Fidelity

Faster? In what sense? Faster to write the code, or faster execution
time, or faster to debug the security problems?

Faster to debug the code, the security issues, faster to write the code,
though I am not sure about execution speed, I haven't tested that.

- Arctic Fidelity
 
S

SM Ryan

(e-mail address removed) (Richard Bos) wrote:
#
# >
# > # sscanf(argv[1],
# >
# > # after the time section of the string, and overflow the program. My
# > # question is, what is the proper way of handling this? How can I remedy it?
# >
# > Do you realize you aren't required to use *scanf? If the tools are
# > too difficult to use, get better tools.
#
# And what better tool would you use in this particular situation?

I usually write my own parser with things like state machines, strchr,
isxxx, strtol, etc. I prefer writing longer code if necessary to ensure
I have it under control.

Then again I'm not the one who felt the need to ask others if a scanf
format was safe.
 
W

WhoCares?

BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?

Would you please tell me how to do that?
In my _limited_ knowledge, if the command line contains whitespaces
then the strings separated by these whitespaces are passed as different
members of argv[]. How to get all that in a single member?

Thanks in advance.
 
W

Walter Roberson

BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?
Would you please tell me how to do that?
In my _limited_ knowledge, if the command line contains whitespaces
then the strings separated by these whitespaces are passed as different
members of argv[]. How to get all that in a single member?

Mechanism to pass spaces in as arguments are OS or shell specific,
and should be asked in an appropriate newsgroup.

[OT]

Commonly, passing spaces in involves quoting of arguments. But the
exact quote characters and escape rules are OS or shell specific.

Unix ksh:

A3="arg 3"
./myprog "arg 1" 'arg 2' $A3

The behaviour in the last of those cases especially is not the same
on other shells or OS's.
 
R

Richard Bos

[ Please do not remove attributions that are still relevant. Thanks. ]
BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?

Would you please tell me how to do that?

Depends on the OS, and under some OSes, on the shell.
In my _limited_ knowledge, if the command line contains whitespaces
then the strings separated by these whitespaces are passed as different
members of argv[].

That's the most usual case, yes. But consider an OS which does not have
a command line, but allows you to fill in program parameters in the
symlink properties dialog. Or consider a shell which allows you to
escape whitespace.

Richard
 
W

Walter Roberson

On Mon, 24 Oct 2005 20:52:37 -0400, Walter Roberson
Faster to debug the code, the security issues, faster to write the code,
though I am not sure about execution speed, I haven't tested that.

You snipped the context that you had encountered scanf() in some
documentation and had found using it to be faster.

With regard to the debugging the security issues, you should be
taking into account that in order to debug those issues, you ended
up having to post to Usenet and to track through several days of
discussions in order to get the security issues clear. If you had
written your own small code section that did not use scanf(),
then you could have had it done in perhaps half an hour. And since
debugging the code includes debugging the security issues, you
weren't done debugging the code for at least several days.

Similarily, part of writing the code is debugging it, and documenting
it. Again you had the several days of delay while you found out
what scanf() does. So writing the code was in fact slower than if you
had taken a more direct approach without scanf().

The only "faster" left is execution time, which you indicate that
you did not measure.

I think you should be reconsidering whether it was any "faster"
to use scanf() or not. It looks to me that using scanf() was slower
in every measure you were taking into account.
 
A

Arctic Fidelity

You snipped the context that you had encountered scanf() in some
documentation and had found using it to be faster.

My sincere apologies. I shall try to take better note of that.
With regard to the debugging the security issues, you should be
taking into account that in order to debug those issues, you ended
up having to post to Usenet and to track through several days of
discussions in order to get the security issues clear. If you had
written your own small code section that did not use scanf(),
then you could have had it done in perhaps half an hour. And since
debugging the code includes debugging the security issues, you
weren't done debugging the code for at least several days.

Actually, I received a response that well enough answered my question to
the point where I knew where to look and how to look in my documentation
that I was able to have the entire question settled from post to end in
far less than half a day. The time that it would have taken for me to
properly fix and make an even equivalently working program of my own,
would have been at least that long, since my speed at writing C code is
not nearly fast enough for that yet.
Similarily, part of writing the code is debugging it, and documenting
it. Again you had the several days of delay while you found out
what scanf() does. So writing the code was in fact slower than if you
had taken a more direct approach without scanf().

Whether the approach is more direct or not is debatable. I would say that
in comparison with the estimated amount of time it would have taken to
properly fix and verify the code I would have written, sscanf() + Usenet
discussion time (and by this I mean the time before I fixed the problem)
was faster.
The only "faster" left is execution time, which you indicate that
you did not measure.

I think you should be reconsidering whether it was any "faster"
to use scanf() or not. It looks to me that using scanf() was slower
in every measure you were taking into account.

Having reconsidered it, and I have come to the conclusion that sscanf
seems to be at least equivalent in "speed" (with regards to those issues
stated above) to writing my own code by hand, taking into account my
relative speed at writing such code at this moment in time.

- Arctic Fidelity
 
A

Arctic Fidelity

Then again I'm not the one who felt the need to ask others if a scanf
format was safe.

In some regards, I feel almost as though having asked this question has
earned me even the slightest bit of disdain from some particular readers
of this group. Am I missing something? Forgive me if I am reading into
such things. I am under the impression, perhaps, that scanf and such
functions have at them a group of people who are in at least partially
strong objection to their use? If so, is their some history or methods or
something else about these scanf tools with which I am not familar that
has earned them such apparent dislike? If not, well, then do please ignore
the far too naive jabberings of a simpleton of the C world, a relative
newcomer.

- Arctic Fidelity
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top