Buffer Overruns other C Gotchas -- "Coders at Work"

  • Thread starter Casey Hawthorne
  • Start date
K

Keith Thompson

Tom St Denis said:
My point more so was that as an instantiation of an environment in
which C may be used it's not always so fubared or dramatic. Ideally,
if the spec says the output can only be 26 chars the Sun platform
should return an appropriate error condition instead of crashing. And
that's THEIR fault for so blindly copying it.

No, it's your fault for calling asctime with an argument for which its
behavior is undefined. It's a more understandable error than
char s[5];
strcpy(s, "hello, world");
because the valid inputs for asctime() are harder to figure out than
the valid input for strcpy(). But it's still the case that your
program's behavior is undefined, and can blow up on a conforming
implementation.

As it turns out, Sun *didn't* blindly copy the implementation in the
standard. Here's a modified version of your program:

#include <time.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

int main(void)
{
struct tm t;
char *p;
memset(&t, 0, sizeof t);
t.tm_year = 10000;
p = asctime(&t);
if (p == NULL) {
puts("asctime returned NULL");
}
else {
printf("asctime returned %p --> \"%s\"\n", (void*)p, p);
}
return 0;
}

and its output on Solaris 9:

asctime returned NULL

The standard doesn't specify that asctime returns a null pointer on
error, but it's a reasonable convention -- and it would have helped
you catch your error more quickly than on a glibc-based
implementation.

[...]
I just read the C99 spec for asctime. Nowhere does it say the buffer
can only be 26 bytes. It describes what the output format must look
like, but never mentions the length. The C code happens to mention
the length in passing but I really consider the C code an example [a
poor one] that produces the desired output format.

No, the C code is not just an example. It is the definition of the
algorithm.
I haven't read anywhere that says you can't have a year of 11900.

Read the algorithm in the standard. Trace through it, and see what
happens if you pass it a year of 11900. It's not stated explicitly
(which is unfortunate), but it's still right there in black and white.
I
also don't consider the C code in the spec to define how the algorithm
that produces the output must be written.

No, but whatever algorithm is used must be equivalent to the one
presented.
To me the definition of
asctime() is

---
The asctime function converts the broken-down time in the structure
pointed to by timeptr into a string in the form
Sun Sep 16 01:03:52 1973\n\0
---

Only if you stop reading there. The rest of that section isn't
decorative; it means something.

[...]
Well I think that's the bigger point here, these C functions have
explicit/implicit assumptions of the inputs.
Yes.

I haven't read anywhere that explicitly states you can't pass memcpy()
[section 7.21.2.1] NULL as one of the pointers. We just "know" that
because dereferencing a NULL pointer leads to undefined behaviour.

C99 7.1.4, Use of library functions:

Each of the following statements applies unless explicitly
stated otherwise in the detailed descriptions that follow:
If an argument to a function has an invalid value (such as
a value outside the domain of the function, or a pointer
outside the address space of the program, or a null pointer,
or a pointer to non-modifiable storage when the corresponding
parameter is not const-qualified) or a type (after promotion)
not expected by a function with variable number of arguments,
the behavior is undefined.
Similarly, by reading the code, assuming you assume that that is the
way your function is implemented it's obvious that you can't have a 5+
digit year. That's an implicit assumption based on the behaviour of
the function based on the description.

And really that's the point. We have people who are not strong
software developers bitching about the fact that they have to sanitize
and properly test their inputs. They'd rather hack together whatever
they can as fast and as incoherently as possible and are pissed that
software development is ACTUALLY REALLY HARD WORK.

jacob has strongly criticized the standard's definition of asctime().
Some of his criticisms are overstated; some I actually agree with.
I don't think we can reasonably assume he dislikes asctime()
because using it is "REALLY HARD WORK". Perhaps you were talking
about developers other than jacob, but I don't recall anyone else
complaining about asctime() (well, I have).
 
C

crisgoogle

Tom St Denis a écrit :




So you think it is a good thing to have code that overflows its buffer in the
C standard itself???

As an "EXAMPLE" ???

I am discussing the flaw in the C standard as an example of C code that provokes
buffer overflows. And as an example of the refusal of many people in the C
community to acknowledge that buffer overflows are a serious thing.

You are proving that this attitude towards buffer overflows is widespread..

Thanks for your help.

Good grief, here we go again. I just don't get your obsession with
this.

I'm sure something similar to the following has already been pointed
out to
you, but here we go, just for giggles:

If there was sample strcpy code in the standard, either along with the
current
specification, or instead of, that said something like:

char *strcpy(char *s1, const char *s2)
{
size_t i;

while(*s1 = *s2)
i++;

return s1;
}

.... would you be throwing the same fit? If so, why? This code doesn't
do
_anything_ that a conforming implementation can't do, and does
everything
that a conforming implementation must do. I imagine, in fact, that it
does
just about exactly what most implementations actually _do_ do.

If you wouldn't object, why not? By the same criteria that you apply
to
asctime, the standard would have a bug!!

But as you can see, the standard defines exactly the same the language
with
or without that code snippet. Conforming compilers behave exactly the
same
way as long as they're fed code that doesn't exhibit undefined
behhaviour.
Exactly the same bits of code that would be undefined if the standard
_did_
have that code snippet, are in fact undefined under the current state
of affairs.

In short, there are many (many many!) explicit and implicit
opportunities for undefined behaviour in C. asctime has undefined
behaviour
if fed certain inputs, as do lots of other functions and constructs,
but the
inputs that are allowed and give well-defined outputs are clearly (if
not
explicitly) defined by the standard.
 
B

Ben Bacarisse

crisgoogle said:
char *strcpy(char *s1, const char *s2)
{
size_t i;

while(*s1 = *s2)
i++;

return s1;
}


I think you meant:

char *strcpy(char * restrict s1, const char * restrict s2)
{
size_t i = 0;
while (s1 = s2)
i++;
return s1;
}

These are all irrelevant to your point, but there may be people
reading this who will be baffled if the code is uncorrected.

<snip>
 
C

crisgoogle

char *strcpy(char *s1, const char *s2)
{
  size_t i;
  while(*s1 = *s2)
     i++;

  return s1;
}

I think you meant:

  char *strcpy(char * restrict s1, const char * restrict s2)
  {
      size_t i = 0;
      while (s1 = s2)
          i++;
      return s1;
  }

These are all irrelevant to your point, but there may be people
reading this who will be baffled if the code is uncorrected.

<snip>


<sigh>

Note to self:
Must remember to engage brain. And to check after changing examples.

Ta.
 
S

Seebs

Careful. That's still illegal in some states.

Remember, the right is to a trial by a jury *of your peers*.

As long as you get all programmers, and you can prove that you were actually
forced to use QBASIC, you should be set.

-s
 
J

Joachim Schmitz

Seebs said:
Remember, the right is to a trial by a jury *of your peers*.

As long as you get all programmers, and you can prove that you were
actually forced to use QBASIC, you should be set.

There are some states where it is illegal and wehre there is no jury, only a
single judge, who is quite unlikely to be a programmer...

Bye, Jojo
 
N

Nick Keighley

Casey Hawthorne a écrit :

why do you keep posting this?

The deeper problem is that the C users community doesn't even want to a knowledge this problem.

if you want a language that doesn't allow buffer overflows then don't
use C

<snip>
 
J

James Dow Allen

Is there a question here?

Wow!! Any doubt that bizarre-thinking pedants are roaming in this
group are dispelled.

Or, perhaps I need to read the group's charter. Does it have
a clause based on the Jeopardy game show?
"Comments must be phrased in the form of a question."

James
 
T

Tim Streater

Tom St Denis said:
...and I was writing software before you were born. So?

Well then ....
and I grew up on writing DOS applications that had direct control
over the VGA, Sound card [or PC speaker whichever] and other
devices.

Sure. It's a debate. Feel free to debate the point! But you may need
better arguments than "because I want to" if you are to persuade the
OP. :)

Because I had to? There weren't video/graphics/sound/modem/etc
drivers back in the day. If I wanted my DOS application to make beeps
and boops I had to poke hardware. Something the OP is probably
unaware of due to age and/or lack of experience.

So the answer to why C "crept" into the userspace is that back in the
day most OSes were blind to hardware [or in the case of a lot of 8-bit
systems there was no OS at all]. So people wrote C applications
around their C code that controlled the hardware.

But to get even more resounding, even today, I'd rather do bit
twiddling like you find in crypto, DSP related codecs, error
correction, etc, in C than something like C# or VB.

20 years ago I needed to write cross-platform code for VMS and VM/CMS.
It had to support recursion and re-entrancy. So I used C. And the fact
that the cross-platform micro-kernel we had for thread support was also
in C helped too.

I can't remember whether recursion and re-entrancy are an inherent part
of what C can do, but it certainly worked for the compilers we had on
the VAX and IBM 3090.
 
T

Tom St Denis

...and I was writing software before you were born. So?

Well then ....
and I grew up on writing DOS applications that had direct control
over the VGA, Sound card [or PC speaker whichever] and other
devices.

Sure. It's a debate. Feel free to debate the point! But you may need
better arguments than "because I want to" if you are to persuade the
OP. :)

Because I had to? There weren't video/graphics/sound/modem/etc
drivers back in the day. If I wanted my DOS application to make beeps
and boops I had to poke hardware. Something the OP is probably
unaware of due to age and/or lack of experience.

So the answer to why C "crept" into the userspace is that back in the
day most OSes were blind to hardware [or in the case of a lot of 8-bit
systems there was no OS at all]. So people wrote C applications
around their C code that controlled the hardware.

But to get even more resounding, even today, I'd rather do bit
twiddling like you find in crypto, DSP related codecs, error
correction, etc, in C than something like C# or VB.

Tom
 
J

jacob navia

Gareth Owen a écrit :
Everybody knows asctime() is standardised and broken (for some inputs).
Everybody knows gets() is very unsafe and should not be used except in
very limited circumstances.

And yet, every month you bring this up as if it were a great revelation,
and is if this were evidence of some grand cabal to keep the language
from progressing. It's not. It's just evidence that standardisation
bodies move very slowly.

Dear anonymous coward:

I will go on bringing this every month. Until it is fixed. It is not a great
revelation, it is just that I think that when you want to change things you
have to be stubborn.
*We all know this*.

I know that.
Contrary to your assertion, everyone acknowledges it.
Everyone is aware of it.

This is not true, since the committee has neither acknowledge it nor
fixed it.

We just don't care very much.

The royal "we". You are so scared that you do not even use your real
name. And... you know what?

I do not care a lot about your opinion either.
 
S

Seebs

Gareth Owen a écrit :
Dear anonymous coward:

I see a name there, I don't see how that's anonymous.
I will go on bringing this every month. Until it is fixed. It is not a great
revelation, it is just that I think that when you want to change things you
have to be stubborn.

You realize that even assuming the committee were fully persuaded that this
needed to change, it would take well more than a year to change it, yes?
This is not true, since the committee has neither acknowledge it nor
fixed it.

I have every confidence that this issue will be fixed within two or three
thousand years of when times with five-digit years will be a significant
concern for most people. Possibly sooner.

However, I think it is just mildly possible to imagine that people are
not *especially* concerned about this in the short term.

-s
 
K

Keith Thompson

James Dow Allen said:
Wow!! Any doubt that bizarre-thinking pedants are roaming in this
group are dispelled.

Or, perhaps I need to read the group's charter. Does it have
a clause based on the Jeopardy game show?
"Comments must be phrased in the form of a question."

Did you notice that the original article begins with "I thought of
this question", and then doesn't ask a question?
 
K

Keith Thompson

jacob navia said:
Gareth Owen a écrit : [snip]

Dear anonymous coward:
[snip]

The royal "we". You are so scared that you do not even use your real
name. And... you know what?

I do not care a lot about your opinion either.

jacob, do you have some reason to believe that Gareth Owen isn't his
real name?

Also, I posted a rather lengthy followup in this thread. If you
intend to ignore what I have to say on the topic of asctime(),
please let me know so I can stop wasting my time.
 
L

lawrence.jones

jacob navia said:
A buffer overrun is *specified* in the code of the C standard itself.

For the gazillionth time, a *potential* buffer overrun is specified.
Any code that would trigger that buffer overrun is *incorrect*.
The many discussions in this group or in the similar group
comp.lang.c have led to nothing.

Strangely enough, all the shouting I've done at my TV set hasn't
improved the quality of the shows, either.
This code will provoke a buffer overflow if the year is, for instance,
bigger than 8099.

Nowhere in the standard are the ranges for the year are specified.

The range is specified by the code that you keep insisting is broken.
However, the latest draft (N1401) now spells it out explicitly (and more
restrictively):

If any of the fields of the broken-down time contain values that
are outside their normal ranges, the behavior of the asctime
function is undefined. Likewise, if the calculated year exceeds
four digits or is less than the year 1000, the behavior is
undefined.

You'll be happy to know that the committee just voted (unanimously, as
it turns out) to remove gets() from the draft as well.

So what are you going to complain about now?
 
S

Seebs

For the gazillionth time, a *potential* buffer overrun is specified.
Any code that would trigger that buffer overrun is *incorrect*.

Perhaps true now. It's not totally obvious to me that this was true of,
say, C99 -- there, I don't see anything wrong with a year 10K.
You'll be happy to know that the committee just voted (unanimously, as
it turns out) to remove gets() from the draft as well.

Yayyyy!

-s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top