Bug/Gross InEfficiency in HeathField's fgetline program

  • Thread starter Antoninus Twink
  • Start date
R

Richard Heathfield

Tor Rustad said:
So, you wouldn't have replied then, given the same conditions.

if ( tor_eval(richard) == cluless )
goto no_reply;



This didn't make any sense, you did reply, didn't you?

Look up "if".
Either "I think /you/ are clueless" is *false*, or you are posting
nonsense. You cannot have it both ways:

if ( richard_eval(tor) == cluless )
goto post_reply;

See? You do understand about "if".
For an introduction to basic security principles, see e.g. [1]:

"Principle 32. Identify and prevent common errors and vulnerabilities

Discussion: Many errors reoccur with disturbing regularity - errors
such as buffer overflows, race conditions, format string errors,
failing to check input for validity, and programs being given excessive
privileges. Learning from the past will improve future results."

In my experience, the following bug is far more common than strncpy:

char *t;

strcpy(t, s);

The bug here is in failing to allocate *any storage at all* for t.

Such an error, is detectable by statically checking tools, if compiler
doesn't catch it, lint tools like e.g. splint does.

Try it on this:

/* foo.h */
#ifndef H_FOO_H
#define H_FOO_H 1
void build_foo(char *foo, int bar, char *baz);
#endif

/* foo.c */
#include <string.h>
#include "foo.h"

void build_foo(char *foo, int bar, char *baz)
{
sprintf(foo, "<h%d>%s</h%d>", bar, baz);
}

What warnings do you get? Compilation need not be done all at one time.

To be honest, in safety-critical or security-critical software, I can't
ever remember being hit by buffer overflow or unallocated storage, in
production. Before someone is allowed to do the real thing, they need to
master the basics.

Fine, so you agree that the best solution is to hire good people and make
sure they know what they're doing?
The main risk where I work, is rather the "clever" insiders.

What do you mean by "clever"? Where I come from, it means "smart, bright,
intelligent" and is considered a compliment. If you mean someone who tries
to write difficult code to show off how well he can write difficult code
(instead of writing easy code that anyone could maintain), I wouldn't call
that "clever".
 
¬

¬a\\/b

In data Thu, 18 Oct 2007 08:07:58 +0200, ¬a\/b scrisse:
In data Wed, 17 Oct 2007 17:00:13 +0200, ¬a\/b scrisse:

// overlaps m bytes starting at t

^^^^^^^^^^^^^

return 1


^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

if( ((int)t)>=0 &&((int)(t+m-1))<= 0) return 1;

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

if( ((int)t)<=0 &&((int)(t+m-1))>= 0) return 1;

there is anhother error
the 'singular' points are two 0 and INT_MAX
--0 == -1 ++INT_MAX!=0
 
¬

¬a\\/b

In data Fri, 19 Oct 2007 08:49:43 +0200, ¬a\/b scrisse:

/* test: if n bytes starting at s
// overlaps m bytes starting at t
*/

#define uns unsigned

/* assume a pointer has the same size of one unsigned */
/* return 1 if error or overlap 0 otherwise
if you like uns == size_t
*/
int mem_overlap123(char* s, int n, char* t, int m)
{if(s==0|| t==0) return 1;

/* s and t can not go trhu 0 address */
/* s----- t------- */
if((uns)(s+n-1) < (uns) s) return 1;
if((uns)(t+m-1) < (uns) t) return 1;


/* overlap */
/* s----- t------ || t----- s------*/
if( (uns)(s+n-1) < (uns) t) return 0;
if( (uns)(t+m-1) < (uns) s) return 0;

return 1;
}
 
¬

¬a\\/b

In data Fri, 19 Oct 2007 09:19:27 +0200, ¬a\/b scrisse:
In data Fri, 19 Oct 2007 08:49:43 +0200, ¬a\/b scrisse:

/* test: if n bytes starting at s
// overlaps m bytes starting at t
*/

#define uns unsigned

/* assume a pointer has the same size of one unsigned */
/* return 1 if error or overlap 0 otherwise
if you like uns == size_t
*/
int mem_overlap123(char* s, int n, char* t, int m)

int mem_overlap123(char* s, uns n, char* t, uns m)
{if(s==0|| t==0) return 1;

/* s and t can not go trhu 0 address */
/* s----- t------- */
if((uns)(s+n-1) < (uns) s) return 1;
if((uns)(t+m-1) < (uns) t) return 1;


/* overlap */
/* s----- t------ || t----- s------*/
if( (uns)(s+n-1) < (uns) t) return 0;
if( (uns)(t+m-1) < (uns) s) return 0;

return 1;
}

so nobody has some interest in doing a check overlap function
in less than O(n+m) ?
 
J

James Kuyper Jr.

Richard said:
Tor Rustad said: ....

What do you mean by "clever"? Where I come from, it means "smart, bright,
intelligent" and is considered a compliment.

He's not using the word with a different meaning; he's using it
sarcastically, to describe someone who's just clever enough to create
some very complicated problems, without being clever enough to avoid or
escape those problems.
 
J

James Kuyper Jr.

¬a\/b wrote:
....
int mem_overlap123(char* s, uns n, char* t, uns m)


so nobody has some interest in doing a check overlap function
in less than O(n+m) ?

Not unless it's portable, and I'm not aware of any portable algorithm
for checking overlap that meets that specification. I've already pointed
out why your algorithm is not portable.
 
R

Richard

James Kuyper Jr. said:
He's not using the word with a different meaning; he's using it
sarcastically, to describe someone who's just clever enough to create
some very complicated problems, without being clever enough to avoid
or escape those problems.

It's a euphemism for "cocky smartass".
 
T

Tor Rustad

Richard said:
Tor Rustad said:

[pissing match snipped, even if I really need to blow off more steam
after a hell of a month!]
Try it on this:

/* foo.h */
#ifndef H_FOO_H
#define H_FOO_H 1
void build_foo(char *foo, int bar, char *baz);
#endif

/* foo.c */
#include <string.h>

???

#include said:
#include "foo.h"

void build_foo(char *foo, int bar, char *baz)

???

broken API design, remember gets()?
{
sprintf(foo, "<h%d>%s</h%d>", bar, baz);

???

sprintf(foo said:
}

What warnings do you get? Compilation need not be done all at one time.

I don't see the point with analyzing the broken code above.

Not all kinds of buffer overruns can be detected by static analysis, but
I have used splint to check private API's for overruns before. The
idea in splint, is to use annontations, the *requires* and *ensures*
clauses, put in pre- and postconditions, respectively.

Constraints on lvalues, can be set via maxSet() and minSet(), while
constraints on rvalues can be set via maxRead() and minRead(). Those
constraints, set bounds of legal memory access.

Example:

char *my_strcpy(char *s1, const char *s2)
/*@requires maxSet(s1) >= maxRead(s2)@*/
/*@ensures maxRead(s1) == maxRead(s2)
/\ result == s1 @*/;

Fine, so you agree that the best solution is to hire good people and make
sure they know what they're doing?

There isn't a single *best solution* in security engineering, just like
in good software engineering:

bad functional specs, if you screw up design, if you screw up version
control, if you screw up testing ...

the resulting software isn't trustworthy.
What do you mean by "clever"? Where I come from, it means "smart, bright,
intelligent" and is considered a compliment. If you mean someone who tries
to write difficult code to show off how well he can write difficult code
(instead of writing easy code that anyone could maintain), I wouldn't call
that "clever".

A common mistake among the "clever", is that they beleave they are
better than their own constraints. Not knowing your own limitation, is a
major security risk IMO. I do by far, prefer humble smartness.

There is a second dimension to this, the vast majority of fraud, is done
by insiders, IIRC ca. 60%-70%.
 
R

Richard Heathfield

Tor Rustad said:
I don't see the point with analyzing the broken code above.

Neither do I. In an argument about correctness, I should have taken the
time to compile the example code rather than trust my fingers to get it
right on auto-pilot. But, *had I done so*, it would have been a good
example!
Not all kinds of buffer overruns can be detected by static analysis,

Indeed. This is kind of my point, really. If you could detect all errors
automatically, you wouldn't need bright programmers. But since you can't,
you do.

There isn't a single *best solution* in security engineering,

Agreed. Nevertheless, the best solution is to hire bright people. Bright
people should be able to work out how not to abuse strncpy, right?
A common mistake among the "clever", is that they beleave they are
better than their own constraints. Not knowing your own limitation, is a
major security risk IMO. I do by far, prefer humble smartness.

That's "clever" as in "dumb", right? Just checking.

The risk of not knowing your own limitations and weaknesses is precisely
the reason that clever people use coding conventions/standards/style
guides, and ask for code reviews by their peers. Indeed, it's why we
bother to do testing. I fail to see how this says anything about strncpy,
though.
 
M

Malcolm McLean

Richard Heathfield said:
Agreed. Nevertheless, the best solution is to hire bright people. Bright
people should be able to work out how not to abuse strncpy, right?
That's one of the few immutable rules of engineering. The better the people
you can hire, the less chance of mistakes.
The problem is that salaries are serious money, and even if you rack them
up, good people are not always easy to obtain. So the non-ideal reality is
that you might get a Java man, two years out from college and who did a
short course on C, fixing up your C program. So he sees a strcpy() and knows
that Microsoft has deprecated the call. Ha ha, he thinks, security loophole
here. Let's fix it up. Oh, the MS sstrcpy() doesn't seem to be available on
this installation. However they've got an equivalent called strncpy(). Let's
just drop that in instead.
 
M

Malcolm McLean

James Kuyper Jr. said:
Imposing the requirements you suggest would make efficient implementation
of C on many platforms more difficult. I've used machines where 16 bits
was the most reasonable size for 'int', which had a LOT more than 65536
bytes of memory installed. I'm certain that there will be machines in the
future (I wouldn't be surprised if they already exist) where the natural
size for an integer is 32 bits, which have far more than 4GB of memory
installed.
Here I don't agree.

Basically you are saying the paradigm

int i;

for(i=0;i<N;i++)
array = x;

ought to be allowed to break down if 16 or 32 bit operations are faster on
machines with 32 or 64 bit address spaces.
I'd say that the burden should be on the micro-optimiser to say

short i; /* inner loop, use 16 bit type */

That does lead to the problem of what type to use for 32 bit register, 64
bit address space machines.

There was also the problem of the old x86 segmented machines, where arrays
of more than 64K were highly inefficient but sometimes needed. Generally
however, if you've got 64 bits in the address bus, you've got to ask what
the designer is doing by not providing an efficient 64 bit integer.
 
F

Flash Gordon

Malcolm McLean wrote, On 20/10/07 06:56:
James Kuyper Jr. said:
Imposing the requirements you suggest would make efficient
implementation of C on many platforms more difficult. I've used
machines where 16 bits was the most reasonable size for 'int', which
had a LOT more than 65536 bytes of memory installed. I'm certain that
there will be machines in the future (I wouldn't be surprised if they
already exist) where the natural size for an integer is 32 bits, which
have far more than 4GB of memory installed.
Here I don't agree.

Basically you are saying the paradigm

int i;

for(i=0;i<N;i++)
array = x;

ought to be allowed to break down if 16 or 32 bit operations are faster
on machines with 32 or 64 bit address spaces.


Only if N is large. Or do you really want to increase the cost of most
of the appliances in your house, the cost of your car, the cost of
planes (and thus flying) etc.
I'd say that the burden should be on the micro-optimiser to say

short i; /* inner loop, use 16 bit type */

Your suggestion would mean that a lot of embedded code would have to use
short almost exclusively instead of int. Are you going to pay for all
the rewriting?
That does lead to the problem of what type to use for 32 bit register,
64 bit address space machines.

There was also the problem of the old x86 segmented machines, where
arrays of more than 64K were highly inefficient but sometimes needed.
Generally however, if you've got 64 bits in the address bus, you've got
to ask what the designer is doing by not providing an efficient 64 bit
integer.

Perhaps the designer is saving $25 on a $100 product. High speed memory
(e.g. cache) is expensive, so however fast operations are on 64 bit
integers you can massively increase your costs, or slow things down
massively, by doubling the size of your basic integer type.

You seem to have forgotten that all this and more has already been
pointed out to you. If you think the decision is wrong start by taking
it up with Intel, AMD, the Posix standard group and MS. Most people will
have some respect for the abilities of at least one of these groups,
although which group will depend on the person.

BTW, I happen to know that there are still a number of processors with
the Z80 instruction set flying around and a number of processors early
in the 80x86 range as well. When I say flying around I mean that they
are part of avionics systems on current aircraft.
 
C

Charlie Gordon

Tor Rustad said:
Recently I was involved in hiring a security officer with C programming
skills. Those who passed the initial screening, had at least a master
degree, but their experience varied a lot.

I ended up recommending the only one, with 0 work experience, who admitted
he didn't knew C well. The seniors, failed big time implementing strncpy()
on the blackboard. Very embarrassing.

You should see the horrors I get for atoi, from seniors and juniors!
 
C

Charlie Gordon

Richard Heathfield said:
Tor Rustad said:




Well, I'm game. Is this a blackboard? Why, yes, it is (although it's
actually white, but never mind).

Okay, it's an interview, so I'm not allowed to look stuff up. So, off the
top of my head, strncpy copies no more than n characters from s to t,
stopping at a null terminator if present, and zero-padding t. It then
returns t. I can't actually remember whether n is size_t or int. (It ought
to be size_t, of course, but then so ought the n in fgets.) So I'll risk
embarrassment by plumping for size_t.

#include <stddef.h>

char *strncpy(char *t, const char *s, size_t n)
{
char *u = t;
while(n > 0 && *s != '\0')
{
*t++ = *s++;
--n;
}
while(n-- > 0)
{
*t++ = '\0';
}
return u;
}

How did I do? Should I start blushing yet?

Why do you include <stddef.h> instead of <string.h> ?
 
C

Charlie Gordon

Richard Heathfield said:
[ I already wrote a long reply to this, but the newsreader crashed on send
(no, I did *not* write the newsreader, and no, I *will* not write a
newsreader). I don't have time to reconstruct the whole reply. This is a
rather polite summary of my previous, very short-tempered reply. ]

Tor Rustad said:

Amazing, I was *not* hiring someone to protect *apples*, *crown* or
looking for a clueless in security, unable to identify *common* errors.

(a) you were *not* hiring *anyone*. You were taking part in a Usenet
discussion.
(b) if you think I'm clueless, why bother to continue this discussion?
(c) what if I think /you/ are clueless, unable to recognise *common*
sense?
For an introduction to basic security principles, see e.g. [1]:

"Principle 32. Identify and prevent common errors and vulnerabilities

Discussion: Many errors reoccur with disturbing regularity - errors such
as buffer overflows, race conditions, format string errors, failing to
check input for validity, and programs being given excessive privileges.
Learning from the past will improve future results."

In my experience, the following bug is far more common than strncpy:

char *t;

strcpy(t, s);

The bug here is in failing to allocate *any storage at all* for t. I have
seen this happen in production code far more than I have seen strncpy
misused (or indeed used at all). In fact, I have seen what we might call
"the char * bug" (if only there weren't so many other bugs that could
easily have the same name) so often that it definitely counts as
disturbing regularity.

Such bugs hardly survive rudimentary testing.
If you ban strncpy, logically you have to ban char * too, because it's a
far greater risk. To do otherwise is to be guilty of rearranging the
deckchairs on the Titanic.

Not so. strncpy related bugs are more subtle, and occur in border cases.
As such they will go unnoticed until Murphy helps making them strike at the
worst possible moment.
So ban pointers. They cause far more trouble than strncpy.

strncpy can be banned with no downside. Pointers are a fundamental feature
of the language. The class of problems that can be solved without them has
a topological measure of zero.
FCOL, Tor. Wake up and smell the real risk - clueless programmers, hired
by
witless buffoons because they have good hair and a good CV.

90% of the IT work force. More than 90% of all code produced.
 
C

Charlie Gordon

Tor Rustad said:
Richard said:
Tor Rustad said:
(b) if you think I'm clueless, why bother to continue this discussion?

So, you wouldn't have replied then, given the same conditions.

if ( tor_eval(richard) == cluless )
goto no_reply;

(c) what if I think /you/ are clueless, unable to recognise *common*
sense?

This didn't make any sense, you did reply, didn't you?

Either "I think /you/ are clueless" is *false*, or you are posting
nonsense. You cannot have it both ways:

if ( richard_eval(tor) == cluless )
goto post_reply;

:)
For an introduction to basic security principles, see e.g. [1]:

"Principle 32. Identify and prevent common errors and vulnerabilities

Discussion: Many errors reoccur with disturbing regularity - errors such
as buffer overflows, race conditions, format string errors, failing to
check input for validity, and programs being given excessive privileges.
Learning from the past will improve future results."

In my experience, the following bug is far more common than strncpy:

char *t;

strcpy(t, s);

The bug here is in failing to allocate *any storage at all* for t.

Such an error, is detectable by statically checking tools, if compiler
doesn't catch it, lint tools like e.g. splint does.

$ cat -n main.c
1 #include <stdio.h>
2 #include <string.h>
3
4 int main(void)
5 {
6
7 const char *s = "Hello";
8 char *t;
9 char d[5];
10
11 /* BUG 1 - storage not allocated */
12 (void)strcpy(t, s);
13 (void)printf("'%s'\n", t);
14
15 /* BUG 2 - destination buffer not null terminated */
16 strncpy(d, s, sizeof d);
17 (void)printf("'%.*s'\n", (int)sizeof d, d);
18
19 /* BUG 3 - destination buffer truncated */
20 strncpy(d, s, sizeof d - 1);
21 (void)printf("'%.*s'\n", (int)sizeof d, d);
22
23 return 0;
24 }
$ gcc -ansi -pedantic -W -Wall main.c
$ splint main.c
Splint 3.1.1 --- 20 Jun 2006

main.c: (in function main)
main.c:12:15: Unallocated storage t passed as out parameter to strcpy: t
An rvalue is used that may not be initialized to a value on some
execution
path. (Use -usedef to inhibit warning)

Finished checking --- 1 code warning
$

Note that the bugs of class 2 and 3, gives no warning above.

Whether 2 and 3 are bugs depends on how ``d'' will be used and on
programmers intent regarding truncation. But I agree with you, I would
advocate warnings on all strncpy instances.

Why do (void) strcpy and not strncpy ?
That is an ugly convention anyway, and splint would be silly to complain
about the return value of strcpy or strncpy not being used.
 
C

Charlie Gordon

Richard Heathfield said:
Mark McIntyre said:


Well, it's not actually arrogance. The point is that my mind is *my* mind,
not *your* mind. If you want to hire it, fine. But *changing* it is my
prerogative, just as changing your mind is your prerogative.


The trick is to review the coding standard during your first day, and come
up with a long list of proposed changes (which, naturally, you will be
prepared to justify). Once you've won that battle, you can start in on
reviewing the in-house libraries. :)

I would be very interested to see your coding standard.
 
J

James Kuyper Jr.

Malcolm said:
James Kuyper Jr. said:
Imposing the requirements you suggest would make efficient
implementation of C on many platforms more difficult. I've used
machines where 16 bits was the most reasonable size for 'int', which
had a LOT more than 65536 bytes of memory installed. I'm certain that
there will be machines in the future (I wouldn't be surprised if they
already exist) where the natural size for an integer is 32 bits, which
have far more than 4GB of memory installed.
Here I don't agree.

Basically you are saying the paradigm

int i;

for(i=0;i<N;i++)
array = x;

ought to be allowed to break down if 16 or 32 bit operations are faster
on machines with 32 or 64 bit address spaces.


I'm confused by your example, and it's supposed connection to what I
said. Without definitions for N, array, and x, I'm left to assume that
they all have reasonable definitions. As long as N <= INT_MAX, and
assuming that array is defined as having at least N elements, and 'x'
has a value that can safely be converted to the type of array, I see
no way to interpret what I said as endorsing failure of that loop on
such machines.

I didn't say so, but my argument does imply an endorsement of the fact
that the standard does not define the value of INT_MAX, but only sets a
lower limit on that value. Is that what you're talking about? Are you
complaining about the need to compare N with INT_MAX, instead of being
able to relying on INT_MAX being the same on all implementations?

If the language were re-defined from scratch, I'd endorse making the
size-named types which were added in C99 fundamental types, preferably
with a nicer naming convention. For such a loop counter, I'd use the
equivalent of "int_fastN_t", choosing a value for N that is as small as
possible while still being big enough to prevent the loop counter from
overflowing. In C99, that's inconveniently complicated to write, and I
would normally use 'signed char', 'int' and 'long' as rough synonyms for
'int_fast8_t', 'int_fast16_t', and 'int_fast32_t', respectively.
 
F

Flash Gordon

Charlie Gordon wrote, On 20/10/07 13:54:
Richard Heathfield said:
[ I already wrote a long reply to this, but the newsreader crashed on send
(no, I did *not* write the newsreader, and no, I *will* not write a
newsreader). I don't have time to reconstruct the whole reply. This is a
rather polite summary of my previous, very short-tempered reply. ]

Tor Rustad said:

Amazing, I was *not* hiring someone to protect *apples*, *crown* or
looking for a clueless in security, unable to identify *common* errors.
(a) you were *not* hiring *anyone*. You were taking part in a Usenet
discussion.
(b) if you think I'm clueless, why bother to continue this discussion?
(c) what if I think /you/ are clueless, unable to recognise *common*
sense?
For an introduction to basic security principles, see e.g. [1]:

"Principle 32. Identify and prevent common errors and vulnerabilities

Discussion: Many errors reoccur with disturbing regularity - errors such
as buffer overflows, race conditions, format string errors, failing to
check input for validity, and programs being given excessive privileges.
Learning from the past will improve future results."
In my experience, the following bug is far more common than strncpy:

char *t;

strcpy(t, s);

The bug here is in failing to allocate *any storage at all* for t. I have
seen this happen in production code far more than I have seen strncpy
misused (or indeed used at all). In fact, I have seen what we might call
"the char * bug" (if only there weren't so many other bugs that could
easily have the same name) so often that it definitely counts as
disturbing regularity.

Such bugs hardly survive rudimentary testing.

Depends on your luck. It *can* sometimes "work".
Not so. strncpy related bugs are more subtle, and occur in border cases.
As such they will go unnoticed until Murphy helps making them strike at the
worst possible moment.

If your testing does not include the border cases then you need to
improve your testing. After all, border cases are where bugs often are
exhibited, especially the bugs that slip past the average developer!
strncpy can be banned with no downside.

Except where it is the right tool for the job. If an application
developer had made a slightly different choice, namely 0 padding instead
of space padding, on the application I spend a lot of time working on
then it would make *far* more calls to strncpy than calls strcpy. If I
had the time I would actually make that change to the SW.

Instead, my response in terms of strncpy and its safety aspects would
have been to ask the question, "Do you have fixed width 0 padded and
potentially not 0 terminated fields in the your SW?" If the answer was
"no" *then* I would say that almost any use of it in that SW would be
wrong and a security risk because it does not null terminate the result
in all conditions and in other conditions can take an unexpectedly long
time. If the answer was "yes" then I would say the risk was in
erroneously using it when the intent is to produce a C string rather
than populate such a fields and that naming conventions should be used
to assist in spotting correct vs incorrect usage. Further mitigation
would include ensuring that such limits are tested and the usual stuff
about reviews and use of grep.
Pointers are a fundamental feature
of the language. The class of problems that can be solved without them has
a topological measure of zero.

Not quite, there are *some* programs that can be written without
pointers, just not many.
90% of the IT work force. More than 90% of all code produced.

I've been fortunate and only worked with a few clueless programmers.
 
J

James Kuyper Jr.

Richard said:
Tor Rustad said: ....

That's "clever" as in "dumb", right? Just checking.

No, there's a key difference between the sarcastic use of "clever" and
ordinary misuse of "dumb" as an insensitive synonym for stupid.

A "clever" person is sufficiently intelligent and bright to get himself
into trouble in ways that are far too complicated for a stupid person to
duplicate. Someone who is clever, rather than "clever" is sufficiently
intelligent to avoid getting into such trouble in the first place.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Fibonacci 0
Adding adressing of IPv6 to program 1
C language. work with text 3
code review 26
Can't solve problems! please Help 0
compressing charatcers 35
Strange bug 65
K&R exercise 5-5 10

Members online

No members online now.

Forum statistics

Threads
474,262
Messages
2,571,058
Members
48,769
Latest member
Clifft

Latest Threads

Top