Listing the most dangerous parts of C

J

Juuso Hukkanen

I am looking for a wish list of things which should be removed from
the C (C99) - due to feature's bad security track record <OT>or
Multithreading unsafety. I need this list for a project intending to
build another (easiest & most powerful) programming language, which
has a two page definition document stating: "... includes C
programming language (C99), except its famous
"avoid-using-this-functions". </OT>

If you would not want to remove a whole function but only the use of
it with certain arguments / parameters, what would those combinations
be like? (Like scanf with %s or %[ arguments )

Probably there are official not to use recommendation lists.
( million times better than this)
http://tele3d.com/wiki/index.php/Parts_of_C99_which_are_NOT_included_in_t3d

Please, do not circumvent the question by saying all functions except
gets() are safe if used properly. That would be like teaching that
"the ideology of Soviet Union was right, it was the Soviet peoples
fault that the system didn't work.

Juuso Hukkanen
(to reply by e-mail set addresses month and year to correct)
www.tele3d.com
 
J

jacob navia

Juuso Hukkanen a écrit :
I am looking for a wish list of things which should be removed from
the C (C99) - due to feature's bad security track record <OT>or
Multithreading unsafety. I need this list for a project intending to
build another (easiest & most powerful) programming language, which
has a two page definition document stating: "... includes C
programming language (C99), except its famous
"avoid-using-this-functions". </OT>

If you would not want to remove a whole function but only the use of
it with certain arguments / parameters, what would those combinations
be like? (Like scanf with %s or %[ arguments )

Probably there are official not to use recommendation lists.
( million times better than this)
http://tele3d.com/wiki/index.php/Parts_of_C99_which_are_NOT_included_in_t3d

Please, do not circumvent the question by saying all functions except
gets() are safe if used properly. That would be like teaching that
"the ideology of Soviet Union was right, it was the Soviet peoples
fault that the system didn't work.

Juuso Hukkanen
(to reply by e-mail set addresses month and year to correct)
www.tele3d.com

What is "t3d" first ???

From that wiki page it is completely imposssible to have an idea what
the hell is that.

jacob
 
A

Arthur J. O'Dwyer

Juuso Hukkanen a écrit :
I am looking for a wish list of things which should be removed from
the C (C99) - due to feature's bad security track record <OT>or
Multithreading unsafety. I need this list for a project intending to
build another (easiest & most powerful) programming language, which
has a two page definition document stating: "... includes C
programming language (C99), except its famous
"avoid-using-this-functions". </OT>

If you would not want to remove a whole function but only the use of
it with certain arguments / parameters, what would those combinations
be like? (Like scanf with %s or %[ arguments )
Probably there are official not to use recommendation lists.
( million times better than this)
http://tele3d.com/wiki/index.php/Parts_of_C99_which_are_NOT_included_in_t3d

What is "t3d" first ???

From that wiki page it is completely imposssible to have an idea what the
hell is that.

It's vaporware. This guy's been pushing its "natural language, giant
built-in library of functions" model for at least a year or so, now.
(The problems are that the "natural" language isn't, and the "built-in"
functions aren't.)

FWIW, off the top of my head I'd say gets (obviously), strtok (not
thread-safe), atoi (no error-checking possible), and much of scanf
(again with the error-checking).
scanf("%*s") is fine, but scanf("%s") is evil, scanf("%99s") is
unmaintainable, and scanf("%d") chokes in unpredictable ways on input
like "3287482475".

my $.02,
-Arthur
 
P

P.J. Plauger

I am looking for a wish list of things which should be removed from
the C (C99) - due to feature's bad security track record <OT>or
Multithreading unsafety. I need this list for a project intending to
build another (easiest & most powerful) programming language, which
has a two page definition document stating: "... includes C
programming language (C99), except its famous
"avoid-using-this-functions". </OT>

If you would not want to remove a whole function but only the use of
it with certain arguments / parameters, what would those combinations
be like? (Like scanf with %s or %[ arguments )

Probably there are official not to use recommendation lists.
( million times better than this)
http://tele3d.com/wiki/index.php/Parts_of_C99_which_are_NOT_included_in_t3d

Please, do not circumvent the question by saying all functions except
gets() are safe if used properly. That would be like teaching that
"the ideology of Soviet Union was right, it was the Soviet peoples
fault that the system didn't work.

One very popular wish list is Misra C. (Actuall two, since there's a
revision out too.) It endeavors to tame C by outlawing all sorts of
usages that some people think *might* be misused.

Another is Microsoft's secure/safer/bounded C, a version of which is
now shipping with VC++ V8. It supplies alternatives to many functions
that can be better bounds checked to avoid storage overwrites. This
work is based on Microsoft's massive bug hunt stimulated by all the
viral attacks on Microsoft software largely written in C.

Neither is anywhere near perfect, nor universally accepted. Both are
places to start.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
R

Richard Heathfield

Juuso Hukkanen said:
I am looking for a wish list of things which should be removed from
the C (C99)

Absolutely. Just remove C99. Nobody will notice anyway.
- due to feature's bad security track record <OT>or
Multithreading unsafety. I need this list for a project intending to
build another (easiest & most powerful) programming language, which
has a two page definition document stating: "... includes C
programming language (C99), except its famous
"avoid-using-this-functions". </OT>

Okay, drop gets(), scanf(), and strncpy() - ironically, this is unsafe
chiefly because people think it's safe and so they feel free to use it in a
rather cavalier way!

strtok() isn't threadsafe, as someone already said, so I guess you would
want to drop that (I wouldn't, but you're not me).

That's about it, I think. Everything else is fine, if you're careful. (Mind
you, scanf, strncpy, and strtok are fine if you're careful, too!)
Please, do not circumvent the question by saying all functions except
gets() are safe if used properly. That would be like teaching that
"the ideology of Soviet Union was right, it was the Soviet peoples
fault that the system didn't work.

No, it would be more like saying that if you give power tools to
kindergarten kids, you should expect tears before bedtime.
 
R

Rod Pemberton

Juuso Hukkanen said:
I am looking for a wish list of things which should be removed from
the C (C99) - due to feature's bad security track record <OT>or
Multithreading unsafety. I need this list for a project intending to
build another (easiest & most powerful) programming language, which
has a two page definition document stating: "... includes C
programming language (C99), except its famous
"avoid-using-this-functions". </OT>

If you would not want to remove a whole function but only the use of
it with certain arguments / parameters, what would those combinations
be like? (Like scanf with %s or %[ arguments )

Probably there are official not to use recommendation lists.
( million times better than this)
http://tele3d.com/wiki/index.php/Parts_of_C99_which_are_NOT_included_in_t3d

Please, do not circumvent the question by saying all functions except
gets() are safe if used properly. That would be like teaching that
"the ideology of Soviet Union was right, it was the Soviet peoples
fault that the system didn't work.

Juuso Hukkanen
(to reply by e-mail set addresses month and year to correct)
www.tele3d.com

I read a security oriented pdf (sorry, don't know where anymore) which said:

1) 15 C functions suffer buffer overflow problems:
gets() cuserid() scanf() fscanf() sscanf() vscanf() vsscanf() vfscanf()
sprintf() strcat() strcpy() streadd() strecpy() vsprintf() strtrns()

2) 8 C functions suffer from format string vulnerabilities
printf() fprintf() sprintf() snprintf() vprintf() vfprintf() vsprintf()
vsnprintf()

Summary of pdf: Because many C implementations use the same stack for
string data and flow control information (like addresses), the above
functions can modify the flow control information on the stack thereby
allowing authorized code to execute.

If you really want to get crazy with C, do some of these:
1) eliminate pointers in main
2) make pointers be associated with a variable before use, not with a data
type
3) eliminate malloc, add dynamic allocation and garbage collection
4) change C to pass by reference
5) require separation of string (and other) data and flow control
information
6) give up now, and try Walter Bright's D language...


Rod Pemberton
 
J

jacob navia

Rod Pemberton a écrit :
If you really want to get crazy with C, do some of these:
1) eliminate pointers in main
????

2) make pointers be associated with a variable before use, not with a data
type

lcc-win32: done.
References are pointers associated with an object permanently.
3) eliminate malloc, add dynamic allocation and garbage collection

lcc-win32: done.
The gc is standard in the normal distribution.
4) change C to pass by reference
?????
Why?


5) require separation of string (and other) data and flow control
information

Stack allocation is ok if used correctly. Making all objects heap based
would slow done everything without a lot of gain in security.
6) give up now, and try Walter Bright's D language...

????

With the above improvements, C can be much easier and safer to program.

jacob
 
K

Keith Thompson

Rod Pemberton said:
I read a security oriented pdf (sorry, don't know where anymore) which said:

1) 15 C functions suffer buffer overflow problems:
gets() cuserid() scanf() fscanf() sscanf() vscanf() vsscanf() vfscanf()
sprintf() strcat() strcpy() streadd() strecpy() vsprintf() strtrns()
[snip]

Obviously this document wasn't concerned just with standard C. A
number of those functions are non-standard. (I haven't even heard of
all of them.)
 
J

Juuso Hukkanen

Okay, drop gets(), scanf(), and strncpy() - ironically, this is unsafe
chiefly because people think it's safe and so they feel free to use it in a
rather cavalier way!

Ok, strncpy gone. I too I was unsure about it after reading the
http://en.wikipedia.org/wiki/Strlcpy
, but I thought to keep the strncpy for the sake of respect of
standard libraries :)
I see strlcpy could be obtained from the kind donation public
domain by Chuck Falconer, but I am not sure if it is a good idea to
have non-standard C functions inserted. Possible strlcpy is soon part
of the standard.
strtok() isn't threadsafe, as someone already said, so I guess you would
want to drop that (I wouldn't, but you're not me).

Well also the strtok(), (and even gets()) problems see to have better
but still a non-standard solutions provided to public domain by Chuck
(hmmm. Do I see a pattern) .

toksplit()
http://groups.google.com/group/comp.lang.c/msg/b62e03ab0f27f874
ggets()
http://cbfalconer.home.att.net/download/ggets.zip

Thanks for the suggestions
Juuso Hukkanen
(to reply by e-mail set addresses month and year to correct)
www.tele3d.com
 
K

Kenneth Brody

Juuso said:
I am looking for a wish list of things which should be removed from
the C (C99) - due to feature's bad security track record <OT>or
[...]

I've always thought that "the most dangerous part of C" was the programmer.

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+
Don't e-mail me at: <mailto:[email protected]>
 
J

Juuso Hukkanen

1) 15 C functions suffer buffer overflow problems:
gets() cuserid() scanf() fscanf() sscanf() vscanf() vsscanf() vfscanf()
sprintf() strcat() strcpy() streadd() strecpy() vsprintf() strtrns()

2) 8 C functions suffer from format string vulnerabilities
printf() fprintf() sprintf() snprintf() vprintf() vfprintf() vsprintf()
vsnprintf()

It's probably this document
http://www.ida.liu.se/~johwi/research_publications/licentiate_thesis.pdf


Chapter 7.3.3 Functions which are for attracting buffer overflows
Chapter 7.3.5 Format string vulnerabilities

very good reading , and it gives references to another even better
says...

<snip>
Functions to avoid in most cases (or ensure protection) include the
functions, strcpy(3), strcat(3), sprintf(3) (with cousin vsprintf(3)),
and gets(3). These should be replaced with functions
such as strncpy(3), strncat(3), snprintf(3), and fgets(3)
respectively, but see the discussion below. The
function strlen(3) should be avoided unless you can ensure that there
will be a terminating NIL character to
find. The scanf() family (scanf(3), fscanf(3), sscanf(3), vscanf(3),
vsscanf(3), and vfscanf(3)) is often
dangerous to use; do not use it to send data to a string without
controlling the maximum length (the format %s
is a particularly common problem).
....
Unfortunately, snprintf()'s variants have additional problems.
Officially, snprintf() is not a standard C function
in the ISO 1990 (ANSI 1989) standard, though sprintf() is, so not all
systems include snprintf(). Even worse,
some systems' snprintf() do not actually protect against buffer
overflows; they just call sprintf directly.
If you really want to get crazy with C, do some of these:
1) eliminate pointers in main
2) make pointers be associated with a variable before use, not with a data
type
3) eliminate malloc, add dynamic allocation and garbage collection
Eliminated by including the Boehms garbage collector
4) change C to pass by reference
Done in a way that inputs go into functions with values and results
are taken out from the functions by pass by references (or its C
equivalent)
5) require separation of string (and other) data and flow control
information
The "safe strings" have the first 100 bytes header information about
and for each arrays life, the debug mode collects info from each
function entry - exit. All functions return long long negative values
meaning intelligent error codes.
6) give up now, and try Walter Bright's D language...

D is gainning ground,
http://www.tiobe.com/tiobe_index/index.htm

Because it has everything
http://www.digitalmars.com/d/comparison.html

Well I bet their language definition is bigger than two pages and it
can not be learned in 30 minutes :)
www.tele3d.com/t3d/language.pdf



Thank You
Juuso Hukkanen
(to reply by e-mail set addresses month and year to correct)
www.tele3d.com
 
J

Juuso Hukkanen

... This guy's been pushing its "natural language, giant
built-in library of functions" model for at least a year or so, now.
(The problems are that the "natural" language isn't, and the "built-in"
functions aren't.)

Firstly thank for very the good scanf examples. But otherwise your
characterization was not too fair. Firstly the language was published
for the October 2005. Since then I have just been busy and apart from
this hobby.
Assumingly you already understood and learned the programming
language - because you could state it being "natural language
giant-built-in-library of functions". Now compare how long it took to
learn other languages. I don't believe you have anything to complain
about the language itself, but you rather want to insult me and the
newness of this thing.

So is it such a bad thing, if all the good functions get written
properly (only) once and then nobody again need to write that function
from the scratch again.

I am 100% sure that even Arthur J. O'Dwyer would prefer to call a
function,
t3d_convert_file_Rfile_WAV2OGG_BITRATEXXX
, than to try to write a similar doing function - Right?
(The problems are that the "natural" language isn't, and the "built-in"
functions aren't.)

Assumingly that you admit that you to have learned this "natural"
language. Well
PHP created by Rasmus Lerdorf in 1994; How popular was it 8 months
after its release - not very. In June 1995 he posted info about those
PHP modules to UseNET for others who might be interested in them.
http://groups.google.com/group/comp.infosystems.www.authoring.cgi/msg/cc7d43454d64d133

Appears to be kind of similar project as this language project - And
probably Rasmus had not at that stage written the currently thousand
of functions in the PHP source library.

What did we learn, I don't know …and I can not say whether the thing
will succeed also because defining the success is not that simple (see
my reply to Jacob).

Want to see sources - help yourself:
Initial codes to pre-preprocessor & library routines:
http://www.tele3d.com/t3d/subversion.htm
http://www.tele3d.com/t3d/DEVEL/


Greetings
Juuso Hukkanen
(to reply by e-mail set addresses month and year to correct)
www.tele3d.com
 
J

Juuso Hukkanen

What is "t3d" first ???

Short question - long semi OT answer

It is a super-simple (documentation two pages) C based programming
language, with safe strings, (boehm) garbage collection, networking,
exact datatypes, GUI, multi-threading, environment etc.

Ok, key thing is that all the function calls are formulated as being
logical sentences.

You have 15 verbs: add, remove, convert, open,write…
You also have 20 objects (datatypes): byte, double, file, filepath,
url…

Then you just combine a:
verb --- > to which object the verb action is to be done ---> to which
object the results are to be written ---> and at the end you tell how
the verb action is to be done

t3d_calculate_iarray_Rdouble_STD_DEVIATION
t3d_convert_file_Rfile_GSM2WAV
t3d_measure_barray_LENGTH
t3d_convert_Rfile_READ_ONLY

The flexibility and the logicality of the t3d function prototypes
allows you to easily create or use all kinds routines. Even using
routines you didn't know to exist. Bonus is that the function
prototype works using any of the words written languages (try for
example in France).

t3d_calcule_fichier_Rdécimal_DEVIATION_STANDARD
Ok, the idea is to use C99 as much as possible but when it's not
possible then use the platform specific C extensions.
Ok, Internet is full of weird programming languages and the
competition is tough - However sometimes new programming languages do
succeed - like PHP did. I believe t3d has lots of unique qualities due
to its easiness, raw power and multi-language support.
I have a dream, I don't know if it is realizable but I try donate
the t3d language to one of the major charity organization, under a
license owned by a bunch of charity organizations. --> Therefore all
the programs that would be made e.g. by using those "ultimate t3d
libraries" or any other "under the license" staff would require a
license from some of the listed charity organizations. The charity
license would be like a dual - GPL allowing individuals to use the
software freely, while the rich country companies would require a
license.

This way the coders (anywhere) could decide to donate their opensource
works to charity organizations instead as currently donating them to
Free Software Foundation using the GPL2. The reason why I keep the
hobby of promoting the language and the (even more important) license
idea is that I sense that if this thing would succeed it could make a
big impact. Currently the Amnesty International, Red Cross and the
Greenpeace International are evaluating the license - who knows they
might like it or decide to write their own version.

Naturally it is impossible for me to write a whole language and I am
not that good coder either, but who knows what happens if for example
Slashdot would one day announce that
"Amnesty has a programming language & new OS license"

I bet there would be few programmers who would consider supporting the
initiative of building the t3d programming language ready.

Juuso Hukkanen
(to reply by e-mail set addresses month and year to correct)
"t3d programming language" and the structure of t3d function prototype
are trademarks of Juuso Hukkanen. (As said currently discussing the
transfer of those to a major charity organization).
 
J

jacob navia

Kenneth said:
Juuso said:
I am looking for a wish list of things which should be removed from
the C (C99) - due to feature's bad security track record <OT>or

[...]

I've always thought that "the most dangerous part of C" was the programmer.

Yeah you are right!

ELIMINATE THEM!!!!!
 
P

pete

Juuso Hukkanen wrote:

http://www.ida.liu.se/~johwi/research_publications/licentiate_thesis.pdf
Chapter 7.3.3 Functions which are for attracting buffer overflows
Chapter 7.3.5 Format string vulnerabilities

very good reading , and it gives references to another even better
says...

<snip>
Functions to avoid in most cases (or ensure protection) include the
functions, strcpy(3), strcat(3), sprintf(3) (with cousin vsprintf(3)),
and gets(3). These should be replaced with functions
such as strncpy(3), strncat(3), snprintf(3), and fgets(3)
respectively, but see the discussion below. The
function strlen(3) should be avoided unless you can ensure that there
will be a terminating NIL character to find.

That looks like good advice for programmers
who don't know that string functions are for using with strings,
but who want to use string functions anyway.
 
A

Andrew Poelstra

If you really want to get crazy with C, do some of these:
1) eliminate pointers in main
Seems like any problems associated with that would be poor programming practice.
2) make pointers be associated with a variable before use, not with a data
type Ditto.

3) eliminate malloc, add dynamic allocation and garbage collection
Now you've got a Java-like beast, only to solve programmers who can't keep
track of memory.
4) change C to pass by reference
Make that C++... for the same reason.
5) require separation of string (and other) data and flow control
information
I believe that is implementation-defined right now, and in some situations, such
as embedded systems, it could be problematic.
6) give up now, and try Walter Bright's D language...
I think I'll check that out...

Basically, as has been said, C's most dangerous aspect is the programmer, and
any functions that might seem worth ditching /do/ have a purpose. Except for
gets(), which has no safe usage.
 
Q

qed

Juuso said:
I am looking for a wish list of things which should be removed from
the C (C99) - due to feature's bad security track record <OT>or
Multithreading unsafety. I need this list for a project intending to
build another (easiest & most powerful) programming language, which
has a two page definition document stating: "... includes C
programming language (C99), except its famous
"avoid-using-this-functions". </OT>

If you would not want to remove a whole function but only the use of
it with certain arguments / parameters, what would those combinations
be like? (Like scanf with %s or %[ arguments )

Probably there are official not to use recommendation lists.
( million times better than this)
http://tele3d.com/wiki/index.php/Parts_of_C99_which_are_NOT_included_in_t3d

You are recommending strncpy and strncat. These are slow functions that
occasionally leave off the terminating '\0'. I would argue that they
are therefore *WORSE* than that functions they replace. Others would
recommend strlcpy/strlcat as superior alternatives. (But limited --
obviously I would recommend removing all of str* and add in Bstrlib as
an alternative, but, I've said this before, and this is likely beyond
what you are looking at/for.)

fgets is not an ideal substitute for gets as explained here:
http://www.pobox.com/~qed/userInput.html (though obviously gets must be
removed.) So I would also recommend removing fgets if you have a
replacement for it (such as getInputFrag, or perhaps just fgetstr)

I am not sure why you want to get rid of srand() or rand(). Its true
they suck as PRNGs, and race conditions mess them up in ways that can be
worse than you think (and RAND_MAX is generally pathetically small), but
I don't think people generally abuse them to that degree of detriment in
the real world. Again, if you had a *substitute*, that would be fine.
The problem is that I am not aware of any good portable PRNGs -- are you
(hence supporting the idea that C is not a portable language)? As for
non-portable ones, there are plenty (such as Mersenne Twister, or any of
the Marsaglia generators.) So as long as we are stuck with *something*
-- they still can serve a role as a quick and dirty PRNG. (The right
answer here is to demand that the standard change how it works --
however a quick perusal of their guiding principles, indicates there is
no mechanism by which you could reasonably do this.)

Ok, as for other things that should obviously be removed: ftell() and
fseek(). Use fgetpos() and fsetpos() as the alternatives. (ftell and
fseek are simply not defined to have any useful functionality beyond
fgetpos/fsetpos and are incredibly deceptive in how they *appear* to work.)

I would get rid of ungetc just on principle (can't unread at the
beginning of a file, may screw up fgetpos(), only does a single
character -- its just super lame, and throws a monkey wrench into too
many other functions.)

The complex number type from C99 is in clear namespace conflict with
C++. This isn't a minor issue, since accomidating for it would
drastically reduce C++'s functionality/usefulness (C++ implements
complex numbers as a template, so that you can implement things like
Guassian Integers for example). As such, I would recommending removing
that whole set of such operations.

Ok, as to core language things -- there is also the strange issue of
function pointers. If you say:

void (* qs) (void*,size_t,size_t,int (*)(const void*,const void*));
qs = qsort;
qs = &qsort;

there is no syntax error or conflict between the last two lines (in fact
they do and mean the same thing). This means if you want to express a
pointer to a function pointer, matters are not obvious. So one or the
other should be removed.

Of course I think "register" and "inline" are functional placebos in
modern C compilers. They are also deceptively named (both should be
replaced by a single adjective "nonaddressable" or something like that.)

C also accepts things like 3[a] as equivalent to a[3], when there
doesn't seem to be a really good reason to do this. This appears to be
strictly for the obfuscated C code competition.
 
S

Skarmander

Kenneth said:
Juuso said:
I am looking for a wish list of things which should be removed from
the C (C99) - due to feature's bad security track record <OT>or
[...]

I've always thought that "the most dangerous part of C" was the programmer.
Actually, those sharp hooks at the end of the letter is what will hurt you
every time.

Of course, C++ is even more dangerous in that regard, since it adds barbed
wire. D should be safe, though.

S.
 
B

Ben Pfaff

qed said:
I would get rid of ungetc just on principle (can't unread at the
beginning of a file, may screw up fgetpos(), only does a single
character -- its just super lame, and throws a monkey wrench into too
many other functions.)

I am unaware of a limitation on calling ungetc() at the beginning
of a file. I scanned the definition of ungetc() in C99 and
didn't see such a limitation--did I miss something?
 
S

Skarmander

Andrew said:
Seems like any problems associated with that would be poor programming practice.

Now you've got a Java-like beast, only to solve programmers who can't keep
track of memory.
Exactly. Garbage collection is for people who are stupid or lazy or both.
Everyone knows that keeping track of memory yourself is better and cleaner.
Keeps the mind in shape and your programs fast.

Or something like that, at least.

S.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top