Why C Is Not My Favourite Programming Language

Walter Roberson · Feb 6, 2005

:True, but my point is:

1) C introduces entirely new classes of bugs
:and

2) C makes bugs more likely, regardless of the programmer's skill
:and

3) C makes it much harder to root out and fix bugs

:and I still haven't seen any reasoning against this.

Introduced relative to -what- ? Do I get to compare it against
the other languages that were available at the time of it's
development in 1969-1973, such as:

- numerous assemblers
- Fortran 66, Fortran G, Fortran H
- The original Pascal (1970)
- PL/1 (1969)
- Algol 68
- APL (1969)
- LISP, LISP 1.5, MacLISP

Malcolm · Feb 6, 2005

Chris Torek said:
It already does. It returns the length in (C) bytes (units of
"char"). The fact that it takes up to 4 C bytes to encode a single
display position is another matter entirely; and if you want columns
to line up, you need something more than "number of display positions"
anyway, because Unicode fonts are (at least mostly) proportional.

You can do it by declaring char to be 32 bits. But now we have this problem.
I want a program that opens an image file, does something fancy with the
data, and writes an output. Filenames are composed of characters and
obviously I want my Chinese customers to be able to load and save files in
their own language. So four bit chars are reasonable.

Now the bulk of the program consists of

void imageroutine(unsigned char *rgb, int width, int height)
{
}
Bytes are 8 bits, whilst pixels are naturally 24 bits with 8 bits for each
pixel. So I want to take advantage of this to access my pixels one component
at a time. But I can't.

The best way round is probably to pass in pixels as 32 bits and waste the
top 8. Often this will also increase performance.

But now we have further difficulties. Let's say I typedef unsigned long as
"colour" and macros RED GREEN and BLUE to access the channels. Now we've
polluted the symbol space. It becomes that bit much more difficult to hook
my imageroutine into different programs, because there is no natural layout
of the pixels in memory.

Now my friend has produced a JPEG loader. He faced the same problem. So he
typedefed unsigned long as "color" and macros RED GREEN AND BLUE, but using
Microsoft format (blues in the low bit).

Then of course most machines come with 8 or at a pinch 9 bit bytes. However
it is entirely plausible that people will start using "long" as a longer
type than int, ie 64 bits. Wasting half the space for an image-processing
routine is probably not acceptable. So we could easily end up doing a
rewrite.

See the sort of problems you very quickly run into?

(I suspect, however, that infobahn was referring to EBCDIC on
IBM mainframes. C works fine there, although plain char must be
unsigned, because, e.g., '0' == 0xf0, and CHAR_BIT is still 8.
EBCDIC does break a lot of programs that assume 'I'+1 == 'J',
though.)

The other progams that break are those that read and write from file formats
that contain ASCII strings. For instance an IFF file contains 4 byte ASCII
mnemonic chunk ids. It is extremely tempting to write

if(!memcmp(chunk, "DATA", 4))

rather than coding the bytes numerically, but it will break on EBCDIC.

JuicyLucy · Feb 6, 2005

Yes we understand, ... It is quitte difficult to get your hands on
good programming books when you are visualy impaired.

However there are enough options for you. A good start-point would be.
The National Library Service for the Blind
http://lcweb.loc.gov/nls

If you decide afterall that C programming is just not your thing you
could also try singing. There are Nr. of blind people who have been
known to make a very good living out of it. Ray Charles, Stevie
Wonder.. Etc..

http://www.stevie-wonder.com/

Peace bro & Good Luck...

Lucy..

p.s.
don't forget to send us some of your C code when you located your
computer again.. That would be great..!!

I've been utilising C for lots of small and a few medium-sized personal
projects over the course of the past decade, and I've realised lately
just how little progress it's made since then. I've increasingly been
using scripting languages (especially Python and Bourne shell) which
offer the same speed and yet are far more simple and safe to use. I can
no longer understand why anyone would willingly use C to program
anything but the lowest of the low-level stuff. Even system utilities,
text editors and the like could be trivially written with no loss of
functionality or efficiency in Python. Anyway, here's my reasons. I'd
be interested to hear some intelligent advantages (not
rationalisations) for using C.

No string type
--------------

C has no string type. Huh? Most sane programming languages have a
string type which allows one to just say "this is a string" and let the
compiler take care of the rest. Not so with C. It's so stubborn and
dumb that it only has three types of variable; everything is either a
number, a bigger number, a pointer or a combination of those three.
Thus, we don't have proper strings but "arrays of unsigned integers".
"char" is basically just a really small number. And now we have to
start using unsigned ints to represent multibyte characters.

What. A. Crock. An ugly hack.

Functions for insignificant operations
--------------------------------------

Copying one string from another requires including <string.h> in your
source code, and there are two functions for copying a string. One
could even conceivably copy strings using other functions (if one
wanted to, though I can't imagine why). Why does any normal language
need two functions just for copying a string? Why can't we use the
assignment operator ('=') like for the other types? Oh, I forgot.
There's no such thing as strings in C; just a big continuous stick of
memory. Great! Better still, there's no syntax for:

* string concatenation
* string comparison
* substrings

Ditto for converting numbers to strings, or vice versa. You have to use
something like atol(), or strtod(), or a variant on printf(). Three
families of functions for variable type conversion. Hello? Flexible
casting? Hello?

And don't even get me started on the lack of an exponentiation
operator.

No string type: the redux
-------------------------

Because there's no real string type, we have two options: arrays or
pointers. Array sizes can only be constants. This means we run the risk
of buffer overflow since we have to try (in vain) to guess in advance
how many characters we need. Pathetic. The only alternative is to use
malloc(), which is just filled with pitfalls. The whole concept of
pointers is an accident waiting to happen. You can't free the same
pointer twice. You have to always check the return value of malloc()
and you mustn't cast it. There's no builtin way of telling if a spot of
memory is in use, or if a pointer's been freed, and so on and so forth.
Having to resort to low-level memory operations just to be able to
store a line of text is asking for...

The encouragement of buffer overflows
-------------------------------------

Buffer overflows abound in virtually any substantial piece of C code.
This is caused by programmers accidentally putting too much data in one
space or leaving a pointer pointing somewhere because a returning
function ballsed up somewhere along the line. C includes no way of
telling when the end of an array or allocated block of memory is
overrun. The only way of telling is to run, test, and wait for a
segfault. Or a spectacular crash. Or a slow, steady leakage of memory
from a program, agonisingly 'bleeding' it to death.

Functions which encourage buffer overflows
------------------------------------------

* gets()
* strcat()
* strcpy()
* sprintf()
* vsprintf()
* bcopy()
* scanf()
* fscanf()
* sscanf()
* getwd()
* getopt()
* realpath()
* getpass()

The list goes on and on and on. Need I say more? Well, yes I do.

You see, even if you're not writing any memory you can still access
memory you're not supposed to. C can't be bothered to keep track of the
ends of strings; the end of a string is indicated by a null '\0'
character. All fine, right? Well, some functions in your C library,
such as strlen(), perhaps, will just run off the end of a 'string' if
it doesn't have a null in it. What if you're using a binary string?
Careless programming this may be, but we all make mistakes and so the
language authors have to take some responsibility for being so
intolerant.

No builtin boolean type
-----------------------

If you don't believe me, just watch:

$ cat > test.c
int main(void)
{
bool b;
return 0;
}

$ gcc -ansi -pedantic -Wall -W test.c
test.c: In function 'main':
test.c:3: 'bool' undeclared (first use in this function)

Not until the 1999 ISO C standard were we finally able to use 'bool' as
a data type. But guess what? It's implemented as a macro and one
actually has to include a header file to be able to use it!

High-level or low-level?
------------------------

On the one hand, we have the fact that there is no string type and
little automatic memory management, implying a low-level language. On
the other hand, we have a mass of library functions, a preprocessor and
a plethora of other things which imply a high-level language. C tries
to be both, and as a result spreads itself too thinly.

The great thing about this is that when C is lacking a genuinely useful
feature, such as reasonably strong data typing, the excuse "C's a
low-level language" can always be used, functioning as a perfect
'reason' for C to remain unhelpfully and fatally sparse.

The original intention for C was for it to be a portable assembly
language for writing UNIX. Unfortunately, from its very inception C has
had extra things packed into it which make it fail as an assembly
language. Its kludgy strings are a good example. If it were at least
portable these failings might be forgivable, but C is not portable.

Integer overflow without warning
--------------------------------

Self explanatory. One minute you have a fifteen digit number, then try
to double or triple it and - boom - its value is suddenly
-234891234890892 or something similar. Stupid, stupid, stupid. How hard
would it have been to give a warning or overflow error or even just
reset the variable to zero?

This is widely known as bad practice. Most competent developers
acknowledge that silently ignoring an error is a bad attitude to have;
this is especially true for such a commonly used language as C.

Portability?!
-------------

Please. There are at least four official specifications of C I could
name from the top of my head and no compiler has properly implemented
all of them. They conflict, and they grow and grow. The problem isn't
subsiding; it's increasing each day. New compilers and libraries are
developed and proprietary extensions are being developed. GNU C isn't
the same as ANSI C isn't the same as K&R C isn't the same as Microsoft
C isn't the same as POSIX C. C isn't portable; all kinds of machine
architectures are totally different, and C can't properly adapt because
it's so muttonheaded. It's trapped in The Unix Paradigm.

If it weren't for the C preprocessor, then it would be virtually
impossible to get C to run on multiple families of processor hardware,
or even just slightly differing operating systems. A programming
language should not require a C preprocessor so that it can run on both
FreeBSD, Linux or Windows without failing to compile.

C is unable to adapt to new conditions for the sake of "backward
compatibility", throwing away the opportunity to get rid of stupid,
utterly useless and downright dangerous functions for a nonexistent
goal. And yet C is growing new tentacles and unnecessary features
because of idiots who think adding seven new functions to their C
library will make life easier. It does not.

Even the C89 and C99 standards conflict with each other in ridiculous
ways. Can you use the long long type or can't you? Is a certain
constant defined by a preprocessor macro hidden deep, deep inside my C
library? Is using a function in this particular way going to be
undefined, or acceptable? What do you mean, getch() isn't a proper
function but getc() and getchar() are?

The implications of this false 'portability'
--------------------------------------------

Because C pretends to be portable, even professional C programmers can
be caught out by hardware and an unforgiving programming language;
almost anything like comparisons, character assignments, arithmetic, or
string output can blow up spectacularly for no apparent reason because
of endianness or because your particular processor treats all chars as
unsigned or silly, subtle, deadly traps like that.

Archaic, unexplained conventions
--------------------------------

In addition to the aforementioned problems, C also has various
idiosyncracies (invariably unreported) which not even some teachers of
C are aware of:

* "Don't use fflush(stdin)."
* "gets() is evil."
* "main() must return an integer."
* "main() can only take one of three sets of arguments."
* "main() can only return either EXIT_SUCCESS or EXIT_FAILURE."
* "You musn't cast the return value of malloc()."
* "fileno() isn't an ANSI compliant function."
* "A preprocessor macro oughtn't use any of its arguments more than
once."

...all these unnecessary and unmentioned quirks mean buggy code. Death
by a thousand cuts. Ironic when you consider that Kernighan thinks of
Pascal in the same way when C has just as many little gotchas that
bleed you to death gradually and painfully.

Blaming The Progammer
---------------------

Due to the fact that C is pretty difficult to learn and even harder to
actually use without breaking something in a subtle yet horrific way
it's assumed that anything which goes wrong is the programmer's fault.
If your program segfaults, it's your fault. If it crashes, mysteriously
returning 184 with no error message, it's your fault. When one single
condition you'd just happened to have forgotten about whilst coding
screws up, it's your fault.

Obviously the programmer has to shoulder most of the responsibility for
a broken program. But as we've already seen, C positively tries to make
the programmer fail. This increases the failure rate and yet for some
reason we don't blame the language when yet another buffer overflow is
discovered. C programmers try to cover up C's inconsistencies and
inadequacies by creating a culture of 'tua culpa'; if something's
wrong, it's your fault, not that of the compiler, linker, assembler,
specification, documentation, or hardware.

Compilers have to take some of the blame. Two reasons. The first is
that most compilers have proprietary extensions built into them. Let me
remind you that half of the point of using C is that it should be
portable and compile anywhere. Adding extensions violates the original
spirit of C and removes one of its advantages (albeit an already
diminished advantage).

The other (and perhaps more pressing) reason is the lack of anything
beyond minimal error checking which C compilers do. For every ten types
of errors your compiler catches, another fifty will slip through.
Beyond variable type and syntax checking the compiler does not look for
anything else. All it can do is give warnings on unusual behaviour,
though these warnings are often spurious. On the other hand, a single
error can cause a ridiculous cascade, or make the compiler fall over
and die because of a misplaced semicolon, or, more accurately and
incriminatingly, a badly constructed parser and grammar. And yet,
despite this, it's your fault.

To quote The Unix Haters' Handbook:

"If you make even a small omission, like a single semicolon, a C
compiler tends to get so confused and annoyed that it bursts into tears
and complains that it just can't compile the rest of the file since one
missing semicolon has thrown it off so much."

So C compilers may well give literally hundreds of errors stating that
half of your code is wrong if you miss out a single semicolon. Can it
get worse? Of course it can! This is C!

You see, a compiler will often not deluge you with error information
when compiling. Sometimes it will give you no warning whatsoever even
if you write totally foolish code like this:

#include <stdio.h>

int main()
{
char *p;
puts(p);
return 0;
}

When we compile this with our 'trusty' compiler gcc, we get no errors
or warnings at all. Even when using the '-W' and '-Wall' flags to make
it watch out for dangerous code it says nothing.

$ gcc -W -Wall stupid.c
$

In fact, no warning is given ever unless you try to optimise the
program with a '-O' flag. But what if you never optimise your program?
Well, you now have a dangerous program. And unless you check the code
again you may well never notice that error.

What this section (and entire document) is really about is the sheer
unfriendliness of C and how it is as if it takes great pains to be as
difficult to use as possible. It is flexible in the wrong way; it can
do many, many different things, but this makes it impossible to do any
single thing with it.

Trapped in the 1970s
--------------------

C is over thirty years old, and it shows. It lacks features that modern
languages have such as exception handling, many useful data types,
function overloading, optional function arguments and garbage
collection. This is hardly surprising considering that it was
constructed from an assembler language with just one data type on a
computer from 1970.

C was designed for the computer and programmer of the 1970s,
sacrificing stability and programmer time for the sake of memory.
Despite the fact that the most recent standard is just half a decade
old, C has not been updated to take advantage of increased memory and
processor power to implement such things as automatic memory
management. What for? The illusion of backward compatibility and
portability.

Yet more missing data types
---------------------------

Hash tables. Why was this so difficult to implement? C is intended for
the programming of things like kernels and system utilities, which
frequently use hash tables. And yet it didn't occur to C's creators
that maybe including hash tables as a type of array might be a good
idea when writing UNIX? Perl has them. PHP has them. With C you have to
fake hash tables, and even then it doesn't really work at all.

Multidimensional arrays. Before you tell me that you can do stuff like
int multiarray[50][50][50] I think that I should point out that that's
an array of arrays of arrays. Different thing. Especially when you
consider that you can also use it as a bunch of pointers. C programmers
call this "flexibility". Others call it "redundancy", or, more
accurately, "mess".

Complex numbers. They may be in C99, but how many compilers support
that? It's not exactly difficult to get your head round the concept of
complex numbers, so why weren't they included in the first place? Were
complex numbers not discovered back in 1989?

Binary strings. It wouldn't have been that hard just to make a
compulsory struct with a mere two members: a char * for the string of
bytes and a size_t for the length of the string. Binary strings have
always been around on Unix, so why wasn't C more accommodating?

Library size
------------

The actual core of C is admirably small, even if some of the syntax
isn't the most efficient or readable (case in point: the combined '? :'
statement). One thing that is bloated is the C library. The number of
functions in a full C library which complies with all significant
standards runs into four digit figures. There's a great deal of
redundancy, and code which really shouldn't be there.

This has knock-on effects, such as the large number of configuration
constants which are defined by the preprocessor (which shouldn't be
necessary), the size of libraries (the GNU C library almost fills a
floppy disk and its documentation, three) and inconsistently named
groups of functions in addition to duplication.

For example, a function for converting a string to a long integer is
atol(). One can also use strtol() for exactly the same thing. Boom -
instant redundancy. Worse still, both functions are included in the
C99, POSIX and SUSv3 standards!

Can it get worse? Of course it can! This is C!

As a result it's only logical that there's an equivalent pair of atod()
and strtod() functions for converting a string to a double. As you've
probably guessed, this isn't true. They are called atof() and strtod().
This is very foolish. There are yet more examples scattered through the
standard C library like a dog's smelly surprises in a park.

The Single Unix Specification version three specifies 1,123 functions
which must be available to the C programmer of the compliant system. We
already know about the redundancies and unnecessary functions, but
across how many header files are these 1,123 functions spread out? 62.
That's right, on average a C library header will define approximately
eighteen functions. Even if you only need to use maybe one function
from each of, say, five libraries (a common occurrence) you may well
wind up including 90, 100 or even 150 function definitions you will
never need. Bloat, bloat, bloat. Python has the right idea; its import
statement allows you to define exactly the functions (and global
variables!) you need from each library if you prefer. But C? Oh, no.

Specifying structure members
----------------------------

Why does this need two operators? Why do I have to pick between '.' and
'->' for a ridiculous, arbitrary reason? Oh, I forgot; it's just yet
another of C's gotchas.

Limited syntax
--------------

A couple of examples should illustrate what I mean quite nicely. If
you've ever programmed in PHP for a substantial period of time, you're
probably aware of the 'break' keyword. You can use it to break out from
nested loops of arbitrary depth by using an integer, like so:

for ($i = 0; $i < 10; $i++) {

for ($j = 0; $j < 10; $j++) {

for ($k = 0; $k < 10; $k++) {
break 2;
}
}

/* breaks out to here */

}

There is no way of doing this in C. If you want to break out from a
series of nested for or while loops then you have to use a goto. This
is what is known as a crude hack.

In addition to this, there is no way to compare any non-numerical data
type using a switch statement. Not even strings. In the programming
language D, one can do:

char s[];

switch (s) {

case "hello":
/* something */
break;

case "goodbye":
/* something else */
break;

case "maybe":
/* another action */
break;

default:
/* something */
break;

}

C does not allow you to use switch and case statements for strings. One
must use several variables to iterate through an array of case strings
and compare them to the given string with strcmp(). This reduces
performance and is just yet another hack.

In fact, this is an example of gratuitous library functions running
wild once again. Even comparing one string to another requires use of
the strcmp() function:

char string[] = "Blah, blah, blah\n";

if (strcmp(string, "something") == 0) {

/* do something */

}

Flushing standard I/O
---------------------

A simple microcosm of the "you can do this, but not that" philosophy of
C; one has to do two different things to flush standard input and
standard output.

To flush the standard output stream, the fflush() function is used
(defined by <stdio.h>). One doesn't usually need to do this after every
bit of text is printed, but it's nice to know it's there, right?

Unfortunately, fflush() can't be used to flush the contents of standard
input. Some C standards explicitly define it as having undefined
behaviour, but this is so illogical that even textbook authors
sometimes mistakenly use fflush(stdin) in examples and some compilers
won't bother to warn you about it. One shouldn't even have to flush
standard input; you ask for a character with getchar(), and the program
should just read in the first character given and disregard the rest.
But I digress...

There is no 'real' way to flush standard input up to, say, the end of a
line. Instead one has to use a kludge like so:

int c;

do {

errno = 0;
c = getchar();

if (errno) {
fprintf(stderr,
"Error flushing standard input buffer: %s\n",
strerror(errno));
}

} while ((c != '\n') && (!feof(stdin)));

That's right; you need to use a variable, a looping construct, two
library functions and several lines of exception handling code to flush
the standard
input buffer.

Inconsistent error handling
---------------------------

A seasoned C programmer will be able to tell what I'm talking about
just by reading the title of this section. There are many incompatible
ways in which a C library function indicates that an error has
occurred:

* Returning zero.
* Returning nonzero.
* Returning EOF.
* Returning a NULL pointer.
* Setting errno.
* Requiring a call to another function.
* Outputting a diagnostic message to the user.
* Triggering an assertion failure.
* Crashing.

Some functions may actually use up to three of these methods. (For
instance, fread().) But the thing is that none of these are compatible
with each other and error handling does not occur automatically; every
time a C programmer uses a library function they must check manually
for an error. This bloats code which would otherwise be perfectly
readable without if-blocks for error handling and variables to keep
track of errors. In a large software project one must write a section
of code for error handling hundreds of times. If you forget, something
can go horribly wrong. For example, if you don't check the return value
of malloc() you may accidentally try to use a null pointer. Oops...

Commutative array subscripting
------------------------------

"Hey, Thompson, how can I make C's syntax even more obfuscated and
difficult to understand?"

"How about you allow 5[var] to mean the same as var[5]?"

"Wow; unnecessary and confusing syntactic idiocy! Thanks!"

"You're welcome, Dennis."

Yes, I understand that array subscription is just a form of addition
and so it should be commutative, but doesn't it seem just a bit foolish
to say that 5[var] is the same as var[5]? How on earth do you take the
var'th value of 5?

Variadic anonymous macros
-------------------------

In case you don't understand what variadic anonymous macros are,
they're macros (i.e. pseudofunctions defined by the preprocessor) which
can take a variable number of arguments. Sounds like a simple thing to
implement. I mean, it's all done by the preprocessor, right? And
besides, you can define proper functions with variable numbers of
arguments even in the original K&R C, right?

In that case, why can't I do:

#define error(...) fprintf(stderr, ...)

without getting a warning from GCC?

warning: anonymous variadic macros were introduced in C99

That's right, folks. Not until late 1999, 30 years after development on
the C programming language began, have we been allowed to do such a
simple task with the preprocessor.

The C standards don't make sense
--------------------------------

Only one simple quote from the ANSI C standard - nay, a single footnote
- is needed to demonstrate the immense idiocy of the whole thing.
Ladies, gentlemen, and everyone else, I present to you...footnote 82:

All whitespace is equivalent except in certain situations.

I'd make a cutting remark about this, but it'd be too easy.

Too much preprocessor power
---------------------------

Rather foolishly, half of the actual C language is reimplemented in the
preprocessor. (This should be a concern from the start; redundancy
usually indicates an underlying problem.) We can #define fake
variables, fake conditions with #ifdef and #ifndef, and look, there's
even #if, #endif and the rest of the crew! How useful!

Erm, sorry, no.

Preprocessors are a good idea for a language like C. As has been
iterated, C is not portable. Preprocessors are vital to bridging the
gap between different computer architectures and libraries and allowing
a program to compile on multiple machines without having to rely on
external programs. The #define statement, in this case, can be used
perfectly validly to set 'flags' that can be used by a program to
determine all sorts of things: which C standard is being used, which
library, who wrote it, and so on and so forth.

Now, the situation isn't as bad as for C++. In C++, the preprocessor is
so packed with unnecessary rubbish that one can actually use it to
calculate an arbitrary series of Fibonacci numbers at compile-time.
However, C comes dangerously close; it allows the programmer to define
fake global variables with wacky values which would not otherwise be
proper code, and then compare values of these variables. Why? It's not
needed; the C language of the Plan 9 operating system doesn't let you
play around with preprocessor definitions like this. It's all just
bloat.

"But what about when we want to use a constant throughout a program? We
don't want to have to go through the program changing the value each
time we want to change the constant!" some may complain. Well, there's
these things called global variables. And there's this keyword, const.
It makes a constant variable. Do you see where I'm going with this?

You can do search and replace without the preprocessor, too. In fact,
they were able to do it back in the seventies on the very first
versions of Unix. They called it sed. Need something more like cpp? Use
m4 and stop complaining. It's the Unix way.

Randy Howard · Feb 6, 2005

In that case, why is it that there are so many buffer overflows in so
many C programs written by presumably experienced coders

There lies the flaw -- "presumably experienced".

evolnet.regular · Feb 6, 2005

Walter said:
:True, but my point is:

1) C introduces entirely new classes of bugs
:and
2) C makes bugs more likely, regardless of the programmer's skill
:and
3) C makes it much harder to root out and fix bugs

:and I still haven't seen any reasoning against this.

Introduced relative to -what- ? Do I get to compare it against
the other languages that were available at the time of it's
development in 1969-1973, such as:

- numerous assemblers
- Fortran 66, Fortran G, Fortran H
- The original Pascal (1970)
- PL/1 (1969)
- Algol 68
- APL (1969)
- LISP, LISP 1.5, MacLIS

Yes. Don't forget to compare it to all of the other, superior languages
that have been created subsequently, too!

Steven · Feb 6, 2005

Darn, I made it never below 1/2K (including standard startup and
cleanup

code)...

I got it down to 25 bytes (A86).

Steven · Feb 6, 2005

Darn, I made it never below 1/2K (including standard startup and
cleanup

code)...

I got it down to 25 bytes (A86).

Steven · Feb 6, 2005

I got "hello world!" down to exactly 25 bytes. (A86 assembly)

Chris Torek · Feb 6, 2005

I should have pointed out that I was thinking of UTF-8 encoding
(which is the only practical encoding scheme, so there

).
(Seriously, UTF-16 and UTF-32 are OK for internal use only, but
should never, ever be stored externally. The Unicode standards
even said so. Naturally, Microsoft stores filenames in UTF-16
format, which introduces endian-ness problems on 8-bit systems.)

You can do it by declaring char to be 32 bits. But now we have this problem.
I want a program that opens an image file, does something fancy with the
data, and writes an output. Filenames are composed of characters and
obviously I want my Chinese customers to be able to load and save files in
their own language. So four bit chars are reasonable.

I assume you mean "four byte" chars (CHAR_BIT set to 32). But
"just don't do that", if at all possible. Use UTF-8. It just
works. I mean this: *it just works*.

[much snippage of how one gets oneself into trouble by using UTF32]

The other progams that break are those that read and write from file formats
that contain ASCII strings. For instance an IFF file contains 4 byte ASCII
mnemonic chunk ids. It is extremely tempting to write

if(!memcmp(chunk, "DATA", 4))

rather than coding the bytes numerically, but it will break on EBCDIC.

Yes, it will. As before: "don't do that"

It is quite tempting, of course, to assume that the machine's native
character set is the file-format's native character set. People
who use IBM mainframes quickly learn otherwise, though, because
there is no single EBCDIC.

Michael Mair · Feb 6, 2005

Steven said:
I got "hello world!" down to exactly 25 bytes. (A86 assembly)

Now that you have posted it three times, we are already at 75 B ;-)
I was talking about C, compiled without any dirty tricks (maybe a
pragma) and linked against standard startup and cleanup code.
Without that, you essentially have only to locate printf or puts
or whatever in your implementation's standard library and pass
the string to it.

This does not count IMO.

Cheers
Michael

Steven · Feb 6, 2005

I *think* I deleted the other posts, my browser was acting up.
Anyway, this is exactly why I like C and Assembly, you can make it so
efficient. Got it down to 21 bytes so far (Only 8 bytes of
instructions!)

Noah Roberts · Feb 7, 2005

Fixed newsgroup settings; followup to comp.lang.c only.

OP below.

Allin Cottrell · Feb 7, 2005

[M]y point is:

(1) C introduces entirely new classes of bugs
and
(2) C makes bugs more likely, regardless of the programmer's skill
and
(3) C makes it much harder to root out and fix bugs

This sort of shtick reminds me of the TV ads that are now
(I believe) banned on TV in the UK:

"Foo washes whiter."
"Bar tastes better."

And so on, the point being that in each case a "straw man"
comparison is implied. Washes whiter than tar; tastes
better than bleach. Makes bugs more likely than the
predicate calculus. Introduces new classes of bugs relative
to Truth Tables.

Allin Cottrell.

CBFalconer · Feb 7, 2005

Michael said:
Now that you have posted it three times, we are already at 75 B ;-)
I was talking about C, compiled without any dirty tricks (maybe a
pragma) and linked against standard startup and cleanup code.
Without that, you essentially have only to locate printf or puts
or whatever in your implementation's standard library and pass
the string to it.

This does not count IMO.

No? If your C system makes an executable file somewhere in the
25,000 to 250,000 byte range this means a bloat factor of between
1000 and 10,000 is involved. This shows why a Z80 with 64k of
memory could keep up for so long. Today the hardware is something
in that 1000 to 10,000 range faster than my 2 Mhz z80 was, and
memory is about 1000 times larger, with virtual memory about 10,000
times larger. However the bloat and tail-chasing have kept
performance in the same general ball park.

Another comparison - disk storage has gone from 400k per floppy to
40 GB per hard disk. That's a factor of 100,000, and is the reason
we don't have storage space problems. The disks are staying ahead
of the bloat.

</rant>

CBFalconer · Feb 7, 2005

Steven said:
I *think* I deleted the other posts, my browser was acting up.
Anyway, this is exactly why I like C and Assembly, you can make
it so efficient. Got it down to 21 bytes so far (Only 8 bytes
of instructions!)

mov dx,msg
mov al,??
int ??
mov al,??
int ??
msg db 'Hello World$'

(I have forgotten a lot - but CP/M is almost exactly the same)

James Dow Allen · Feb 7, 2005

C [has made] little progress ...

Mr. Evolnet is operating under a fundamental but very
common misconception. Let me try to elucidate.

There are two opposite poles in porgramming language:

Low-level: ... few simple well-defined operations, well-determined,
unambiguous, easy to specify, easy to learn. Example languages
include Forth, assembly languages, C. Metaphors: standard transmission,
do what I say, human is boss.

High-level: ... rich semantics, operators overloaded, language details
vary case to case, difficult specifications, often error-prone implementations.
Example languages include Ada, C++. Metaphors: automatic transmission,
do what I mean, man is smarter than machine.

Neither pole is "right" or "wrong". Machines sometimes seem smarter
than (some) men. There is evidence that high-level languages lead to
much higher programmer productivity. Yet many programmers love Forth,
and I'm sure it still has important niches. No one language, or
type of language, satisifies all users and applications.

I, for one, am much more productive and (perhaps more important)
much happier programming in Low-level languages. Frankly I used
to program mostly in assembly language, even when there was no "need"
but was happy to adopt C when I discovered it. Very low-level, but
giving most of the benefit of high-level language.

The syntactic similarity between C and C++ has confused the issue.
Totally from opposite ends of the language spectrum, yet many have
the misconception that C++ is sort of a revision of C with improvements.

C has no string type. Huh?...

There are other languages if you don't like C. It sounds like you
prefer High-level; why are you here in comp.lang.c ?

What. A. Crock. An ugly hack.

Hunh? Simplicity is the opposite of Hackery.

And don't even get me started on the lack of an exponentiation
operator.

Try, cc whatever.c -lm, next time. Is "-lm" so hard to remember?

Hope this helps,
James

Mohd Hanafiah Abdullah · Feb 7, 2005

I would consider your arguments more seriously if you bothered to make
them seem objective. As it is, you just sound like you're whining.
Also, after reading the entire post, it's obvious that your list of
complaints is very short:

1) You want a string type
2) You don't like low level programming
3) You don't think C is portable

Come to think of it, those all sound like opinions rather than hard
facts. If you don't like C, don't use it. Nobody is forcing you. If you
don't want anyone else to use C, tough. It's impossible to use your own
opinions to change the opinions of others.

After all the discussion of the advantages/disadvantages of C, it is still one
of the favorite languages in the real world for decades. It is simply
pragmatic to go with C in many projects.

See: http://www.tiobe.com/tpci.htm

for the latest info on programming languages index.

Napi

Jonathan Burd · Feb 7, 2005

JuicyLucy said:
Yes we understand, ... It is quitte difficult to get your hands on

Please snip out irrelevant textual matter from your quotes
when posting and please do not toppost in clc.

Regards,
Jonathan.

Walter Roberson · Feb 7, 2005

|Walter Roberson wrote:

|> In article <[email protected]>,

|> :True, but my point is:

|>

1) C introduces entirely new classes of bugs

|> Introduced relative to -what- ? Do I get to compare it against
|> the other languages that were available at the time of it's
|> development in 1969-1973, such as:

|Yes. Don't forget to compare it to all of the other, superior languages
|that have been created subsequently, too!

I don't see any meaningful sense in which it can be said that C
"introduces" classes problems when compared against programming
languages that were developed afterwards.

| - Fortran 66, Fortran G, Fortran H

No character constants, no character strings. Hollorith literals were
new in FORTRAN 66, as was equivilencing with all the accompany typing
horrors and ill-defined padding rules. Those were the days when
you really could change the value of a constant in FORTRAN by passing
it by value and then assigning a value into the dummy parameter.

One of the first things I did while working with the organization
I work with now, was to sit and go through an official IBM FORTRAN G
and H reference manual... correcting it by memory. There were
a fair number of mistakes.

| - The original Pascal (1970)

I worked for a couple of years for a company that designed one of the
first all-digital telephone switches. The line card programming was
Motorola 68020 assembler (I got pretty good at reading core
and patching code in binary), but all the higher level programming
was done in Pascal. Except that it *couldn't* be done in Pascal
as defined by The Pascal User Manual and Report, so there was a team
of people that did nothing but rewrite the compiler itself, extending
it and optimizing it for real use. Pascal doesn't have "include"
files or anything similar, so in order to use the real Pascal of the
time, one would have to include all of the interface and constant
definitions in every program, and change all several hundred copies
of them when [for example] the maximum line card density was increased.
And actually talk to hardware? Not a chance without extending it
to be able to violate all the strong typing rules that were the
heart of Pascal.

| - PL/1 (1969)

According to wikipedia.org,

It has a very large vocabulary of built-in functions. In fact,
there is probably no one compiler that has the full standard of
keywords available. PL/I compilers are normally subsets of the
language that specialize in various fields.

And you complain about the number of library functions defined
for C!

| - Algol 68

An interesting language, but I will need to refer to my dusty manual
before making further comments.

| - APL (1969)

APL... the famous "write-only language", with too many operators to
fit into even minor extensions to ANSI -- one needed a full APL
keyboard. Too many ways to do things in C? APL had more ... most of
them unreadable the week after you'd written the code. Oh, and
the goto operator was pretty much mandatory.

Don't suppose here that I am slandering APL: I am one of the few that
bothered to buy a real APL terminal our of my own pocket [while still
a student!]. APL is nice for great for some kinds of work...
but to do any systems programming, you pretty much had to go behind
it's back and hack in a new (non-portable) i-beam operator.
And the only way to pass operators (functions) around was textually
and then eval the string.

| - LISP, LISP 1.5, MacLIS

This was before the LISP machines, before Interlisp, before
Common Lisp. Not exactly a strongly typed language [though there
was certainly nothing stopping you from locally extending the
standard operators by defining a new function with the same
name that tested for particular conditions and passed off control
to the CDR of the previous definition of the function...]

Erik de Castro Lopo · Feb 7, 2005

James said:
The syntactic similarity between C and C++ has confused the issue.

The main confusion about C++ is that its practitioners think
it is simultaneously a low and high level language when in
reality its good at neither.

Totally from opposite ends of the language spectrum, yet many have
the misconception that C++ is sort of a revision of C with improvements.

There are other languages if you don't like C. It sounds like you
prefer High-level; why are you here in comp.lang.c ?

I suggested O'caml as a genuine high level language. We don't
yet know if he's had a look at that.

Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo (e-mail address removed) (Yes it's valid)
+-----------------------------------------------------------+
"C is a programming language. C++ is a cult."
-- David Parsons in comp.os.linux.development.apps

New Programming Language GALAXION	2	Feb 15, 2024
C Programming functions	2	Dec 3, 2021
Looking for feedback on this markup language I developed and my website idea?	0	Jun 17, 2023
Anyone wants to make this programming language? (in C)	0	Jun 1, 2022
What programming language to choose?	4	Jul 3, 2022
Can't decide which language to get back into programming with	1	Mar 28, 2023
How to get started with C programming:	1	Aug 7, 2023
What is the best paying programming language?	6	Jun 21, 2022

Why C Is Not My Favourite Programming Language

Walter Roberson

Malcolm

JuicyLucy

Randy Howard

evolnet.regular

Steven

Steven

Steven

Chris Torek

Michael Mair

Steven

Noah Roberts

Allin Cottrell

CBFalconer

CBFalconer

James Dow Allen

Mohd Hanafiah Abdullah

Jonathan Burd

Walter Roberson

Erik de Castro Lopo

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads