Why C/C++ errors are SO obscure/devious??

K

Keith Thompson

Chris Hills said:
Because a C compiler is a translator. Not an error checker. It tries to
make the most complex statement it can from the tokens.

For error checking use Lint. that is whit it is designed for.
[...]

A compiler is both a translator and an error checker. A conforming C
compiler is required by the standard to issue diagnostics for certain
classes of errors; the diagnostics aren't required to be useful, but
in any decent compiler they will be.

The decision to make lint a separate tool from the compiler is an old
historical one. I don't know much about what went into that decision,
but I suspect it had to do with resource limits on the machines of the
time, limits that are no longer relevant.

There are still various lint-like tools floating around, and it's
still a good idea to use them, but there's no inherent reason why the
compiler itself shouldn't be doing what lint does, at least
optionally. Most C compilers will issue warnings for perfectly legal
code; this is, in my opinion, a good thing. gcc, for example, will
warn about unused or potentially unitialized variables, incorrect
printf format strings, and a number of other things.
 
K

Keith Thompson

Massimo Soricetti said:
[wants a more helpful compiler]

There have been compilers like this, though I haven't seen one for C.

The most famous example was PLICC, the IBM PL/1 Checkout Compiler.
It would insert semicolons, guess at missing declarations, and so on.
It could take a FORTRAN program and "fix" it until it was a legal PL/1
program (not all FORTRAN programs of course, but sufficiently simple
ones).

I've worked on a compiler (not a C compiler) that did agressive syntax
fix-up on syntactically incorrect input. If you fed it a paragraph of
English text, it would find English words that happened to be keywords
and eventually fit them into a transformed version of the input that
happened to be a legal program. (Sometimes this would be an empty
compilation unit.)

This was useful because it meant that a syntax error wouldn't prevent
the compiler from continuing to detect further errors. A missing
semicolon on line 50 would be re-inserted, and the compiler would
still detect the misspelled identifier on line 200.

*But* all this transformation was purely internal. Once a syntax
error was detected, the compilation was going to fail; the fix-up was
done only to allow further error detection. It would *never* attempt
to modify the input source file, or even generate a corrected one;
you'd have to insert the semicolon yourself.

With today's faster systems, it's not clear that this is worth the
effort. If a compiler detects a syntax error and immediately bails
out, I can fix the error and recompile to detect any subsequent
errors.

Another useful trick I've seen (in a different compiler) is using
indentation as a clue for error recovery. For example, if you have a
C program like this:

int main(void)
{ /* line 2 */
if (something) { /* line 3 */
do_something();
/* line 5, missing right brace */
} /* line 6 */

most compilers will match the '{' on line 3 to the '}' on line 6, and
complain about a missing '}' following that. A compiler that pays
attention to indentation might correctly report a missing '}' on
line 5. In a larger program, that might save the user a lot of time
tracking down the error.

All this stuff is a lot of work for whoever writes the parser, to
avoid what's usually not a lot of work for the programmer to find and
fix the error. Again, it's not obvious whether it's worth the effort.
 
C

Chris Hills

Keith Thompson <kst- said:
Chris Hills said:
Because a C compiler is a translator. Not an error checker. It tries to
make the most complex statement it can from the tokens.

For error checking use Lint. that is whit it is designed for.
[...]

A compiler is both a translator and an error checker. A conforming C
compiler is required by the standard to issue diagnostics for certain
classes of errors; the diagnostics aren't required to be useful, but
in any decent compiler they will be.

The decision to make lint a separate tool from the compiler is an old
historical one. I don't know much about what went into that decision,
but I suspect it had to do with resource limits on the machines of the
time, limits that are no longer relevant.

There are still various lint-like tools floating around, and it's
still a good idea to use them, but there's no inherent reason why the
compiler itself shouldn't be doing what lint does, at least
optionally. Most C compilers will issue warnings for perfectly legal
code; this is, in my opinion, a good thing. gcc, for example, will
warn about unused or potentially unitialized variables, incorrect
printf format strings, and a number of other things.

I think lint was intended to be part of the compiler chain. (ask the
other Keith :) In those day the compiler was a three pass affair so it
would have been

lint, pre-processor, cc1, cc2, cc3, assemble, link.

These days it is compile (to object )-link

So I think they may have intended lint to be part of the system. The
first Lint was first done by Johnson a year or so after the first C
compilers AFAIK) It is still AFAIK part of most Unix and Linux c
compiler suites. It just seem to have dropped of the PC ones.

I think the reason why I like a separate lint is because a different
mind-set has made it compared to the compiler. So you should between
them catch most things. It is like using two different virus scanners
with 95% coverage to get 99% coverage.

The idea of lint was to warn about legal but dubious uses of C.

this comes back the original comment that C is a s safe as any other
language when used properly . 61508 says C is OK when used with a subset
and static analysis as part of the process.

As we have seen Ada can be just as unsafe if not fully tested. The first
thing I have seen done on Modula 2 developments is the system and
pointers included. Others seemed to think if the compiler compiled it
with no bugs it was Good Code.

A few minutes with lint on code that compiles clean should convince
anyone that just because code is legal does not mean it is safe. In any
language.
 
K

Keith Thompson

Chris Hills said:
I think lint was intended to be part of the compiler chain. (ask the
other Keith :) In those day the compiler was a three pass affair so it
would have been

lint, pre-processor, cc1, cc2, cc3, assemble, link.

These days it is compile (to object )-link

Um, what other Keith are you referring to? Do you mean Ken Thompson?

As far as I know, lint has always been optional. The chain above
implies that the output of lint would be fed to the preprocessor;
that's not how it works.

[...]
As we have seen Ada can be just as unsafe if not fully tested. The first
thing I have seen done on Modula 2 developments is the system and
pointers included. Others seemed to think if the compiler compiled it
with no bugs it was Good Code.

A few minutes with lint on code that compiles clean should convince
anyone that just because code is legal does not mean it is safe. In any
language.

Since lint applies only to C (or perhaps, in more recent variants, to
C++), this should at most convince someone that legal C code isn't
necessarily safe.

There is no language in which it is impossible, or even particularly
difficult, to write bad code. But there are languages in which a
clean compilation can imply you're much further along the path to a
correct program than is the case in other languages. I won't go into
more specifics than that because I don't want to start a language war
-- but I will mention that improving C in this respect would be a
worthy goal. The introduction of prototypes, for example, was a great
improvement. I'd be surprised if there weren't more opportunities for
such improvements.
 
J

jussij

It's not uncommon to forget a } writing code

If by chance you happen to write your code on the Windows platform then
you could use something like the Zeus IDE. It has several features that
would help in situations like this, for example:

1: Automatic Brace Insertion
This means when you enter this code:
{<- Enter key here

Zeus will automatically do this:
{
|<- Cursor here
}

2: Code folding
Since code folding is done on braces, if a brace is missing the code
folding quickly highlights that something is wrong.

3: Find Matching Braces.
Searching for a matching brace also helps find missing braces.
And in general, I noted that many, if not all, error messages
from the compiler are VERY short and cryptic,

With time things do become clearer. I would suggest taking careful note
of the exact wording of the compile error. Then when you fix the error
go back and revisit the compiler error message.

On most occasions I think you will find the compiler error message was
in fact valid and precise.

The compiler output is a lot like a legal document. Consider a legal
document that is impossible to read. But to a lawyer the same document
is accurate and precise.

To better understand the compiler you just need to practice your
compiler legal speak :)
Why can't a compiler give more accurate informations about
errors?

In general they do.
Shouldn't this save time, stress and money?

After time you will find the stress is not created by the compiler or
it error message it produces, but rather the tight deadlines and that
strange program crash that is proving impossible to track down.

Jussi Jumppanen
Author: Zeus for Windows
Note: Zeus is shareware (45 day trial).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top