Ed Prochak said:
If the programmer happens to indent lines
of code to the wrong level, how does the INDENT/DEDENT syntax stop the
compiler from generating, as Skybuck like to say, lots of
bull<expletive> lines of errors?
I don't think the INDENT/DEDENT syntax itself with stop the compiler
from generating lots of lines of errors, the concept of INDENT/DEDENT is
orthogonal to the concept of intelligent error message supression, which is
a moderately well understood problem.
To give you an example, consider the following line:
int i = a + (b + (c + (d + e)));
Imagine that a, b, c and d are all integers, while e is something else,
such that you cannot add an integer and whatever e is together. So the
expression "d + e" results in a type error, and will yield an error message.
But that means the type of the expression "(d + e)" (which is distinct from
the expression "d + e") is also erroneous. And the type of "(c + (d + e))"
is erroneous as well, and so on.
What is usually done then is that the compiler has a dummy "error_type"
type that it uses in the type checking phase. When it sees "d + e", it'll
report the error message, but then claim that the type of the expression is
"error_type" (as opposed to say, "int"). Then, by definition, any operation
with an error_type is "legal" in the sense that it generates no errors. So
adding an int to an error_type doesn't generate an error message.
That's how the compiler is able to generate just one error message for
the case above, instead of half a dozen errors.
Handling missing block structure tokens (e.g. '{' and '}' in C-like
languages) is a bit trickier, but still doable, depending on the complexity
of the language, from a parsing point of view. I've never tried to write a C
compiler, so I don't know how hard it'd be in C, but I know that the grammar
that describes the Java language is relatively simple. For example, in Java,
when you encounter a method declaration, you'll know it's a method
declaration for sure; it cannot be confused as a cast operation or a
variable declaration or anything like that.
When the parser encounters something that it doesn't expect, it knows
something's gone wrong (it may be a missing closing bracket, or something
else). In these situations, it can just report the error of seeing something
it doesn't expect, and then stumble along until it encounters something it
recognizes, for example, a method declaration. Since method declarations
can't be nested in Java, if it sees a new method declaration, the parser can
know to close the old method declaration, and just continue parsing on from
there. The idea is to not complain or generate error messages about all
those strange tokens it saw during the "stumbling along" phase, since the
structure of the parse tree is too damaged to make any informed error
messages anyway. Once it grounds itself again, and can establish where it is
located in the parse tree, it can turn error reporting on again, and
continue on its merry way.
- Oliver