Russ P. said:
I know some researchers in software engineering who believe that the
ultimate solution to software reliability is automatic code
generation. The don't really care much which language is used, because
it would only be an intermediate form that humans don't interact with
directly. In that scenario, humans would essentially use a "higher
level" language such as UML or some such thing.
I personally have a hard time seeing how that could work, but that may
just be due to be my own lack of understanding or vision.
The usual idea is that you would write a specificiation, and a
constructive mathematical proof that a certain value meets that
specification. The compiler then verifies the proof and turns it into
code. Coq (
http://coq.inria.fr) is an example of a language that
works like that. There is a family of jokes that go:
Q. How many $LANGUAGE programmers does it take to change a lightbulb?
A. [funny response that illustrates some point about $LANGUAGE].
The instantiation for Coq goes:
Q. How many Coq programmers does it take to change a lightbulb?
A. Are you kidding? It took two postdocs six months just to prove
that the bulb and socket are threaded in the same direction.
Despite this, a compiler for a fairly substantial C subset has been
written mostly in Coq (
http://compcert.inria.fr/doc/index.html). But,
this stuff is far far away from Python.
I have a situation which I face almost every day, where I have some
gigabytes of data that I want to slice and dice somehow and get some
numbers out of. I spend 15 minutes writing a one-off Python program
and then several hours waiting for it to run. If I used C instead,
I'd spend several hours writing the one-off program and then 15
minutes waiting for it to run, which is not exactly better. (Or, I
could spend several hours writing a parallel version of the Python
program and running it on multiple machines, also not an improvement).
Often, the Python program crashes halfway through, even though I
tested it on a few megabytes of data before starting the full
multi-gigabyte run, because it hit some unexpected condition in the
data that could have been prevented with more compile time checking
that made sure the structures understood by the one-off script matched
the ones in the program that generated the input data.
I would be ecstatic with a version of Python where I might have to
spend 20 minutes instead of 15 minutes writing the program, but then
it runs in half an hour instead of several hours and doesn't crash. I
think the Python community should be aiming towards this.