PEP 3107 and stronger typing (note: probably a newbie question)

Roy Smith · Jul 3, 2007

Eckel's and Martin's well-known essays on why good testing can replace
strict static type checking:
<http://www.mindview.net/WebLog/log-0025>
<http://www.artima.com/weblogs/viewpost.jsp?thread=4639>

I've read the first before. I just re-read it. There seem to be three
different concepts all being talked about at the same time.

1) Static vs. dynamic checking.
2) Type (is-a) checking vs. behavior (has-a) checking.
3) Automatic (i.e. compiler generated) vs. manually written tests.

They all allow you to write manual tests. No sane programmer will rely
exclusively on the automatic checks, no matter what flavor they are. The
interesting thing is that most people seem to conflate items 1 and 2 above
into two composite camps: static type checking vs. dynamic behavior
checking. There's really no reason you can't have dynamic type checking
(things that raise TypeError in Python, for example, or C++'s
dynamic_cast). There's also no reason you can't have static behavior
checking (Java's interfaces).

greg · Jul 4, 2007

Paul said:
E.g. your program might pass its test and run properly for years
before some weird piece of input data causes some regexp to not quite
work.

Then you get a bug report, you fix it, and you add a test
for it so that particular bug can't happen again.

Once I got the
function to work, I deployed it without writing permanent tests for
it.

That suggests you had a temporary test at some point.
So, keep it and make it a permanent test. Even if it's
just a small manually-created directory, it's still
better than nothing, and you can add to it over time
to cover any bugs that turn up.

Paul Rubin · Jul 4, 2007

greg said:
Then you get a bug report, you fix it, and you add a test
for it so that particular bug can't happen again.

Why on earth would anyone prefer taking a failure in the field over
having a static type check make that particular failure impossible?

That suggests you had a temporary test at some point.

I ran the function on real, dynamic data and made sure the results
were right.

So, keep it and make it a permanent test. Even if it's just a small
manually-created directory, it's still better than nothing, and you
can add to it over time to cover any bugs that turn up.

That doesn't obviously fit into the doctest framework; such a test
would be considerably more complex than the function itself, and the
types of situations that I could imagine causing failures wouldn't be
included in a test like that. Basically I had to get on to the next
thing. Given the somewhat throwaway nature of this particular code,
nothing else really made sense. But I think I'm going to look into
using dejagnu for more heavyweight testing of some other parts of the
program, so I guess some good came out of discussing this issue.
Thanks.

Bruno Desthuilliers · Jul 4, 2007

Paul Rubin a écrit :

Why on earth would anyone prefer taking a failure in the field over
having a static type check make that particular failure impossible?

Because static type checks impose a lot of arbitrary restrictions,
boilerplate code etc, which tends to make code more complicated than it
needs to be, which is a good way of introducing bugs that wouldn't have
existed without static type checks. Depending on the application domain
and some technical and non-technical constraints and requirements, it
(often) happens that it's better to have the application deployed now
with an occasional error message than to have it next year...

And FWIW, when it comes to "weird piece of input data", statically typed
languages are not specially better than dynamic ones...

Roy Smith · Jul 4, 2007

greg said:
Then you get a bug report, you fix it, and you add a test
for it so that particular bug can't happen again.

The TDD zealots would tell you you've got the order wrong. Instead of
"fix, then write a test", it should be "write a failing test, then fix it".

Alex Martelli · Jul 4, 2007

Roy Smith said:
The TDD zealots would tell you you've got the order wrong. Instead of
"fix, then write a test", it should be "write a failing test, then fix it".

Incidentally, I was pretty surprised recently (re-reading Weinberg's
"Psychology of Computer Programming" classic from _1971_) to find out
Weinberg advocating "test-first coding" (not the same thing as
"test-driven design", but sharing the key insight that tests should be
written before the code they test) for psychological reasons. He's
discussing common practices of the '60s, with the same professional
writing both the code and the tests, and pointing out how often the
knowledge of the already-written code subconsciously influences the
programmer to write tests that don't really "challenge" the code enough
-- writing the tests "in advance" would avoid this problem.

Nihil sub sole novi...

Alex

Paul Rubin · Jul 4, 2007

Bruno Desthuilliers said:
Because static type checks impose a lot of arbitrary restrictions,
boilerplate code etc, which tends to make code more complicated than
it needs to be, which is a good way of introducing bugs that wouldn't
have existed without static type checks.

Why do you say that? By metrics and anecdotal evidence, Haskell code
appears to be at least as compact as Python code.

Depending on the application domain and some technical and
non-technical constraints and requirements, it (often) happens that
it's better to have the application deployed now with an occasional
error message than to have it next year...

I suppose that includes the thing I'm currently working on. For
some other stuff I've done, such errors would have caused huge hassles,
lost customer money, etc.

And FWIW, when it comes to "weird piece of input data", statically
typed languages are not specially better than dynamic ones...

I know that ML gives compiler warning messages if you have a pattern
match (sort of a variant of a case statement, not a regexp match)
which is non-exhaustive. And Haskell's Maybe monad is part of an
idiom that handles failing computations (like regexp matches) much
more gracefully than Python can. Both of those would help this
situation.

Paul Boddie · Jul 4, 2007

Paul said:
Why do you say that? By metrics and anecdotal evidence, Haskell code
appears to be at least as compact as Python code.

I think Bruno is referring to another class of languages here.
However, it's interesting to consider the work that sometimes needs to
go in to specify data structures in some languages - thinking of ML
and friends, as opposed to Java and friends. The campaign for optional
static typing in Python rapidly became bogged down in this matter,
fearing that any resulting specification for type information might
not be the right combination of flexible and powerful to fit in with
the rest of the language, and that's how we really ended up with PEP
3107: make the semantics vague and pretend it has nothing to do with
types, thus avoiding the issue completely.

Paul

John Nagle · Jul 4, 2007

Paul said:
Paul Rubin wrote:

The campaign for optional
static typing in Python rapidly became bogged down in this matter,
fearing that any resulting specification for type information might
not be the right combination of flexible and powerful to fit in with
the rest of the language, and that's how we really ended up with PEP
3107: make the semantics vague and pretend it has nothing to do with
types, thus avoiding the issue completely.

Unfortunately, that may lead to the worst of both worlds.

If you think enforced static typing is painful, try maintaining code with
non-enforced static typing. You can't rely on the type information, and
inevitably some of it will be wrong. You can't tell by looking which
is wrong, of course.

This has been tried. Original K&R C had non-enforced static typing.
All "struct" pointers were equivalent. It wasn't pretty.

It takes strict programmer discipline to make non-enforced static
typing work. I've seen it work in an aerospace company, but the Python
crowd probably doesn't want that level of engineering discipline.
Non-enforced static typing requires a quality assurance group that
reads code and checks coding standards.

John Nagle

Paul Rubin · Jul 4, 2007

John Nagle said:
This has been tried. Original K&R C had non-enforced static typing.
All "struct" pointers were equivalent. It wasn't pretty.

It takes strict programmer discipline to make non-enforced static
typing work. I've seen it work in an aerospace company, but the Python
crowd probably doesn't want that level of engineering discipline.

I think even enforced static types wouldn't cure what I see as the
looseness in Python. There is not enough composability of small
snippets of code. For example, the "for" statement clobbers its index
variable and then leaks it to the outside of the loop. That may be
more of a culprit than dynamic types. Perl and C++ both fix this with
syntax like

for (my $i in ...) ... (perl) or
for (int i = 0; i < n; i++) ... (C++, Java)

making a temporary scope for the index variable. Python even leaks
the index variable of list comprehensions (I've mostly stopped using
them because of this), though that's a recognized wart and is due to
be fixed.

Python would be helped a lot by some way to introduce temporary
scopes. We had some discussion of this recently, concluding that
there should be a compiler warning message if a variable in a
temporary scope shadows one from a surrounding scope.

Paul Rubin · Jul 4, 2007

Bruno Desthuilliers said:
Haskell - as other languages using type-inference like OCaml - are in
a different category. Yes, I know, don't say it, they are statically
typed - but it's mostly structural typing, not declarative
typing. Which makes them much more usable IMHO.

Some users in fact recommend writing an explicit type signature for
every Haskell function, which functions sort of like a unit test.
That doesn't bloat the code up noticibly. The conciseness of those
languages comes more from polymorphism and convenient ways of writing
and using higher-order functions, than from type inference.

Still, static typechecking is not a garantee against runtime
errors. Nor against logical errors.

Right, however the reality is it does seem to prevent a lot of
surprises. There's even an intermediate language (a language
generated by a compiler as an intermediate step towards emitting
machine code) called Henk, in which EVERY value is type-annotated (and
in a very fancy type system too). The author reports that the
annotations have been very helpful for noticing compiler bugs.

I'd have to see a concrete use case. And I'd need much more real-world
experience with some ML variant, but this is not something I can
expect to happen in a near future - it's difficult enough to convince
PHBs that Python is fine.

Monad Reader #7 has an article about some Wall street company using ML:

http://www.haskell.org/sitewiki/images/0/03/TMR-Issue7.pdf

see the article by Yaron Minsky.

=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?= · Jul 4, 2007

Remember that pure CPython has no different "compile time" and
runtiime. But Psyco and ShedSkin could use the annotations the way
they want. ......
def compile(source: "something compilable",
filename: "where the compilable thing comes from",
mode: "is this a single statement or a suite?"):

I think the above would make an annotation-enhanced Psyco or ShedSkin
very confused.

Eduardo \EdCrypt\ O. Padoan · Jul 4, 2007

I think the above would make an annotation-enhanced Psyco or ShedSkin
very confused.

This example was to show that annotations are for documentation too,
not only type checking or optimization. It is from the PEP.

greg · Jul 4, 2007

The TDD zealots would tell you you've got the order wrong. Instead of
"fix, then write a test", it should be "write a failing test, then fix it".

Yes, I'm aware of that. I said "and", not "then".

Roy Smith · Jul 4, 2007

John Nagle said:
Non-enforced static typing requires a quality assurance group that
reads code and checks coding standards.

In other words, it's enforced, but it's enforced by QA people instead of
the compiler.

Michael Hoffman · Jul 4, 2007

Eduardo said:
Sorry, I surely know that Python has a compile time, I wanted to say
somthing like "compile time checks except from syntax".

Well, if you try to reassign __debug__ or None you get a SyntaxError,
but I don't think it is truly checking syntax.

Bruno Desthuilliers · Jul 5, 2007

Paul Rubin a écrit :

Why do you say that? By metrics and anecdotal evidence, Haskell code
appears to be at least as compact as Python code.

Haskell - as other languages using type-inference like OCaml - are in a
different category. Yes, I know, don't say it, they are statically typed
- but it's mostly structural typing, not declarative typing. Which makes
them much more usable IMHO. It's too bad they are not more widely adopted.

I suppose that includes the thing I'm currently working on.
>
For
some other stuff I've done, such errors would have caused huge hassles,
lost customer money, etc.

Still, static typechecking is not a garantee against runtime errors. Nor
against logical errors.

I know that ML gives compiler warning messages if you have a pattern
match (sort of a variant of a case statement, not a regexp match)

I know what pattern matching is, I did play a bit with OCaml and Haskell.

which is non-exhaustive. And Haskell's Maybe monad is part of an
idiom that handles failing computations (like regexp matches) much
more gracefully than Python can. Both of those would help this
situation.

I'd have to see a concrete use case. And I'd need much more real-world
experience with some ML variant, but this is not something I can expect
to happen in a near future - it's difficult enough to convince PHBs that
Python is fine.

Bruno Desthuilliers · Jul 5, 2007

Paul Rubin a écrit :

Some users in fact recommend writing an explicit type signature for
every Haskell function, which functions sort of like a unit test.

Stop here. explicit type signature == declarative static typing != unit
test.

That doesn't bloat the code up noticibly. The conciseness of those
languages comes more from polymorphism and convenient ways of writing
and using higher-order functions, than from type inference.

Type inference is certainly helpful for genericity.

Right, however the reality is it does seem to prevent a lot of
surprises.

I have few "surprises" with typing in Python. Very few. Compared to the
flexibility and simplicity gained from a dynamism that couldn't work
with static typing - even using type inference -, I don't see it a such
a wonderful gain. At least in my day to day work.

Monad Reader #7 has an article about some Wall street company using ML:

http://www.haskell.org/sitewiki/images/0/03/TMR-Issue7.pdf

see the article by Yaron Minsky.

Sorry, I don't live near Wall Street !-)

=?ISO-8859-1?Q?Nis_J=F8rgensen?= · Jul 5, 2007

Bruno Desthuilliers skrev:

Paul Rubin a écrit :

Stop here. explicit type signature == declarative static typing != unit
test.

Well, it quacks like a duck ...

Nis

Steve Holden · Jul 5, 2007

Paul said:
I think even enforced static types wouldn't cure what I see as the
looseness in Python. There is not enough composability of small
snippets of code. For example, the "for" statement clobbers its index
variable and then leaks it to the outside of the loop. That may be
more of a culprit than dynamic types. Perl and C++ both fix this with
syntax like

for (my $i in ...) ... (perl) or
for (int i = 0; i < n; i++) ... (C++, Java)

making a temporary scope for the index variable. Python even leaks
the index variable of list comprehensions (I've mostly stopped using
them because of this), though that's a recognized wart and is due to
be fixed.

Wow, you really take non-pollution of the namespace seriously. I agree
it's a wart, but it's one I have happily worked around since day one.

Python would be helped a lot by some way to introduce temporary
scopes. We had some discussion of this recently, concluding that
there should be a compiler warning message if a variable in a
temporary scope shadows one from a surrounding scope.

Yeah, well the approach to scoping could really be improved, but if you
keep your namespaces small it's not difficult to keep things straight. I
have always been rather guarded in my approach to accessing non-local
scopes because the coupling is rather less than obvious, and is subject
to variation due to non-local changes.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

PEP: Specialization Syntax	19	Aug 7, 2005
Updated PEP 359: The make statement	7	Apr 18, 2006
PEP 324: popen5 - New POSIX process module	1	Jan 3, 2004
Sencha Touch--Support 2 browsers in just 228K!	64	Jul 15, 2010
server-side JavaScript: Prototypes of built-in classes, objects and functins	0	Jun 28, 2008
Thoughts on Guido's ITC audio interview	24	Jun 25, 2005
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004

PEP 3107 and stronger typing (note: probably a newbie question)

Roy Smith

greg

Paul Rubin

Bruno Desthuilliers

Roy Smith

Alex Martelli

Paul Rubin

Paul Boddie

John Nagle

Paul Rubin

Paul Rubin

=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=

Eduardo \EdCrypt\ O. Padoan

greg

Roy Smith

Michael Hoffman

Bruno Desthuilliers

Bruno Desthuilliers

=?ISO-8859-1?Q?Nis_J=F8rgensen?=

Steve Holden

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads