Optional Static Typing

D

Donn Cave

Quoth (e-mail address removed) (Alex Martelli):
....
| > He didn't dwell much on it, but there was some mention of type
| > inference, kind of as though that could be taken for granted.
| > I guess this would necessarily be much more limited in scope
| > than what Haskell et al. do.
|
| Assuming that by "he" you mean GvR, I think I saw that too, yes. And
| yes, a language and particularly a typesystem never designed to
| facilitate inferencing are hard-to-impossible to retrofit with it in as
| thorough a way as one that's designed around the idea. (Conversely,
| making a really modular system work with static typing and inferencing
| is probably impossible; in practice, the type inferencer must examine
| all code, or a rather copious summary of it... it can't really work
| module by module in a nice, fully encapsulated way...).

Well, I would assume that a modules in a static system would present
a typed external interface, and inference would apply only within the
module being compiled.

for example, Objective CAML revised syntax -

$ cat mod.ml
module T =
struct
type op = [On | Off];
value print t a = match t with
[ On -> print_string a | Off -> () ];
value decide t a b = match t with
[ On -> a | Off -> b ];
end;

$ ocamlc -i -pp camlp4r mod.ml
module T :
sig
type op = [ On | Off ];
value print : op -> string -> unit;
value decide : op -> 'a -> 'a -> 'a;
end;

This is fairly obvious, so I'm probably missing the point,
but the compiler here infers types and produces an interface
definition. The interface definition must be available to
any other modules that rely on this one, so they are relieved
of any need to examine code within this module.

There might be tricky spots, but I imagine the Objective CAML
folks would object to an assertion like "making a really modular
system work with static typing and inferencing is probably
impossible"!

Donn Cave, (e-mail address removed)
 
L

Luis M. Gonzalez

I don't understand why this discussion on optional static typing came
up right at this moment.
As far as I know, it has been discussed many times in the past, and
there even was a SIG that simply died... but it seems that it never was
something of much interest to python developers (that's my impression,
I might be wrong).

Now, that the Pypy project is steadily advancing (and which is aimed at
making Python faster while keeping it dynamic and simple), why has this
topyc been raised?

Also there's starkiller, which deals with agressive type inference and
compilation to native code. If these projects are serious and are well
headed, then I don't know why we are talking now of static typing.

Lets say that Pypy and/or Starkiller end up as succesful projects, what
would be the advantage of having static typing in Python?
 
A

Alex Martelli

Donn Cave said:
| making a really modular system work with static typing and inferencing
| is probably impossible; in practice, the type inferencer must examine
| all code, or a rather copious summary of it... it can't really work
| module by module in a nice, fully encapsulated way...).

Well, I would assume that a modules in a static system would present
a typed external interface, and inference would apply only within the
module being compiled. ...
There might be tricky spots, but I imagine the Objective CAML
folks would object to an assertion like "making a really modular
system work with static typing and inferencing is probably
impossible"!

It seems to me that you just restated in the first part I quoted what
you say in the second part OCAML folks would object to. If you give up
on type inferencing across modules, and only have type inferencing
inside each module, then you're getting little mileage out of the
inferencing if your modules are many and small.

But let me quote Van Roy and Haridi, rather than paraphrasing them, lest
I fail to do them justice -- I think quoting 8 lines out of a book of
over 800 pages is "fair use" (anecdote: my Xmas gift is said book, my
wife's a real-bargain Powerbook 15" Titanium laptop -- yesterday we
weighed them against each other and determined the book's heavier;-)...

"""
Dynamic typing makes it a trivial matter to do separate compilation,
i.e. modules can be compiled without knowing anything about each other.
This allows truly open programming, in which independently written
modules can come together at runtime and interact with each other. It
also makes program development scalable, i.e., extremely large programs
can be divided into modules that can be recompiled individually without
recompiling other modules. This is harder to do with static typing
because the type discipline must be enforced across module boundaries.
"""

I see that by paraphrasing and summarizing by heart I was changing their
argument a bit, from 'enforcing the type discipline' (which is what
they're discussing, and is obviously mandatory if you want to talk about
static typing) to 'type inferencing' (which they don't discuss in this
specific paragraph). Nor do they claim 'probably impossible', since
they're only discussing enforcement, not inferencing -- just 'harder'.

Essentially, to compile a module under static typing you need full type
information for other modules -- the "without knowing anything about
each other" condition of fully separate compilation can't hold, and thus
neither can the "truly open programming" and "scalable development"
consequences. Mind you, I personally _like_ the concept of describing
an interface separately, even in a different language (Corba's IDL, say)
that's specialized for the task. But it doesn't seem to be all that
popular... without such separation, modularity plus static checking
appears to imply bottom->up coding: you need to compile modules in some
topologically sorted order compatible with the "X uses Y" relation.


Alex
 
G

gabriele renzi

Mike Meyer ha scritto:
LISP has type declarations. Everybody I know doing production work in
LISP uses them. It's the only way to get reasonable performance out of
LISP compiled code.

I also think some smalltalk allow you to tag stuff with type hints for
performance.
Which raises what, to me, is the central question. If we have optional
static typing, can I get a performance enhancement out of it? If not,
why bother?

for documentation and 'crash early' purposes, I'd say.
Btw, why don't we rip out the approach of CL and some schemes that offer
optional typing ? (not that I understand how those work, anyway)
 
M

moma

Adding Optional Static Typing to Python looks like a quite complex
thing, but useful too:
http://www.artima.com/weblogs/viewpost.jsp?thread=85551

I have just a couple of notes:

Boo (http://boo.codehaus.org/) is a different language, but I like its
"as" instead of ":" and "->", to have:
def min(a as iterable(T)) as T:
Instead of:
def min(a: iterable(T)) -> T:

I want to introduce a shorter syntax form:

Declare variables
a'int
a'int = 13
s'string = "Santana"
d'float

def min(a'int, b'int)'int:
c'int # Declare a local variable c of type int
c = a
...
*************************************

The (template) notation is very useful.
def min(a'T, b'T)'T:
c'T
c = a
....

f'float = min(1.2, 2.2)
i'int = min(9, f) ## of course: comiler adds int(f) type conversion
*************************************

But these 2 should be syntactically wrong. The type of T is not obvious.

def max(a'int, b'int)'T:
....
def max(a, b)'T:
....
*************************************

The big question is how to handle combound data types (container
objects) ? lists, tuples, maps...

Can a list contain various data types?

# Declare h as list of ints
h'int[] = [1, 8, 991]

# These declarations produce syntax errors
h'int = [1, 8, 991]
error: h is a scalar not container

h'int[] = ['abc', 13, (9,8)]
^^
error: expecting int value
*************************************

Tuples

A general sequence
t = 1, 3, 4,

A tuple of ints
t'int() = 1, 3, 4,

What about this?
u'int() = t, 6, 7,

Yes, it's OK. because the basic_scalar_values are ALL ints.
((1,3,4), 6,7)


Maps
......

*************************************

I think the compiler should allow typeless containers even you compile
with --strict option. Apply --strict (strictly) to scalar types only.

*************************************


class A:
pass


def func1(h'A)
# Expects (instance) A or any subclass of A
....
*************************************


// moma
http://www.futuredesktop.org/OpenOffice.html
 
R

Rahul

I am assuming that optional type checking is being added for easier
debugging only. So if 'expects' are turned on , python raises
warnings(which do not halt the system) but not when they are turned
off. These will enable easier debugging for new people while not
affecting masters. Also,perhaps, it will be easier to accomodate till
type checking mechanism is perfected(if it is implemented at all that
is) so that python does not stop you when it is in fact python which
might be making some mistake.(This last statement is a guess only...)

It is similar to assert and __debug__=1 in a way.

So the crux is :
1.Expects is only a bridge between type checking and dynamic typing.
2.Type checking is done only as a tool which you are free to override
if you want to.
3.The objective of type checking here is only to make debugging easier
and not speed/optimization.
4.The point is not that 'expects' be a additional keyword.You can go
like this also :
def (int a,int b): or whatever you like. Only that expects make it a
bit clearer IMHO.

sincerely.,
rahul
 
M

Mike Meyer

Mind you, I personally _like_ the concept of describing
an interface separately, even in a different language (Corba's IDL, say)
that's specialized for the task. But it doesn't seem to be all that
popular... without such separation, modularity plus static checking
appears to imply bottom->up coding: you need to compile modules in some
topologically sorted order compatible with the "X uses Y" relation.

Personally, I hate declaring the interface separately, whether in the
same language or another language. On the other hand, generating the
interface information from the code for type checking (and
documentation) purposes makes an incredible amount of sense. A type
inferencing engine to generate that information from Python code -
with possible a bit of human assistance - would make it possible to
use pychecker to catch duck typing errors without having to import an
entire module.

<mike
 
R

Robert Kern

Luis said:
I don't understand why this discussion on optional static typing came
up right at this moment.
As far as I know, it has been discussed many times in the past, and
there even was a SIG that simply died... but it seems that it never was
something of much interest to python developers (that's my impression,
I might be wrong).

Now, that the Pypy project is steadily advancing (and which is aimed at
making Python faster while keeping it dynamic and simple), why has this
topyc been raised?

Also there's starkiller, which deals with agressive type inference and
compilation to native code. If these projects are serious and are well
headed, then I don't know why we are talking now of static typing.

Lets say that Pypy and/or Starkiller end up as succesful projects, what
would be the advantage of having static typing in Python?

Starkiller would *love* type declarations. In Michael Salib's words (to
my recollection), "every piece of type information helps." I'm sure that
PyPy's code generator(s) could use this information to good effect, too.

Automatic type inferencing is great, but sometimes the inference is
"object". Being able to supply more information about types helps
Starkiller keep the inferences tight and specific.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
A

Alex Martelli

Mike Meyer said:
Personally, I hate declaring the interface separately, whether in the
same language or another language.

Yeah, so do many people -- as I mentioned, it's not all that popular.
Personally, I don't see _why_, but there's no doubt that the market is
speaking quite loudly in this respect; the concept of using IDL is seen
as a handicap of (e.g.) Corba and pure COM (while I agree with Don Box's
lavish _praise_ for that concept in a COM context!), and the concept of
not having an interface separate from the implementation is touted as a
plus of (e.g.) Java vs C++.

But then, the above criticism applies: if interface and implementation
of a module are tightly coupled, you can't really do fully modular
programming AND static typing (forget type inferencing...).

On the other hand, generating the
interface information from the code for type checking (and
documentation) purposes makes an incredible amount of sense. A type
inferencing engine to generate that information from Python code -
with possible a bit of human assistance - would make it possible to
use pychecker to catch duck typing errors without having to import an
entire module.

And why should it be a problem to "import an entire module", pray? I
just fail to see any BIG advantage here. Sure, you could extend pydoc
to generate a bit more docs than it already does -- big deal. How much
would that augment your overall productivity? 5%? 10%? Some tiny
incremental amount, anyway. If it prompts you to do less unit testing,
it might even have a negative lifecycle-productivity impact;-).


Alex
 
L

Luis M. Gonzalez

Robert said:
Automatic type inferencing is great, but sometimes the inference is
"object". Being able to supply more information about types helps
Starkiller keep the inferences tight and specific.

Hmm... I'm not an expert in this subject at all, but I think that when
the inference is "object", as you said, is because the type couldn't
be inferred so it defaults to object, which is the more general type of
all.
For example, this is what happens in Boo, which is the only language I
know (a little bit) that uses type inference.
 
M

Mike Meyer

Yeah, so do many people -- as I mentioned, it's not all that popular.
Personally, I don't see _why_

Because it violates the principle of "Once and only once". If the
interface is described in two places, that means it's possible to
update one and forget to update the other. That's not possible if the
interface is defined once and only once. I've forgotten to update my
IDL more than once.
But then, the above criticism applies: if interface and implementation
of a module are tightly coupled, you can't really do fully modular
programming AND static typing (forget type inferencing...).

I beg to differ. Eiffel manages to do this quite well. Then again,
every Eiffel environment comes with tools to extract the interface
information from the code. With SmartEiffel, it's a command called
"short". Doing "short CLASSNAME" is like doing "pydoc modulename",
except that it pulls routine headers and DbC expression from the code,
and not just from comments.
And why should it be a problem to "import an entire module", pray?

It isn't. I was just thinking out loud. If you had a type inferencing
engine that created the type signatures of everything in a module,
that would let you do type checking without importing the module. It's
actually more interesting for compiled languages than for interpreted
ones.

<mike
 
R

Robert Kern

Luis said:
Hmm... I'm not an expert in this subject at all, but I think that when
the inference is "object", as you said, is because the type couldn't
be inferred so it defaults to object, which is the more general type of
all.
For example, this is what happens in Boo, which is the only language I
know (a little bit) that uses type inference.

Starkiller, at least, can deal with cases where a variable might be one
of a set of types and generates code for each of this set. Explicit type
declarations can help keep these sets small and reduces the number of
times that Starkiller needs to fall back to PyObject_* calls.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
D

Donn Cave

Quoth Mike Meyer <[email protected]>:
| (e-mail address removed) (Alex Martelli) writes:
....
|> But then, the above criticism applies: if interface and implementation
|> of a module are tightly coupled, you can't really do fully modular
|> programming AND static typing (forget type inferencing...).
|
| I beg to differ. Eiffel manages to do this quite well. Then again,
| every Eiffel environment comes with tools to extract the interface
| information from the code. With SmartEiffel, it's a command called
| "short". Doing "short CLASSNAME" is like doing "pydoc modulename",
| except that it pulls routine headers and DbC expression from the code,
| and not just from comments.

And you probably think Eiffel supports fully modular programming, as
I thought Objective CAML did. But Alex seems not to agree.

The way I understand it, his criteria go beyond language level semantics
to implementation details, like whether a change to a module may require
dependent modules to be recompiled when they don't need to be rewritten.
I don't know whether it's a reasonable standard, but at any rate hopefully
he will explain it better than I did and you can decide for oneself whether
it's an important one.

Donn Cave, (e-mail address removed)
 
A

Alex Martelli

Donn Cave said:
And you probably think Eiffel supports fully modular programming, as
I thought Objective CAML did. But Alex seems not to agree.

Rather, I would say it's Dr Van Roy and Dr Haridi who do not agree;
their definition of "truly open programming" being quite strict, based
on modules "not having to know anything about each other". BTW, while I
have looked into Alice yet, some people posting on this thread
apparently have -- I know the Alice whitepaper claims Alice adds to ML
just what is needed to support "truly open programming" -- features that
OCAML doesn't have, so, if the Alice researchers are right, your
assessment of OCAML is wrong; OTOH, since Alice _is_ statically typed
like any other ML dialect, they'd appear to rebut Van Roy and Haridi's
contention, too. VR & H do mention Alice at the end of their pages
about static vs dynamic typing but they don't appear to acknowledge the
claim. Maybe it all boils down to both Oz and Alice being still open
research efforts, making it difficult to assess fully what results they
can eventually achieve.

The way I understand it, his criteria go beyond language level semantics
to implementation details, like whether a change to a module may require
dependent modules to be recompiled when they don't need to be rewritten.

Ah yes, definitely: Lakos' "Large Scale C++ Software Design" was the
first place where I met a thorough exposition of the evil effects that
come from modules "just needing to be recompiled, not rewritten" as soon
as the number of modules becomes interestingly large, at least with the
kind of dependency graphs that naturally emerge when designers are
focused on all other important issues of software design rather than
dependency control.

Basically, what emerges from Lakos' analysis, and gets formalized in
Robert Martin's precious "dependency inversion principle", is the
equivalent of the interface/implementation separation that you always
get e.g. in Corba, by writing the interface in IDL and the
implementation in whatever language you wish: both implementation and
client modules depend on an abstract-interface module, so the dependency
arrows go the "right" way (concrete depends on abstract) and the cycle
is more tractable. But if you proceed by extracting the abstract
interface from a concrete implementation, that's a dependency too
(albeit for a tool that's not the compiler), and it's the wrong way
around... abstract depends on concrete. (Main hope being that the
abstract part changes VERY rarely -- when it DOES change, the rebuild
cost is still potentially out of control).

You may use a separate language to express the interface (IDL,
whatever), you may use a subset of the implementation language with
strict constraints, but one way or another, if you want to control your
dependency graph and avoid the ills Lakos points out so clearly,
dependency inversion is an indispensable (perhaps not sufficient) tool.

Mike points out this breaks "once and only once", but then that same
principle is broken in any language where you express the type of a
variable twice -- in declaring AND in using it -- as is typical of most
statically-typed languages (even where the language does not mandate the
redundancy, as in *ML or Haskell, typical style in such languages is to
have the redundancy anyway; and AFAIK Eiffel _does_ mandate the
redundancy, just like C++ or Java do).

Dynamic languages have dependencies too, maybe not "reified" but
conceptually, _operationally_ there. For example, if you consider the
contract (as in DbC) to be part of a type, that's how Eiffel works: it
diagnoses the type violation (breach of contract) dynamically, at
runtime. And that's how's Oz, Python, Ruby, Erlang, etc etc, work too:
no type (signature, contract, ...) violation along any dependency arrow
need pass silently, they're diagnosed dynamically just like DbC
violations in Eiffel or more generally violations of other constraints
which a given language chooses not to consider "type-related" (e.g., if
the positive-reals are a separate type from general reals, sqrt(x) with
x<0 is a type violation -- if there's only a general reals type, it's
not -- hopefully, it's diagnosed at runtime, though).

Robert Martin's "Dynamic Visitor" design pattern is, I believe, an
instructive case. The classic "Visitor", per se, has intractable
dependency problems and cannot possibly respect fully the open/closed
principle; Dynamic Visitor uses the escapes to dynamic typing allowed by
such tools as C++'s dynamic_cast (and Java's own casts, which work quite
similarly) to ensure a sane dependency structure. If you don't have or
don't allow any escape from static typing, Visitor is somewhat of a
nightmare DP as soon as the kinds of visitors and visitees start
multiplying -- at the very least it's a build-time nightmare, even if
your language has tricks to save the need to change the code. Martin
does a much more thorough job of exploring these issues and I'll point
to his essays rather than trying to expand this summary further.

I don't know whether it's a reasonable standard, but at any rate hopefully
he will explain it better than I did and you can decide for oneself whether
it's an important one.

If you're building reasonably small systems, so that "recompiling the
world" is no big deal, dependency control may be considered a marginal
consideration -- some things such as dependency cycles you probably want
to eradicate anyway (so Visitor can still be thought of as nasty;-), but
there are no doubt many more important software development issues to
deal with than dependency control. But build times suffer from typical
combinatorial explosions with the growth of number of modules and
complications in the dependency graphs, so, if you're building large
systems, this isn't the kind of efficiency problem that goes away in two
or three years thanks to Moore's Law.


Alex
 
L

Luis M. Gonzalez

Robert said:
Starkiller, at least, can deal with cases where a variable might be one
of a set of types and generates code for each of this set. Explicit type
declarations can help keep these sets small and reduces the number of
times that Starkiller needs to fall back to PyObject_* calls.

Will we be able to see it anytime soon?
I'm eagerly waiting for its release.
 
M

Mike Meyer

Donn Cave said:
Quoth Mike Meyer <[email protected]>:
| (e-mail address removed) (Alex Martelli) writes:
...
|> But then, the above criticism applies: if interface and implementation
|> of a module are tightly coupled, you can't really do fully modular
|> programming AND static typing (forget type inferencing...).
|
| I beg to differ. Eiffel manages to do this quite well. Then again,
| every Eiffel environment comes with tools to extract the interface
| information from the code. With SmartEiffel, it's a command called
| "short". Doing "short CLASSNAME" is like doing "pydoc modulename",
| except that it pulls routine headers and DbC expression from the code,
| and not just from comments.

And you probably think Eiffel supports fully modular programming, as
I thought Objective CAML did. But Alex seems not to agree.

The way I understand it, his criteria go beyond language level semantics
to implementation details, like whether a change to a module may require
dependent modules to be recompiled when they don't need to be rewritten.
I don't know whether it's a reasonable standard, but at any rate hopefully
he will explain it better than I did and you can decide for oneself whether
it's an important one.

I read through his explanation. And the answer for Eiffel is, of
course, "it depends".

There's an optimization that embeds a class data directly in the
cilent class - the expanded keyword. If you have an expanded variable
or type in your client class, then changing the implementation of the
provider may require recompilation of the client. On the other hand,
that's pretty much an optimization, and so you shouldn't run into it
during development.

SmartEiffel, on the other hand, always does full-source analysis. It
drops class features that aren't used by any client, so changing the
client can cause the provider to be recompiled.

<mike
 
M

Michael Hobbs

Rahul said:
I am assuming that optional type checking is being added for easier
debugging only. So if 'expects' are turned on , python raises
warnings(which do not halt the system) but not when they are turned
off. These will enable easier debugging for new people while not
affecting masters. Also,perhaps, it will be easier to accomodate till
type checking mechanism is perfected(if it is implemented at all that
is) so that python does not stop you when it is in fact python which
might be making some mistake.(This last statement is a guess only...)

It is similar to assert and __debug__=1 in a way.

So the crux is :
1.Expects is only a bridge between type checking and dynamic typing.
2.Type checking is done only as a tool which you are free to override
if you want to.
3.The objective of type checking here is only to make debugging easier
and not speed/optimization.
4.The point is not that 'expects' be a additional keyword.You can go
like this also :
def (int a,int b): or whatever you like. Only that expects make it a
bit clearer IMHO.

sincerely.,
rahul

Your proposition reminds me very much of Design by Contract, which is
a prominent feature of the Eiffel programming language. Considering
that Python is an interpreted language where type checking would
naturally occur at runtime, I think Design by Contract would be more
appropriate than static typing.

In a function's contract, not only could it state that its parameter
must be an integer, but also that it must be > 50 and be divisible by
7. If a value is passed to the function that violates the contract,
it raises an exception.

In Eiffel, contract checking can be turned on or off based on a
compiler flag or a runtime switch.

- Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top