What is Expressiveness in a Computer Language

David Hopwood · Jun 21, 2006

Pascal said:
Yes, but the question remains how a static type system can deal with
this kind of updates.

It's not difficult in principle:

- for each class [*], define a function which converts an 'old' value of
that class to a 'new' value (the ability to do this is necessary anyway
to support some kinds of upgrade). A default conversion function may be
autogenerated if the class definition has changed only in minor ways.

- typecheck the new program and the conversion functions, using the old
type definitions for the argument of each conversion function, and the
new type definitions for its result.

- have the debugger apply the conversions to all values, and then resume
the program.

[*] or nearest equivalent in a non-OO language.

David Hopwood · Jun 21, 2006

genea said:
[...] NOW that being said, I think
that the reason I like Haskell, a very strongly typed language, is that
because of it's type system, the language is able to do things like
lazy evaluation, [...]

Lazy evaluation does not depend on, nor is it particularly helped by
static typing (assuming that's what you mean by "strongly typed" here).

An example of a non-statically-typed language that supports lazy evaluation
is Oz. (Lazy functions are explicitly declared in Oz, as opposed to Haskell's
implicit lazy evaluation, but that's not because of the difference in type
systems.)

Anton van Straaten · Jun 21, 2006

Marshall said:
Joe said:

They *do* have a related meaning. Consider this code fragment:
(car "a string")
[...]
Both `static typing' and `dynamic typing' (in the colloquial sense) are
strategies to detect this sort of error.

Click to expand...

The thing is though, that putting it that way makes it seems as
if the two approaches are doing the same exact thing, but
just at different times: runtime vs. compile time. But they're
not the same thing. Passing the static check at compile
time is universally quantifying the absence of the class
of error; passing the dynamic check at runtime is existentially
quantifying the absence of the error. A further difference is
the fact that in the dynamically typed language, the error is
found during the evaluation of the expression; in a statically
typed language, errors are found without attempting to evaluate
the expression.

I find everything about the differences between static and
dynamic to be frustratingly complex and subtle.

Let me add another complex subtlety, then: the above description misses
an important point, which is that *automated* type checking is not the
whole story. I.e. that compile time/runtime distinction is a kind of
red herring.

In fact, automated type checking is orthogonal to the question of the
existence of types. It's perfectly possible to write fully typed
programs in a (good) dynamically-checked language.

In a statically-checked language, people tend to confuse automated
static checking with the existence of types, because they're thinking in
a strictly formal sense: they're restricting their world view to what
they see "within" the language.

Then they look at programs in a dynamically-checked language, and see
checks happening at runtime, and they assume that this means that the
program is "untyped".

It's certainly close enough to say that the *language* is untyped. One
could also say that a program, as seen by the language, is untyped.

But a program as seen by the programmer has types: the programmer
performs (static) type inference when reasoning about the program, and
debugs those inferences when debugging the program, finally ending up
with a program which has a perfectly good type scheme. It's may be
messy compared to say an HM type scheme, and it's usually not proved to
be perfect, but that again is an orthogonal issue.

Mathematicians operated for thousands of years without automated
checking of proofs, so you can't argue that because a
dynamically-checked program hasn't had its type scheme proved correct,
that it somehow doesn't have types. That would be a bit like arguing
that we didn't have Math until automated theorem provers came along.

These observations affect the battle over terminology in various ways.
I'll enumerate a few.

1. "Untyped" is really quite a misleading term, unless you're talking
about something like the untyped lambda calculus. That, I will agree,
can reasonably be called untyped.

2. "Type-free" as suggested by Chris Smith is equally misleading. It's
only correct in a relative sense, in a narrow formal domain which
ignores the process of reasoning about types which is inevitably
performed by human programmers, in any language.

3. A really natural term to refer to types which programmers reason
about, even if they are not statically checked, is "latent types". It
captures the situation very well intuitively, and it has plenty of
precedent -- e.g. it's mentioned in the Scheme reports, R5RS and its
predecessors, going back at least a decade or so (haven't dug to check
when it first appeared).

4. Type theorists like to say that "universal" types can be used in a
statically-typed language to subsume "dynamic types". Those theorists
are right, the term "dynamic type", with its inextricable association
with runtime checks, definitely gets in the way here. It might be
enlightening to rephrase this: what's really happening is that universal
types allow you to embed a latently-typed program in a
statically-checked language. The latent types don't go anywhere,
they're still latent in the program with universal types. The program's
statically-checked type scheme doesn't capture the latent types.
Describing it in these terms clarifies what's actually happening.

5. Dynamic checks are only part of the mechanism used to verify latent
types. They shouldn't be focused on as being the primary equivalent to
static checks. The closest equivalent to the static checks is a
combination of human reasoning and testing, in which dynamic checks play
an important but ultimately not a fundamental part. You could debug a
program and get the type scheme correct without dynamic checks, it would
just be more difficult.

So, will y'all just switch from using "dynamically typed" to "latently
typed", and stop talking about any real programs in real programming
languages as being "untyped" or "type-free", unless you really are
talking about situations in which human reasoning doesn't come into
play? I think you'll find it'll help to reason more clearly about this
whole issue.

Thanks for your cooperation!!

Anton

Pascal Costanza · Jun 21, 2006

David said:
Pascal said:

Yes, but the question remains how a static type system can deal with
this kind of updates.

Click to expand...

It's not difficult in principle:

- for each class [*], define a function which converts an 'old' value of
that class to a 'new' value (the ability to do this is necessary anyway
to support some kinds of upgrade). A default conversion function may be
autogenerated if the class definition has changed only in minor ways.

Yep, this is more or less exactly how CLOS does it. (The conversion
function is called update-instance-for-redefined-class, and you can
provide your own methods on it.)

- typecheck the new program and the conversion functions, using the old
type definitions for the argument of each conversion function, and the
new type definitions for its result.

The problem here is: The program is already executing, so this typecheck
isn't performed at compile-time, in the strict sense of the word (i.e.,
before the program is deployed). It may still be a syntactic analysis,
but you don't get the kind of guarantees anymore that you typically
expect from a static type checker _before_ the program is started in the
first place.

(It's really important to understand that the idea is to use this for
deployed programs - albeit hopefully in a more structured fashion - and
not only for debugging. The example I have given is an extreme one that
you would probably not use as such in a "real-world" setting, but it
shows that there is a boundary beyond which static type systems cannot
be used in a meaningful way anymore, at least as far as I can tell.)

- have the debugger apply the conversions to all values, and then resume
the program.

In CLOS, this conversion is defined as part of the language proper, but
this is mostly because Common Lisp doesn't make a sharp distinction
between debugging capabilities and "regular" language features. (I think
it's a good thing that there is no strong barrier against having
debugging capabilities in a deployed program.)

[*] or nearest equivalent in a non-OO language.

Pascal

Andreas Rossberg · Jun 21, 2006

David said:
Oh, but it *does* make sense to talk about dynamic tagging in a statically
typed language.

It even makes perfect sense to talk about dynamic typing in a statically
typed language - but keeping the terminology straight, this rather
refers to something like described in the well-known paper of the same
title (and its numerous follow-ups):

Martin Abadi, Luca Cardelli, Benjamin Pierce, Gordon Plotkin
Dynamic typing in a statically-typed language.
Proc. 16th Symposium on Principles of Programming Languages, 1989
/ TOPLAS 13(2), 1991

Note how this is totally different from simple tagging, because it deals
with real types at runtime.

- Andreas

=?iso-8859-1?q?Torben_=C6gidius_Mogensen?= · Jun 21, 2006

Rob Thorpe said:
Hmm. You're right, ML is no-where in my definition since it has no
variables.

That's not true. ML has variables in the mathematical sense of
variables -- symbols that can be associated with different values at
different times. What it doesn't have is mutable variables (though it
can get the effect of those by having variables be immutable
references to mutable memory locations).

What Andreas was alluding to was presumably FP-style languages where
functions or relations are built by composing functions or relations
without ever naming values.

Torben

Chris Uppal · Jun 21, 2006

Darren New wrote:

[me:]

Interestingly, Ada defines a type as a collection of values. It works
quite well, when one consistantly applies the definition.

I have never been very happy with relating type to sets of values (objects,
whatever). I'm not saying that it's formally wrong (but see below), but it
doesn't fit with my intuitions very well -- most noticeably in that the sets
are generally unbounded so you have to ask where the (intentional) definitions
come from.

Two other notions of what "type" means might be interesting, both come from
attempts to create type-inference mechanisms for Smalltalk or related
languages. Clearly one can't use the set-of-values approach for these purposes
;-) One approach takes "type" to mean "set of classes" the other takes a
finer-grained approach and takes it to mean "set of selectors" (where
"selector" is Smalltalk for "name of a method" -- or, more accurately, name of
a message).

But I would rather leave the question of what a type "is" open, and consider
that to be merely part of the type system. For instance the hypothetical
nullability analysis type system I mentioned might have only three types
NULLABLE, ALWAYSNULL, and NEVERNULL.

It's worth noting, too, that (in some sense) the type of an object can change
over time[*]. That can be handled readily (if not perfectly) in the informal
internal type system(s) which programmers run in their heads (pace the very
sensible post by Anton van Straaten today in this thread -- several branches
away), but cannot be handled by a type system based on sets-of-values (and is
also a counter-example to the idea that "the" dynamic type of an object/value
can be identified with its tag).

([*] if the set of operations in which it can legitimately partake changes.
That can happen explicitly in Smalltalk (using DNU proxies for instance if the
proxied object changes, or even using #becomeA

, but can happen anyway in less
"free" languages -- the State Pattern for instance, or even (arguably) in the
difference between an empty list and a non-empty list).

-- chris

Chris Uppal · Jun 21, 2006

Anton said:
But a program as seen by the programmer has types: the programmer
performs (static) type inference when reasoning about the program, and
debugs those inferences when debugging the program, finally ending up
with a program which has a perfectly good type scheme. It's may be
messy compared to say an HM type scheme, and it's usually not proved to
be perfect, but that again is an orthogonal issue.

I like this way of looking at it.

-- chris

Chris Uppal · Jun 21, 2006

David said:
When people talk
about "types" being associated with values in a "latently typed" or
"dynamically typed" language, they really mean *tag*, not type.

I don't think that's true. Maybe /some/ people do confuse the two, but I am
certainly a counter-example ;-)

The tag (if any) is part of the runtime machinery (or, if not, then I don't
understand what you mean by the word), and while that is certainly a reasonably
approximation to the type of the object/value, it is only an approximation,
and -- what's more -- is only an approximation to the type as yielded by one
specific (albeit abstract, maybe even hypothetical) type system.

If I send #someMessage to a proxy object which has not had its referent set
(and assuming the default value, presumably some variant of nil, does not
understand #someMessage), then that's just as much a type error as sending
#someMessage to a variable holding a nil value. If I then assign the referent
of the proxy to some object which does understand #someMessage, then it is not
a type error to send #someMessage to the proxy. So the type has changed, but
nothing in the tag system of the language implementation has changed.

-- chris

Rob Thorpe · Jun 21, 2006

Matthias said:
You seem to be confusing "does not have a type" with "no type
information is passed at runtime".

Believe me, I have.

In a C compiler the compiler has no idea what the values are in the
program.
It knows only their type in that it knows the type of the variable they
are contained within.
Would you agree with that?

Which result are you getting? What does it mean to "make sense"?

Well the right one actually, bad example.

But, if I cast an unsigned int 2500000000 to signed I get -1794967296.

Andreas Rossberg · Jun 21, 2006

Rob said:
I think this should make it clear. If I have a "xyz" in lisp I know it
is a string.
If I have "xyz" in an untyped language like assembler it may be
anything, two pointers in binary, an integer, a bitfield. There is no
data at compile time or runtime to tell what it is, the programmer has
to remember.

You have to distinguish between values (at the level of language
semantics) and their low-level representation (at the implementation
level). In a high-level language, the latter should be completely
immaterial to the semantics, and hence not interesting for the discussion.

Hmm. You're right, ML is no-where in my definition since it has no
variables.

Um, it has. Mind you, it has no /mutable/ variables, but that was not
even what I was talking about.

Well, is it even represented at compile time?
The compiler doesn't know in general what values will exist at runtime,
it knows only what types variables have. Sometimes it only has partial
knowledge and sometimes the programmer deliberately overrides it. From
what knowledge it you could say it know what types values will have.

Again, variables are insignificant. From the structure of an expression
the type system derives the type of the resulting value. An expression
may contain variables, and then the type system generally must know (or
be able to derive) their types too, but that's a separate issue. Most
values are anonymous. Nevertheless their types are known.

Unfortunately it's often necessary to break static type systems.

Your definitely using the wrong static language then. ;-)

- Andreas

Chris Uppal · Jun 21, 2006

Chris said:
It would be interesting to see what a language designed specifically to
support user-defined, pluggable, and perhaps composable, type systems
would look like. [...]

Click to expand...

You mean in terms of a practical programming language? If not, then
lambda calculus is used in precisely this way for the static sense of
types.

Good point. I was actually thinking about what a practical language might look
like, but -- hell -- why not start with theory for once ? ;-)

I think Marshall got this one right. The two are accomplishing
different things. In one case (the dynamic case) I am safeguarding
against negative consequences of the program behaving in certain non-
sensical ways. In the other (the static case) I am proving theorems
about the impossibility of this non-sensical behavior ever happening.

And so conflating the two notions of type (-checking) as a kind of category
error ? If so then I see what you mean, and it's a useful distinction, but am
unconvinced that it's /so/ helpful a perspective that I would want to exclude
other perspectives which /do/ see the two as more-or-less trivial variants on
the same underlying idea.

I acknowledge those questions. I believe they are valid. I don't know
the answers. As an intuitive judgement call, I tend to think that
knowing the correctness of these things is of considerable benefit to
software development, because it means that I don't have as much to
think about at any one point in time. I can validly make more
assumptions about my code and KNOW that they are correct. I don't have
to trace as many things back to their original source in a different
module of code, or hunt down as much documentation. I also, as a
practical matter, get development tools that are more powerful.

Agreed that these are all positive benefits of static declarative (more or
less) type systems.

But then (slightly tongue-in-cheek) shouldn't you be agitating for Java's type
system to be stripped out (we hardly /need/ it since the JVM does latent typing
anyway), leaving the field free for more powerful or more specialised static
analysis ?

(Whether it's possible to create the same for a dynamically typed
language is a potentially interesting discussion; but as a practical
matter, no matter what's possible, I still have better development tools
for Java than for JavaScript when I do my job.)

Acknowledged. Contrary-wise, I have better development tools in Smalltalk than
I ever expect to have in Java -- in part (only in part) because of the late
binding in Smalltalk and it's lack of insistence on declared types from an
arbitrarily chosen type system.

On
the other hand, I do like proving theorems, which means I am interested
in type theory; if that type theory relates to programming, then that's
great! That's probably not the thing to say to ensure that my thoughts
are relevant to the software development "industry", but it's
nevertheless the truth.

Saying it will probably win you more friends in comp.lang.functional than it
looses in comp.lang.java.programmer ;-)

-- chris

Joachim Durchholz · Jun 21, 2006

Andreas said:
Um, it has. Mind you, it has no /mutable/ variables, but that was not
even what I was talking about.

Indeed. A (possibly nonexhaustive) list of program entities that (can)
have type would comprise of mutable variables, immutable variables (i.e.
constants and parameter names), and functions resp. their results.

Regards,
Jo

Joachim Durchholz · Jun 21, 2006

Matthias said:
This was not meant to be a rigorous definition.

Rigorous or not, introducing additional undefined terms doesn't help
with explaining a term.

Also, I'm not going to repeat the textbook definitions for those
three standard terms here.

These terms certainly aren't standard for Perl, Python, Java, or Lisp,
and they aren't even standard for topics covered on comp.lang.functional
(which includes dynamically-typed languages after all).

Regards,
Jo

Joachim Durchholz · Jun 21, 2006

Pascal said:
(It's really important to understand that the idea is to use this for
deployed programs - albeit hopefully in a more structured fashion - and
not only for debugging. The example I have given is an extreme one that
you would probably not use as such in a "real-world" setting, but it
shows that there is a boundary beyond which static type systems cannot
be used in a meaningful way anymore, at least as far as I can tell.)

As soon as the running program can be updated, the distinction between
"static" (compile time) and "dynamic" (run time) blurs.
You can still erect a definition for such a case, but it needs to refer
to the update process, and hence becomes language-specific. In other
words, language-independent definitions of dynamic and static typing
won't give any meaningful results for such languages.

I'd say it makes more sense to talk about what advantages of static vs.
dynamic typing can be applied in such a situation.
E.g. one interesting topic would be the change in trade-offs: making
sure that a type error cannot occur becomes much more difficult
(particularly if the set of available types can change during an
update), so static typing starts to lose some of its appeal; OTOH a good
type system can give you a lot of guarantees even in such a situation,
even if it might have to revert to the occasional run-time type check,
so static checking still has its merits.

Regards,
Jo

Rob Thorpe · Jun 21, 2006

So, will y'all just switch from using "dynamically typed" to "latently

typed", and stop talking about any real programs in real programming
languages as being "untyped" or "type-free", unless you really are
talking about situations in which human reasoning doesn't come into
play? I think you'll find it'll help to reason more clearly about this
whole issue.

I agree with most of what you say except regarding "untyped".

In machine language or most assembly the type of a variable is
something held only in the mind of the programmer writing it, and
nowhere else. In latently typed languages though the programmer can
ask what they type of a particular value is. There is a vast
difference to writing code in the latter kind of language to writing
code in assembly.

I would suggest that at least assembly should be referred to as
"untyped".

Andreas Rossberg · Jun 21, 2006

Chris said:
I have never been very happy with relating type to sets of values (objects,
whatever).

Indeed, this view is much too narrow. In particular, it cannot explain
abstract types, which is *the* central aspect of decent type systems.
There were papers observing this as early as 1970. A type system should
rather be seen as a logic, stating invariants about a program. This can
include operations supported by values of certain types, as well as more
advanced properties, e.g. whether something can cause a side-effect, can
diverge, can have a deadlock, etc.

(There are also theoretic problems with the types-as-sets view, because
sufficiently rich type systems can no longer be given direct models in
standard set theory. For example, first-class polymorphism would run
afoul the axiom of foundation.)

It's worth noting, too, that (in some sense) the type of an object can change
over time[*].

No. Since a type expresses invariants, this is precisely what may *not*
happen. If certain properties of an object may change then the type of
the object has to reflect that possibility. Otherwise you cannot
legitimately call it a type.

Taking your example of an uninitialised reference, its type is neither
"reference to nil" nor "reference to object that understands message X",
it is in fact the union of both (at least). And indeed, languages with
slightly more advanced type systems make things like this very explicit
(in ML for example you have the option type for that purpose).

- Andreas

Pascal Costanza · Jun 21, 2006

Joachim said:
As soon as the running program can be updated, the distinction between
"static" (compile time) and "dynamic" (run time) blurs.
You can still erect a definition for such a case, but it needs to refer
to the update process, and hence becomes language-specific. In other
words, language-independent definitions of dynamic and static typing
won't give any meaningful results for such languages.

I'd say it makes more sense to talk about what advantages of static vs.
dynamic typing can be applied in such a situation.
E.g. one interesting topic would be the change in trade-offs: making
sure that a type error cannot occur becomes much more difficult
(particularly if the set of available types can change during an
update), so static typing starts to lose some of its appeal; OTOH a good
type system can give you a lot of guarantees even in such a situation,
even if it might have to revert to the occasional run-time type check,
so static checking still has its merits.

I am not opposed to this view. The two examples I have given for things
that are impossible in static vs. dynamic type systems were
intentionally extreme to make the point that you have to make a choice,
that you cannot just blindly throw (instances of) both approaches
together. Static type systems potentially change the semantics of a
language in ways that cannot be captured by dynamically typed languages
anymore, and vice versa.

There is, of course, room for research on performing static type checks
in a running system, for example immediately after or before a software
update is applied, or maybe even on separate type checking on software
increments such that guarantees for their composition can be derived.
However, I am not aware of a lot of work in that area, maybe because the
static typing community is too focused on compile-time issues.

Personally, I also don't think that's the most interesting issue in that
area, but that's of course only a subjective opinion.

Pascal

Matthias Blume · Jun 21, 2006

Rob Thorpe said:
In a C compiler the compiler has no idea what the values are in the
program.

It is no different from any other compiler, really. If the compiler
sees the literal 1 in a context that demands type int, then it knows
perfectly well what value that is.

It knows only their type in that it knows the type of the variable they
are contained within.
Would you agree with that?

Well the right one actually, bad example.

But, if I cast an unsigned int 2500000000 to signed I get -1794967296.

So, why do you think this "does not make sense"? And, as this example
illustrates, casting in C maps values to values. Depending on the
types of the source and the target, a cast might change the underlying
representation, or it might leave it the same. But it does produce a
value, and the result value is usually not the same as the argument
value, even if the representation is the same.

Matthias

Matthias Blume · Jun 21, 2006

Joachim Durchholz said:
Rigorous or not, introducing additional undefined terms doesn't help
with explaining a term.

I think you missed my point. My point was that a language is
statically typed IF IT IS DEFINED THAT WAY, i.e., if it has a static
type system that is PART OF THE LANGUAGE DEFINITION. The details are
up to each individual definition.

These terms certainly aren't standard for Perl, Python, Java, or Lisp,

Indeed. That's because these languages are not statically typed.

What is Expresiveness in a Computer Language?	24	Jul 10, 2005
proliferation of computer languages	8	Jul 18, 2008
is laziness a programer's virtue?	13	Apr 16, 2007
Hacker News, Xahlee.Org, and What is Politics?	1	Mar 15, 2010
is laziness a programer's virtue?	12	Apr 15, 2007
Math Notations, Computer Languages, and the “Form” in Formalism	4	Aug 31, 2009
general function for sorting a matrix	4	Aug 29, 2007
A Lambda Logo Tour	5	Apr 5, 2006

What is Expressiveness in a Computer Language

David Hopwood

David Hopwood

Anton van Straaten

Pascal Costanza

Andreas Rossberg

=?iso-8859-1?q?Torben_=C6gidius_Mogensen?=

Chris Uppal

Chris Uppal

Chris Uppal

Rob Thorpe

Andreas Rossberg

Chris Uppal

Joachim Durchholz

Joachim Durchholz

Joachim Durchholz

Rob Thorpe

Andreas Rossberg

Pascal Costanza

Matthias Blume

Matthias Blume

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads