Type Hinting vs Type Checking and Preconditions

T

Tom Bradford

Let me first say that I'm sure that this subject has come up before,
and so forgive me for beating a dead horse. Secondly, let me say that
Python's strength is its dynamic nature, and I don't believe that it
should ever require a precondition scaffolding. With that said, I do
believe that something like type hinting would be beneficial to the
Python community, both for tool enablement and for disambiguous
programming.

Here is what I mean. The following function, though conventionally
indicating that it will perform a multiplication, will yield standard
Python behaviors if a string value is passed to it:

def multiplyByTwo(value):
return value * 2

Passing 14 to it will return 28, whereas passing "14" to it will return
"1414". Granted, we know and accept that this is Python's behavior
when you multiply two values, but because we don't (and shouldn't have
to) know the inner workings of a function, we don't know that the types
of the values that we pass into it may adversly affect that results
that it yields.

Now, on the other hand, if we were to introduce a purely optional type
hint to the function prototype, such as follows:

def multiplyByTwo(value:int):
return value * 2

The python interpreter could do the work of casting the incoming
parameter to an int (if it is not already) before it is multipled,
resulting in the desired result or a typecasting error otherwise.
Furthermore, it could do it more efficiently than a developer having to
put conditional code at the beginning of traditionally typecasting
functions.

All of this wouldn't require changes to the calling code, and would be
a purely optional feature of the function declaration, one that as far
as I can tell would be backward compatible both with Python's standard
operating behavior, and it's overall grammar. This hinting would also
apply to derived class names. Functions like the one first
demonstrated would continue to operate as normal.

The additional benefit of this type hinting mechanism is that tools
like Komodo would be able to perform a "close-to-java" like amount of
code analysis and completion, even though Python would continue to be
just as dynamic as it had always been, refining Python's productivity
benefits even moreso.

This could also be extended into the realm of return types, but that
might imply imposing typing in calling code, which is why I do not
propose it as part of a type hinting system, as I believe calling code
should remain unaffected by the introduction of such a system.

The next question is how do you represent lists, maps and tuples in
this system? would it be something like this? myList:[] And if so,
do we further allow for typing hinting within the contents of the list
such as myList:[int]? There would be no harm in allowing a parameter
that is a list, map, or tuple to remain in a parameter list without
type hinting, even while other parameters retain hints.

It wouldn't be hard to prototype something like this using Jython.

Thoughts?
 
B

bearophileHUGS

You can look at this, its API looks very well thought out to me:
http://oakwinter.com/code/typecheck/
Now, on the other hand, if we were to introduce a purely optional type
hint to the function prototype, such as follows:
def multiplyByTwo(value:int): return value * 2

I don't think Python will have something like this...

Bye,
bearophile
 
R

Roy Smith

"Tom Bradford said:
Here is what I mean. The following function, though conventionally
indicating that it will perform a multiplication, will yield standard
Python behaviors if a string value is passed to it:

def multiplyByTwo(value):
return value * 2

Passing 14 to it will return 28, whereas passing "14" to it will return
"1414". Granted, we know and accept that this is Python's behavior
when you multiply two values, but because we don't (and shouldn't have
to) know the inner workings of a function, we don't know that the types
of the values that we pass into it may adversly affect that results
that it yields.

The question is, what is the function *supposed to do*? Without knowing
what it is *supposed to do*, it is impossible to say for sure whether
returning "1414" is correct or not. Consider two different functions:

def multiplyByTwo_v1(value):
"""Returns the argument multiplied by 2. If the argument is a
string representation of an integer, another string is returned
which is the string representation of that integer multiplied
by 2.
""""
return value * 2

def multiplyByTwo_v2(value):
"""Returns the argument multiplied by 2.
""""
return value * 2

The first one should return "28" when passed "14". If it returns "1414",
it's broken. I know this seems rather silly and pedantic, but it's an
important point.

I was once working on a project which historically didn't have any unit
tests. We had a function called something like "isValidIP" in the library
which returned True or False depending on whether its (string) argument was
a valid IP address.

I wrote some unit tests and it failed on a corner case like
"255.255.255.255" (or maybe it was "0.0.0.0"). Turns out, the original
author was using it in some special situation where 255.255.255.255 wasn't
valid for his purposes. We got down to, "OK, *you* document what the
function is supposed to do, and *I'll* write a unit test which proves it
does what the documentation says". You would think that would be easy, but
it never got done because we couldn't get everybody to agree on what the
function was supposed to do.

It was being used in production code. I would have thought it would bother
people that we were using a function without knowing what it was supposed
to do, but what really bothered people more was that we had a unit test
that was failing. And the solution was to back out unit test. Sometimes
politics trumps technology.

PS, as far as I know, that project is now dead, but for other reasons far
worse than one underspecified function :)
 
P

Paul Boddie

Roy said:
def multiplyByTwo(value):
return value * 2
[...]

The question is, what is the function *supposed to do*? Without knowing
what it is *supposed to do*, it is impossible to say for sure whether
returning "1414" is correct or not.

Indeed. And I don't think arithmetic-based examples are really very
good at bringing out the supposed benefits of type declarations or are
very illustrative when reasoning about type systems, mostly because
everyone assumes the behaviour of the various operators without
considering that in general the behaviour is arbitrary. In other words,
people look at expressions like "a * b" and say "oh yes, numbers being
multiplied together producing more numbers" without thinking that "a"
might be an instance of "Snake" and "b" might be an instance of
"Reptile" and the "Snake.__mul__" method (or perhaps the
"Reptile.__rmul__" method) might produce a range of different things
that aren't trivially deduced.
Consider two different functions:

def multiplyByTwo_v1(value):
"""Returns the argument multiplied by 2. If the argument is a
string representation of an integer, another string is returned
which is the string representation of that integer multiplied
by 2.
""""
return value * 2

def multiplyByTwo_v2(value):
"""Returns the argument multiplied by 2.
""""
return value * 2

The first one should return "28" when passed "14". If it returns "1414",
it's broken. I know this seems rather silly and pedantic, but it's an
important point.

I've done some work on this kind of thing which actually specialises
functions/methods and produces something resembling that quoted above,
and I believe that various other works produce similar specialisations
when reasoning about the behaviour of Python programs. Specifically,
you'd write the first version like this:

def multiplyByTwo_v1(value):
if isinstance(value, int): return int.__mul__(value, 2)
elif isinstance(value, string): return string.__mul__(value, 2)
else: raise RuntimeError

Really, you'd want to avoid having a single specialisation, having
separate ones for each "incoming type", although that might be hard to
arrange in every case.

Paul

P.S. Have a look here for some simple (and now quite dated) examples:

http://www.boddie.org.uk/python/analysis.html

Specifically, here:

http://www.boddie.org.uk/python/analysis-summaries.html
(A good test of CSS standards compliance if nothing else!)

I'll hopefully make a new release at some point in the near future
which tries to do a better job at deducing the various types and
invocation targets.
 
F

Florian Diesch

Tom Bradford said:
Let me first say that I'm sure that this subject has come up before,
and so forgive me for beating a dead horse. Secondly, let me say that
Python's strength is its dynamic nature, and I don't believe that it
should ever require a precondition scaffolding. With that said, I do
believe that something like type hinting would be beneficial to the
Python community, both for tool enablement and for disambiguous
programming.

Here is what I mean. The following function, though conventionally
indicating that it will perform a multiplication, will yield standard
Python behaviors if a string value is passed to it:

def multiplyByTwo(value):
return value * 2

Passing 14 to it will return 28, whereas passing "14" to it will return
"1414". Granted, we know and accept that this is Python's behavior
when you multiply two values, but because we don't (and shouldn't have
to) know the inner workings of a function, we don't know that the types
of the values that we pass into it may adversly affect that results
that it yields.

Now, on the other hand, if we were to introduce a purely optional type
hint to the function prototype, such as follows:

def multiplyByTwo(value:int):
return value * 2

The python interpreter could do the work of casting the incoming
parameter to an int (if it is not already) before it is multipled,
resulting in the desired result or a typecasting error otherwise.
Furthermore, it could do it more efficiently than a developer having to
put conditional code at the beginning of traditionally typecasting
functions.

What's the advantage? Instead of "multiplication may not do what I want
with some classes" you got "casting to int may not do what I want
with some classes".
Passing a float now returns a much more counterintuitive result than
passing a string in the old function.
And it's not working anymore with classes which you can not cast to int
but implement multiplication.

In any case you have to document what exactly your function is doing and
the user has to read this documentation.



Florian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top