Next step after pychecker

P

Philippe Fremy

Hi,

I would like to develop a tool that goes one step further than pychecker
to ensure python program validity. The idea would be to get close to
what people get on ocaml: a static verification of all types of the
program, without any kind of variable declaration. This would definitely
brings a lot of power to python.

The idea is to analyse the whole program, identify constraints on
function arguments and check that these constraints are verified by
other parts of the program.

What is in your opinion the best tool to achieve this ? I had an
extensive look at pychecker, and it could certainly be extended to do
the job. Things that still concern me are that it works on the bytecode,
which prevents it from working with jython and the new .NET python.

I am currently reading the documentation on AST and visitor, but I am
not sure that this will be the best tool either. The AST seems quite
deep and I am afraid that it will make the analysis quite slow and
complicated.


For a simple pattern of what I want to do:

def f1( a ):
return a+1

def f2():
f1( [] )

Obviously, this won't work, but neither pychecker nor the python
interpreter will catch it. But a program analysis can catch it in this
simple example.

regards,

Philippe
 
P

Paul Rubin

Philippe Fremy said:
I would like to develop a tool that goes one step further than
pychecker to ensure python program validity. The idea would be to get
close to what people get on ocaml: a static verification of all types
of the program, without any kind of variable declaration. This would
definitely brings a lot of power to python.

You know, I've always thought that ML-style type inference is an
impressive technical feat, but why is it so important to not use
declarations? This is an aspect I've never really understood.
 
S

Steven Bethard

Philippe said:
I would like to develop a tool that goes one step further than pychecker
to ensure python program validity. The idea would be to get close to
what people get on ocaml: a static verification of all types of the
program, without any kind of variable declaration. This would definitely
brings a lot of power to python.

It's a very cool idea, and there was talk of including such a tool in
Python 3.0, though as yet, no one has volunteered to solve this problem.
=) Note that you can't actually really ensure program validity, but you
can make some educated guesses like pychecker does. An example:

def f1(a):
return a + 1

def f2():
return f1([])

Your suggestion was that this code should be deemed invalid. But if
it's followed with:

f1 = ''.join
f2()

where f1 is rebound, then it's perfectly valid code. That said, I think
providing "suggestions" like pychecker does is definitely worth pursuing.
What is in your opinion the best tool to achieve this ? I had an
extensive look at pychecker, and it could certainly be extended to do
the job. Things that still concern me are that it works on the bytecode,
which prevents it from working with jython and the new .NET python.

I don't know much about what pychecker does, but if it works with the
bytecode, shouldn't it be fine for jython and IronPython? I thought the
bytecode was part of the language spec, and what was CPython specific
was how the bytecodes were actually implemented...
I am currently reading the documentation on AST and visitor, but I am
not sure that this will be the best tool either. The AST seems quite
deep and I am afraid that it will make the analysis quite slow and
complicated.

You should read up on the recent python-dev discussions about merging
the AST branch. I don't know too much about it, but apparently there's
some work being done currently on how the AST is used in Python. Sorry
I don't know more about this...


Steve
 
S

Sylvain Thenault

Hi,
Hi

I would like to develop a tool that goes one step further than pychecker
to ensure python program validity. The idea would be to get close to what
people get on ocaml: a static verification of all types of the program,
without any kind of variable declaration. This would definitely brings a
lot of power to python.

The idea is to analyse the whole program, identify constraints on function
arguments and check that these constraints are verified by other parts of
the program.

Did you take a look at the starkiller [1] and pypy projects [2] ?
What is in your opinion the best tool to achieve this ? I had an
extensive look at pychecker, and it could certainly be extended to do
the job. Things that still concern me are that it works on the bytecode,
which prevents it from working with jython and the new .NET python.

I am currently reading the documentation on AST and visitor, but I am
not sure that this will be the best tool either. The AST seems quite
deep and I am afraid that it will make the analysis quite slow and
complicated.

are you talking about the ast returned by the "parser" module, or the ast
from the "compiler" module ? The former is a higher abstraction, using
specific class instances in the tree, and most importantly with all the
parsing junk removed. See [3]. You may also be interested in pylint
[4] which is a pychecker like program built in top of the compiler ast,
and so doesn't require actual import of the analyzed code. However it's
not yet as advanced as pychecker regarding bug detection.

And finally as another poster said you should probably keep an eye open
on the python 2.5 ast branch work...

Hope that helps !

[1]http://www.python.org/pycon/dc2004/papers/1/paper.pdf)
[2]http://codespeak.net/pypy/index.cgi?home
[3]http://www.python.org/doc/current/lib/module-compiler.ast.html
[4]http://www.logilab.org/projects/pylint
 
H

huy

Paul said:
You know, I've always thought that ML-style type inference is an
impressive technical feat, but why is it so important to not use
declarations? This is an aspect I've never really understood.

You know, I think I agree ;-). Just because you don't declare the types,
doesn't mean you can change the implicit type willy nilly anyway; at
least for more complex programs anyway. In fact, it would be safer to
have type checking when you want to do something like this. I currently
needed to change a number parameter to a string parameter (found out
order_no wasn't just numbers as specs had specified). Of course this
parameter was being used in a great many places. Changing it was a bit
scary because we had to make sure it wasn't being treated as a number
anywhere throughout the code. Yes good coverage with unit tests would
have helped but unfortunately we do not yet have good coverage. TDD is a
quite hard to practice as a beginner.

Cheers,

Huy
 
J

Jacek Generowicz

Paul Rubin said:
You know, I've always thought that ML-style type inference is an
impressive technical feat, but why is it so important to not use
declarations? This is an aspect I've never really understood.

You gain something akin to duck typing, though in a formalized way
(type classes). The code works for any types which provide the
features which are used in the code, without regard for the specific
type. You gain generic programming.

You also gain not having to clutter your code with all the type
declarations. And you gain not having to decide what types you will
use too early on in development.
 
P

Peter Maas

Jacek said:
You also gain not having to clutter your code with all the type
declarations. And you gain not having to decide what types you will
use too early on in development.

But it can be useful to restrict type variety in certain situations
e.g. prime number calculation :) And it would probably also be useful
to check violations of restrictions before running the program in
normal mode.

This must not necessarily mean 'int this, float that'. E.g. the
following would be nice:

a := some_type_value # "type fixing" assignment
a = other_type_value # error: a is restricted to some_type
a := yet_another_type_value # OK, another fixed type is set
del a # release type restriction; a can be recreated in normal
# dynamic mode

The type fixing assignment could be used for optimization and for
checking the program with pychecker.
 
D

Diez B. Roggisch

But it can be useful to restrict type variety in certain situations
e.g. prime number calculation :) And it would probably also be useful
to check violations of restrictions before running the program in
normal mode.

But that's what (oca)ml and the like do - they exactly don't force you to
specify a type, but a variable has an also variable type, that gets
inferned upon the usage and is then fixed.
 
A

aurora

A frequent error I encounter

try:
...do something...
except IOError:
log('encounter an error %s line %d' % filename)

Here in the string interpolation I should supply (filename,lineno).
Usually I have a lot of unittesting to catch syntax error in the main
code. But it is very difficult to run into exception handler, some of
those are added defensely. Unfortunately those untested exception
sometimes fails precisely when we need it for diagnosis information.

pychecker sometime give false alarm. The argument of a string
interpolation may be a valid tuple. It would be great it we can somehow
unit test the exception handler (without building an extensive library of
mock objects).
 
F

Francis Girard

Hi,

I do not want to discourage Philippe Fremy but I think that this would be very
very difficult to do without modifying Python itself.

What FP languages rely upon to achieve type inference is a feature named
"strong typing". A clear definition of strong typing is :

"Every well-formed expression of the language can be assigned a type that can
be deduced from the constituents of the expression alone." Bird and Wadler,
Introduction to Functional Programming, 1988

This is certainly not the case for Python since one and the same variable can
have different types depending upon the execution context. Example :

1- if a is None:
2- b = 1
3- else:
4- b = "Phew"
5- b = b + 1

One cannot statically determine the type of b by examining the line 5- alone.

Therefore, Fremy's dream can very well turn to some very complex expert system
to make "educated" warning. Being "educated" is a lot harder than to be
"brutal".

It's funny that what mainly prevents us from easily doing a type inferencer is
at this very moment discussed with almost religious flame in the "variable
declaration" thread.

Anyway, strong typing as defined above would change the Python language in
some of its fundamental design. It would certainly be valuable to attempt the
experience, and rename the new language (mangoose would be a pretty name).

Francis Girard
FRANCE

Le mardi 1 Février 2005 16:49, Diez B. Roggisch a écrit :
 
F

Francis Girard

Hi,

Do you have some more pointers to the StarKiller project ? According to the
paper some implementation of this very interesting project exists.

Thank you

Francis Girard

Le mardi 1 Février 2005 11:21, Sylvain Thenault a écrit :
Hi,
Hi

I would like to develop a tool that goes one step further than pychecker
to ensure python program validity. The idea would be to get close to what
people get on ocaml: a static verification of all types of the program,
without any kind of variable declaration. This would definitely brings a
lot of power to python.

The idea is to analyse the whole program, identify constraints on
function arguments and check that these constraints are verified by other
parts of the program.

Did you take a look at the starkiller [1] and pypy projects [2] ?
What is in your opinion the best tool to achieve this ? I had an
extensive look at pychecker, and it could certainly be extended to do
the job. Things that still concern me are that it works on the bytecode,
which prevents it from working with jython and the new .NET python.

I am currently reading the documentation on AST and visitor, but I am
not sure that this will be the best tool either. The AST seems quite
deep and I am afraid that it will make the analysis quite slow and
complicated.

are you talking about the ast returned by the "parser" module, or the ast
from the "compiler" module ? The former is a higher abstraction, using
specific class instances in the tree, and most importantly with all the
parsing junk removed. See [3]. You may also be interested in pylint
[4] which is a pychecker like program built in top of the compiler ast,
and so doesn't require actual import of the analyzed code. However it's
not yet as advanced as pychecker regarding bug detection.

And finally as another poster said you should probably keep an eye open
on the python 2.5 ast branch work...

Hope that helps !

[1]http://www.python.org/pycon/dc2004/papers/1/paper.pdf)
[2]http://codespeak.net/pypy/index.cgi?home
[3]http://www.python.org/doc/current/lib/module-compiler.ast.html
[4]http://www.logilab.org/projects/pylint
 
J

John Roth

Sylvain Thenault said:
On Tue, 01 Feb 2005 05:18:12 +0100, Philippe Fremy wrote:

Did you take a look at the starkiller [1] and pypy projects [2] ?

Has anything happened to Starkiller since PyCon 2004? The
latest mention I can find on Google is a blog entry (by Ted
Leung) on Aug 30 saying he wished someone would give the
author some money to finish it and publish it.

John Roth
 
P

Philippe Fremy

I do not want to discourage Philippe Fremy but I think that this would be very
very difficult to do without modifying Python itself.

That's the conclusion I reached to too after lurking on the ocaml list.
What FP languages rely upon to achieve type inference is a feature named
"strong typing".
> [...]

This is certainly not the case for Python since one and the same variable can
have different types depending upon the execution context. Example :

1- if a is None:
2- b = 1
3- else:
4- b = "Phew"
5- b = b + 1

One cannot statically determine the type of b by examining the line 5- alone.

There are even worse cases:

1- a=1
2- a= someObject.someMethod()

someObject might not have someMethod() and this would be caught by an
AttributeError exception handler and 'a' will stay at 1.

Therefore, Fremy's dream can very well turn to some very complex expert system
to make "educated" warning. Being "educated" is a lot harder than to be
"brutal".

And even being brutal is not that easy to achieve...
Anyway, strong typing as defined above would change the Python language in
some of its fundamental design. It would certainly be valuable to attempt the
experience, and rename the new language (mangoose would be a pretty name).

Indeed. I understood that this is one of the things that could be
provided by pypy.

What I understood (Jacek, this is an answer for you too) is that there
are many usage patterns for python.

Some people use it for all of its advanced dynamic features. I use it in
a java-like fashion. One could use it without the "OO crap", like a
modern pascal replacement.

This is a strength of python that it lends itself to so many programming
paradigm. However, one drawback is that python is not optimum in many of
these paradigm (although it is sometimes the best proponent). In my
case, dynamic typing is not a required feature. Typing immutability and
type inferring is a lot more valuable.

I really hope that pypy will provide that kind of choice. Give me python
with eiffel like contracts, super speed optimisation thank to type
inference and I will be super happy.

Thank everyone for its feedback.

Any other idea of a fun python improvement project I could join without
too much hassle ? I can't help but thinking that pychecker ought to be
able to do a better job.

regards,

Philippe


Francis Girard
FRANCE

Le mardi 1 Février 2005 16:49, Diez B. Roggisch a écrit :
 
S

Skip Montanaro

Francis> "Every well-formed expression of the language can be assigned a
Francis> type that can be deduced from the constituents of the
Francis> expression alone." Bird and Wadler, Introduction to Functional
Francis> Programming, 1988

Francis> This is certainly not the case for Python since one and the
Francis> same variable can have different types depending upon the
Francis> execution context. Example :

Francis> 1- if a is None:
Francis> 2- b = 1
Francis> 3- else:
Francis> 4- b = "Phew"
Francis> 5- b = b + 1

Francis> One cannot statically determine the type of b by examining the
Francis> line 5- alone.

Do you have an example using a correct code fragment? It makes no sense to
infer types in code that would clearly raise runtime errors:
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot concatenate 'str' and 'int' objects

Also, note that the type assigned to an expression may be nothing more than
"object". Clearly that wouldn't be very helpful when trying to write an
optimizing compiler, but it is a valid type.

Skip
 
T

Terry Reedy

Steven Bethard said:
I don't know much about what pychecker does, but if it works with the
bytecode, shouldn't it be fine for jython and IronPython? I thought the
bytecode was part of the language spec, and what was CPython specific was
how the bytecodes were actually implemented...

Nothing about bytecode is part of the language spec. And CPython bytecode
is version specific. If the CPython implementation changed from a virtual
stack machine to a virtual register machine, as was once discussed, the
stack-oriented byte code would be replaced by a register-oriented byte code
or virtual machine language.

Jython compiles Python code to JVM (Java Virtual Machine) bytecode. Parrot
compile to Parrot bytecode. Ironman compiles, I presume, to .Net CL or
whatever it's called.

Terry J. Reedy
 
A

Alex Martelli

Philippe Fremy said:
Any other idea of a fun python improvement project I could join without
too much hassle ? I can't help but thinking that pychecker ought to be
able to do a better job.

Have a look at pypy -- around the key idea of reimplementing Python's
runtime in Python, type inferencing on a somewhat restricted ("not too
dynamic/meta at runtime") subset of Python plays a key role there.


Alex
 
F

Francis Girard

Le mercredi 2 Février 2005 00:28, Philippe Fremy a écrit :
I really hope that pypy will provide that kind of choice. Give me python
with eiffel like contracts, super speed optimisation thank to type
inference and I will be super happy.

That's also my dream. Type inference not so much for speed but for safety.
There are type fans (like most skilled C++ programmer), anti-type supporters
(like the talented Alex Martelli). Type inference sits in the middle.

Didn't understand how pypy intend to support type inference. I read something
about automatically translating Python into the "smooth blend" Pyrex. I am
not sure what it exactly means and how they plan to face the problems we
foresee.

Francis Girard
 
P

Philippe Fremy

Skip said:
Francis> "Every well-formed expression of the language can be assigned a
Francis> type that can be deduced from the constituents of the
Francis> expression alone." Bird and Wadler, Introduction to Functional
Francis> Programming, 1988

Francis> This is certainly not the case for Python since one and the
Francis> same variable can have different types depending upon the
Francis> execution context. Example :

Francis> 1- if a is None:
Francis> 2- b = 1
Francis> 3- else:
Francis> 4- b = "Phew"
Francis> 5- b = b + 1

Francis> One cannot statically determine the type of b by examining the
Francis> line 5- alone.

Do you have an example using a correct code fragment? It makes no sense to
infer types in code that would clearly raise runtime errors:

On the contrary, the point of type inference is to detect such errors.
If the program was always well-formed, there would be no point in
developing a type inference tool.

Philippe
 
S

Sylvain Thenault

Sylvain Thenault said:
On Tue, 01 Feb 2005 05:18:12 +0100, Philippe Fremy wrote:

Did you take a look at the starkiller [1] and pypy projects [2] ?

Has anything happened to Starkiller since PyCon 2004? The latest mention I
can find on Google is a blog entry (by Ted Leung) on Aug 30 saying he
wished someone would give the author some money to finish it and publish
it.

nothing I'm aware of. Some people talked about it, but no news from
starkiller's author. However some posts mention that it already has some
usable output now.
 
F

Francis Girard

To complete Philippe's answer :

As Bird and Wadler continue :

"The major consequence of the discipline imposed by strong-typing is that any
expression which cannot be assigned a "sensible" type is regarded as not
being well-formed and is rejected by the computer before evaluation. Such
expression have _no_ value: they are simply regarded as illegal."

But Skip, I am sure that you can easily find an example by yourself. For
example, replace "+" by a function that does different things depending on
its argument type.

Francis Girard

Le mercredi 2 Février 2005 10:27, Philippe Fremy a écrit :
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,276
Latest member
Sawatmakal

Latest Threads

Top