declarations summary

M

Michael Tobis

Summary of my understanding of a recent interesting thread:

General usage has "declaration" meaning "statement which does not
generate executable bytecode but merely affects the compiler". My
assertion that decorator syntax is "declarative" is therefore formally
false.

The common assertion that "Python is 100% executable" is an
exageration, but not because of the introduction of decorator syntax.
The "global" statement, however is not executable and is recognized as
a declaration.

Both "global" and "@", though have the characteristics that 1) they
don't fit comfortably into Python syntax and 2) they at least
conceptually affect the reference and not the referent. In suport of
the first point, unlike any other block, decorators can appear at only
particular places in the file. Meanwhile globals are counterintuitively
indifferent to whether they are nested in code branches that don't
execute.

..x = 1
..def foo():
.. if False:
.. global x
.. x = 2
..foo()
..print x

prints "1"

These are implemented in different ways and both seem to be somehow
unsatisfactory. They also seem to have something in common. However,
for me to call them both "declarations" was incorrect.

Pythonistas appear to be averse to any declarations (in the sense of
compiler directives) besides "global".

The resistance to "@" is a lost cause, even though a dramatically
superior
def foo(a,b,c) decoby bar"
syntax was proposed. I find this hard to justify, as a deocrator is
clearly a modifier of an immediately following function declaration
rather than a freeestanding executable block.

If it were feasible,I'd even prefer
def foo(a,b,c) decoby [baz,quux]:
so that the list of decorators could be dynamically constructued at
define time.

Anyway, it's this sort of reference-modifier that's at issue here,
whether or not the implementation is truly declarative.

Now that there's two, can there be more? Should there be?

It's not difficult for me to imagine at least two other things I'd want
to do with references: 1) make them immutable (making life easier for
the compiler) 2) make them refer only to objects with certain
properties (strong duck-typing).

Also there's the question of typo-driven bugs, where an attempted
rebinding of "epsilon" instead cerated a reference called "epselon".
(The epselon bug) This is the bane of fortran, and after generations it
was generally agreed that optionally one could require all references
to be declared (implicit none). Perl went throgh a similar process and
implemented a "use strict" module. Experienced Pythonistas are oddly
resistant to even contemplating this change, though in the Fortran and
Perl worlds their usage quickly became practically universal.

I believe that enthusiasm for this construct in other languages is so
strong that it should be prominently addressed in Python introductions
and surveys, and that questions in this regard should be expected from
newbies and patiently addressed. I also don't fully understand the
profound aversion to this idea.

I myself am going back and forth on all this. I despise the fact that
good fortran90 code often has more declarations than executables. Ed
Ream (LEO's author) pointed out to me the Edward-Tufte-ness of Python.
No wasted ink (except maybe the colons, which I do sometimes find
myself forgetting on long edits...) I don't want Python to look like
f90. What would be the point?

On the other hand, I think compiler writers are too attached to
cleverly divining the intention of the programmer. Much of the effort
of the compiler could be obviated by judicious use of declarations and
other hints in cases where they matter. Correct code could always be
generated without these hints, but code could be made more
runtime-efficient with them. People who are running codes that are a
compute-bound tend not to be beginners, and could put up with some
clutter when tuning their code.

In the end, I think the intuition of what's Pythonic is enormously
valuable, and I'm continuing to learn from discussions here every day.
As a result, I'm inclined to believe in the grounds of authority that
this sort of trick does not belong in Python.

That said it seems unlikely that any version of Python will ever be
competitive as a high-performance numerical computing language because
nothing can be known about bindings at compile time, as we are
constantly being remineded here. I'll be happy if Python knocks my
socks off yet again, but I can't see counting on it. So the successor
to Fortran (presuming it isn't C++, which I do presume) may be
influenced by Python, but probably it won't be Python.

mt
 
A

Alex Martelli

Michael Tobis said:
.x = 1
.def foo():
. if False:
. global x
. x = 2
.foo()
.print x

prints "1"
Wrong:
.... if False:
.... global x
.... x = 2
.... 2

And indeed, that IS the problem.
Pythonistas appear to be averse to any declarations (in the sense of
compiler directives) besides "global".

Many of us consider ``global'' a nasty wart, too.
Anyway, it's this sort of reference-modifier that's at issue here,

You can play with references as much as you like, as long as they're
*COMPOUND* names (attributes, or even, if you with, items). Just leave
*BARE* names alone, and nobody will jump all over you.
It's not difficult for me to imagine at least two other things I'd want
to do with references: 1) make them immutable (making life easier for
the compiler) 2) make them refer only to objects with certain
properties (strong duck-typing).

Look at Enthought's ``Traits'' for some highly developed infrastructure
along these lines -- for *COMPOUND* names only, of course. No ``making
life easier for the compiler'', but for example ease of generation of
forms and other view/controller implementations from a model, etc.

If you're keen to hack on the compiler to take advantage of such
information for code generation purposes, your experimentation will be
easier once the AST branch is folded back into the 2.5 CVS head; if it's
not about code generation but just runtime (so not really about the
compiler but rather about the virtual machine), pypy (see codespeak.net)
may help you similarly, already today. If you're not keen to hack,
experiment, and prove that dramatic speedups can be obtained this way,
then I doubt that musing about ``helping the compiler'' in a purely
theoretical way is gonna do any good whatsoever.

On the other hand, I think compiler writers are too attached to
cleverly divining the intention of the programmer. Much of the effort
of the compiler could be obviated by judicious use of declarations and
other hints in cases where they matter. Correct code could always be

Ah, somebody who's never hacked on C compilers during the transition
from when ``judicious use of 'register' '' (surely a classic example of
such an "other hint") was SO crucial, to where it doesn't matter in the
least -- except that it's an ugly wart, and the clutter stays.

Used to be that C compilers didn't do register allocation with any skill
nor finesse, but did let you give a hint by using "register" as the
storage class of a variable. Smart programmers studied the generated
machine code on a few architectures of interest, placed "register"
appropriately, studied what changes this made to the generated code, and
didn't forget to check on all different machines of interest.

Most C programmers just slapped "register" where they GUESSED it would
help, and in far too many cases they were horribly wrong, because
intuition is no good guide to performance improvement; I have witnessed
examples of code where on certain machine/compiler combinations
inserting a "#define register auto" to disable the GD ``register'' made
some functions measurably FASTER.

Then "graph-coloring" register allocators came into their own and in the
space of a few years ``register'' blissfully became just about
irrelevant; nowadays, I believe all extant compilers simply ignore it,
at least in suitable optimization mode.

Of course, for backwards compatibility, "register" remains in the
language: total, absolute, absurd deadweight -- one extra little
difficulty in learning the language that has absolutely no reason to
exist, gives zero added value, just clutters things up to no purpose
whatsoever.

So much for the ``judicious'' use of performance hints: it just doesn't
happen enough in the real world, to put up with the aggravations until
compiler technology has gotten around to making a certain hint
irrelevant, and the totally useless conceptual and coding clutter
forevermore afterwards.

Those who can't learn from history are condemned to repeat it. Maybe
some of us just know some history better (perhaps by having lived it)
and don't need yet another round of repetition?

Optional declarations to ease compiler optimizations were at the heart
of Dylan, and looked like a great idea, but Dylan's success surely
wasn't such as to enourage a repeat of this ``soft typing'' experiment,
was it now? Of course, it's always hard to pinpoint one reason out of a
language's many features -- e.g., Dylan had macros, too, maybe THOSE
were more of a problem than soft-typing.

In theory, I do like the idea of being able to give assertions about
issues that a compiler could not reasonably infer (without analyzing the
whole program's code, which is another idea that never really panned out
for real-life programs as well as one might hope), and having a mode
where those assertions get checked at runtime and another where the
compiler may, if it wishes, rely on those assertions for code
optimization purposes. In practice, having never seen this
theoretically nice idea work really WELL in real life, despite it being
tried out in a huge number of ways and variations many, MANY times, I'd
have to be convinced that a particular incarnation is working well by
seeing it make a truly impressive performance difference in some
prototype implementation. AST-branch for the compiler side, and pypy
for the runtime side, are developing in good part just in order to ALLOW
easy experimentation, by coding only Python (and maybe generating some
pyrex, if the point of some idea is making some sparkling machinecode).

Intuitions about performance are far too often completely off the mark:
performance effects just NEED to be shown as working well in practice,
before any credence can be put into reasoning that this, that or the
other tweak "should" speed things up wondrously.
socks off yet again, but I can't see counting on it. So the successor
to Fortran (presuming it isn't C++, which I do presume) may be
influenced by Python, but probably it won't be Python.

You appear to assume that Fortran is dead, or dying, or is gonna die
soon. I assume Mr. Beliavski will do a great job of jumping on you for
this, so I can save the effort of doing do myself;-).


Alex
 
B

beliavsky

Alex said:
You appear to assume that Fortran is dead, or dying, or is gonna die
soon. I assume Mr. Beliavski will do a great job of jumping on you for
this, so I can save the effort of doing do myself;-).

Everyone needs a purpose in life :). I hope that Fortran 2003 and
future versions will be the successors of traditional Fortran, but I
may well be disappointed.

Many scientists and engineers do not have the motivation, the time, or
even the ability to master C++, generally acknowledged to be a language
for professional programmers. When performance is not paramount, they
can use Python (with Numarray or Numeric) and other array languages
like Matlab/Octave/Scilab as reasonable alternatives to Fortran.
Despite its high cost for non-students, Matlab is enormously popular
among engineers.
 
N

Nick Coghlan

Michael said:
Also there's the question of typo-driven bugs, where an attempted
rebinding of "epsilon" instead cerated a reference called "epselon".
(The epselon bug) This is the bane of fortran, and after generations it
was generally agreed that optionally one could require all references
to be declared (implicit none). Perl went throgh a similar process and
implemented a "use strict" module. Experienced Pythonistas are oddly
resistant to even contemplating this change, though in the Fortran and
Perl worlds their usage quickly became practically universal.

I believe that enthusiasm for this construct in other languages is so
strong that it should be prominently addressed in Python introductions
and surveys, and that questions in this regard should be expected from
newbies and patiently addressed. I also don't fully understand the
profound aversion to this idea.

I myself am going back and forth on all this. I despise the fact that
good fortran90 code often has more declarations than executables. Ed
Ream (LEO's author) pointed out to me the Edward-Tufte-ness of Python.
No wasted ink (except maybe the colons, which I do sometimes find
myself forgetting on long edits...) I don't want Python to look like
f90. What would be the point?

*If* the rebinding issue were to be addressed for bare names and the right-most
name in a compound name (the only place where it is currently an issue), I
imagine it would be by introducing a rebinding augmented assignment operator
rather than by introducing variable declarations.

Or, more precisely, standard name binding would continue to serve in that role.

For Alex:

I figured out something resembling a justification for the difference in
semantics between a rebinding operator and the rest of the augmented assignment
operators: there's no magic method for binding a name, so there is no need for a
magic method when simply rebinding one, whereas the other augmented assignment
operators combine the name-binding with an operation which normally *can* be
overridden, and the augmented assignment magic method is a convenience to allow
the rebinding case of that operation to be optimised.

Cheers,
Nick.
Do I really want to add another PEP to the snoozing PEP 338, though?
 
A

Arthur

General usage has "declaration" meaning "statement which does not
generate executable bytecode but merely affects the compiler". My
assertion that decorator syntax is "declarative" is therefore formally
false.

I'm not sure what this adds to the discussion, but I think it worth
noting that PEP318 and Guido's own discussion make it very clear that
the choice of Python decorator syntax was influenced - presumable by
the drawing of some analogy - to Java Annotation syntax:

http://java.sun.com/j2se/1.5.0/docs/guide/language/annotations.html

The documentation as to Java annotations seems clear tht annotations
are considered as declarative, within the context of the Java
language.

But also in those docs:

"""
Typical application programmers will never have to define an
annotation type, but it is not hard to do so.
"""

IMO, there was that analogy as well that had some influence on the
decision. It needed to be something that communicated decorators as
something beyond the scope of "everyday Python". Which would mean
something not too deeply embedded, which may account for the reason
that the proposals with more obvious surface aesthetic appeal like the
one's you refer to were rejected.

Beauty of course being only skin deep. Given that the decorators had
to happen when they happened (that being a separate point of
contention), I happen to think Guido's choice was a good and
courageous one.

Art
 
A

Arthur

I happen to think Guido's choice was a good and
courageous one.

which given my perceived track record (in some quarters), is probably
not a very good sign.

Or else by agreeing with Guido sometimes, I get to be right sometimes.

;)

Art
 
F

Fredrik Lundh

Alex said:
Used to be that C compilers didn't do register allocation with any skill
nor finesse, but did let you give a hint by using "register" as the
storage class of a variable. Smart programmers studied the generated
machine code on a few architectures of interest, placed "register"
appropriately, studied what changes this made to the generated code, and
didn't forget to check on all different machines of interest.

Most C programmers just slapped "register" where they GUESSED it would
help, and in far too many cases they were horribly wrong, because
intuition is no good guide to performance improvement; I have witnessed
examples of code where on certain machine/compiler combinations
inserting a "#define register auto" to disable the GD ``register'' made
some functions measurably FASTER.

Then "graph-coloring" register allocators came into their own and in the
space of a few years ``register'' blissfully became just about
irrelevant; nowadays, I believe all extant compilers simply ignore it,
at least in suitable optimization mode.

[fredrik@brain Python-2.3.3]$ grep register */*.c | wc -l
502
[fredrik@brain Python-2.4]$ grep register */*.c | wc -l
497

oh, well. at least we're moving in the right direction.

</F>
 
M

Michael Tobis

Well, many scientists and engineers don't have the time, motivation or
ability to master the intricacies of recent fortran vintages either.
That's the problem.

Very large codes written by teams of software engineers for
well-delimited application spaces will continue to be written in some
version of Fortran. I hope the follwoing will explain why this causes
me difficulty.

Toy codes will probably move from Matlab to Python soon enough, if only
because Python, iPython and Matplotlib are free. It's the middle
ground, large codes as modified by single researchers, where much of
the real work of research gets done, and the infrastructure for this
use case is abysmal.

The near-term solution I'm coming to is code generation. As much as
possible I intend to present a clean Python interface to the scientist
and generate efficient compilable code behind their back.

Among the questions I'm trying to address is whether Python can or will
ever be so efficient that much of my work in this direction will be
abandoned. In that case, I can focus on the expression side, and leave
the efficiency question alone in the expectation that PyPy (or
something) will take care of it someday.

Another question is which language I should use as the back end. C or
even F77 can be coupled into the interactive environment on the fly.
Actual production work, of course, gains nothing from interactivity,
but we are still envisioning a significant scientist/developer use
case; in fact, that's the point.

Whether I like it or not, the group I work with is interested in is in
F90 source. (CCSM) This has proven awkward to integrate into Python,
though not impossible for some compilers thanks to the work of the
Babel and Chasm groups.

In practice, had the professional coders developing CCSM used C++
rather than Fortran90 (or even had they stuck to F77), we would be in a
better position to expose a well-defined low-complexity high-efficiency
layer to the non-CS specialist physical scientist. That's a mouthful, I
admit, but (trust me) the world would be a better place if it were
done.

So such advantages as do exist for a professional full-time
computational science group in developing CCSM in F90 have ended up as
a hindrance to what my constituency regards as the most important use
case.

Beliavsky's summary of what is available is a taxonomy of distinct
islands of production with tenuous connections between them. I don't
want quick-code slow-run "alternatives" to slow-code high-performance
production codes. I want, and my users need, a unified scientific
software environment which provides powerful expression without
sacrificing performance or interoperability.

Do I expect Fortran to go away? Not anytime soon. Do I expect an
alternative to emerge? Indeed I do, and I hope to be part of said
emergence. In fact, I don't know what the high-performance scientific
computer language of the year 2020 will look like, but I, for one,
would like it to be called "Python".

mt
 
T

Terry Reedy

Michael said:
Also there's the question of typo-driven bugs, where an attempted
rebinding of "epsilon" instead cerated a reference called "epselon".
/cerated/created/
(The epselon bug) This is the bane of fortran, and after generations it
was generally agreed that optionally one could require all references
to be declared (implicit none). [snip]
Experienced Pythonistas are oddly resistant to even contemplating
this change [snip]
I also don't fully understand the profound aversion to this idea.

I don't know if my aversion is profound, and don't think it odd, but I will
try to help your understanding, while consenting to contemplating a change
;-).

1. I seem to have a copy-editor mind. Catching and mentally correcting
'cerated' took me less than a second. Being a touch-typist, I catch most
of my keying errors as I go, within a few keystrokes. Then I re-read and
catch more before submitting -- code or clp postings. I also find for
myself that typing the wrong word (usually spelled correctly) is as much a
problem as typing the wrong letter.

2. In my own experience, declarations were always tied to type
declarations. Name-only declarations to only catch typos is a new and
different idea for me. (Type declarations are a different issue, but I
really like generic programming and duck 'typing'.)

3. While redundancy can aid error catching, it can also aid error
commission. First, neglecting to declare a name would become a crime
causing rejection of a program even though it is otherwise perfect. (More
laws = more crime.) Second, if someone misdeclares 'epselon' and then, 200
lines later, writes 'epsilon = 1e-10', the declaration is again a problem,
not a solution. And the 'name not declared' message will take you to the
wrong place to fix the problem -- maybe even to the wrong file. In other
words, nannies can be a nuisance.

4. Name declarations will only catch a subset of name typos -- those that
create a new name instead of duplicating another declared one. And these
should also be easier to catch by eye -- at least for the original writer
who has a mental list of valid names. Name declarations will not catch
'x2' mistyped as 'x3' in a program that declares both. So there is no
substitute for careful reading and testing.

5. Having users compile a list for the compiler to check strikes me as the
wrong (or at least obsolete) solution to the fairly narrow problem of
catching keying errors that create an 'invalid' name. Having to type out
all names before coding, or having to run up and down a file to add each
new name to the namelist, is a nuisance. Let the computer do it! (Which
it eventually does anyway.) To catch bad name errors sooner, which is
definitely better, have a checker program make and list and ask "are these
all intended?" Or have the editor do so and immediately (when a
word-ending char is entered) highlight or color the first occurence of a
non-keyword, non-builtin name. This would also catch undefined names used
in an expression.

6. I am also a fan of Tufte. To colon or not to colon compound statement
header lines was apparently a close call for Guido. The rule requiring
colons is both a help and a hindrance.
*If* the rebinding issue were to be addressed for bare names
and the right-most name in a compound name (the only place where it is
currently an issue),
I imagine it would be by introducing a rebinding augmented
assignment operator rather than by introducing variable declarations.

This would be better, and could be considered for Python 3. But it would
also introduce a new possibility for error. And I think the better way to
go is leave the language clean and have more edit-time checking both for
syntax and PyLint-PyChecker type checks.

Terry J. Reedy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,771
Messages
2,569,587
Members
45,097
Latest member
RayE496148
Top