Coding style article with interesting section on white space

Discussion in 'Python' started by Nick Coghlan, Jan 29, 2005.

  1. Nick Coghlan

    Nick Coghlan Guest

    Nick Coghlan, Jan 29, 2005
    #1
    1. Advertisements

  2. Nick Coghlan

    Rakesh Kumar Guest

    Thanx Nick
     
    Rakesh Kumar, Jan 29, 2005
    #2
    1. Advertisements

  3. Nick Coghlan

    beliavsky Guest

    http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=271&page=3

    The suggestions in the cited article, "How Not to Write FORTRAN in Any
    Language", are reasonable but elementary and can be followed in Fortran
    90/95/2003 as well as any other language. What infuriates me is that
    the author writes as if Fortran has not evolved since the 1960s. It
    has. To be specific, Fortran 90

    (1) allows variable names up to 31 characters long
    (2) has a free source form where
    (a) there are no rigid rules about starting code in a certain
    column
    (b) white space is significant
    (3) has a full set of control structures -- goto's are almost never
    needed

    More detailed rebuttals of the article are in the archives of the
    Fortran 90 discussion group at
    http://www.jiscmail.ac.uk/cgi-bin/webadmin?A1=ind0501&L=comp-fortran-90
    -- search for "Fortran bashing".

    Python looks more like Fortran 90 than one of the curly-brace/semicolon
    languages, and both languages have modules and array slices.

    One ought to do a little research before publishing an article.
    Apparently, many authors and editors are too lazy to do so.
     
    beliavsky, Jan 29, 2005
    #3
  4. .
    .
    .
    .... and/or ignorant or uncultured. Also, don't forget to excoriate
    the publishers and editors, too cheap and/or otherwise constrained
    to edit/fact-check/review/...
     
    Cameron Laird, Jan 29, 2005
    #4
  5. Nick Coghlan

    Nick Coghlan Guest

    For myself, I'd be more inclined to say you can write Perl in any language, but
    the fact that the author used Fortan as his own hated source of unreadable code
    is beside the point - the entire point of the article is that readability
    counts, no matter what language you're writing in :)

    And that's why the article got published in spite of the jabs at Fortran - those
    jabs served to explain the source of the author's detestation of unreadable
    code. Anyone taking such an admittedly biased opinion and using it to form an
    opinion on _current_ Fortan has problems far bigger than a single article.

    Cheers,
    Nick.
     
    Nick Coghlan, Jan 30, 2005
    #5
  6. (unwisely taking the bait...)

    If you like your language to look like this
    http://www.cs.rpi.edu/~szymansk/OOF90/bugs.html
    then more power to you.

    I prefer my languages to be portable, terse and expressive. That's why
    I like Python. If you want your language to be obscure, ill-defined and
    inconsistent across platforms, by all means go to comp.lang.fortran .

    There is no fundamental reason why a language with expressive power
    much like Python's cannot have run-time performance comparable to
    Fortran's. Unfortunately, Fortran's dominance of the relatively small
    scientific computation universe has prevented such a language from
    emerging. The solutions which interest me in the short run are 1)
    writing a code generation layer from Python to a compiled language
    (possibly F77 though I will try to get away with C) and 2) wrapping
    legacy Fortran in Python. The latter is quite regularly subverted by
    non-standard binary data structures across compilers and a pretty
    profound disinterest in interoperability by the people designing the
    Fortran standard that makes their interest look more like turf
    protection and less like an interest in the progress of science.

    In the long run, hopefully a high-performance language that has
    significant capacity for abstraction and introspection will emerge.
    People keep trying various ways to coax Python into that role. Maybe it
    will work, or maybe a fresh start is needed. Awkwardly bolting even
    more conetmporary concepts onto Fortran is not going to achieve
    bringing computational science up to date.

    Python fundamentally respects the programmer. Fortran started from a
    point of respecting only the machine, (which is why Fortrans up to F77,
    having a well-defined objective, were reasonable) but now it is a
    motley collection of half-baked and confusing compromises between
    runtime performance, backward compatibility, and awkward efforts at
    keeping up with trends in computer languages. So-called "object
    oriented Fortran" makes the most baroque Java look elegant and
    expressive.

    For more see http://www.fortranstatement.com
    Language matters. You can't really write Python in any language.

    mt
     
    Michael Tobis, Jan 30, 2005
    #6
  7. Nick Coghlan

    beliavsky Guest

    Thanks for pointing out that interesting article on Fortran 90 bugs.
    How long would a comparable C++ list be? Even Python has gotchas, for
    example the difference between deep and shallow copies.
    Fortran programmers are generally happy with the portability of the
    language. A difficulty with Python portability and maintainability is
    that some needed functionality is not in the core language but in C
    extensions. For scientific computation, consider the case of Numeric
    and Numarray. I don't think Numeric binaries are available for Python
    2.4, and Numarray is not perfect substitute, being considerably slower
    for small arrays, having slightly different functionality in some
    areas, and
    as recently as Nov 2004 (c.l.py thread "numarray memory leak") leaking
    memory when multiplying matrices.

    The recent "Pystone Benchmark" message says that Python is only 75% as
    fast on Linux as on Windows. Fortran programs do not suffer this
    performance hit and are in this respect more portable. In theory, as
    has been mentioned, one could use a faster compiler to compile CPython
    on Linux, but AFAIK this has not yet been done.

    Nobody is stopping Python developers from working on projects like
    Psyco.
    So uninterested in interoperability is the Fortran standards committee
    that they added interoperability with C in Fortran 2003 standard.

    What does that mean?
    is > why Fortrans up to F77, having a well-defined objective, were
    I have found that Fortran 90/95 is better at the objective of FORmula
    TRANslation for array expressions (mostly what I need) than previous
    versions.
    This is true to some extent of any language "of a certain age",
    including C++.

    And the rebuttal at http://www.fortranstatement.com/Site/responses.html
    ..
     
    beliavsky, Jan 30, 2005
    #7
  8. no, it really only says that the Pystone benchmark is 75% as fast as Linux as on
    Windows, on the poster's hardware, using his configuration, and using different
    compilers.

    </F>
     
    Fredrik Lundh, Jan 30, 2005
    #8
  9. C++ "gotchas" spawn whole books. So, once, did C ones -- Koenig's "C
    Traps and Pitfalls" is a wonderful book; to be honest, that was only
    partly about the _language_... Koenig's advocacy of half-open loops and
    intervals is just as valid in any language, but it was still a point
    WELL worth making.

    The referenced page, in part, is simply pointing out places where
    Fortran might prove surprising to a programmer just because it works
    differently from some other language the programmer might be used to.
    For example, the very first entry just says that '.and.' does not
    short-circuit, so when you need guard behavior you should rather use
    nested IF statements. This is no bug, just a reasonable language design
    choice; anybody coming from (standard) Pascal would not be surprised;
    Ada even has two different forms ('and' doesn't short-circuit, if you
    want short-circuit you use 'and then').
    In some sense it can be a gotcha for some programmers, but it would be
    silly to count it as a "fortran bug"! Or even "wart" for that matter.

    So, I would try to classify things in three classes:

    a. some things are important techniques which one may choose to
    highlight in the context of a given language, yet it would simply
    be silly to classify as gotchas, warts, or bugs _of that language_;

    b. some aspects of a language's behavior are surprising to people
    familiar w/other languages which behave differently, and thus are
    worth pointing out as "gotchas" though they're surely not bugs (and
    may or may not be warts);

    c. lastly, some things are irregularities within the language, or
    truly unexpected interactions among language features, or vary
    between implementations in ways most programmers won't expect;
    these can be described as warts (and maybe even bugs, meaning
    things that may well be fixed in the next version of a language).

    The advantages of half-open intervals (per Koenig's book), the fact that
    copies exist in both shallow and deep senses, or the fact that with
    pointers to pointers you need to allocate the parent pointers first (the
    last entry in the referenced page) are really about [a] -- of course if
    a language doesn't have pointers, or doesn't offer a standardized way to
    make copies, you won't notice those aspects in that language (the issue
    of half-open loops and intervals is wider...), but really these kinds of
    observations apply across broad varieties of languages.

    Point (b) will always be with us unless all languages work in exactly
    the same way;-). 'and' will either short-circuit or not (or the
    language will be more complicated to let you specify), array indices
    will start from 0 or from 1 (or the language will be more complicated to
    let you specify, etc etc), default values for arguments will be computed
    at some specified time -- compile-time, call-time, whatever -- or the
    language will be poorer (no default values, or only constant ones) or
    more complicated (to let you specify when the default gets computed).
    Etc, etc.

    Point (c) is really the crux of the matter. Generally, simple languages
    such as C or Python will have relatively few (c)-warts; very big and
    rich ones such as C++ or Perl will have many; and ones in the middle, as
    it appears to me that Fortran 90 is, will have middling amounts. I'm
    not saying that lanugage size/complexity is the only determinant --
    there are other aspects which contribute, e.g., the need for backwards
    compatibility often mandates the presence of legacy features whose
    interaction with other features may cause (c) moments, so, a language
    which is older, has evolved a lot, and is careful to keep compatibility,
    will be more at risk of (c)-level issues. Still, size matters. It's
    just like saying that a big program is likely to have more bugs than a
    small one... even though many other factors contribute (backwards
    compatible evolution from previous versions matters here, too).

    <http://www4.ncsu.edu/~jdbrandm/Numeric-23.6.win32-py2.4.exe> ? Just
    googled and visited the first hit -- I don't currently use Windows so I
    don't know if it's still there, works well, etc.
    You're saying that using a different and better compiler cannot speed
    the execution of your Fortran program by 25% when you move it from one
    platform to another...?! This seems totally absurd to me, and yet I see
    no other way to interpret this assertion about "Fortran programs not
    suffering" -- you're looking at it as a performance _hit_ but of course
    it might just as well be construed as a performance _boost_ depending on
    the direction you're moving your programs.

    I think that upon mature consideration you will want to retract this
    assertion, and admit that it IS perfectly possible for the same Fortran
    program on the same hardware to have performance that differs by 25% or
    more depending on how good the optimizers of different compilers happen
    to be for that particular code, and therefore that, whatever point you
    thought you were making here, it's in fact totally worthless.
    We're cheapskates, so we tend to go for the free compilers -- with the
    exception of Windows, where I believe Microsoft donated many copies of
    their newest commercial compiler to Python core developers working on
    Windows (smart move on MS part, makes their platform look better at no
    real cost to them).

    But the more compilers are in use, the LARGER the variation of
    performance I expect to see for the same code on a given box. There
    will surely be cheapskates, or underfunded programmers, in the Fortran
    world, too, using free or very cheap compilers -- or is your claim that
    anybody using Fortran MUST be so flush with money that they only ever
    use the costliest tools, and thus that Fortran should not be considered
    unless your project's budget for programming tools is Rubenesque? Do
    you think the costliest professional compilers cannot EVER find, on any
    given benchmark, some optimization worth a 25% speedup wrt the cheapest
    or free compilers...?! I really can't believe you'd claim any of this.
    Maybe you will want to clarify what, if anything, you mean here.


    Alex
     
    Alex Martelli, Jan 30, 2005
    #9
  10. Nick Coghlan

    beliavsky Guest

    Alex Martelli wrote:

    I should have Googled. I will investigate that link. At SourceForge,
    http://sourceforge.net/project/showfiles.php?group_id=1369 I see a
    Numarray but not a Numeric Windows binary for Python 2.4. The latest
    Numeric Windows binary there is for Python 2.3.
    I had in mind the Polyhedron Fortran 90 benchmarks for Windows and
    Linux on Intel x86 at
    http://www.polyhedron.co.uk/compare/linux/f90bench_p4.html and
    http://www.polyhedron.co.uk/compare/win32/f90bench_p4.html . The speed
    differences of Absoft, Intel, and Lahey between Linux and Windows for
    individual programs, not to mention the average differential across all
    programs, is much less than 25%. The differences on a single OS between
    compilers can be much larger, but that has less bearing on portability
    across OS's.

    Thanks for your earlier informative comments on languages. Sparring
    with Alex Martelli is like boxing Mike Tyson, except that one
    experiences brain enhancement rather than brain damage :).
     
    beliavsky, Jan 30, 2005
    #10
  11. Look at the Fortran compiler benchmarks here:

    http://www.polyhedron.co.uk/compare/win32/f77bench_p4.html

    for some concrete evidence to support Alex's point.

    You will see that the average performance across different benchmarks of
    different Fortran compilers on the same platform can be as much a factor
    of two. Variation of individual benchmarks as much as a factor of three.

    Some of you might be surprised at how many different Fortran compilers
    are available!
     
    Andrew McLean, Jan 30, 2005
    #11
  12. So, you think that comparing a single commercial compiler for code
    generation on Windows vs Linux (which _should_ pretty obviously be
    pretty much identical) is the same thing as comparing the code
    generation of two DIFFERENT compilers, a commercial one on Windows vs a
    free one on Linux?! This stance sounds singularly weird to me.

    If on one platform you use a compiler that's only available for that
    platform (such as Microsoft's), then "portability across OS's" (in terms
    of performance, at least) can of course easily be affected. If you care
    so much, splurge for (say) the commercial compilers that Intel will be
    quite happy to sell you for both platforms -- the one for Windows is a
    plug-in replacement for Microsoft's, inside MS Visual Studio (at least,
    it used to be that way, with VS 6.0; I don't know if that's still the
    case), the one for Linux is usable in lieu of the free gcc. So, each
    should compile Python without any problem. Presumably, the optimizer
    and code generator will be essentially unchanged across platforms, as
    they are in the offerings of other vendors of commercial compilers -- it
    would seem silly for any vendor to do otherwise!

    If one has no funding to purchase commercial compilers for several
    platforms, or one doesn't care particularly about the differences in
    speed resulting from different compilers' optimizers, then, surprise
    surprise, one's programs are quite liable to have different performance
    on different platforms. Trying to imply that this has ANYTHING to do
    with the LANGUAGE the programs are coded in, as opposed to the compilers
    and expenses one is willing to incur for them, is either an extremely
    serious error of logic, if incurred in good faith, or else is an attempt
    to "score points" in a discussion, and then good faith is absent.


    Alex
     
    Alex Martelli, Jan 30, 2005
    #12
  13. The reply "C++ is even worse", while debatable either way, seems to be
    a common response from Fortran defenders. It misses the point.

    What the scientific community needs, whether they know it or not, is
    high-performance Python if that's possible, or something as close to it
    as possible. Specifically, something with serious introspective power
    and economy of expression.
    Until they try to port something...? Honestly, I can't imagine where
    anyone would get this impression.

    Now, about the terseness and expressiveness?
    I'm not happy with these, because they either make temporary arrays
    with wild abandon, or enforce an unnatural style of expression. I could
    see how they would be useful to others but they are awkward in
    long-time spatially coarse finite difference/finite volume/spectral
    calculations, which is the problem space I care about.

    As for non-coarse (finite element) integrations (where rectangular
    decompositions do not suffice) it seems to me that using Fortran is
    sheer madness, even though there are real pointers now.

    I do not suggest that Python is currently competitive with C++ or
    Fortran. I simply agree with
    http://www.fortranstatement.com that something new ought to be
    designed, that a backward compatible Fortran2003 cannot possibly be it,
    and that attention to fortran diverts resources from teh osrt of
    genuine progress that ought to be possible.
    Without disagreeing with Alex Martelli's response to this, I find it
    nonsensical on other grounds.

    Performance portability has nothing to do with what I'm talking about.

    The answer belies the attitude that programmers are there to put in
    time and expend effort, because the only resource worth considering is
    production cycles on a big machine. This attitude explains why working
    with Fortran is so unpleasant an experience for anyone who has been
    exposed to other languages.

    An alternative attitude is that the amount of human effort put into
    solving a problem is a relevant factor.

    In this view, "portability" is actually about build effort, not runtime
    performance. Perhaps the Fortran community finds this idea surprising?

    Getting a python package working usually amounts to an install command
    to the OS and an import command to the interpreter. Then you can get
    down to work. Getting a Fortran package working involves not only an
    endless dance with obscure compiler flags and library links, but then a
    set of workarounds for codes that produce expected results on one
    compiler and compile failures or even different results on another.

    The day when project cost was dominated by machine time as opposed to
    programmer time is coming to a close. Fortran is a meaningful solution
    only to the extent that programmer time is not just secondary but
    actually negligible as a cost.

    The assumption that portability is somehow about performance belies
    total lack of concern for the programmer as a resource, and therefore
    to the time-to-solution of any new class of scientific problem. The
    result of this habit of thought (entirely appropriate to 1955) is that
    in an environment where fortran is expected, new problems are
    interpreted as changes to old problems, and the packages become vast
    and bloated.

    Since most of these legacy codes in practice predate any concept of
    design-for-test, they are also almost certainly wrong, in the sense
    that they are unlikely to implement the mathematics they purport to
    implement. Usually they are "close enough" for some purposes, but it's
    impossible to delimit what purposes are inside or outside the range of
    application.
    They aren't helping either, for the most part.

    Psyco aside, institutions that ought to be supporting development of
    high-performance high-expressiveness scientific languages are not doing
    so.

    Institutions with a Fortran legacy base are confused between
    maintaining the existing investment and expanding it. The former is
    frequently a mistake already, but the latter is a mistake almost
    always. This mistake drives investment of effort in inefficient
    directions, where efficiency is about design and build cost-to-solution
    rather than runtime cost-of-execution.
    Er, replaced the modest C interoperability they lost in F90, right?

    The codes I care about aren't new ones. So now I have to hack into the
    code and redeclare all my arrays, right? So that they'll have a defined
    structure? And hope that doesn't break some other assumptions? Except
    that I have to await someone actually releasing an F03 compiler, right?


    Thanks. It turns out that I need a compiler, not a specification,
    unfortunately.

    Even so, I understood that there was some resistance to even this level
    of interoperability, because it breaks the Fortran no-convention
    convention. C-interop vapor-mode will require, (oh horror!) a
    specification for which bytes represent what. Now each vendor will have
    to try to patch that back in to their idiosyncratic representation.
    They'll probably get it mostly right, eventually.
    If you don't know by now, don't mess with it.

    mt
     
    Michael Tobis, Jan 30, 2005
    #13
  14. Nick Coghlan

    beliavsky Guest

    that many portable libraries like LAPACK exist. Fortran 90+ allows one
    to write even more portable code using the KIND mechanism and the
    selected_int_kind and selected_real_kind functions, which let you
    specify the ranges of basic data types.
    Fortran 90/95 is more expressive than Fortran 77 in many ways, as
    described in (for example) pages 7-8 of the essay "Numerical Recipes:
    Does This Paradigm Have a Future?", available at
    http://www.nr.com/CiP97.pdf . Quoting that paper:

    "This, for us, was the revelation of parallel programming in Fortran
    90.
    The use of parallel and higher-level constructions -- wholly
    independently of whether they are executed on tomorrow's parallel
    machines or today's ordinary workstations -- expresses more science per
    line of code and per programming workday. Based on our own experience,
    we think that productivity, or achievable complexity of project, is
    increased a factor of 2 or 3 in going from Fortran 77 to Fortran 90 --
    if one makes the investment of mastering Fortran 90's higher level
    constructs."
    Some early Fortran 90 compilers did this but have improved
    significantly in this respect.
    I wonder what this means.

    You have not given specifics about what "genuine progress" in a
    scientific programming language would be.

    Nonsense. Fortran was originally created to make PROGRAMMERS more
    efficient by letting them code algebraic expressions instead of machine
    code.
    "Anyone"? Since I have been exposed to Python and C++ and still find
    Fortran 90+ enjoyable, your statement is demonstrably false. Have you
    ever used a version of Fortran past F77? The free Fortran 95 compiler
    called g95 http://www.g95.org is available.

    If one writes standard-conforming code it does not. At least with
    Fortran one has multiple independent implementations of an ISO
    standardized language, about 10 each on Intel Windows or Linux, and 4
    for Mac OS X. Links are at
    http://dmoz.org/Computers/Programming/Languages/Fortran/Compilers/ . If
    one Fortran compiler leaks memory when multiplying matrices, one can
    use another. If Python Numarray does, the only alternative I know of is
    Numeric, which is no longer supported according to
    http://www.pfdubois.com/numpy/ . I have found it extremely useful to
    work with multiple compilers and compiler options. A good compiler such
    as Lahey/Fujitsu has debugging options that aid programmer productivity
    by finding bugs, both at compile and run time, at some cost in speed. I
    gave a few examples in a previous thread. The existence of such
    compilers refutes your assertion that Fortran programmers think only
    machine time is important.
     
    beliavsky, Jan 30, 2005
    #14
  15. The example shown on p 10 illustrates a shorter piece of code in f90
    than in f77, but it is not obviously more expressive or less complex.
    Arguably the f77 code is easier to write and maintain, even though it
    has more linefeeds in it, so I find the example far from compelling.

    In fact, I find the f90 example impenetrable. Specifically, what does
    array_copy(source,dest,n,nn) do? I note that n and nn are uninitialized
    at call time. Are these outputs from array_copy(), appearing, in that
    inimitable Fortran way, in the call signature?

    Here it is in Python with NumArray:

    b = sort(compress(less(v,200.) * greater(v,100.),m ))
    result = b[(len(b)+3)//4]

    I certainly think this example is competitive for whatever that's
    worth. It has the added benefit of handling the case of an empty list
    with a nice catchable IndexError exception.

    However, if this were intended to be maintainable code I would find it
    most effectively expressed something like this:

    .... def magthresh(vel,mag,vmin=100.,vmax=200.):
    .... selector = less(vel,vmax) * greater(vel,vmin)
    .... sortedmags = sort(compress(selector,mag))
    .... if len(sortedmags):
    .... index = (len(sortedmags) + 3) // 4
    .... return sortedmags[index]
    .... else:
    .... raise IndexError,"No velocities in range."

    mt
     
    Michael Tobis, Jan 31, 2005
    #15
  16. Nick Coghlan

    beliavsky Guest

    Ok, here are some simple examples of the greater expressiveness of
    Fortran 95 compared to F77 or C for calculations involving arrays.

    (1) To compute the sum of squares for each column of a matrix of the
    positive elements, one can write in F90 just

    isum = sum(imat**2,dim=1,mask=imat>0)

    compared to

    do j=1,ncol
    isum(j) = 0
    do i=1,nrows
    if (imat(i,j) > 0) isum(j) = isum(j) + imat(i,j)**2
    end do
    end do

    I think there is a similar Numeric Python one-liner using the sum and
    compress functions. Array operations are not revolutionary (APL had
    them in 1960s), but they are faster to write and later read.

    (2) Suppose x and y are matrices of the same size and one wants to set
    each element y(i,j) = f(x(i,j)) for some elemental (no side-effects)
    function f. In Fortran 95, one can just write

    y = f(x)

    compared to

    do j=1,ncol
    do i=1,nrow
    y(i,j) = f(x(i,j))
    end do
    end do

    The ufunc of Numeric Python and the map of basic Python offer similar
    functionality.

    With Fortran 95 one can code numerical algorithms involving arrays in a
    high-level manner similar to Python with Numeric/Numarray or Matlab,
    while retaining the advantages (better performance, stand-alone
    executables) and disadvantages (explicit variable declarations, no
    scripting ability) of a compiled language with static typing. That's
    all I am claiming.
     
    beliavsky, Jan 31, 2005
    #16
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.