Python advocacy in scientific computation

V

val bykoski

sturlamolden said:
Sorry I forgot the part about writing grant applications. As for
teaching students, I have thankfully not been bothered with that too
much.



Yes, and that is why I use C (that is ISO C99, not ANSI C98) instead
of Matlab for everything except trivial tasks. The design of Matlab's
language is fundamentally flawed. I once wrote a tutorial on how to
implement things like lists and trees in Matlab (using functional
programming, e.g. using functions to represent list nodes), but it's
just a toy. And as Matlab's run-time does reference counting insted
of proper garbage collection, any datastructure more complex than
arrays are sure to leak memory (I believe Python also suffered from
this as some point). Matlab is not useful for anything except
plotting data quickly. And as for the expensive license, I am not
sure its worth it. I have been considering a move to Scilab for some
time, but it too carries the burden of working with a flawed
language.

A quick addition to Robert's very reasonable response to you. My point
is that to *trust* a simulation *results* (no matter how fast/slow/etc
you obtained it) you have to explore and manage the "physics" or
"biology" of your code. That's where Python's readability, flexibility,
and dynamism (including on-the-fly model building/testing/correction) as
well as model introspecting and exploration capabilities are of critical
importance and sometimes the indication to a missing link. It does not
hurt to remember that the original idea (by S.Ulam) of a computer was
the idea of an *experimentation environment* (including sampling). It
does not look like the Matlab's strongest point is the feedback-driven
experimentation. Or i'm missing smth about ISO C99?
Val Bykoski
 
D

David M. Cooke

Robert Kern said:
I used to think like that up until two seconds before I entered this gem:

$ rm `find . -name "*.pyc"`

Okay, I didn't type it exactly like that; I was missing one character. I'll let
you guess which.

I did that once. I ended up having to update decompyle to run with
Python 2.4 :) Lost comments and stuff, but the code came out great.
 
B

bearophileHUGS

Michael Tobis:
Python plays so well with others. For many applications, NumPy is fine.
Otherwise write your own C or C++ or F77; building the Python bindings
is trivial.

On Windows I have found that creating such bindings is very very
difficult... I have succed only partially, with C++, and I've had to
compile Python with MinGW (a compilation succed at about 95%). It's not
easy and surely not trivial (for me).

Bye,
bearophile
 
A

Andy Salnikov

Steve Holden said:
Troll. You think this is going away because *you* don't like it? Am I to
presume that you don't bother to indent your C code according to its
nested block structure? If you *do* indent your C code, perhaps you can
explain the additional benefits of the braces?
Calling people trolls for expressing their honest opinion is probably
the worst thing you can do. Try to be more constructive if you want to
help people, especially with such heated topic as indentation.

My own opinion on this, I think the indentation is probably one
biggest drawback which prevents wider Python acceptance. Indentation
makes all kinds of inlined code extremely clumsy or practically impossible
in Python. OK, I'll stop here, time to be called troll myself :(

Andy.
 
M

Magnus Lycka

Terry said:
I believe it is Guido's current view, perhaps Google's collective view, and
a general *nix view that such increases can just as well come thru parallel
processes. I believe one can run separate Python processes on separate
cores just as well as one can run separate processes on separate chips or
separate machines. Your view has also been presented and discussed on the
pydev list. (But I am not one for thread versus process debate.)

That's ok for me. I usually have lots of different things
happening on the same computer, but for someone who writes
an application and want to make his particular program faster,
there is not a lot of support for building simple multi-process
systems in Python. While multi-threading is nasty, it makes
it possible to perform tasks in the same program in parallel.

I could well imagine something similar to Twisted, where the
different tasks handled by the event loop were dispatched to
parallel execution on different cores/CPUs by some clever
mechanism under the hood.

If the typical CPU in X years from now is a 5GHz processor with
16 cores, we probably want a single Python "program" to be able
to use more than one core for CPU intensive tasks.
Queue.Queue was added to help people write correct threaded programs.

What I'm trying to say is that multi-threaded programming
(in all languages) is difficult. If *other* languages manage
to make it less difficult than today, they will achieve a
convenient performance boost that Python can't compete with
when the GIL prevents parallel execution of Python code.

- For many years, CPU performance has increased year after
year through higher and higher processor clock speeds. The
number of instructions through a single processing pipeline
per second has increased. This simple kind of speed increase
is flattening out. The first 1GHz CPUs from Intel appeared
about five years ago. At that time, CPU speed still doubled
every 2 years. With that pace, we would have had 6GHz CPUs
now. We don't! Perhaps someone will make some new invention
and the race will be on again...but it's not the case right
now.

- The hardware trend right now, is to make CPUs allow more
parallel execution. Today, double core CPU's are becoming
common, and in a few years there will be many more cores
on each CPU.

- While most computers have a number of processes running in
parallel, there is often one process / application that is
performance critical on typical computers.

- To utilize the performance in these multi-core processors,
we need to use multi-threading, multiple processes or some
other technology that executes code in parallel.

- I think languages and application design patterns will evolve
to better support this parallel execution. I guess it's only
a few languages such as Erlang that support it well today.

- If Python isn't to get far behind the competition, it has
to manage this shift in how to utilize processor power. I
don't know if this will happen through core language changes,
or via some conveninent modules that makes fork/spawn/whatever
and IPC much more convenient, or if there is something
entirely different waiting around the corner. Something is
needed I think.
 
M

Michael Tobis

Indentation
makes all kinds of inlined code extremely clumsy or practically impossible
in Python.

This is the only sensible argument against the indentation thing I've
heard. Python squirms about being inlined in a presentation template.
Making a direct competitor to PHP in pure Python is problematic.

While there are workarounds which are not hard to envision, it seems
like the right answer is not to inline small fragments of Python code
in HTML, which is probably the wrong approach for any serious work
anyway. This problem does, however, seem to interfere with adoption by
beginning web programmers, who may conceivably end up in PHP or perhaps
Perl Mason out of an ill-considered expedience.

Why this should matter in this discussion, about scientific
programming, escapes me though.

When you say "all kinds" of inlined code, do you have any other
examples besides HTML?

mt
 
A

Andy Salnikov

Michael Tobis said:
This is the only sensible argument against the indentation thing I've
heard. Python squirms about being inlined in a presentation template.
Making a direct competitor to PHP in pure Python is problematic.

While there are workarounds which are not hard to envision, it seems
like the right answer is not to inline small fragments of Python code
in HTML, which is probably the wrong approach for any serious work
anyway. This problem does, however, seem to interfere with adoption by
beginning web programmers, who may conceivably end up in PHP or perhaps
Perl Mason out of an ill-considered expedience.

Why this should matter in this discussion, about scientific
programming, escapes me though.

When you say "all kinds" of inlined code, do you have any other
examples besides HTML?
Makefiles is one example. Shell script containing snippet(s) of
Python code is another one.

At one time I also tried to make a simple "configuration file"
engine based on Python for a big Framework used in one physics lab.
Idea was to have a Python extension for that C++ framework and
to configure the Framework from Python code, like:

# Module means C++ Framework module, not Python

Module1.param1 = "a string"
Module2.paramX = [ 1, 2, 3 ]
# etc., with all Python niceties.

People who were using this Framework were all hard-core physicists,
some of them knew Fortran, many were exposed to C++. There were
few other "languages", some of them home-grown, used for different
tasks, but none of these mentioned languages ever placed so much
significance on the whitespaces. There were some big surprises for
people when they discovered they can't arbitrary indent pieces of
the above configuration files because it is all Python code. Add
here space/tabs controversy if it is not enough yet to confuse
poor physicist fellows :) I think that config file project was killed
later in favor of less restrictive format (I left the lab before that,
can't say for sure.)

Andy.
 
S

Steve Holden

Andy said:
This is the only sensible argument against the indentation thing I've
heard. Python squirms about being inlined in a presentation template.
Making a direct competitor to PHP in pure Python is problematic.

While there are workarounds which are not hard to envision, it seems
like the right answer is not to inline small fragments of Python code
in HTML, which is probably the wrong approach for any serious work
anyway. This problem does, however, seem to interfere with adoption by
beginning web programmers, who may conceivably end up in PHP or perhaps
Perl Mason out of an ill-considered expedience.

Why this should matter in this discussion, about scientific
programming, escapes me though.

When you say "all kinds" of inlined code, do you have any other
examples besides HTML?

Makefiles is one example. Shell script containing snippet(s) of
Python code is another one.

At one time I also tried to make a simple "configuration file"
engine based on Python for a big Framework used in one physics lab.
Idea was to have a Python extension for that C++ framework and
to configure the Framework from Python code, like:

# Module means C++ Framework module, not Python

Module1.param1 = "a string"
Module2.paramX = [ 1, 2, 3 ]
# etc., with all Python niceties.

People who were using this Framework were all hard-core physicists,
some of them knew Fortran, many were exposed to C++. There were
few other "languages", some of them home-grown, used for different
tasks, but none of these mentioned languages ever placed so much
significance on the whitespaces. There were some big surprises for
people when they discovered they can't arbitrary indent pieces of
the above configuration files because it is all Python code. Add
here space/tabs controversy if it is not enough yet to confuse
poor physicist fellows :) I think that config file project was killed
later in favor of less restrictive format (I left the lab before that,
can't say for sure.)
I just hope this remains a "someone made a poor choice of configuration
language and trained the users inadequately" story, and does not
transmogrify into a "Python is bad" story.

You mention makefiles and shell scripts as contexts unsympathetic to
Python's indentation requirements, but frankly you don't see much code
in any language except shell inlined in these contexts.

Given the makefile's requirement that significant leading whitespace be
tabs and not spaces and you have a recipe for disaster inlining any
language.

regards
Steve
 
M

Michael Tobis

I think I agree with Steve here.

I suspect you should either have sufficiently trained your users in
Python, or have limited them to one-line statements which you could
then strip of leading whitespace before passing them to Python, or even
offered the alternative of one or the other. This would not have been
much extra work.

As for shell scripts generating Python code, I am not sure what you
were trying to do, but if you're going that far why not just replace
the shell script with a python script altogether?

os.system() is your friend.

I also agree with Steve that I can't see what this has to do with
makefiles. (But then I think "make" is a thoroughly bad idea in the
first place, and think os.system() is my friend.)

mt
 
A

Andy Salnikov

Steve Holden said:
Andy said:
When you say "all kinds" of inlined code, do you have any other
examples besides HTML?

Makefiles is one example. Shell script containing snippet(s) of
Python code is another one.

At one time I also tried to make a simple "configuration file"
engine based on Python for a big Framework used in one physics lab.
Idea was to have a Python extension for that C++ framework and
to configure the Framework from Python code, like:

# Module means C++ Framework module, not Python

Module1.param1 = "a string"
Module2.paramX = [ 1, 2, 3 ]
# etc., with all Python niceties.

People who were using this Framework were all hard-core physicists,
some of them knew Fortran, many were exposed to C++. There were
few other "languages", some of them home-grown, used for different
tasks, but none of these mentioned languages ever placed so much
significance on the whitespaces. There were some big surprises for
people when they discovered they can't arbitrary indent pieces of
the above configuration files because it is all Python code. Add
here space/tabs controversy if it is not enough yet to confuse
poor physicist fellows :) I think that config file project was killed
later in favor of less restrictive format (I left the lab before that,
can't say for sure.)
I just hope this remains a "someone made a poor choice of configuration
language and trained the users inadequately" story, and does not
transmogrify into a "Python is bad" story.
It does not, and I did not say it's "bad". But people do percieve it
as at least very weird kind of language in a modern times of all the
"curly brace languages".
You mention makefiles and shell scripts as contexts unsympathetic to
Python's indentation requirements, but frankly you don't see much code in
any language except shell inlined in these contexts.
Shell's strength is in the process spawning/management and input/output
redirection, Python is rather weak in that area but OTOH Python is
strong in processing highly structured and numeric data, where shells
are really weak. I saw lots of awk or sed "code" embedded in scripts
so your claim that nothing except sheel is being inlined does not look
right to me.
Given the makefile's requirement that significant leading whitespace be
tabs and not spaces and you have a recipe for disaster inlining any
language.
I saw makefiles with thousands lines of Perl code in them. I agree this
(Perl) is disaster, but it would probably be better if it was Python code
instead.

Andy.
 
A

Andy Salnikov

Michael Tobis said:
I think I agree with Steve here.

I suspect you should either have sufficiently trained your users in
Python, or have limited them to one-line statements which you could
then strip of leading whitespace before passing them to Python, or even
offered the alternative of one or the other. This would not have been
much extra work.

As for shell scripts generating Python code, I am not sure what you
were trying to do, but if you're going that far why not just replace
the shell script with a python script altogether?

os.system() is your friend.

I also agree with Steve that I can't see what this has to do with
makefiles. (But then I think "make" is a thoroughly bad idea in the
first place, and think os.system() is my friend.)

mt

Actually os.system() is rather poor replacement for the shell's
capabilities, and it's _very_ low level, it's really a C-level code
wrapped in Python syntax. Anyway, to do something useful you need
to use all popen() stuff, and this is indeed infinitely complex
compared to the easy shell syntax.

Andy.
 
G

Greg Ewing

Andy said:
I saw lots of awk or sed "code" embedded in scripts

In my experience, embedding any of make/sh/awk/sed in
any of the others is a nightmare of singlequote/
doublequote/backslash juggling that makes a few
tab/space problems in Python pale by comparison.
 
G

Greg Ewing

Andy said:
Actually os.system() is rather poor replacement for the shell's
capabilities, and it's _very_ low level, it's really a C-level code
wrapped in Python syntax.

Since os.system() spawns a shell to execute the command,
it's theoretically capable of anything that the shell
can do. It's somewhat inelegant having to concatenate
all the arguments into a string, though.

I gather there's a new subprocess management module
coming that's designed to clean up the mess surrounding
all the popen() variants. Hopefully it will make this
sort of thing a lot easier.
 
G

Greg Ewing

Peter said:
This is hard to understand for an outsider. If you pass an int, a float,
a string or any other "atomic" object to a function you have "pass by
value" semantics. If you put a compound object like a list or a dictionary
or any other object that acts as an editable data container you can return
modified *contents* (list elements etc.) to the caller, exactly like in
Java and different from C/C++.

There's really no difference here -- when you pass an
int, you're passing a pointer to an int object, just
the same as when you pass a list, you're passing a
pointer to a list object. It's just that Python
doesn't provide any operations for changing the
contents of an int object, so it's hard to see
the difference.

The similarity is brought out by the following
example:
.... x = 42
........ x = [42]
....
>>> y = 3
>>> a(y)
>>> print y 3
>>> y = [3]
>>> b(y)
>>> print y
[3]

What this shows is that assignment to the parameter
*name* never affects anything outside the function,
regardless of whether the object passed in is mutable
or immutable.

It's best to avoid using terms like "by reference" when
talking about Python parameter passing, because it's
hard to tell whether the person you're talking to
understands the same thing by them. But if you
insist, the correct description in Algol terms is
that Python passes pointers to objects by value.
 
D

Dennis Lee Bieber

I gather there's a new subprocess management module
coming that's designed to clean up the mess surrounding

"... coming..." Pardon? I thought it was already standard in 2.4,
and downloadable for 2.3
--
 
M

Michael McNeil Forbes

Robert Kern said:
sturlamolden wrote: .... ....
This is one thing that a lot of people seem to get wrong: version
control is not a burden on software development. It is a great
enabler of software development. It helps you get your work done
faster and easier even if you are a development team of one. You can
delete code (intentionally!) because it's not longer used in your
code, but you won't lose it. You can always look at your history and
get it again. You can make sweeping changes to your code, and if
that experiment fails, you can go back to what was working
before. Now you can do this by making copies of your code, but
that's annoying, clumsy, and more effort than it's worth. Version
control makes the process easier and lets you do more interesting
things.

I would go so far as to say that version control enables the
application of the scientific method to software development. When
you are in lab, do you say to yourself, "Nah, I won't write anything
in my lab notebook. If the experiment works at the end of the day,
only that result matters"?
....
A slightly off topic note:

I find that version control (VC) has many advantages for
scientific research (I am a physicist).

1) For software as Robert mentions I find it indispensable.
2) Keeping track of changes to papers (as long as they are plain text
like LaTeX). This is especially useful for collaborations: using
the diff tools one can immediately see any changes a coauthor may
have made.

(I even use branching: maintaining one branch for the journal
submission which typically has space restrictions, and another for
preprint archives which may contain more information.)
3) Using VC allows you to easily bring another computer up to date
with your current work. If I go to a long workshop and use local
computing resources, I simply checkout my current projects and I
can work locally. When I am done, I check everything back in and
when I get home, I can sync my local files.
-------

Another aspect of python I really appreciate are the unit testing
facilities. The doctest, unittest, and test modules make it easy to
include thorough tests: crucial for making sure that you can trust the
results of your programs. Organizing these tests in MATLAB and with
other languages was such a pain that I would often be tempted to omit
the unit tests and just run a few simulations, finding errors on the
fly.

Now I almost always write unit tests along with---or sometimes
before---I write the code.

Michael.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top