why python is slower than java?

T

Tim Roberts

Maurice LING said:
I am wondering the impact when IBM decided that the base memory to not
exceed 64kb, in the late 1960s...

What??? If you are referring to the original IBM PC, which did ship with
64k base memory, that was 1980, not 1960. You could certainly order one
with more than 64kB.

In the late 1960s, IBM was worrying about the System/360. It had a lot
more than 64kB of base memory.
Are the codebase of Python 1.5.2 and Java 1.1 totally replaced and
deprecated?

For new development, yes, Python 1.5.2 has been totally replaced. There
are legacy applications running in Python 1.5.2 that aren't worth the
trouble to upgrade.
Lisp compiler is the 1st compiler to be created (according to the
Red-Dragon book, I think) and almost all others are created by
bootstrapping to LISP compiler.

That's just silly. It is true that LISP was one of the pioneers of the
compiled languages, but other compilers were not written in LISP. Almost
without exception, compilers were all written in assembly language until
Pascal came around.
 
M

Maurice LING

so as a result, I ask questions trying to describe the abstract case of
what I'm trying to do so that I don't get distracted down a rathole of
defending the application. Needless to say, I suspect others are doing
the same thing and it's really really hard to describe the abstract case
in a way that's clear.
It is indeed tough to describe a situation in many cases. This may be
due to 3 cases, firstly, it will be extremely long and tedious to
describe the rationale for a decision or a situation. Secondly, during
idea explorations, the codes are not even written and there is no way to
present any codes in newsgroups because they simply hadn't existed yet.
Thirdly, it may be legally criminal to post any codes without prior
clearance.

Even in academic institutions, because there are many industry sponsored
projects, it is common these days that examiners of theses need to sign
a NDA beforehand...

maurice
 
M

Maurice LING

3) Links to threads that answer common questions
Yes, there is an FAQ and yet people still ask why Python gets division
wrong. But I've often seen people dig up links to threads for questions
perhaps not so common to be written up in an FAQ, but perhaps common
enough to merit linking on a wiki page. I'm not under the illusion that
it would solve the problem on its own, but it might well head some
repeat questions off. For those it doesn't, it would give an easy way to
answer.
This may be useful but it will certainly bring us a new set of questions
then, in the form of "is this <whatever information> still holds for
<some latest version of python>?" Although computer science stems from
mathematics, it had lost an important aspect of mathematics, that is,
"once proven true is forever true."

maurice
 
M

Maurice LING

What??? If you are referring to the original IBM PC, which did ship with
64k base memory, that was 1980, not 1960. You could certainly order one
with more than 64kB.

In the late 1960s, IBM was worrying about the System/360. It had a lot
more than 64kB of base memory.
I was thinking about the part whereby a program is allocated 64k memory
and other allocations are heap allocations which needs pointers. I
vaguely recall Turbo Pascal manual mentioned that you cannot statically
allocate more than 64k of variable memory, which is why pointers are
needed to circumvent it. And there is something called extended memory
manager (EMM386.com or something) in DOS 5 that is needed to address up
to 640kB of memory or something...

DISCLAIMER: before I am flamed again, I wish to say that these are from
vague memories of more than a decade ago. I may be COMPLETELTY WRONG. I
was then still an early teens trying to get my games running......

For new development, yes, Python 1.5.2 has been totally replaced. There
are legacy applications running in Python 1.5.2 that aren't worth the
trouble to upgrade.




That's just silly. It is true that LISP was one of the pioneers of the
compiled languages, but other compilers were not written in LISP. Almost
without exception, compilers were all written in assembly language until
Pascal came around.

I'm wrong again. I suppose what I am trying to suggest is that design
decisions made in the last may have a longer effect than what we
consciously think. At least this is the impression I get while reading
James Gosling's argument that Java should be object-oriented from day 1
(and not added onto the language, like in C++) in The Java Programming
Environment: A white paper.

maurice
 
B

Brian van den Broek

Maurice LING said unto the world upon 2004-11-07 01:04:
This may be useful but it will certainly bring us a new set of questions
then, in the form of "is this <whatever information> still holds for
<some latest version of python>?" Although computer science stems from
mathematics, it had lost an important aspect of mathematics, that is,
"once proven true is forever true."

maurice

Dang.

I already suspected that only a minority would check such a
resource. On reflection, I think you are likely correct many of that
minority would then end up asking the sort of question you indicate.
That *would* be better, but probably not enough so as to make the effort
of maintaining the resource seem worth the candle. :-(

Of the 3 parts I posted about, this one was the one I am least able to
do a draft of anyway. I think I will take a go at the other two, though.

Best,

Brian vdB
 
A

Alex Martelli

Tim Roberts said:
It is fabulous that you are able to enumerate your list of requirements so
completely. I'm quite serious; many people embark on even complicated
projects without a clear understanding of the tradeoffs they will
encounter.

However, given that set of needs, why would you mess with an "exotic"
language at all? Why wouldn't you just write straight to the metal in C++
or C?

Perhaps because point 3 has _never_ been true as stated? Nobody would
really be happpy to take 40 years more to develop a program in order to
shave a second off each hour-long run (well, maybe somebody _has_ taken
that choice 30 years ago, and we'll see their program in 10 more years).
There is always, at some point, a tradeoff - the only issue is where.
Otherwise, "to the metal" would be assembly or microcode, btw.

If a program is anyway spending its time waiting for disk, network, or
other I/O, the benefits of compressing the already-small CPU part are
tiny, by Amdahl's Law. On the other hand, ease of experimentation with
different program architectures -- for example trying to overlap some of
the I/O waits rather than serialize them -- could still help.


Alex
 
A

Alex Martelli

Roger Binns said:
I/O is one area where Java is *very* different than other environments.
Java emphasises the ability to select at runtime (typically from

Python's dynamism at runtime is definitely one of its fortes.
Note how in Python, files do read/write whereas sockets do
send/recv.

Sure, since the semantics are different. Which is why we have the
makefile method of socket objects: for those occasions in which you want
signature-polymorphism at the expense of the overhead of adaptation.
But you are comparing Apples to Oranges. Java programs are written
in certain ways with various emphases (flexibility, interfaces and
factories). Python programs emphasise other things (generators,
typing). Exceptions are expensive in Java and not used much. They
are "cheap" in Python and used frequently.

Generators are a reasonably recent addition to Python, and I have no
idea what you mean by stating that Python emphasizes typing more than
Java does -- I typically see far more mention of the types of everything
in Java code than in Python, where the usual duck-typing generally
obviates the need. Python's flexibility is definitely one of its
fortes, and while duck typing obviates much of the need for explicit
interfaces, nevertheless they're getting popular in large Python
frameworks. And factory-based design patterns are everywhere in Python,
of course; indeed, it's in Java that you see lots of 'new ThisClass'
constructs which build an instance of some hardwired concrete class --
in Python, instantiation is generally by calling, which makes it
trivially easy to arrange for a function, rather than a class, to be
called, getting factory-effect. Do not underestimate the flexibility
you get by classes and functions being first-class objects, passed as
arguments as easily as any other, ready to call for instantiation...

In other words, I do not consider your observations to be at all well
founded; the one about exceptions is rather inapplicable to the example
codes posted so far. If, as the OP claimed, Python is slower than Java
for disk-intensive programs, this should be easy for him to show, and I
have not seen it shown yet.

Arguably Python's default is reading a line at a time, and it

Not for the somefile.read method -- read doesn't do lines. You may be
thinking of iter(somefile).next instead, and I'm not sure what the Java
equivalent of _that_ one is.
is a bad default in some circumstances (large files), just
as the Java code was a bad way of doing anything but small
files.

Python's default makes it trivially easy to read most files in a single
gulp, so it's appropriate in many cases; Java's makes it hard and slow
to read ANY file, so it's never appropriate.
If the language code is the same, then that claim boils down to
the Java Native Interface vs the Python C API. In the case of
Java, I can see the claim having some relevance in multi-threaded
code since Java doesn't have the GIL.

Python does, but drops it during blocking I/O operations so that the
relevance should be just about the same in both cases.

Your timing included the overhead of starting up and shutting down
both environments, making the rest of the measure less than
interesting.

If the OP intended their claim to apply only to long-running programs
where the difference in environment startup/shutdown gets fully
amortized, they *MIGHT* have deigned to mention the fact. I have seen
no such claim yet, nor as yet ANY benchmark posted that purports to
prove anything related to the original claim.

I did observe (at some point along the substantial chain of small
benchmarks I and some other posters exchanged) that the 4:1 ratio in
runtime in favour of Python exactly matched the 4:1 ratio in pagefaults,
again in favour of Python, btw. I guess the startup/shutdown costs can
be amortized by simply looping over the filecopy operation N times.

The idea isn't to emphasise the open source side, but rather so
that anyone can see for themselves how it was all put together.
If I have a business critical app, and claim it is written in
Python but noone can see the insides then they can't really
know too much. The biggest thing is that they can't tell
if they could write code like that (or even how much was written)
to produce an app of similar functionality and complexity.

However, firms that choose not to release their business critical
applications as open source are likely to require at the very least a
non-disclosure agreement before they show you those sources, making it
impractical to use those sources to meet your wishes.

Note that even GPL would be no use here: if you do not _distribute_ your
programs, but keep them for in-house use, you keep the option to not
show anybody those programs' sources even when GPL applies. Therefore,
the problem applies equally to all languages, even a hypothetically
GPL'd one (not that I know of any GPL-covered language in common use for
writing business critical apps).

As for your latter sentence, I've never met a programmer whose default
assumption was that they would NOT be able to write code just as good as
most anybody else's.


Alex
 
A

Alex Martelli

Maurice LING said:
1. I've asked a question which I may be wrong (many people do that)

So how would you react if somebody asked a question that _takes for
granted_ something that you strongly believe is not the case, to the
point of absurdity? Say somebody posted to a local NG or mailing list
about Melbourne nightlife, asking for explanations of why the bands
playing in Melbourne clubs are so much worse than those playing in
Canberra ones. No justification of the assertion, as if it was so
obvious it didn't need any, just a request for an explanation. And say
later the same guy clarified the claim by stating, again without ANY
supporting data, that his criterion is based on bands' records and how
well they place in the hit parades -- and again, not asking _whether_
Melbourne's scene is inferior by that measure, not even stating _that_
it is and inviting rebuttal, but still expressing himself as if this was
so EVIDENTLY the case that an explanation of WHY was all he needed.

To make you more personally involved in the issue, imagine that you had
spent lots of time and energy convincing friends and acquaintances to
move to Melbourne, rather to Canberra, based in some part on how much
better the club music scene is in Melbourne. And you even had some
little involvement in the past in trying to arrange for some Melbourne
clubs to get in contact with some hitparade-successful bands.

Now, even if you personally care little about hit parades, it seems
sensible for you to review the bidding -- if nothing else, to be able to
qualify future advice to other friends, or, if needed, try to help clubs
and bands get in touch again.

And you can't find *ANY* support for the unquestioned-assumptions, "so
obvious they only need explanation", behind that guy's questions. As
far as you can tell, Melbourne's clubs have about FOUR times more bands
that place high on the hit parade playing on typical nights than
Canberra clubs do.

So, would you be quiet and let Melbourne's reputation get sullied when
you believe to have data to show that? Or would you rather relate your
own observations, and challenge the original poster to come out and
*GIVE* those data he was basing his "too obvious to question" underlying
assumptions?!

2. Some had answered and pointed out my misconceptions, thank you all
for that.

Does this imply you now believe that the unquestioned-assumptions behind
your "why" questions were unfounded?
3. Some had pointed to the fact that I am from University of Melbourne
(with unknown motives). It was then clarified that I was an honours
student (www.zoology.unimelb.edu.au) in the Dept of Zoology. Perhaps I
may say that I am a molecular biologist by degree.

I have no idea about [3]. One of my best friends in Australia is in
Melbourne now (Canberra previously). I do occasionally hear snide jabs
at Melbourne, but those are generally from _Sidney_ people, so I tend to
discount them appropriately;-).

As for your professional specialization, how would it make you feel and
react if somebody posted a question such as, say, "why are the
nucleotides in cats' cells so different from those in dogs'?". Not a
good parallel, I guess, because in this case the total and utter
cluelessness of the asker and absurdity of the question are entirely
obvious -- you would not need any research to confirm that, nor would
you need to fear anybody else getting misled from this question. So
please find something slightly subtler, but just as wrong, and focus on
a poster taking the wrong underlying assumption so much for granted that
they only need to ask about WHY things are (obviously, w/o any need for
discussion or confirmation) that way...

4. Some had pointed out instances (books and all) whereby my wrong
impressions might have been formed.

Forming a wrong impression is always a possibility. It's proceeding to
take it for granted as an obvious fact, that is quite questionable. If
you had phrased your observation in terms such as "I have gotten the
impression, without having done any measurement myself, that" etc, the
reactions would have been vastly different; your choice to express
yourself by implying an unquestionable fact existed and only required
explanations of its reasons, is a good part of the reaction's cause.
5. Some had painfully taxed on the my initial misconceptions and
generalizing it to the ridicule of non-experts, and in the process of
so, suggesting controversial codes in the pre-text of eliciting responses.

Well I _did_ get responses, which easily led to Python and Java programs
that are strongly equivalent (same buffer sizes, and all), with a 4:1
performance advantage for Python (perhaps explainable by Java's terrible
startup/shutdown performance, with 4:1 ratio of pagefaults to Python's;
pagefaults, of course, _can_ be seen as disk-intensive, too;-).
6. Furthering it, using notable words from famous people, to
discriminate against a group of people, when the first instance had been
breached... you can choose not to reply.

Just like you could choose to stay silent, and let some assertions and
implications against the Melbourne club scene stand unchallenged when
you believe you can show those "stated as too obvious to discuss"
implications are totally wrong, ill-founded, and reverse of the truth.

If you let the untruth stand unchallenged, quite apart from the sympathy
for truth that many of us feel, you'd badly serve anybody who,
considering a move to Melbourne in the near future, might "do their
homework", research recent Melbourne/Canberra comparisons, and find that
apparently nobody questions with any real data the "too obvious to
discuss" assumption about bands playing in clubs. Plus, you'd waste the
effort that weirdly implied assumption prompted you to, in order to
start researching the issue a bit.

If you do post what you have, you serve well anybody who's looking up
the issue in the future, AND get a chance to see what solid data the
original poster based their taken-for-granted assumptions on -- IF ANY.

All these happens when the discussion had been taken to other areas... I
brings one to wonder on maturity and character......

_ALL_? Re [5], I'm still trying to get a good benchmark going -- I
expect to observe roughly 1:1 ratio (net of startup and shutdown) when
that's the case. Why is that "other areas" wrt your original question
of WHY (things were obviously, undisputably in a state they aren't)?

Are you claiming that, if one asks "why is the sky yellow?", striving to
show him and everybody else that it *ISN'T* is "taking to other areas"
the discussion?! I entirely disagree. Phrasing the original question
as a "why" was one issue surely deserving of metadiscussion, but I find
it perfectly appropriate (and a sign of impeccable character and
maturity) to rephrase the question to a better "is it the case that" and
trying to explore that in this enhanced form; including explanations of
one's observations in the matter (e.g., the one about the 4:1 pagefault
ratio that's clearly part of the slowdown of Java wrt Python here).

All the knowledge in this world are in libraries and now, networked

False: not all knowledge in the world can be usefully embodied in print
(text, images, even videos and sound recordings actualy). An important
part of human knowledge is experiential, and only a shadow of that
important part can be captured in libraries, including multimedia ones.
libraries via internet. Today, almost all high school students in
developed countries are versed in internet. And by your argument, it
seems that all universities are complete waste of money as all knowledge
is out there and the tools to access the knowledge is readily available.

Many universities, as a part of their mission, conduct research and
therefore presumably generate new knowledge. This entirely self-evident
and obvious fact, even by itself, makes your assertion doubly absurd:
"all knowledge is out there" readily implies there is no knowledge to be
added, and "all universities are complete waste of money" similarly
implies no university is generating new knowledge, or what they do
generate is utterly worthless.

Your assertion that "by my argument" (which had never touched anywhere
upon the research role of universities) research is nonexistent, futile,
or entirely worthless, is at the same time absurd and deeply insulting.
If and when I want to criticize university research, I will do so
myself, and I do not AT ALL appreciate this attempt to put words in my
mouth, even though any reader with a 3-digits IQ can see it as the
insulting absurdity it is.

If we stick to the _teaching_ role of universities, the reasons my
arguments imply nothing like you're stating are both more interesting
and subtler. On one hand, there is the experiential side of knowledge.
By conducting experiments in a laboratory under proper guidance, even
though those experiments are not novel, students acquire knowledge
experientially -- a very different learning mechanism from reading books
and articles, or listening to lectures. Of course, good high schools
have laboratories etc, too, but in University, at least in scientific
and technical disciplines, the experiential learning process _should_
blossom to a far higher degree (if it doesn't -- if a university skimps
on labs and overwhelms students with just books and lectures -- then
that may well be a valid ground for criticism, _of that particular
university's choice in didactics_, of course, not "of all
universities"). In other disciplines, experiential learning may be less
obvious, but if the university is any good, it will be there.

And then, there is the issue of selection and structuring. I have
posted about that recently, in a thread asking whether there was a book
about large-scale software development with Python, and you can easily
look it up on google groups. To summarize: I have lots of materials on
the subject. I find I'm easily able to organize these materials into
courses and workshops that are specifically aimed at an identified group
of students, with certain backgrounds and interests. Organizing the
same material into a _book_ is a far harder task, one which I can't take
the time off to undertake at present... which ties back to a _part_ of
the reason why not all knowledge is in books or other printed or
otherwise 'frozen' (recorded, filmed, ...) forms. A vast majority of
the materials I collected IS out there -- over the years, I've posted a
goodly fraction of it. But it's generally unselected and unstructured,
making the learning task far more daunting than proper structuring and
selection can potentially make it.

A good university course has selection and structuring, and is
interactive in a way a book can never be, thus potentially making the
s&s more appropriate and effective for the specific individual students
who are taking that course wrt books (or some other well-organized
subset of info from the net). Of course, if you throw 500 students at a
poor professor, no matter how good he is, his ability to teach a really
good course will be impaired -- smaller classes are MUCH better that way
(another valid criticism of the way many universities are structured).

Usenet does have the interactivity advantage, but normally not the
structuring one, with rare exceptions, and only to some extent the
selection one. Thus, it can complement rather than substitute for
books, courses, and information search on the raw net. It has little
experiential value, though not zero -- _some_ of the interaction on it
does work to stimulate and vaguely guide/aim some experiences.

As mentioned, the discussion is heading else where and my misconceptions
cleared before your replies. If I've indeed forgotten, my sincere
apologies and hereby thank you for your time and efforts.

Thanks, this is appreciated. I take it then that you do not any longer
opine that on disk-I/O intensive programs Python is self-evidently
slower than Java?
You still have the right to remain solemn.

Heh, nice. Well, it's a right I surely exercise far more often than the
more traditional one of remaining silent;-).


Alex
 
M

Maurice LING

Mainly to Alex and the rest of the adjitated community,

[snip about Melbourne club scene]

I do understand what you meant. There are restrictions in giving out
data (codes) in many cases and I seek your understanding.

Nevertheless, I do feel that you had unfairly made use of this case to
voice out your accumulated dissatisfaction with answering newbies
questions.

In later parts of this thread (after your post), it was suggested that
my errorous impressions might have been formed by books and publications
(such as "learning python", as suggested)...... I do not seek to place
blame on anyone for my misconceptions. But considering a person trying
to learn a new programming language, it is common that the person takes
in what is presented in the face, especially from books such as,
"Learning python" and "Python, the complete reference".
Does this imply you now believe that the unquestioned-assumptions behind
your "why" questions were unfounded?
Now I will say that Python is comparable to Java in terms of disk I/O.

Forming a wrong impression is always a possibility. It's proceeding to
take it for granted as an obvious fact, that is quite questionable. If
you had phrased your observation in terms such as "I have gotten the
impression, without having done any measurement myself, that" etc, the
reactions would have been vastly different; your choice to express
yourself by implying an unquestionable fact existed and only required
explanations of its reasons, is a good part of the reaction's cause.
My apologies for raising your blood pressure, take care.
False: not all knowledge in the world can be usefully embodied in print
(text, images, even videos and sound recordings actualy). An important
part of human knowledge is experiential, and only a shadow of that
important part can be captured in libraries, including multimedia ones.

Many universities, as a part of their mission, conduct research and
therefore presumably generate new knowledge. This entirely self-evident
and obvious fact, even by itself, makes your assertion doubly absurd:
"all knowledge is out there" readily implies there is no knowledge to be
added, and "all universities are complete waste of money" similarly
implies no university is generating new knowledge, or what they do
generate is utterly worthless.

Perhaps I should rephase it as known knowledge so far.
Your assertion that "by my argument" (which had never touched anywhere
upon the research role of universities) research is nonexistent, futile,
or entirely worthless, is at the same time absurd and deeply insulting.
If and when I want to criticize university research, I will do so
myself, and I do not AT ALL appreciate this attempt to put words in my
mouth, even though any reader with a 3-digits IQ can see it as the
insulting absurdity it is.

If we stick to the _teaching_ role of universities, the reasons my
arguments imply nothing like you're stating are both more interesting
and subtler. On one hand, there is the experiential side of knowledge.
By conducting experiments in a laboratory under proper guidance, even
though those experiments are not novel, students acquire knowledge
experientially -- a very different learning mechanism from reading books
and articles, or listening to lectures. Of course, good high schools
have laboratories etc, too, but in University, at least in scientific
and technical disciplines, the experiential learning process _should_
blossom to a far higher degree (if it doesn't -- if a university skimps
on labs and overwhelms students with just books and lectures -- then
that may well be a valid ground for criticism, _of that particular
university's choice in didactics_, of course, not "of all
universities"). In other disciplines, experiential learning may be less
obvious, but if the university is any good, it will be there.

And then, there is the issue of selection and structuring. I have
posted about that recently, in a thread asking whether there was a book
about large-scale software development with Python, and you can easily
look it up on google groups. To summarize: I have lots of materials on
the subject. I find I'm easily able to organize these materials into
courses and workshops that are specifically aimed at an identified group
of students, with certain backgrounds and interests. Organizing the
same material into a _book_ is a far harder task, one which I can't take
the time off to undertake at present... which ties back to a _part_ of
the reason why not all knowledge is in books or other printed or
otherwise 'frozen' (recorded, filmed, ...) forms. A vast majority of
the materials I collected IS out there -- over the years, I've posted a
goodly fraction of it. But it's generally unselected and unstructured,
making the learning task far more daunting than proper structuring and
selection can potentially make it.

A good university course has selection and structuring, and is
interactive in a way a book can never be, thus potentially making the
s&s more appropriate and effective for the specific individual students
who are taking that course wrt books (or some other well-organized
subset of info from the net). Of course, if you throw 500 students at a
poor professor, no matter how good he is, his ability to teach a really
good course will be impaired -- smaller classes are MUCH better that way
(another valid criticism of the way many universities are structured).

Usenet does have the interactivity advantage, but normally not the
structuring one, with rare exceptions, and only to some extent the
selection one. Thus, it can complement rather than substitute for
books, courses, and information search on the raw net. It has little
experiential value, though not zero -- _some_ of the interaction on it
does work to stimulate and vaguely guide/aim some experiences.





Thanks, this is appreciated. I take it then that you do not any longer
opine that on disk-I/O intensive programs Python is self-evidently
slower than Java?




Heh, nice. Well, it's a right I surely exercise far more often than the
more traditional one of remaining silent;-).


Alex

All in all, through all these discussions, I can safely assert that
Python and Java are comparable in disk I/O. And a part of the original
misconceptions might have been formed possibly out-dated printed
materials which are still references for new python programmers. It is
then my concern that such misconceptions may be perpentuated.

I admit and apologise for my poor phrasing of questions which sparked
this chain of events. We had all lost some cool along the way and some
harsh words flew. Alex, I do hope you will accept my apologies. I
suppose I am pissed off when the flare is targetted towards myself and
not the situation. Anyway, my apologies...

maurice
 
I

Ian Bicking

Israel said:
I would reject the premise entirely.

When looking at desktop apps made with Python, they positively whiz
when compared to Java swing apps.

Let's give a shout out to the GIL! Woo GIL! More seriously, I'd give
up Java's theoretical thread scalability any day in return for the ease
of Python's C API. Not that I've tried to code a C extension for either
language; but I get the impression the GIL is an important part of
making Python easy to use, and I frequently take advantage of the
extensions other people have written, and I'm betting there'd be less of
them without the GIL. They seem uncommon in Javaland.
As for non desktop apps, the entire portage system of Gentoo is
written in Python. 'emerge sync' causes a python app to synchronise a
local application database with database at a non-local mirror. It is
i/o intensive and appears to work very well and very fast.

Here, I'm betting it's fast because of a good implementation; because
emerge can avoid doing work, rather than obsessing about doing the work
quickly. For something like emerge, there's huge potential in caching,
dependency checking, and generally being smart about the tradeoffs.
This will apply to any program that uses the network, a database, large
files, or any of those other time sinks that easily outweigh CPU
performance. If you can get the job done in Python in half the time (or
better), that gives you more time to make the application faster.
 
A

Alex Martelli

Mainly to Alex and the rest of the adjitated community,

[snip about Melbourne club scene]

I do understand what you meant. There are restrictions in giving out
data (codes) in many cases and I seek your understanding.

Sure, if you need help understanding or fixing the behavior of
proprietary codes, that's harder. Nevertheless, although not a
work-free process, it IS often productive to try and extract a small
core of code that reproduces a performance problem. If you can't
reproduce the problem on a small scale (small in terms of lines of
code, not in quantity of data), that's significant too. Best is when
you _can_ reproduce it, because then you do have code you can post and
get free expert help about!
Nevertheless, I do feel that you had unfairly made use of this case to
voice out your accumulated dissatisfaction with answering newbies
questions.

I disagree: somebody else commented on newbie-friendliness of this
group, and my discussion was in response to their comments.
In later parts of this thread (after your post), it was suggested that
my errorous impressions might have been formed by books and
publications (such as "learning python", as suggested)...... I do not
seek to place blame on anyone for my misconceptions. But considering a
person trying to learn a new programming language, it is common that
the person takes in what is presented in the face, especially from
books such as, "Learning python" and "Python, the complete reference".

I have not seen "learning python" mentioned in this thread -- funny,
because I did read all posts; having been a tech reviewer for (2nd
edition) Learning Python, I'd have been quite interested (if nothing
else, to check if the guilty passage WAS one I had remarked on, and my
remarks had not been taken into account by the authors, or something
that had escaped me). I'll googlegroup in a few days when the archive
has had a chance to update.
Now I will say that Python is comparable to Java in terms of disk I/O.

This makes sense to me. Java apparently has more traps and pitfalls,
but once you're expert enough to bypass them, if program
startup/shutdown is no problem (long-running program), both Java and
Python should be able to saturate the disk bandwidth capacity,
normally.

All in all, through all these discussions, I can safely assert that
Python and Java are comparable in disk I/O. And a part of the original
misconceptions might have been formed possibly out-dated printed
materials which are still references for new python programmers. It is
then my concern that such misconceptions may be perpentuated.

I share your concerns in this regard. Which is why I'll be quite
interested to check "Learning Python" once I do get the specific
reference on google groups. I'm not sure what that "complete
reference" book _IS_ -- there is probably nothing we can do about that
one. But good publishers CARE, and O'Reilly IS a good publisher, so,
if Learning Python has some serious error, we can surely get it fixed
next printing!!!
I admit and apologise for my poor phrasing of questions which sparked
this chain of events. We had all lost some cool along the way and some
harsh words flew. Alex, I do hope you will accept my apologies. I
suppose I am pissed off when the flare is targetted towards myself and
not the situation. Anyway, my apologies...

Thanks!, and my apologies in return if the way I responded to other
people's comments about generic newbie-friendliness were poorly worded
and made you feel unfairly targeted for other people's mistakes. That
was not my intention, please be sure of that.

I _am_ interested in pursuing the issues of possible errors in Python
books, AND I/O performance issues in Python vs Java, btw, if anybody
wants to. The fact that _in theory_ there shouldn't be any such
issues, doesn't mean that some couldn't be found _in practice_ where we
(or JVM coders;-) goofed or took justifiable but unfortunate design
decisions...;-).


Alex
 
M

Maurice LING

Dear Terry,
Not really. It's an impression you could easily get from several
books about Python (e.g. O'Reilly's "Learning Python"), which
make a rather big deal about Python being slow to execute, but
fast to develop in.
Do you have any idea which part of the book or chapter or passage which
suggests this? Apparently here appears to be a technical fault in the
book and Alex (in cc) is one of the technical reviewers of "Learning
Python" and is trying to fix this...
 
P

Paul Foley

That's just silly. It is true that LISP was one of the pioneers of the
compiled languages, but other compilers were not written in LISP. Almost
without exception, compilers were all written in assembly language until
Pascal came around.

Lisp had the first compiler for language X to be written in language
X, and used to compile itself (and presumably the first to be written
in any high level language), but not the first compiler in existence,
and non-Lisp compilers were not generally written in Lisp, no.

[There were C, Pascal, Ada and Prolog compilers written in Lisp,
though]
 
P

Pythogoras

Bryan said:
also, just for fun, write the following fully
working python program in java:
import time
t = time.time()
s = open('in.txt').read()
open('out.txt', 'w').write(s)
print time.time() - t

OK. I will rewrite it in Java.

///
long t = currentTimeMillis();
tryCopyFile("in.txt","out.txt");
out.println( (currentTimeMillis() - t) / 1000);
///

You lost. The Java-version is shorter!
I didn't count import- and import static-statements
because Eclipse cares about imports automatically.


But maybe you want to have something more fun?
How about that?
read autoexec.bat,
sort the lines
create sorted_autoexec.bak

///
File source = new File("c:/autoexec.bat");
File destination = new File("c:/sorted_autoexec.bak");

String[] lines = readLines(source);
lines = sort(lines);
write(destination,lines);
///

You lost.
You Pythonistas are a bunch of losers.
And your language is dog-slow.
Eeeeeek.

And try to write Eclipse with your Mickey-Mouse-language!

<j>
Pythogoras
 
A

Alex Martelli

Roy Smith said:
My guess is the poor performance had nothing to do with the language you
wrote it in, and everything to do with the algorithms you used.

I think this is an overbid.
Local optimizations rarely gets you more than a factor of 2 improvement.
Choice of language (at least within the same general catagory such as
comparing one native compiled language to another, or one virtual
machine language to another) probably has a somewhat broader range, but
still a factor of 10 would be quite surprising.

kallisti:/tmp alex$ python -mtimeit 'exec("x=2")'
10000 loops, best of 3: 64.7 usec per loop
kallisti:/tmp alex$ python -mtimeit 'x=2'
10000000 loops, best of 3: 0.187 usec per loop

There: a factor of 350 just for using the wrong construct in the same
language -- a silly exec rather than a plain assignment.

To get really bad performance, you need to pick the wrong algorithm. A
project I worked on a while ago had a bit of quadratic behavior in it
when dealing with files in a directory. Most of our customers dealt
with file sets in the 100's. One customer had 50,000 files. Going
from, say, 500 to 50,000 is a 100-fold increase in N, and gave a
10,000-fold increase in execution time.

Right, big-O can often dominate (there are exceptions: in almost all
practical cases the outstanding performance of Python's list.sort wipes
away the theoretical issue that it's O(N logN), letting dominate for
many special cases approaches that are O(N) -- the multiplicative
constants differ by just TOO much for all practically usable N's!-).

Still, consider...:

kallisti:/tmp alex$ python -mtimeit 's=[]
for x in xrange(1000): s = s+[x]'
100 loops, best of 3: 13.7 msec per loop

kallisti:/tmp alex$ python -mtimeit 's=[]
for x in xrange(1000): s += [x]'
1000 loops, best of 3: 1.58 msec per loop

The difference between s += [x] and s = s + [x] is large. AND:

kallisti:/tmp alex$ python -mtimeit 's=[]
for x in xrange(4000): s = s+[x]'
10 loops, best of 3: 224 msec per loop

kallisti:/tmp alex$ python -mtimeit 's=[]
for x in xrange(4000): s += [x]'
100 loops, best of 3: 6.4 msec per loop

....the big-O is different!!! O(N) for +=, way bigger for plain +: as
you can see, a x4 on N makes a 16.3 times increase in time here,
suggesting AT LEAST quadratic behavior. Again, these are good vs bad
cases in the *same* language - Python 2.4 beta 1 in all cases. I'm sure
you can easily imagine how different languages might make apparently
identical constructs be O(N) vs O(N squared) or worse, if even the same
language can do so for constructs whose difference may not be apparent
at all to the newbie...


Alex
 
A

Alex Martelli

Maurice LING said:
Come to think of it, yes, "Python, The Complete Reference", "Programming
Python" and "Learning Python" all seems to have that "speed warning" tag
in the beginning chapters.

If this seems to be the wrong impression, should the authors do something?

I guess they should, if they care about what they're communicating.
This is how I put the issue in "Python in a Nutshell":

'''
Good compilers for classic compiled languages can often generate binary
machine code that runs much faster than Python code. However, in most
cases, the performance of Python-coded applications proves sufficient.
When it doesn't, you can apply the optimization techniques covered in
Chapter 17 to enhance your program's performance while keeping the
benefits of high programming productivity.
'''

I'm biased, having written this, but it seems a balanced set of
assertions which tries hard not to leave either wrong impression.


Alex
 
B

Brian Beck

Pythogoras said:
OK. I will rewrite it in Java.

///
long t = currentTimeMillis();
tryCopyFile("in.txt","out.txt");
out.println( (currentTimeMillis() - t) / 1000);
///

You lost. The Java-version is shorter!
I didn't count import- and import static-statements
because Eclipse cares about imports automatically.

I don't know what version of Java they're using in this troll's fantasy
land, but tryCopyFile does not, and has never existed in any Java
implementation. The real code for copying a file would look something
like the contents of this function (which is not a part of the standard
library):

public static void copy(File source, File dest) throws IOException {
FileChannel in = null, out = null;
try {
in = new FileInputStream(source).getChannel();
out = new FileOutputStream(dest).getChannel();

long size = in.size();
MappedByteBuffer buf = in.map(FileChannel.MapMode.READ_ONLY,
0, size);

out.write(buf);

} finally {
if (in != null) in.close();
if (out != null) out.close();
}
}
 
A

Alex Martelli

Maurice LING said:
I am wondering the impact when IBM decided that the base memory to not
exceed 64kb, in the late 1960s...

??? IBM in the '60s was developing the then-revolutionary concept of a
*family of computers* -- several models able to run the same codes with
different performance and price. "360 degrees computing", whence came
the idea of calling that family "IBM 360". Out of the 32 bits of a
register's width, only 24 bits were in fact used for addressing, so the
limit was *16 megabytes* of base memory -- 40 years ago, that not only
SEEMED huge, it WAS huge. To the point that Motorola in the late
'70s/early '80s took exactly the same architectural decision regarding
*their* (less-revolutionary...) 16-bit microprocessor, the 68000: again
32-bit registers, but, again, 24-bit addresses. The point being that,
about 15+ years later (10 doublings by Moore's Law!!!), IBM's crucial
architectural decision STILL seemed quite reasonable, albeit when
thinking of microcomputers rather than mainframes.

If you're thinking of IBM PC's, the key architectural decision there had
come from intel (8086 then 8088), again in the late '70s, and it _was_
way too restrictive -- just 1MB of addressability (in 64-k segments), vs
the unsegmented 16 MB of IBM earlier and Motorola at about the same
time. IBM when designing the IBM PC shortly afterwards decided to pick
1/3 of that MB for I/O and special extension cards' memories rather than
general base RAM -- leaving 640K, not 64K!, for the latter -- and the
difference betwen 640K and 1MB never really had any big impact.

The machines with 64-K byte limits I know of were early 16-bit minis
(such as PDP-11, despite later kludges to push that limit up) and just
about all 8-bit micros (intel, Motorola, and just about all others).

Are the codebase of Python 1.5.2 and Java 1.1 totally replaced and
deprecated?

No, there's a lot of code from that era still working happily.
Lisp compiler is the 1st compiler to be created (according to the
Red-Dragon book, I think) and almost all others are created by

I'm sure Aho, Hopcroft and Ullmann would never be so ignorant or
insensitive as to detract from Backus' (and IBM Research's) epoch-making
result in making the first production compiler -- exactly 50 years ago,
quite a bit earlier than LISP, for the language first known as Formula
Translation, a name which was soon shortened to ForTran. Lisp has quite
some "first"s, but "first compiler" ain't among them.
bootstrapping to LISP compiler. What are the implications of design
decisions made in LISP compiler then affecting our compilers today? I
don't know. I repeat myself, I DO NOT KNOW.

I can't think of even a single one, although I fancy myself as quite
knowledgeable in the history of computing -- neither from Formula
Translation, nor from LISt Processor, not one design decision whose
implications are still affecting compilers being written today. Same
goes for computers of the same era, the '50s. If you move to the '60s
then some things do suggest themselves -- at least in computers: general
registers that are not part of the addressable memory, memory
addressable by the byte, 8-bit bytes with significant 16-bit and 32-bit
halfwords and fullwords too -- all parts of the IBM-360 legacy; not so
obvious in languages and compilers, maybe. Maybe the concepts of
grammar, tokenizers and parsers -- all stuff that wasn't in evidence in
the '50s, at least not in ForTran or Lisp. The ambitious concept of a
language good for all kinds of problems, commercial AND scientific, much
like IBM's "360 degrees computing" was breaking down the barriers
between scientific and commercial computers in the HW field. The very
first emergence of "bytecodes", partly from trying to emulate yet-older
computers on those of the time, to keep running old applications on new
hardware where IBM's revolutionary concept of "a family of computers"
didn't apply. Pretty thin stuff, IMHO... you have to get to the '70s to
find the emergence of a more solid legacy, I think.


Alex
 
A

Alex Martelli

Pythogoras said:
But maybe you want to have something more fun?
How about that?
read autoexec.bat,
sort the lines
create sorted_autoexec.bak

file('sorted_autoexec.bak','w').writelines(sorted(file('autoexec.bat')))

73 chars in one line. Yeah, I cheated -- I _would_ normally put a space
after the comma, making the real total into seventyFOUR...
///
File source = new File("c:/autoexec.bat");
File destination = new File("c:/sorted_autoexec.bak");

String[] lines = readLines(source);
lines = sort(lines);
write(destination,lines);

How delightfully quaint and redundant. I do recall a time where Python
would take such a roundabout approach too -- of course, the Python code,
even at that time, was elegantly object-oriented, with the reading,
sorting and writing all neatly expressed as methods of file and list
objects, rather than these weird 'readLines' and 'sort' and 'write'
apparently-global functions taking the objects as arguments. Still, all
of those mysterious 'globals' does enhance the retrocomputing taste of
your code -- one can almost see you writing it with a quill dipped in
ink, at a carved oak desk, on your steam-powered computer. Charming!

Should you ever decide to move into the 21st century, though, don't
worry: Python will be there to help you do so.


Alex
 
I

Isaac To

Tim> It is fabulous that you are able to enumerate your list of
Tim> requirements so completely. I'm quite serious; many people
Tim> embark on even complicated projects without a clear
Tim> understanding of the tradeoffs they will encounter.

Tim> However, given that set of needs, why would you mess with an
Tim> "exotic" language at all? Why wouldn't you just write
Tim> straight to the metal in C++ or C? -- - Tim Roberts,
Tim> (e-mail address removed) Providenza & Boekelheide, Inc.

It is quite possible that the list of features, requirements, etc.,
are all known only after the Python prototype show that it worked.
Very few people (except those who are really very proficient in C++)
would like to write prototypes in such static languages, but once the
prototype show what is needed, and the prototype becomes part of the
production system, one have to step by step turn it back to C++.

Regards,
Isaac.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top