why python is slower than java?

K

Kent Johnson

I rarely find myself acting as an apologist for Java, and I understand
that the point Alex is making is that Python's performance for this
operation is quite good, and that the OP should post some code, but this
is really too unfair a comparison for me not to say something.

There are two major differences between these two programs:
- The Java version is doing a character by character copy; the Python
program reads the entire file into a buffer in one operation.
- The Java program is converting the entire file to and from Unicode;
the Python program is copying the literal bytes.

Here is a much more comparable Java program (that will fail if the file
size is over 2^31-1):

import java.io.*;

public class Copy {
public static void main(String[] args) throws IOException {
File inputFile = new File("/usr/share/dict/web2");
int bufferSize = (int)inputFile.length();
File outputFile = new File("/tmp/acopy");

FileInputStream in = new FileInputStream(inputFile);
FileOutputStream out = new FileOutputStream(outputFile);

byte buffer[] = new byte[bufferSize];
int len=bufferSize;

while (true)
{
len=in.read(buffer,0,bufferSize);
if (len<0 )
break;
out.write(buffer,0,len);
}

in.close();
out.close();
}
}

Here are the results I get with this program and Alex's Python program
on my G4-400 Mac:
kent% time java Copy
0.440u 0.320s 0:00.96 79.1% 0+0k 9+3io 0pf+0w
kent% time python Copy.py
0.100u 0.120s 0:00.31 70.9% 0+0k 2+4io 0pf+0w

The Python program is still substantially faster (3x), but with nowhere
near the margin Alex saw.

Kent

Alex said:
OK, could you provide a simple toy example that meets these conditions
-- does lot of identical disk-intensive I/O "in batch" -- and the
execution speed measured (and on what platform) for what Python and Java
implementations, please?

For example, taking a trivial Copy.java from somewhere on the net:

import java.io.*;

public class Copy {
public static void main(String[] args) throws IOException {
File inputFile = new File("/usr/share/dict/web2");
File outputFile = new File("/tmp/acopy");

FileReader in = new FileReader(inputFile);
FileWriter out = new FileWriter(outputFile);
int c;

while ((c = in.read()) != -1)
out.write(c);

in.close();
out.close();
}
}

and I observe (on an iBook 800, MacOSX 10.3.5):

kallisti:~ alex$ java -version
java version "1.4.2_05"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_05-141.3)
Java HotSpot(TM) Client VM (build 1.4.2-38, mixed mode)

-r--r--r-- 1 root wheel 2486825 12 Sep 2003 /usr/share/dict/web2

kallisti:~ alex$ time java Copy

real 0m7.058s
user 0m5.820s
sys 0m0.390s

versus:

kallisti:~ alex$ time python2.4 Copy.py

real 0m0.296s
user 0m0.080s
sys 0m0.170s

with Python 2.4 beta 1 for the roughly equivalent:

inputFile = file("/usr/share/dict/web2", 'r')
outputFile = file("/tmp/acopy", 'w')

outputFile.write(inputFile.read())

inputFile.close()
outputFile.close()

which isn't all that far from highly optimized system commands:

kallisti:~ alex$ time cp /usr/share/dict/web2 /tmp/acopy

real 0m0.167s
user 0m0.000s
sys 0m0.040s

kallisti:~ alex$ time cat /usr/share/dict/web2 >/tmp/acopy

real 0m0.149s
user 0m0.000s
sys 0m0.090s


I'm sure the Java version can be optimized easily, too -- I just grabbed
the first thing I saw off the net. But surely this example doesn't
point to any big performance issue with Python disk I/O wrt Java. So,
unless you post concrete examples yourself, the smallest the better,
it's going to be pretty difficult to understand where your doubts are
coming from!


Alex
 
A

Alex Martelli

Roger Binns said:
Incidentally, that Java code copies one character at a time.

Yep, I noticed -- pretty silly, but if that's how the Java designers
decided the read method should behave by default, who am I to argue?
Just an example I grabbed off the net, first google hit for (if I recall
correctly) java file reading that had the source of a complete example.
The Python code is reading the entire string into memory and
then writing it.

Right, _Python_'s default.
The interpretter overhead vs system calls could be measured
by having the language involved in every byte transferred as
in the Java example, or negligibly as in the Python example.

The claim posted to this newsgroup, without any support nor examples
being given, was that Python's I/O was far slower than Java's in
_disk-intensive_ operations. I'm still waiting to see any small,
verifiable examples of that posted on this thead.

Apparently, at least in default operations on both sides, based on calls
to read without parameters, that is definitely not the case - no doubt,
as you say, that's because of the different ways those defaults are
tuned, making Python much faster. Great, then let those who claim
Java's I/O is much faster in disk intensive operation post suitable
examples, and we'll see.

Whenever anyone has an agenda, you can make all sorts of silly
claims. IMHO the best thing to do is to lead by examples.

I posted examples -- probably not "leading" ones, just one grabbed off
the net using Java's defaults, and one using Python's defaults. I'm not
the one making any claims regarding I/O performance of _properly tuned_
disk-intensive programs -- what I would like to see would be some such
examples posted by those who DO make such claims. Shouldn't the burden
of proof lay on the side making positive assertions, by ordinary rules?!
For years many people claimed Perl was line noise, hard to
maintain etc. I never really saw much response, since the
Perl people were too busy writing real world code and helping
deliver part of the web revolution. (And selling zillions of
books for O'Reilly :)

Surely more than we have sold so far with Python, yes - Tim O'Reilly has
published data about that. Which is why I'm striving to _worsen_ the
quality of Python's free docs until they match Perl's, against heathens
who (can you imagine that!) are instead trying to *improve* them further
(thus no doubt hurting book sales).

I couldn't find any Python success stories on python.org itself,
but if you dig you can find the stories for the Python Business
Forum as well as Pythonology.

Lots of digging, yes, considering that if you google for python success
stories those pbf and pythonology hits are just at the top (other
similar links, such as those to O'Reilly's booklets covering the same
material and their PDF forms, make up the rest of the top-ten pages).
There are however remarkably few where you can go grab the
source code and see how it is all put together. In fact I
couldn't find a single one, but didn't do an exhaustive
search from python.org.

Without straying from the top google hit, you can find that some of the
listed success stories are open-source, such as mayavi -- they're
admittedly a minority of the "success stories" listed on these sites.
Perhaps it is also worth linking to the projects done in
Python on SourceForge and elsewhere?

http://sf.net/softwaremap/trove_list.php?form_cat=178

If the message you're keen to send is "Python is great for open-source",
yes. If you're focusing on "Python is great for your _business_" (as
the python *business* forum does, for example), then emphasizing
open-source projects can reasonably be considered secondary to
emphasizing projects that make or save money for the businesses which
developed them.

If you're looking for highly visible open-source projects using Python,
I'd think that bittorrent, chandler, zope and plone, mayavi, all the
scipy.org site, twisted, schooltool, ubuntu, and the like, should keep
you in reading materials for a while. Not sure how many of these
projects are specifically on sourceforge, but I believe some minority
probably is.


Alex
 
B

Brian van den Broek

Israel Raj T said unto the world upon 2004-11-06 16:01:
Or perhaps because you had not read the many available resources on
using winmodems under linux.

BTW, if you are looking for a decent mail client check out gnus. Of
the over 20 mail clients that I have used over the years, this is the
best.

Perhaps. But the context was that I was a Linux neophyte (not much
beyond that now), identified myself as such, made sure to manifest that
I'd tried and failed to sort it out for myself, and did my best to
appear to be asking for help learning to fish, rather than to be handed
a cooked trout. :) (As in, I'd read and tried to follow the advice of
Raymond's essay.)

I'm certain that the sort of care I put into asking the question made it
fall outside the scope of the comments about newbie questions made
up-thread. I've yet to see a Python list so react to someone who made it
clear they were trying to follow the norms, even if they didn't meet
with complete success. I think the issue is about those who don't try to
follow courteous practise, and perhaps aren't aware they aren't.

Any project, Linux, Python, whatever, that aims to get a user base
beyond the gurus has to accept that not everyone who needs help will be
able to work everything out based on what they can find on-line. (It is
clear to me that Python lists *do* accept this.) The problem isn't with
those who get stuck after trying for themselves; it those who hit send
before search :)

But thanks for the comment, and the client pointer. Best,

Brian vdB
 
A

Alex Martelli

Kent Johnson said:
I rarely find myself acting as an apologist for Java, and I understand
that the point Alex is making is that Python's performance for this
operation is quite good, and that the OP should post some code, but this
is really too unfair a comparison for me not to say something.

I'm glad I posted a sufficiently silly comparison to elicit some
response, then;-)

There are two major differences between these two programs:
- The Java version is doing a character by character copy; the Python
program reads the entire file into a buffer in one operation.
- The Java program is converting the entire file to and from Unicode;
the Python program is copying the literal bytes.

Right: each is using the respective language's defaults, and Python's
are apparently tuned for speed, while Java's apparenty aren't.

Here is a much more comparable Java program (that will fail if the file
size is over 2^31-1):

I believe that the Python program, if run on a suitable 64-bit OS, on a
ridiculously-large-memory machine, could succeed for much larger files
than that. Machines with many gigabytes of physical RAM are becoming
far from absurd these days -- I can't yet afford to throw 2500 euros out
of the window to buy 8 GB of fast RAM, but if I could it would fit
snugly into my own cheap dual-G5 powermac, for example (old model: I got
it used/reconditioned many months ago; I think the current cheaper model
does top out at 4 GB). On 32-bit machines or OSs, of course, Python's
memory limits _will_ byte at fewer GB than that, too.

import java.io.*;

public class Copy {
public static void main(String[] args) throws IOException {
File inputFile = new File("/usr/share/dict/web2");
int bufferSize = (int)inputFile.length();
File outputFile = new File("/tmp/acopy");

FileInputStream in = new FileInputStream(inputFile);
FileOutputStream out = new FileOutputStream(outputFile);

byte buffer[] = new byte[bufferSize];
int len=bufferSize;

while (true)
{
len=in.read(buffer,0,bufferSize);
if (len<0 )
break;
out.write(buffer,0,len);
}

in.close();
out.close();
}
}

Here are the results I get with this program and Alex's Python program
on my G4-400 Mac:
kent% time java Copy
0.440u 0.320s 0:00.96 79.1% 0+0k 9+3io 0pf+0w
kent% time python Copy.py
0.100u 0.120s 0:00.31 70.9% 0+0k 2+4io 0pf+0w

With your program, and mine simply converted to run inside a main()
function rather than at module-level for comparison, switching to tcsh
for direct comparison with the format you're using, I see:

[kallisti:~] alex% time java Copy
0.200u 0.140s 0:00.54 62.9% 0+0k 0+0io 0pf+0w
[kallisti:~] alex% time python Copy.py
0.080u 0.020s 0:00.13 76.9% 0+0k 0+1io 0pf+0w

Which python and java versions are you using? I'm trying to use the
latest and greatest of each, 2.4b1 (I know, I know, I need to install
b2!) for Python, 1.4.2_05 for Java -- just upgraded to MacOSX 10.3.6,
and my /usr/share/dict/web2 is 2486825 bytes.
The Python program is still substantially faster (3x), but with nowhere
near the margin Alex saw.

I still see a 4:1 ratio, but, sure, nowhere like the 20:1 my originally
silly example showed. Maybe a more realistic program would use a buffer
of some fixed length, rather than 'as long as the whole file'. Say:

int bufferSize = 64 * 1024;

in a program that otherwise is just like yours, for the Java side of
things. Switching back to bash because I can't stand long exposures to
anything in the csh family;-), I see:

kallisti:~ alex$ time java Copy

real 0m0.521s
user 0m0.200s
sys 0m0.120s

after several runs, so everything gets a chance to go to cache. Oh,
btw, I did compile with 'javac -O Copy.java'.

The closest Python equivalent I know how to write is:

def main():
inputFile = file("/usr/share/dict/web2", 'r')
bufferSize = 64 * 1024
outputFile = file("/tmp/acopy", 'w')

inf = inputFile.read
ouf = outputFile.write

while 1:
buf = inf(bufferSize)
if not buf: break
ouf(buf)

inputFile.close()
outputFile.close()

and its performance:

kallisti:~ alex$ time python -O Copy.py

real 0m0.135s
user 0m0.050s
sys 0m0.050s

so we still see the 4:1 ratio in favour of Python. It's barely more
than 3:1 in actual CPU time, user and sys, but for some reason Python
seems to be able to get more of the CPU's % attention -- I don't claim
to understand this!-)

So I scp'd everything over to the powermac dual g5 1.8 GHz, and ssh'd
there to do more measurements -- no python 2.4 there (it's my production
machine -> no betas!) and again I measured after a few runs to let stuff
get into cache:

macarthur:~ alex$ time java Copy

real 0m0.163s
user 0m0.040s
sys 0m0.110s

macarthur:~ alex$ time python -O Copy.py

real 0m0.039s
user 0m0.020s
sys 0m0.020s

Far better real/cpu ratios of course (a dual-CPU machine does that;-).
And faster overall. But roughly the same 4:1 ratio in favour of Python.


So, same thing on the oldie but goodie Linux box. In that case I don't
have an updated Java -- Kaffe 1.0.7 will have to do! The CPU is oldish
(Athlon 1.2G), but the disk subsystem is really really good. So I got
(again, good repeatable results after a few runs to stabilize):

[alex@lancelot alex]$ time java Copy
0.07user 0.04system 0:00.11elapsed 98%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (1623major+659minor)pagefaults 0swaps

[alex@lancelot alex]$ time python2.4 -O Copy.py
0.02user 0.01system 0:00.02elapsed 115%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (460major+241minor)pagefaults 0swaps

don't ask ME how python got over 100% CPU on a single-CPU machine; I
guess the task is just too small, at such tiny fractions of a second, to
avoid anomalies. Still, the roughly 4:1 performance ratio appears to be
repeatable here; and an interesting clue is that _pagefaults_ also
appear to be in roughly 4:1 ratio.

But clearly we need bigger tasks to make measurements less fraught with
anomalies. Unfortunately, not knowing on exactly which tasks the OP had
observed and measured his reported results that "Python is slower than
Java" in "a disk intensive I/O operation", it's hard to guess.
Obviously, his results are anything but _trivial_ to reproduce based on
the dearth of information that he has released about them so far; I find
it hard to think of something more disk-intensive than a simple copy of
one file to another. Assuming he does have some interest in
ascertaining the truth and getting his questions answered, perhaps he'll
deign to give us Java and Python sources, and reference data file,
for a small benchmark that _does_ reproduce his results. Otherwise, it
appears we may be just shooting in the dark.


Alex
 
M

Maurice LING

dude that "comparision" from twistedmatrix you refrence is ANCIENT!!!

I am wondering the impact when IBM decided that the base memory to not
exceed 64kb, in the late 1960s...

I suppose more experienced people in this list can agree that certain
decisions made can be almost an edict. So, there is a re-building
process every now and then, hopefully to by-pass such edicts. Python
itself is already such an example.
it is comparing versions that are YEARS out of date and use!

Are the codebase of Python 1.5.2 and Java 1.1 totally replaced and
deprecated?

Lisp compiler is the 1st compiler to be created (according to the
Red-Dragon book, I think) and almost all others are created by
bootstrapping to LISP compiler. What are the implications of design
decisions made in LISP compiler then affecting our compilers today? I
don't know. I repeat myself, I DO NOT KNOW.
you are just trolling or your don't know enough to understand the
answer to your question which is way to vague to be answered, as there
is no real correct answer.

Certainly I do not have 15 PhDs in computer science or computating
mathematics...... I suppose there are some syntax error in your
statement to allow me to parse it completely. "too vague", not "to vague".

Thanks
maurice
 
M

Maurice LING

Not only that, but Maurice Ling has started a long thread a few weeks
looking for a good research topic for his thesis (involving Java &
Python, btw). At some points he was bashed, but went ahead with the
discussion, that ended up touching in several interesting topics.

Thank you. You can do a google search and find my details anyway. Don't
have to go around guessing my details. I have no idea of the motives of
whoever that highlight "unimelb.edu.au". Does he have a problem with my
institution or what, I am not bothered, as I am only a student. Email to
the Vice Chancellor if he has a point to make, don't take it out on me,
I don't have the title of a Professor prefixing my name.

There is a Chinese saying "waves do not happen without wind." If my
impression of Python and Java is flawed, I am seriously wondering where
it came from? Is it all due to benchmark data from years ago?

Before I am accused of trolling etc etc again, let me say this all over
again. I am using both Python and Java actively and had contributed to
BioPython project as well.
As for the climate in c.l.py, it's just interesting to note that the
climate *is* getting less friendly, and that it does coincide with
several old-timers moving away from the list. A few years ago I
remember that the likes of Tim Peters, effbot, Skip Montanaro, and
several others (sorry, I would really like to remember more names from
the top of my head) were frequent posters here. They've now moved to
other interests, or are focusing their efforts on the python-dev list.
It must not be a coincidence.
Maybe as the old-timers go off, the next batch of old-timers feel that
it is too much to handle. Nobody requires anyone to answer to all
threads.....

Thanks
Maurice
 
M

Maurice LING

Message from yahoo.com.
Unable to deliver message to the following address(es).

<[email protected]>:
64.157.4.78 failed after I sent the message.
Remote host said: 554 delivery error: dd This user doesn't have a
yahoo.com account ([email protected]) [0] -
mta109.mail.sc5.yahoo.com
 
M

Maurice LING

Maybe as the old-timers go off, the next batch of old-timers feel that
it is too much to handle. Nobody requires anyone to answer to all
threads.....
Nevertheless I still have to thank Alex for his story on Plato and the
studetn who questioned the validity of studying geometry before
philosophy (in "reverse Jython" thread).

maurice
 
E

Eric S. Johansson

Israel said:
I would argue that this would debase newsgroups even further.

debase in what way? I certainly don't see enough money flowing that it
would lead major python contributors down a road of debauchery and
depravity. at least, those that aren't already there.

First, there would be a bunch of things required to make this work. A
pay on the newsreader client, a connection to a payment system, a
registered identity for payment. This is not a trivial amount of work.

Second, much of the conflict around various forms of intellectual
property boil down to getting paid. By building a low overhead
mechanism through which people could get paid for all sorts of
intellectual property (questions answered, music, video, writings,
code), we would jumpstart a new economic engine.

initially there would be a big rush of everybody on the face the planet
trying to get that dollar payment. It would probably be not unlike the
swarms of beggars that surround rich westerners when they travel through
less well-off regions. We would evolve filters and rating systems which
would be used to cut down on the noise. Eventually some contributors
would fallaway and others would persist and get better.

nothing would compel anyone to pay anything. But I could see eventually
folks being rated on whether or not they do pay. If you've got someone
who's always asking time-consuming questions and not paying anything, it
would be totally appropriate to filter out that person.

in the same community of users, you would have some people operating on
a gift economy, others operating on a financial economy and I believe
somehow, it would work. I can't yet describe how but I believe it would
work.

---eric
 
R

Roger Binns

Alex said:
Yep, I noticed -- pretty silly, but if that's how the Java designers
decided the read method should behave by default, who am I to argue?
Just an example I grabbed off the net, first google hit for (if I
recall correctly) java file reading that had the source of a complete
example.

I/O is one area where Java is *very* different than other environments.
Java emphasises the ability to select at runtime (typically from
properties files) how to do things. This is usually done by
having layers of interfaces, and many implementations of factories
and sometimes factories of factories. That ultimately means
the code will work the same for almost any source and destination,
including doing buffering (an interface), byte to character
conversion (an interface), the actual source (an interface),
output (another interface).

Consequently I/O is one of the hardest things for a newbie Java
programmer to do since it appears so complex due to the flexibility
offered. It is also why bad examples exist, because people
find what works for them and post it.

Note how in Python, files do read/write whereas sockets do
send/recv.

But you are comparing Apples to Oranges. Java programs are written
in certain ways with various emphases (flexibility, interfaces and
factories). Python programs emphasise other things (generators,
typing). Exceptions are expensive in Java and not used much. They
are "cheap" in Python and used frequently.
Right, _Python_'s default.

Arguably Python's default is reading a line at a time, and it
is a bad default in some circumstances (large files), just
as the Java code was a bad way of doing anything but small
files.
The claim posted to this newsgroup, without any support nor examples
being given, was that Python's I/O was far slower than Java's in
_disk-intensive_ operations. I'm still waiting to see any small,
verifiable examples of that posted on this thead.

If the language code is the same, then that claim boils down to
the Java Native Interface vs the Python C API. In the case of
Java, I can see the claim having some relevance in multi-threaded
code since Java doesn't have the GIL.
defaults are tuned, making Python much faster. Great, then let those
who claim Java's I/O is much faster in disk intensive operation post
suitable examples, and we'll see.

Your timing included the overhead of starting up and shutting down
both environments, making the rest of the measure less than
interesting.
If the message you're keen to send is "Python is great for
open-source", yes. If you're focusing on "Python is great for your
_business_" (as the python *business* forum does, for example), then
emphasizing open-source projects can reasonably be considered
secondary to emphasizing projects that make or save money for the
businesses which developed them.

The idea isn't to emphasise the open source side, but rather so
that anyone can see for themselves how it was all put together.
If I have a business critical app, and claim it is written in
Python but noone can see the insides then they can't really
know too much. The biggest thing is that they can't tell
if they could write code like that (or even how much was written)
to produce an app of similar functionality and complexity.

Roger
 
K

Kent Johnson

Alex said:
I'm glad I posted a sufficiently silly comparison to elicit some
response, then;-)

Yes, good success on that one :)
Which python and java versions are you using?

Old. Unfortunately the Mac is no longer my primary machine and not
up-to-date. I just used it to get timings comparable to yours. It's
Java 1.4.1_01
Python 2.3.3
Mac OSX 10.2.4

....We now return to our regularly scheduled program of trying to
convince Java programmers to try Python...

Kent
 
T

Terry Hancock

There is a Chinese saying "waves do not happen without wind." If my
impression of Python and Java is flawed, I am seriously wondering where
it came from? Is it all due to benchmark data from years ago?

Not really. It's an impression you could easily get from several
books about Python (e.g. O'Reilly's "Learning Python"), which
make a rather big deal about Python being slow to execute, but
fast to develop in.

The reality is that it isn't really all *that* slow to execute,
and later versions have gotten quite a bit faster.

But it remains really fast to develop in. ;-)

The resistence to the idea really stems from a sense of rivalry
with Java. Which is really interesting actually, because it
wasn't all that long ago that Python was "just a scripting
language" and Java programmers wouldn't feel threatened by it.
Now they do, I guess. ;-)

It's quite possible that Java programmers console themselves
for all the low-level programming work by thinking the result
will be faster, without really testing to find out. Certainly
Java is going to be faster than Jython (Python running on a
Java platform).

And I have certainly written some extremely poorly optimized
Python programs that positively *crawled*.

Cheers,
Terry
 
K

Kent Johnson

Terry said:
The reality is that [Python] isn't really all *that* slow to execute,
and later versions have gotten quite a bit faster.

But it remains really fast to develop in. ;-)
Certainly
Java is going to be faster than Jython (Python running on a
Java platform).

This is sort of like saying that C is faster than Python. To the Java
programmer, it's an excuse to avoid looking at Jython. To the Jython
programmer, it's "Yes, and your point is...?"

The reality is that Jython isn't really all *that" slow to execute.

But it remains really fast to develop in. ;-)

And you can always recode critical sections in Java for speed.

OK, you can probably guess what language I use in my day job now...

Kent
 
R

Roy Smith

Terry Hancock said:
And I have certainly written some extremely poorly optimized
Python programs that positively *crawled*.

My guess is the poor performance had nothing to do with the language you
wrote it in, and everything to do with the algorithms you used.

Local optimizations rarely gets you more than a factor of 2 improvement.
Choice of language (at least within the same general catagory such as
comparing one native compiled language to another, or one virtual
machine language to another) probably has a somewhat broader range, but
still a factor of 10 would be quite surprising.

To get really bad performance, you need to pick the wrong algorithm. A
project I worked on a while ago had a bit of quadratic behavior in it
when dealing with files in a directory. Most of our customers dealt
with file sets in the 100's. One customer had 50,000 files. Going
from, say, 500 to 50,000 is a 100-fold increase in N, and gave a
10,000-fold increase in execution time.
 
M

Maurice LING

Not really. It's an impression you could easily get from several
books about Python (e.g. O'Reilly's "Learning Python"), which
make a rather big deal about Python being slow to execute, but
fast to develop in.

Come to think of it, yes, "Python, The Complete Reference", "Programming
Python" and "Learning Python" all seems to have that "speed warning" tag
in the beginning chapters.

If this seems to be the wrong impression, should the authors do something?
The reality is that it isn't really all *that* slow to execute,
and later versions have gotten quite a bit faster.

But it remains really fast to develop in. ;-)

I attest to that, at least for the development speed.
It's quite possible that Java programmers console themselves
for all the low-level programming work by thinking the result
will be faster, without really testing to find out. Certainly
Java is going to be faster than Jython (Python running on a
Java platform).

Jython, as I know, wraps each Python objects in a Java class. For
example, each variable is a class of it own. I suppose this overhead
does have speed penalities on it, simply because "every dash of cheese
is calories, even for low fat cheese."

At the same time, Jython FAQ question 1.6 (how fast is jython?) suggests
that Jython may run 10 times slower than CPython. Personally, I hadn't
seen this kind of performance in my programs yet. On the other hand,
"Jython Essentials" suggest that Jython is about 1.5 times the speed of
CPython (CPython takes about 75% of the time compared to Jython). To
this, I've also not seen this fast in Jython codes myself. I will say
that 2-5 times seems to be a reasonable range.

Cheers
Maurice
 
?

=?ISO-8859-1?Q?F=E1bio?= Mendes

Em Sáb, 2004-11-06 às 22:07 -0500, Roy Smith escreveu:
My guess is the poor performance had nothing to do with the language you
wrote it in, and everything to do with the algorithms you used.

Well, try to write a Linear Algebra algorithm in pure python... Then
you'll see that pyrex, scipy or the C API are your friends. Of course
algorithmis matters, but for some intensive CPU applications python can
be as slow as 1/100 C using the same algorith. This, of course, matters.
At least if your program is running this kind of operation most of the
time.
Local optimizations rarely gets you more than a factor of 2 improvement.
Choice of language (at least within the same general catagory such as
comparing one native compiled language to another, or one virtual
machine language to another) probably has a somewhat broader range, but
still a factor of 10 would be quite surprising.

10x or 20x difference is likely to hurt you. If you have a 20x slower
computer were you be using the same apps as you use now? For scientific
simulations (which interest me most), it's the difference between get
your results after 1day calculation of one month...
To get really bad performance, you need to pick the wrong algorithm. A
project I worked on a while ago had a bit of quadratic behavior in it
when dealing with files in a directory. Most of our customers dealt
with file sets in the 100's. One customer had 50,000 files. Going
from, say, 500 to 50,000 is a 100-fold increase in N, and gave a
10,000-fold increase in execution time.

This is an extreme case, but is a typical behaviour of how a good
algorithm can modify the execution times of your program: scaling up
neatly. The language matters indeed, but python is well served in a
broad range of libs. For the most CPU intensive tasks there are usually
python bindings for C/C++/Fortran libraries, so it's usually possible to
execute python code at native speeds. This is why I use python in my
(simple) physics applications: scipy gives me a very good interface for
evaluating heavy numeric stuff. When I program in C++ I'm usually
tempted to write a lot of code from the scratch, for I'm lazy to search
for the write libs and learn how they work. I'm a physicist, not a real
programer, so this end up with C++ code slower than python's (which uses
scipy magic).

The point I want to make here is that library programmers usually makes
much better (in the sense it's faster) code than an application
programmer. So, in any language, if you want to write a big project from
the scratch, you'll probably end up with a buggy, slow and cumbersome
library for the 'low level' stuff, and the high level interface will
never move on. If you pick up the existing libs and extend them to
tailor your needs, you have better chances of success. Python seems to
be better than java in this point (at least in the OSS arena), for there
is an enourmously bigger set of libraries you can start with, and is
easier to wrap C/C++/Fortran code to it. So the likely 10% or so runtime
penalty I get for running a C lib through python is more than paid in
application development time. To me, what matters is: can a __python
program__ run as fast as a __c program__? In lots of cases: yes, just
use the right lib.

All this 'java interpreter is faster than python interpreter' nonsense
doesn't appeal to me. For intensive CPU operations, neither interpreters
are good, or even decent, so there is no subistitute for a good o' low
level implementation. Can java use those implementations as easyly as
python? --No-- Does java have a comparable set of low level (fast!) libs
wrapped to it? --Not as rich as python--. So forget the language
shootout contest! Those are not real world examples, they're complete
crap. Python has a richer set of faster libs than java. Use them, so
when it matters, python usually can be faster than java. This is enough
for me for saying that python IS faster than java.

If you care much, a python program can be almost as fast as a C one,
just use a python wrapper to a C lib or use pyrex! This is not saying
that language doesn't matter. Not all so called scripting languages have
the same facilities as python. For instance perl is very optimized to
regex substitution, but lacks more broad biddings for, for example
numeric libraries. We're lucky to use python, we're lucky that it has so
many facilities. Other scripting languages have nice syntax, powerful
builtin objects and other gizmos that make them much more expressive and
productive than the low level counterparts. IMHO python is unique in the
fact that it also provides very nice set of wrapped low level libraries
to run runtime critical pieces of code, so in a lot of cases we can get
the best of both worlds. Java is too much a in between for me: execution
not so fast, but not so slow; development not so fast, but not so slow;
builtins not so expressive, but not so low level either; it is not that
good, neither that bad, etc...

Cheers,
Fabio
 
M

Maurice LING

What if we do, _AND_ carefully follow Eric Raymond's excellent
recommendations each and every time we ask for help? Are then we
allowed to loathe and despise the mass of clueless dweebs?-)

I do have to thank you for the story on Plato and his students (in
"reverse jython" thread).

Now that courtnesy is done, I believe you do have all rights not to
reply to all requests for help. As in the law, you have the rights to
remain silent...
I can be a newbie at a bazillion subjects, easily -- but I cannot truly
be a newbie at such tasks as human interaction, social dynamics, general
information retrieval and processing. I can easily guess what will
happen if I enter any mailing list or newsgroup with both guns blazing
out of frustration, for example, and therefore I cannot easily
sympathize with anybody who _does_ behave so foolishly. It's not a
matter of expertise about any specific subject, not even exactly one of
skills, but rather one of personal maturity and character.

Bazillion subjects, my respect. Looking back at this thread...

1. I've asked a question which I may be wrong (many people do that)

2. Some had answered and pointed out my misconceptions, thank you all
for that.

3. Some had pointed to the fact that I am from University of Melbourne
(with unknown motives). It was then clarified that I was an honours
student (www.zoology.unimelb.edu.au) in the Dept of Zoology. Perhaps I
may say that I am a molecular biologist by degree.

4. Some had pointed out instances (books and all) whereby my wrong
impressions might have been formed.

5. Some had painfully taxed on the my initial misconceptions and
generalizing it to the ridicule of non-experts, and in the process of
so, suggesting controversial codes in the pre-text of eliciting responses.

6. Furthering it, using notable words from famous people, to
discriminate against a group of people, when the first instance had been
breached... you can choose not to reply.

All these happens when the discussion had been taken to other areas... I
brings one to wonder on maturity and character......
I know how to search mailing list archives, or google groups ones, and
considerate enough to use this easily acquired and very useful knowledge
to try and avoid wasting other people's time and energy, for example, by
airing some complaint that's been made a thousand times and answered
very comprehensively. When I can't find an answer that way, I ask with
courtesy and consideration and appreciation for the time of the people
who, I hope, will be answering my questions. Etc, etc -- reread Eric's
essay on how to ask for help, it's a great piece of work.

All the knowledge in this world are in libraries and now, networked
libraries via internet. Today, almost all high school students in
developed countries are versed in internet. And by your argument, it
seems that all universities are complete waste of money as all knowledge
is out there and the tools to access the knowledge is readily available.

As mentioned, the discussion is heading else where and my misconceptions
cleared before your replies. If I've indeed forgotten, my sincere
apologies and hereby thank you for your time and efforts.
That doesn't mean a newbie isn't always welcome, _if_ they show any sign
whatever of being worth it. But asking for tolerance and patience
against _rude_ newbies which barge in with shrill, mostly unjustified,
repetitious complaints, is, I think, a rather far-fetched request.


Alex

You still have the right to remain solemn.

maurice
 
K

kosh

with Python 2.4 beta 1 for the roughly equivalent:

inputFile = file("/usr/share/dict/web2", 'r')
outputFile = file("/tmp/acopy", 'w')

outputFile.write(inputFile.read())

inputFile.close()
outputFile.close()

I think a generator version works even better. I did tests at various files
sizes and overall the generator one was better and it was vastly better at
large file sizes and overall the generator one also impacted the system less.
The regular version which reads the whole file at once get really bad with
large files especially if the system is being used.

I suspect the generator version should work with any size file that the os is
capable of working with and should be resource friendly at any size.

Just thought the generator version would be a good comparison to the java
version and strangely enough in many cases it is actually faster then the
regular version. :)

Generator Version:
inputFile = file("/home/kosh/KNOPPIX_V3.6-2004-08-16-EN.iso", 'r')
outputFile = file("/tmp/acopy", 'w')

outputFile.writelines(inputFile)

inputFile.close()
outputFile.close()

Regular Version:
inputFile = file("/home/kosh/temp.txt", 'r')
outputFile = file("/tmp/acopy", 'w')

outputFile.write(inputFile.read())

inputFile.close()
outputFile.close()

The timing was done with python 2.3.4 on debian/sid


Large File Test:
File Copied: 733499392 KNOPPIX_V3.6-2004-08-16-EN.iso

Real User Sys
Gen 0m33.478s 0m4.302s 0m3.542s
Reg 2m28.029s 0m0.010s 0m4.992s *
Reg 0m34.913s 0m0.009s 0m4.713s

* This is how long the first run took. The machine swapped
heavily. The Other time is for subsequent runs. This method
overall uses a massive ammount of ram.


Memory Usage:

Virt Res Shr
Gen 3816K 2364K 2524K
Reg 703M 700M 2524K

Small File Test

File Copied: 2754459 Zope-2.7.2-0.tgz

Real User Sys
Gen 0m0.049s 0m0.023s 0m0.014s
Reg 0m0.037s 0m0.009s 0m0.019s


Tiny File Test:

File Copied: 205 May temp.txt

Real User Sys
Gen 0m0.012s 0m0.007s 0m0.005s
Reg 0m0.028s 0m0.009s 0m0.003s
 
T

Tim Roberts

Maurice LING said:
I've already said the following and was not noticed:

1. it is a disk intensive I/O operation.
2. users delay is not in the equation (there is no user input)
3. I am not interested in the amount of time needed to develop it. But
only interested execution speed.

It is fabulous that you are able to enumerate your list of requirements so
completely. I'm quite serious; many people embark on even complicated
projects without a clear understanding of the tradeoffs they will
encounter.

However, given that set of needs, why would you mess with an "exotic"
language at all? Why wouldn't you just write straight to the metal in C++
or C?
 
M

Maurice LING

It is fabulous that you are able to enumerate your list of requirements so
completely. I'm quite serious; many people embark on even complicated
projects without a clear understanding of the tradeoffs they will
encounter.

However, given that set of needs, why would you mess with an "exotic"
language at all? Why wouldn't you just write straight to the metal in C++
or C?

Perhaps you can attribute it to C-phobia.
maurioce
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top