M
Marc H.
Hello,
I recently converted one of my perl scripts to python. What the script
does is simply search a lot of big mail files (~40MB) to retrieve
specific emails. I simply converted the script line by line to python,
keeping the algorithms & functions as they were in perl (no
optimization). The purpose was mainly to learn python and see the
differences with perl.
Now, once the converted script was finished, I was amazed to find that
the python version is running 8 times faster (800% faster!). Needless
to say, I was very intrigued and wanted to know what causes such a
performance gap between the two versions. So to keep my story short,
after some research and a few tests, I found that file IO is mainly
the cause of the performance diff.
I made two short test scripts, one in perl and one in python (see
below), and compared the performance difference. As we can see, the
bigger the file the larger the difference in performance....
I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?
Thanks
-----
#!/usr/bin/python
for i in range(20):
Data = open('data.test').readlines()
-----
#!/usr/bin/perl
for ($i = 0; $i < 20; $i++) {
open(DATA, "data.test");
@Data = <DATA>;
close(DATA);
}
-----
Running tests (data.test = 10MB text file):
blop@moya blop $ time ./ftest.py
real 0m6.408s
user 0m4.552s
sys 0m1.826s
blop@moya blop $ time ./ftest.pl
real 0m22.855s
user 0m21.946s
sys 0m0.822s
-----
Running tests (data.test = 40MB text file):
blop@moya blop $ time ./ftest.py
real 0m26.235s
user 0m18.238s
sys 0m7.872s
blop@moya blop $ time ./ftest.pl
real 3m26.741s
user 3m22.168s
sys 0m3.764s
I recently converted one of my perl scripts to python. What the script
does is simply search a lot of big mail files (~40MB) to retrieve
specific emails. I simply converted the script line by line to python,
keeping the algorithms & functions as they were in perl (no
optimization). The purpose was mainly to learn python and see the
differences with perl.
Now, once the converted script was finished, I was amazed to find that
the python version is running 8 times faster (800% faster!). Needless
to say, I was very intrigued and wanted to know what causes such a
performance gap between the two versions. So to keep my story short,
after some research and a few tests, I found that file IO is mainly
the cause of the performance diff.
I made two short test scripts, one in perl and one in python (see
below), and compared the performance difference. As we can see, the
bigger the file the larger the difference in performance....
I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?
Thanks
-----
#!/usr/bin/python
for i in range(20):
Data = open('data.test').readlines()
-----
#!/usr/bin/perl
for ($i = 0; $i < 20; $i++) {
open(DATA, "data.test");
@Data = <DATA>;
close(DATA);
}
-----
Running tests (data.test = 10MB text file):
blop@moya blop $ time ./ftest.py
real 0m6.408s
user 0m4.552s
sys 0m1.826s
blop@moya blop $ time ./ftest.pl
real 0m22.855s
user 0m21.946s
sys 0m0.822s
-----
Running tests (data.test = 40MB text file):
blop@moya blop $ time ./ftest.py
real 0m26.235s
user 0m18.238s
sys 0m7.872s
blop@moya blop $ time ./ftest.pl
real 3m26.741s
user 3m22.168s
sys 0m3.764s