Can I beat perl at grep-like processing speed?

Discussion in 'Python' started by js, Dec 29, 2006.

  1. js

    js Guest

    Just my curiosity.
    Can python beats perl at speed of grep-like processing?


    $ wget http://www.gutenberg.org/files/7999/7999-h.zip
    $ unzip 7999-h.zip
    $ cd 7999-h
    $ cat *.htm > bigfile
    $ du -h bigfile
    du -h bigfile
    8.2M bigfile

    ---------- grep.pl ----------
    #!/usr/local/bin/perl
    open(F, 'bigfile') or die;

    while(<F>) {
    s/[\n\r]+$//;
    print "$_\n" if m/destroy/oi;
    }
    ---------- END ----------
    ---------- grep.py ----------
    #!/usr/bin/env python
    import re
    r = re.compile(r'destroy', re.IGNORECASE)

    for s in file('bigfile'):
    if r.search(s): print s.rstrip("\r\n")
    ---------- END ----------

    $ time perl grep.pl > pl.out; time python grep.py > py.out
    real 0m0.168s
    user 0m0.149s
    sys 0m0.015s

    real 0m0.450s
    user 0m0.374s
    sys 0m0.068s
    # I used python2.5 and perl 5.8.6
    js, Dec 29, 2006
    #1
    1. Advertising

  2. js wrote:

    > Just my curiosity.
    > Can python beats perl at speed of grep-like processing?
    >
    >
    > $ wget http://www.gutenberg.org/files/7999/7999-h.zip
    > $ unzip 7999-h.zip
    > $ cd 7999-h
    > $ cat *.htm > bigfile
    > $ du -h bigfile
    > du -h bigfile
    > 8.2M bigfile
    >
    > ---------- grep.pl ----------
    > #!/usr/local/bin/perl
    > open(F, 'bigfile') or die;
    >
    > while(<F>) {
    > s/[\n\r]+$//;
    > print "$_\n" if m/destroy/oi;
    > }
    > ---------- END ----------
    > ---------- grep.py ----------
    > #!/usr/bin/env python
    > import re
    > r = re.compile(r'destroy', re.IGNORECASE)
    >
    > for s in file('bigfile'):
    > if r.search(s): print s.rstrip("\r\n")
    > ---------- END ----------
    >
    > $ time perl grep.pl > pl.out; time python grep.py > py.out
    > real 0m0.168s
    > user 0m0.149s
    > sys 0m0.015s
    >
    > real 0m0.450s
    > user 0m0.374s
    > sys 0m0.068s
    > # I used python2.5 and perl 5.8.6

    I'm thankful for the Python version or else, I'd never have guessed what
    that code was supposed to do!

    Try that :
    ---------- grep.py ----------
    #!/usr/bin/env python
    import re
    def main():
    search = re.compile(r'destroy', re.IGNORECASE).search

    for s in file('bigfile'):
    if search(s): print s.rstrip("\r\n")

    main()
    ---------- END ----------
    Christophe Cavalaria, Dec 29, 2006
    #2
    1. Advertising

  3. js <> wrote:
    > Just my curiosity.
    > Can python beats perl at speed of grep-like processing?
    >
    > $ wget http://www.gutenberg.org/files/7999/7999-h.zip
    > $ unzip 7999-h.zip
    > $ cd 7999-h
    > $ cat *.htm > bigfile
    > $ du -h bigfile
    > du -h bigfile
    > 8.2M bigfile
    >
    > #!/usr/local/bin/perl
    > open(F, 'bigfile') or die;
    >
    > while(<F>) {
    > s/[\n\r]+$//;
    > print "$_\n" if m/destroy/oi;
    > }
    > #!/usr/bin/env python
    > import re
    > r = re.compile(r'destroy', re.IGNORECASE)
    >
    > for s in file('bigfile'):
    > if r.search(s): print s.rstrip("\r\n")
    >
    > $ time perl grep.pl > pl.out; time python grep.py > py.out
    > real 0m0.168s
    > user 0m0.149s
    > sys 0m0.015s
    >
    > real 0m0.450s
    > user 0m0.374s
    > sys 0m0.068s
    > # I used python2.5 and perl 5.8.6


    Playing for the other side temporarily, this is nearly twice as fast...

    $ time perl -lne 'print if m/destroy/oi' bigfile >pl.out
    real 0m0.133s
    user 0m0.120s
    sys 0m0.012s

    vs

    $ time ./z.pl >pl.out.orig
    real 0m0.223s
    user 0m0.208s
    sys 0m0.016s

    Which gives the same output modulo a few \r

    --
    Nick Craig-Wood <> -- http://www.craig-wood.com/nick
    Nick Craig-Wood, Dec 30, 2006
    #3
  4. js a écrit :
    > Just my curiosity.
    > Can python beats perl at speed of grep-like processing?


    Probably not.

    >
    > $ wget http://www.gutenberg.org/files/7999/7999-h.zip
    > $ unzip 7999-h.zip
    > $ cd 7999-h
    > $ cat *.htm > bigfile
    > $ du -h bigfile
    > du -h bigfile
    > 8.2M bigfile
    >
    > ---------- grep.pl ----------
    > #!/usr/local/bin/perl
    > open(F, 'bigfile') or die;
    >
    > while(<F>) {
    > s/[\n\r]+$//;
    > print "$_\n" if m/destroy/oi;
    > }
    > ---------- END ----------
    > ---------- grep.py ----------
    > #!/usr/bin/env python
    > import re
    > r = re.compile(r'destroy', re.IGNORECASE)
    >
    > for s in file('bigfile'):
    > if r.search(s): print s.rstrip("\r\n")
    > ---------- END ----------


    Please notice that you're also benchmarking IO here - and perl seems to
    use a custom, highly optimized IO lib, that is much much faster than the
    system's one. I once made a Q&D cat-like comparison of perl, Python and
    C on my gentoo-linux box, and the perl version was insanely faster than
    the C one.

    Now the real question is IMHO: is the Python version fast enough ?

    My 2 cents..
    Bruno Desthuilliers, Jan 2, 2007
    #4
  5. Nick Craig-Wood wrote:

    >> #!/usr/bin/env python
    >> import re
    >> r = re.compile(r'destroy', re.IGNORECASE)
    >>
    >> for s in file('bigfile'):
    >> if r.search(s): print s.rstrip("\r\n")


    footnote: if you're searching for literal strings with Python 2.5, using "in" is a
    lot faster than using re.search.

    </F>
    Fredrik Lundh, Jan 3, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter
    Replies:
    4
    Views:
    394
    Scott Ellsworth
    Oct 18, 2004
  2. Headless

    Re: IE6 has beat me up

    Headless, Aug 6, 2003, in forum: HTML
    Replies:
    2
    Views:
    380
  3. Headless

    Re: IE6 has beat me up

    Headless, Aug 6, 2003, in forum: HTML
    Replies:
    0
    Views:
    392
    Headless
    Aug 6, 2003
  4. Neil White

    Re: IE6 has beat me up

    Neil White, Aug 6, 2003, in forum: HTML
    Replies:
    23
    Views:
    755
    William Tasso
    Aug 10, 2003
  5. Tim Smith
    Replies:
    1
    Views:
    356
    Marc 'BlackJack' Rintsch
    Dec 29, 2006
Loading...

Share This Page