Perl / python regex / performance comparison

Discussion in 'Python' started by Ivan, Mar 3, 2009.

  1. Ivan

    Ivan Guest

    Hello everyone,

    I know this is not a direct python question, forgive me for that, but
    maybe some of you will still be able to help me. I've been told that
    for my application it would be best to learn a scripting language, so
    I looked around and found perl and python to be the nice. Their syntax
    and "way" is not similar, though.
    So, I was wondering, could any of you please elaborate on the
    following, as to ease my dilemma:

    1. Although it is all relatively similar, there are differences
    between regexes of these two. Which do you believe is the more
    powerful variant (maybe an example) ?

    2. They are both interpreted languages, and I can't really be sure how
    they measure in speed. In your opinion, for handling large files,
    which is better ?
    (I'm processing files of numerical data of several hundred mb - let's
    say 200mb - how would python handle file of such size ? As compared to
    perl ?)

    3. This last one is somewhat subjective, but what do you think, in the
    future, which will be more useful. Which, in your (humble) opinion
    "has a future" ?

    Thank you for all the info you can spare, and expecially grateful for
    the time in doing so.
    -- Ivan
     
    Ivan, Mar 3, 2009
    #1
    1. Advertising

  2. On Tue, Mar 3, 2009 at 7:03 PM, Ivan <> wrote:
    > Hello everyone,
    >
    > I know this is not a direct python question, forgive me for that, but
    > maybe some of you will still be able to help me. I've been told that
    > for my application it would be best to learn a scripting language, so
    > I looked around and found perl and python to be the nice. Their syntax
    > and "way" is not similar, though.
    > So, I was wondering, could any of you please elaborate on the
    > following, as to ease my dilemma:
    >
    > 1. Although it is all relatively similar, there are differences
    > between regexes of these two. Which do you believe is the more
    > powerful variant (maybe an example) ?
    >
    > 2. They are both interpreted languages, and I can't really be sure how
    > they measure in speed. In your opinion, for handling large files,
    > which is better ?
    > (I'm processing files of numerical data of several hundred mb - let's
    > say 200mb - how would python handle file of such size ? As compared to
    > perl ?)
    >
    > 3. This last one is somewhat subjective, but what do you think, in the
    > future, which will be more useful. Which, in your (humble) opinion
    > "has a future" ?
    >
    > Thank you for all the info you can spare, and expecially grateful for
    > the time in doing so.
    > -- Ivan
    > --
    > http://mail.python.org/mailman/listinfo/python-list


    I could answer to your second question (will Python handle large
    files). In my case I use Python to create statistics from some trace
    files from a genetic algorithm, and my current size is up to 20MB for
    about 40 files. I do the following:
    * use regular expressions to identify each line type, extract the
    information (as numbers);
    * either create statistics on the fly, either load the dumped data
    into an Sqlite3 database (which got up to a couple of hundred MB);
    * everything works fine until now;

    I've also used Python (better said an application built in Python
    with cElementTree?), that took the Wikipedia XML dumps (7GB? I'm not
    sure, but a couple of GB), then created a custom format file, from
    which I've tried to create SQL inserts... And everything worked good.
    (Of course it took some time to do all the processing).

    So my conclusion is that if you try to keep your in-memory data
    small, and use the smart (right) solution for the problem you could
    use Python without (big) overhead.

    Another side-note, I've also used Python (with NumPy) to implement
    neural networks (in fact clustering with ART), where I had about 20
    thousand training elements (arrays of thousands of elements), and it
    worked remarkably good (I would better than in Java, and comparable
    with C/C++).

    I hope I've helped you,
    Ciprian Craciun.

    P.S. If you just need one regular expression transformation to
    another, or you need regular expression searching, then just use sed
    or grep as you would not get anything better than them.
     
    Ciprian Dorin, Craciun, Mar 3, 2009
    #2
    1. Advertising

  3. Ivan

    Terry Reedy Guest

    Ivan wrote:
    > Hello everyone,
    >
    > I know this is not a direct python question, forgive me for that, but
    > maybe some of you will still be able to help me. I've been told that
    > for my application it would be best to learn a scripting language, so
    > I looked around and found perl and python to be the nice. Their syntax
    > and "way" is not similar, though.
    > So, I was wondering, could any of you please elaborate on the
    > following, as to ease my dilemma:


    Which way are *you* more comfortable with? There are people who
    regularly use both, and many who do not.

    >
    > 1. Although it is all relatively similar, there are differences
    > between regexes of these two. Which do you believe is the more
    > powerful variant (maybe an example) ?


    This is not relevant to your application below. In any case, the
    differences are in rather esoteric details.
    >
    > 2. They are both interpreted languages, and I can't really be sure how
    > they measure in speed. In your opinion, for handling large files,
    > which is better ?
    > (I'm processing files of numerical data of several hundred mb - let's
    > say 200mb - how would python handle file of such size ? As compared to
    > perl ?)


    For one file and simple processing, the time difference should be less
    than the time you spent asking the question. For complex processing or
    multiple files, a Python user might use numpy, scipy, or other
    pre-written analysis extensions.

    > 3. This last one is somewhat subjective, but what do you think, in the
    > future, which will be more useful. Which, in your (humble) opinion
    > "has a future" ?


    Python ;-) at least for me.

    Terry Jan Reedy
     
    Terry Reedy, Mar 3, 2009
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    3
    Views:
    822
    Reedick, Andrew
    Jul 1, 2008
  2. Ivan
    Replies:
    5
    Views:
    580
  3. Ivan
    Replies:
    0
    Views:
    253
  4. Ivan
    Replies:
    1
    Views:
    999
  5. Deepu
    Replies:
    1
    Views:
    266
    ccc31807
    Feb 7, 2011
Loading...

Share This Page