python vs perl lines of code

Discussion in 'Perl Misc' started by Edward Elliott, May 17, 2006.

  1. This is just anecdotal, but I still find it interesting. Take it for what
    it's worth. I'm interested in hearing others' perspectives, just please
    don't turn this into a pissing contest.

    I'm in the process of converting some old perl programs to python. These
    programs use some network code and do a lot of list/dict data processing.
    The old ones work fine but are a pain to extend. After two conversions,
    the python versions are noticeably shorter.

    The first program does some http retrieval, sort of a poor-man's wget with
    some extra features. In fact it could be written as a bash script with
    wget, but the extra processing would make it very messy. Here are the
    numbers on the two versions:

    Raw -Blanks -Comments
    lines chars lines chars lines chars
    mirror.py 167 4632 132 4597 118 4009
    mirror.pl 309 5836 211 5647 184 4790

    I've listed line and character counts for three forms. Raw is the source
    file as-is. -Blanks is the source with blank lines removed, including
    lines with just a brace. -Comments removes both blanks and comment lines.
    I think -Blanks is the better measure because comments are a function of
    code complexity, but either works.

    By the numbers, the python code appears roughly 60% as long by line and 80%
    as long by characters. The chars percentage being (higher relative to line
    count) doesn't surprise me since things like list comprehensions and
    explicit module calling produce lengthy but readable lines.

    I should point out this wasn't a straight line-for-line conversion, but the
    basic code structure is extremely similar. I did make a number of
    improvements in the Python version with stricter arg checks and better
    error handling, plus added a couple minor new features.

    The second program is an smtp outbound filtering proxy. Same categories as
    before:

    Raw -Blanks -Comments
    lines chars lines chars lines chars
    smtp-proxy.py 261 7788 222 7749 205 6964
    smtp-proxy.pl 966 24110 660 23469 452 14869

    The numbers here look much more impressive but it's not a fair comparison.
    I wasn't happy with any of the cpan libraries for smtp sending at the time
    so I rolled my own. That accounts for 150 raw lines of difference. Another
    70 raw lines are logging functions that the python version does with the
    standard library. The new version performs the same algorithms and data
    manipulations as the original. I did do some major refactoring along the
    way, but it wasn't the sort that greatly reduces line count by eliminating
    redundancy; there is very little redundancy in either version. In any
    case, these factors alone don't account for the entire difference, even if
    you take 220 raw lines directly off the latter columns.

    The two versions were written about 5 years apart, all by me. At the time
    of each, I had about 3 years experience in the given language and would
    classify my skill level in it as midway between intermediate and advanced.
    IOW I'm very comfortable with the language and library reference docs (minus
    a few odd corners), but generally draw the line at mucking with interpreter
    internals like symbol tables.

    I'd like to here from others what their experience converting between perl
    and python is (either direction). I don't have the sense that either
    language is particularly better suited for my problem domain than the
    other, as they both handle network io and list/dict processing very well.
    What are the differences like in other domains? Do you attribute those
    differences to the language, the library, the programmer, or other
    factors? What are the consistent differences across space and time, if
    any? I'm interested in properties of the code itself, not performance.

    And just what is the question to the ultimate answer to life, the universe,
    and everything anyway? ;)

    --
    Edward Elliott
    UC Berkeley School of Law (Boalt Hall)
    complangpython at eddeye dot net
     
    Edward Elliott, May 17, 2006
    #1
    1. Advertising

  2. Edward Elliott

    John Bokma Guest

    John Bokma, May 17, 2006
    #2
    1. Advertising

  3. John Bokma wrote:

    > Edward Elliott <nobody@127.0.0.1> wrote:
    >
    >> This is just anecdotal, but I still find it interesting. Take it for
    >> what it's worth. I'm interested in hearing others' perspectives, just
    >> please don't turn this into a pissing contest.

    >
    > Without seeing the actual code this is quite meaningless.


    Evaluating my experiences yes, relating your own no.

    --
    Edward Elliott
    UC Berkeley School of Law (Boalt Hall)
    complangpython at eddeye dot net
     
    Edward Elliott, May 17, 2006
    #3
  4. Edward Elliott

    brian d foy Guest

    In article <IRuag.28298$>, Edward
    Elliott <nobody@127.0.0.1> wrote:

    > This is just anecdotal, but I still find it interesting. Take it for what
    > it's worth. I'm interested in hearing others' perspectives, just please
    > don't turn this into a pissing contest.
    >
    > I'm in the process of converting some old perl programs to python. These
    > programs use some network code and do a lot of list/dict data processing.
    > The old ones work fine but are a pain to extend. After two conversions,
    > the python versions are noticeably shorter.


    You've got some hidden assumptions in there somehere, even if you
    aren't admitting them to yourself. :)

    You have to note that rewriting a program, even in the same language,
    tends to make it shorter, too. These things are measures of programmer
    skill, not the usefulness or merit of a particular language.

    Shorter doesn't really mean anything though, and line count means even
    less. The number of statements or the statement density might be
    slightly more meaningful. Furthermore, you can't judge a script by just
    the lines you see. Count the lines of all the libraries and support
    files that come into play. Even then, that's next to meaningless unless
    the two things do exactly the same thing and have exactly the same
    features and capabilities.

    I can write a one line (or very short) program (in any language) that
    does the same thing your scripts do just by hiding the good stuff in a
    library. One of my friends likes to talk about his program that
    implements Tetris in one statement (because he hardwired everything
    into a chip). That doesn't lead us to any greater understanding of
    anything though.

    *** Posted via a free Usenet account from http://www.teranews.com ***
     
    brian d foy, May 17, 2006
    #4
  5. Edward Elliott

    achates Guest

    It probably says something about your coding style, particularly in
    perl. I've found (anecdotally of course) that while perl is potentially
    the more economical language, writing *legible* perl takes a lot more
    space.
     
    achates, May 17, 2006
    #5
  6. Edward Elliott

    Adam Jones Guest

    Without any more information I would say the biggest contributor to
    this dissimilarity is your experience. Having spent an additional five
    years writing code you probably are better now at programming than you
    were then. I am fairly confident that if you were to take another crack
    at these same programs in perl you would see similar results.

    One of the bigger differences might have been language changes over
    time. If you had written this in python five years ago (assuming the
    python rewrites are relatively current, otherwise this list gets
    bigger) you would not have generators, iterators, the logging package,
    built in sets, decorators, and a host of other changes. Some of these
    features you may not have used, but for every one you did python would
    have had more weight.

    Other than that it all boils down to how the algorithm is implemented.
    Between those three factors you can probably account for most of the
    differences here. The real important question is: what has perl done in
    the last five years to make writing these scripts easier?
     
    Adam Jones, May 17, 2006
    #6
  7. brian d foy wrote:

    > You have to note that rewriting a program, even in the same language,
    > tends to make it shorter, too. These things are measures of programmer
    > skill, not the usefulness or merit of a particular language.


    I completely agree. But you have to start somewhere.

    > Shorter doesn't really mean anything though, and line count means even
    > less. The number of statements or the statement density might be
    > slightly more meaningful. Furthermore, you can't judge a script by just
    > the lines you see. Count the lines of all the libraries and support
    > files that come into play. Even then, that's next to meaningless unless
    > the two things do exactly the same thing and have exactly the same
    > features and capabilities.


    For an objective measure of which language/environment is more optimal for a
    given task, your statement is completely accurate. OTOH for a
    quick-and-dirty real-world comparison of line counts, and possibly a rough
    approximation of complexity, the libraries don't matter if they offer
    more-or-less comparable functionality. Especially if those libraries are
    the standard ones most people rely on.

    I'm not attaching any special significance to line counts. They're just a
    data point that's easy to quantify. What if anything do they mean? How
    does one measure statement density? What's the divisor in the density
    ratio - lines, characters, units of work, etc? These are all interesting
    questions with no easy answers.

    > I can write a one line (or very short) program (in any language) that
    > does the same thing your scripts do just by hiding the good stuff in a
    > library. One of my friends likes to talk about his program that
    > implements Tetris in one statement (because he hardwired everything
    > into a chip). That doesn't lead us to any greater understanding of
    > anything though.


    Of course. Extreme cases are just that.

    --
    Edward Elliott
    UC Berkeley School of Law (Boalt Hall)
    complangpython at eddeye dot net
     
    Edward Elliott, May 18, 2006
    #7
  8. achates wrote:

    > It probably says something about your coding style, particularly in
    > perl. I've found (anecdotally of course) that while perl is potentially
    > the more economical language, writing *legible* perl takes a lot more
    > space.


    I'm sure it does. My perl (from 5 years ago) may be considered verbose (or
    not, I don't know). I avoid shortcuts like $_, use strict mode, etc. Then
    again I frequently use short forms like "statement if/unless (blah);" when
    appropriate. So there's a big personal component in there.

    But again, the interesting thing to me isn't what could one do, it's what
    are people actually doing in the real world?

    --
    Edward Elliott
    UC Berkeley School of Law (Boalt Hall)
    complangpython at eddeye dot net
     
    Edward Elliott, May 18, 2006
    #8
  9. Adam Jones wrote:

    > Without any more information I would say the biggest contributor to
    > this dissimilarity is your experience. Having spent an additional five
    > years writing code you probably are better now at programming than you
    > were then. I am fairly confident that if you were to take another crack
    > at these same programs in perl you would see similar results.


    I am in complete agreement with that statement.

    > One of the bigger differences might have been language changes over
    > time. If you had written this in python five years ago (assuming the
    > python rewrites are relatively current, otherwise this list gets
    > bigger) you would not have generators, iterators, the logging package,
    > built in sets, decorators, and a host of other changes. Some of these
    > features you may not have used, but for every one you did python would
    > have had more weight.


    Absolutely.

    > Other than that it all boils down to how the algorithm is implemented.
    > Between those three factors you can probably account for most of the
    > differences here.


    s/probably/maybe. The factors you list are certainly big contributors, but
    are they most of it? I don't know, there are good arguments both ways. If
    you removed those factors, would the resulting python be shorter, as long
    as, or longer than the corresponding perl? Would that mean anything about
    code complexity, readability, maintainability, etc? I'm not comfortable
    drawing any conclusions, but asking the questions is good.

    > The real important question is: what has perl done in
    > the last five years to make writing these scripts easier?


    That's another very good question. One I can't answer.

    --
    Edward Elliott
    UC Berkeley School of Law (Boalt Hall)
    complangpython at eddeye dot net
     
    Edward Elliott, May 18, 2006
    #9
  10. Edward Elliott

    Larry Bates Guest

    Edward Elliott wrote:
    > This is just anecdotal, but I still find it interesting. Take it for what
    > it's worth. I'm interested in hearing others' perspectives, just please
    > don't turn this into a pissing contest.
    >
    > I'm in the process of converting some old perl programs to python. These
    > programs use some network code and do a lot of list/dict data processing.
    > The old ones work fine but are a pain to extend. After two conversions,
    > the python versions are noticeably shorter.
    >
    > The first program does some http retrieval, sort of a poor-man's wget with
    > some extra features. In fact it could be written as a bash script with
    > wget, but the extra processing would make it very messy. Here are the
    > numbers on the two versions:
    >
    > Raw -Blanks -Comments
    > lines chars lines chars lines chars
    > mirror.py 167 4632 132 4597 118 4009
    > mirror.pl 309 5836 211 5647 184 4790
    >
    > I've listed line and character counts for three forms. Raw is the source
    > file as-is. -Blanks is the source with blank lines removed, including
    > lines with just a brace. -Comments removes both blanks and comment lines.
    > I think -Blanks is the better measure because comments are a function of
    > code complexity, but either works.
    >
    > By the numbers, the python code appears roughly 60% as long by line and 80%
    > as long by characters. The chars percentage being (higher relative to line
    > count) doesn't surprise me since things like list comprehensions and
    > explicit module calling produce lengthy but readable lines.
    >
    > I should point out this wasn't a straight line-for-line conversion, but the
    > basic code structure is extremely similar. I did make a number of
    > improvements in the Python version with stricter arg checks and better
    > error handling, plus added a couple minor new features.
    >
    > The second program is an smtp outbound filtering proxy. Same categories as
    > before:
    >
    > Raw -Blanks -Comments
    > lines chars lines chars lines chars
    > smtp-proxy.py 261 7788 222 7749 205 6964
    > smtp-proxy.pl 966 24110 660 23469 452 14869
    >
    > The numbers here look much more impressive but it's not a fair comparison.
    > I wasn't happy with any of the cpan libraries for smtp sending at the time
    > so I rolled my own. That accounts for 150 raw lines of difference. Another
    > 70 raw lines are logging functions that the python version does with the
    > standard library. The new version performs the same algorithms and data
    > manipulations as the original. I did do some major refactoring along the
    > way, but it wasn't the sort that greatly reduces line count by eliminating
    > redundancy; there is very little redundancy in either version. In any
    > case, these factors alone don't account for the entire difference, even if
    > you take 220 raw lines directly off the latter columns.
    >
    > The two versions were written about 5 years apart, all by me. At the time
    > of each, I had about 3 years experience in the given language and would
    > classify my skill level in it as midway between intermediate and advanced.
    > IOW I'm very comfortable with the language and library reference docs (minus
    > a few odd corners), but generally draw the line at mucking with interpreter
    > internals like symbol tables.
    >
    > I'd like to here from others what their experience converting between perl
    > and python is (either direction). I don't have the sense that either
    > language is particularly better suited for my problem domain than the
    > other, as they both handle network io and list/dict processing very well.
    > What are the differences like in other domains? Do you attribute those
    > differences to the language, the library, the programmer, or other
    > factors? What are the consistent differences across space and time, if
    > any? I'm interested in properties of the code itself, not performance.
    >
    > And just what is the question to the ultimate answer to life, the universe,
    > and everything anyway? ;)
    >

    Sorry, I don't buy this. I can write REALLY short programs that don't handle
    exceptions, don't provide for logging for debugging purposes, don't allow
    for future growth, etc. I find that 60% of my code has nothing to do with
    the actual algorithm or function I'm trying to accomplish. It has more to
    do with handling user's bad input, exceptions, recovering from hardware or
    communications failures, etc. Inexperienced programmers can write some
    pretty short programs that get the job done, but can't handle the real world.

    Also, many years ago I wrote a number of applications in APL. We often
    referred to programs written in APL as "write only code" because going back
    to read what you had written after-the-fact was very hard. You could write
    in one line of APL what takes 1000's of lines of C or even Python and it was
    pretty efficient also (for applications that needed to manipulate vectors or
    matrices).

    I understand what you are trying to say, but I can't support your conclusions
    as presented.

    -Larry Bates
     
    Larry Bates, May 20, 2006
    #10
  11. Larry Bates wrote:

    > Sorry, I don't buy this. I can write REALLY short programs that don't
    > handle exceptions, don't provide for logging for debugging purposes, don't
    > allow
    > for future growth, etc. I find that 60% of my code has nothing to do with
    > the actual algorithm or function I'm trying to accomplish. It has more to
    > do with handling user's bad input, exceptions, recovering from hardware or
    > communications failures, etc.


    Wow, only 60%, I'm surprised it's that low :). When I say the algorithms
    are roughly equivalent, I'm including the amount of input verification and
    error checking that they do. To me, that's part of the algorithm.

    > Inexperienced programmers can write some
    > pretty short programs that get the job done, but can't handle the real
    > world.


    Tell me about it. I've taught intro comp sci, things can get real ugly real
    quick. :)

    > Also, many years ago I wrote a number of applications in APL. We often
    > referred to programs written in APL as "write only code" because going
    > back
    > to read what you had written after-the-fact was very hard. You could
    > write in one line of APL what takes 1000's of lines of C or even Python
    > and it was pretty efficient also (for applications that needed to
    > manipulate vectors or matrices).


    Of course. Comparing line counts between assembly and Prolog is pretty
    useless given the vast discrepancy in their expressive power. Perl and
    Python are roughly comparable in expressiveness, so it doesn't seem
    unreasonable to compare their line counts. It might not tell you much,
    there are undoubtedly better comparisons to make, but I don't think it's
    grossly unfair the way you suggest. I'm all ears if you have another
    metric I can test as easily.

    > I understand what you are trying to say, but I can't support your
    > conclusions as presented.


    What would those be? I tried hard not draw any conclusions. I just want to
    see how other people's data compares to mine.

    --
    Edward Elliott
    UC Berkeley School of Law (Boalt Hall)
    complangpython at eddeye dot net
     
    Edward Elliott, May 21, 2006
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jack
    Replies:
    9
    Views:
    2,707
  2. Joe Wright
    Replies:
    0
    Views:
    539
    Joe Wright
    Jul 27, 2003
  3. Edward Elliott

    python vs perl lines of code

    Edward Elliott, May 17, 2006, in forum: Python
    Replies:
    82
    Views:
    1,466
    John Bokma
    May 22, 2006
  4. lovecreatesbeauty

    How to know two lines are a pare parallel lines

    lovecreatesbeauty, Apr 27, 2006, in forum: C Programming
    Replies:
    11
    Views:
    681
    Old Wolf
    Apr 28, 2006
  5. Markus Dehmann
    Replies:
    1
    Views:
    142
    Tad McClellan
    Sep 26, 2006
Loading...

Share This Page