Re: Web Frameworks Excessive Complexity

Discussion in 'Python' started by Robert Kern, Nov 20, 2012.

  1. Robert Kern

    Robert Kern Guest

    On 20/11/2012 19:46, Andriy Kornatskyy wrote:
    >
    > Robert,
    >
    > Thank you for the comment. I do not try relate CC with LOC. Instead pointing to excessive complexity, something that is beyond recommended threshold, a subject to refactoring in respective web frameworks. Those areas are likely to be potential source of bugs (e.g. due to low code coverage with unit tests) thus have certain degree of interest to both: end users and framework developers.


    Did you read the paper? I'm not suggesting that you compare CC with LoC; I'm
    suggesting that you don't use CC as a metric at all. The research is fairly
    conclusive that CC doesn't measure what you think it measures. The source of
    bugs is not excessive complexity in a method, just excessive lines of code. LoC
    is much simpler, easier to understand, and easier to correct than CC.

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
     
    Robert Kern, Nov 20, 2012
    #1
    1. Advertising

  2. On Tue, 20 Nov 2012 20:07:54 +0000, Robert Kern wrote:

    > The source of bugs is not excessive complexity in a method, just
    > excessive lines of code.


    Taken literally, that cannot possibly the case.

    def method(self, a, b, c):
    do_this(a)
    do_that(b)
    do_something_else(c)


    def method(self, a, b, c):
    do_this(a); do_that(b); do_something_else(c)


    It *simply isn't credible* that version 1 is statistically likely to have
    twice as many bugs as version 2. Over-reliance on LOC is easily gamed,
    especially in semicolon languages.

    Besides, I think you have the cause and effect backwards. I would rather
    say:

    The source of bugs is not lines of code in a method, but excessive
    complexity. It merely happens that counting complexity is hard, counting
    lines of code is easy, and the two are strongly correlated, so why count
    complexity when you can just count lines of code?



    Keep in mind that something like 70-80% of published scientific papers
    are never replicated, or cannot be replicated. Just because one paper
    concludes that LOC alone is a better metric than CC doesn't necessary
    make it so. But even if we assume that the paper is valid, it is
    important to understand just what it says, and not extrapolate too far.

    The paper makes various assumptions, takes statistical samples, and uses
    models. (Which of course *any* such study must.) I'm not able to comment
    on whether those models and assumptions are valid, but assuming that they
    are, the conclusion of the paper is no stronger than the models and
    assumptions. We should not really conclude that "CC has no more
    predictive power than LOC". The right conclusion is that one specific
    model of cyclic complexity, McCabe's CC, has no more predictive power
    than LOC for projects written in C, C++ and Java.

    How does that apply to Python code? Well, it's certainly suggestive, but
    it isn't definitive.

    It's also important to note that the authors point out that in their
    samples of code, they found very high variance and large numbers of
    outliers:

     
    Steven D'Aprano, Nov 21, 2012
    #2
    1. Advertising

  3. Am 21.11.2012 02:43, schrieb Steven D'Aprano:
    > On Tue, 20 Nov 2012 20:07:54 +0000, Robert Kern wrote:
    >> The source of bugs is not excessive complexity in a method, just
    >> excessive lines of code.

    >
    > Taken literally, that cannot possibly the case.
    >
    > def method(self, a, b, c):
    > do_this(a)
    > do_that(b)
    > do_something_else(c)
    >
    >
    > def method(self, a, b, c):
    > do_this(a); do_that(b); do_something_else(c)
    >
    >
    > It *simply isn't credible* that version 1 is statistically likely to have
    > twice as many bugs as version 2. Over-reliance on LOC is easily gamed,
    > especially in semicolon languages.


    "Don't indent deeper than 4 levels!" "OK, not indenting at all, $LANG
    doesn't need it anyway." Sorry, but if code isn't even structured
    halfway reasonably it is unmaintainable, regardless of what CC or LOC say.


    > Besides, I think you have the cause and effect backwards. I would rather
    > say:
    >
    > The source of bugs is not lines of code in a method, but excessive
    > complexity. It merely happens that counting complexity is hard, counting
    > lines of code is easy, and the two are strongly correlated, so why count
    > complexity when you can just count lines of code?


    I agree here, and I'd go even further: Measuring complexity is not just
    hard, it requires a metric that you need to agree on first. With LOC you
    only need to agree on not semicolon-chaining lines and how to treat
    comments and empty lines. With CC, you effectively agree that an if
    statement has complexity of one (or 2?) while a switch statement has a
    complexity according to its number of cases, while it is way easier to
    read and comprehend than a similar number produced by if statement.
    Also, CC doesn't even consider new-fashioned stuff like exceptions that
    introduce yet another control flow path.


    >> LoC is much simpler, easier to understand, and
    >> easier to correct than CC.

    >
    > Well, sure, but do you really think Perl one-liners are the paragon of
    > bug-free code we ought to be aiming for? *wink*


    Hehehe... ;)

    Uli
     
    Ulrich Eckhardt, Nov 21, 2012
    #3
  4. On Wed, Nov 21, 2012 at 10:09 PM, Andriy Kornatskyy
    <> wrote:
    > We choose Python for its readability. This is essential principal of language and thousands around reading the open source code. Things like PEP8, CC, LoC are all to serve you one purpose: bring your attention, teach you make your code better.


    But too much focus on metrics results in those metrics improving
    without any material benefit to the code. If there's a number that you
    can watch going up or down, nobody's going to want to be the one that
    pushes that number the wrong direction. So what happens when the right
    thing to do happens to conflict with the given metric? And yes, it
    WILL happen, guaranteed. No metric is perfect.

    Counting lines of code teaches you to make dense code. That's not a
    good thing nor a bad thing; you'll end up with list comprehensions
    rather than short loops, regardless of which is easier to actually
    read.

    Counting complexity by giving a score to every statement encourages
    code like this:

    def bletch(x,y):
    return x + {"foo":y*2,"bar":x*3+y,"quux":math.sin(y)}.get(mode,0)

    instead of:

    def bletch(x,y):
    if mode=="foo": return x+y*2
    if mode=="bar": return x*4+y
    if mode=="quux": return x+math.sin(y)
    return x

    Okay, this is a stupid contrived example, but tell me which of those
    you'd rather work with, and then tell me a plausible metric that would
    agree with you.

    ChrisA
     
    Chris Angelico, Nov 21, 2012
    #4
  5. On Wed, 21 Nov 2012 22:21:23 +1100, Chris Angelico wrote:

    > Counting complexity by giving a score to every statement encourages code
    > like this:
    >
    > def bletch(x,y):
    > return x + {"foo":y*2,"bar":x*3+y,"quux":math.sin(y)}.get(mode,0)
    >
    > instead of:
    >
    > def bletch(x,y):
    > if mode=="foo": return x+y*2
    > if mode=="bar": return x*4+y
    > if mode=="quux": return x+math.sin(y) return x
    >
    > Okay, this is a stupid contrived example, but tell me which of those
    > you'd rather work with



    Am I being paid by the hour or the line?




    --
    Steven
     
    Steven D'Aprano, Nov 21, 2012
    #5
  6. Robert Kern

    Robert Kern Guest

    On 21/11/2012 01:43, Steven D'Aprano wrote:
    > On Tue, 20 Nov 2012 20:07:54 +0000, Robert Kern wrote:
    >
    >> The source of bugs is not excessive complexity in a method, just
    >> excessive lines of code.

    >
    > Taken literally, that cannot possibly the case.
    >
    > def method(self, a, b, c):
    > do_this(a)
    > do_that(b)
    > do_something_else(c)
    >
    >
    > def method(self, a, b, c):
    > do_this(a); do_that(b); do_something_else(c)
    >
    >
    > It *simply isn't credible* that version 1 is statistically likely to have
    > twice as many bugs as version 2. Over-reliance on LOC is easily gamed,
    > especially in semicolon languages.


    Logical LoC (executable LoC, number of statements, etc.) is a better measure
    than Physical LoC, I agree. That's not the same thing as cyclomatic complexity,
    though. Also, the relationship between LoC (of either type) and bugs is not
    linear (at least not in the small-LoC regime), so you are certainly correct that
    it isn't credible that version 1 is likely to have twice as many bugs as version
    2. No one is saying that it is.

    > Besides, I think you have the cause and effect backwards. I would rather
    > say:
    >
    > The source of bugs is not lines of code in a method, but excessive
    > complexity. It merely happens that counting complexity is hard, counting
    > lines of code is easy, and the two are strongly correlated, so why count
    > complexity when you can just count lines of code?


    No, that is not the takeaway of the research. More code correlates with more
    bugs. More cyclomatic complexity also correlates with more bugs. You want to
    find out what causes bugs. What the research shows is that cyclomatic complexity
    is so correlated with LoC that it is going to be very difficult, or impossible,
    to establish a causal relationship between cyclomatic complexity and bugs. The
    previous research that just correlated cyclomatic complexity to bugs without
    controlling for LoC does not establish the causal relationship.

    > Keep in mind that something like 70-80% of published scientific papers
    > are never replicated, or cannot be replicated. Just because one paper
    > concludes that LOC alone is a better metric than CC doesn't necessary
    > make it so. But even if we assume that the paper is valid, it is
    > important to understand just what it says, and not extrapolate too far.


    This paper is actually a replication. It is notable for how comprehensive it is.

    > The paper makes various assumptions, takes statistical samples, and uses
    > models. (Which of course *any* such study must.) I'm not able to comment
    > on whether those models and assumptions are valid, but assuming that they
    > are, the conclusion of the paper is no stronger than the models and
    > assumptions. We should not really conclude that "CC has no more
    > predictive power than LOC". The right conclusion is that one specific
    > model of cyclic complexity, McCabe's CC, has no more predictive power
    > than LOC for projects written in C, C++ and Java.
    >
    > How does that apply to Python code? Well, it's certainly suggestive, but
    > it isn't definitive.


    More so than the evidence that CC is a worthwhile measure, for Python or any
    language.

    > It's also important to note that the authors point out that in their
    > samples of code, they found very high variance and large numbers of
    > outliers:
    >
    >
     
    Robert Kern, Nov 21, 2012
    #6
  7. On Wed, Nov 21, 2012 at 10:43 PM, Steven D'Aprano
    <> wrote:
    > On Wed, 21 Nov 2012 22:21:23 +1100, Chris Angelico wrote:
    >
    >> Counting complexity by giving a score to every statement encourages code
    >> like this:
    >>
    >> def bletch(x,y):
    >> return x + {"foo":y*2,"bar":x*3+y,"quux":math.sin(y)}.get(mode,0)
    >>
    >> instead of:
    >>
    >> def bletch(x,y):
    >> if mode=="foo": return x+y*2
    >> if mode=="bar": return x*4+y
    >> if mode=="quux": return x+math.sin(y) return x
    >>
    >> Okay, this is a stupid contrived example, but tell me which of those
    >> you'd rather work with

    >
    >
    > Am I being paid by the hour or the line?


    You're on a salary, but management specified some kind of code metrics
    as a means of recognizing which of their programmers are more
    productive, and thus who gets promoted.

    Oh, I'm *so* glad I work in a small company. We've only had one
    programmer that we "let go" (and actually, it was literally letting
    him go - he said he was no good, hoping that we'd beg him to stay, and
    we simply didn't beg him to stay), and the metric of code quality was
    simply that both my boss and I looked at his code and said that it
    wasn't good enough. Much simpler. (Though my boss and I have differing
    views on how many lines of code some things should be. We end up
    having some rather amusing debates about trivial things like line
    breaks.)

    ChrisA
     
    Chris Angelico, Nov 21, 2012
    #7
  8. Robert Kern

    Modulok Guest

    > On Wed, Nov 21, 2012 at 10:43 PM, Steven D'Aprano
    > <> wrote:
    >> On Wed, 21 Nov 2012 22:21:23 +1100, Chris Angelico wrote:
    >>
    >>> Counting complexity by giving a score to every statement encourages code
    >>> like this:
    >>>
    >>> def bletch(x,y):
    >>> return x + {"foo":y*2,"bar":x*3+y,"quux":math.sin(y)}.get(mode,0)
    >>>
    >>> instead of:
    >>>
    >>> def bletch(x,y):
    >>> if mode=="foo": return x+y*2
    >>> if mode=="bar": return x*4+y
    >>> if mode=="quux": return x+math.sin(y) return x
    >>>
    >>> Okay, this is a stupid contrived example, but tell me which of those
    >>> you'd rather work with

    >>
    >>


    > Oh, I'm *so* glad I work in a small company.


    Agreed. Do we rate a contractor's quality of workmanship and efficiency by the
    number of nails he drives?

    Of course not. That would be ridiculous.

    A better metric of code quality and complexity would be to borrow from science
    and mathematics. i.e. a peer review or audit by others working on the project
    or in the same field of study. Unfortunately this isn't cheap or easily
    computed and doesn't translate nicely to a bar graph.

    Such is reality.
    -Modulok-
     
    Modulok, Nov 22, 2012
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Robert Kern

    Re: Web Frameworks Excessive Complexity

    Robert Kern, Nov 20, 2012, in forum: Python
    Replies:
    0
    Views:
    133
    Robert Kern
    Nov 20, 2012
  2. Andriy Kornatskyy

    RE: Web Frameworks Excessive Complexity

    Andriy Kornatskyy, Nov 20, 2012, in forum: Python
    Replies:
    0
    Views:
    141
    Andriy Kornatskyy
    Nov 20, 2012
  3. Andriy Kornatskyy

    RE: Web Frameworks Excessive Complexity

    Andriy Kornatskyy, Nov 20, 2012, in forum: Python
    Replies:
    0
    Views:
    135
    Andriy Kornatskyy
    Nov 20, 2012
  4. Robert Kern

    Re: Web Frameworks Excessive Complexity

    Robert Kern, Nov 20, 2012, in forum: Python
    Replies:
    0
    Views:
    137
    Robert Kern
    Nov 20, 2012
  5. Andriy Kornatskyy

    RE: Web Frameworks Excessive Complexity

    Andriy Kornatskyy, Nov 21, 2012, in forum: Python
    Replies:
    0
    Views:
    102
    Andriy Kornatskyy
    Nov 21, 2012
Loading...

Share This Page