is there any principle when writing python function

Discussion in 'Python' started by smith jack, Aug 23, 2011.

  1. smith jack

    smith jack Guest

    i have heard that function invocation in python is expensive, but make
    lots of functions are a good design habit in many other languages, so
    is there any principle when writing python function?
    for example, how many lines should form a function?
    smith jack, Aug 23, 2011
    #1
    1. Advertising

  2. smith jack

    Peter Otten Guest

    smith jack wrote:

    > i have heard that function invocation in python is expensive, but make
    > lots of functions are a good design habit in many other languages, so
    > is there any principle when writing python function?
    > for example, how many lines should form a function?


    Five ;)
    Peter Otten, Aug 23, 2011
    #2
    1. Advertising

  3. smith jack

    Mel Guest

    smith jack wrote:

    > i have heard that function invocation in python is expensive, but make
    > lots of functions are a good design habit in many other languages, so
    > is there any principle when writing python function?


    It's hard to discuss in the abstract. A function should perform a
    recognizable step in solving the program's problem. If you prepared to
    write your program by describing each of several operations the program
    would have to perform, then you might go on to plan a function for each of
    the described operations. The high-level functions can then be analyzed,
    and will probably lead to functions of their own.

    Test-driven development encourages smaller functions that give you a better
    granularity of testing. Even so, the testable functions should each perform
    one meaningful step of a more general problem.

    > for example, how many lines should form a function?

    Maybe as few as one.

    def increase (x, a):
    return x+a

    is kind of stupid, but a more complicated line

    def expand_template (bitwidth, defs):
    '''Turn Run-Length-Encoded list into bits.'''
    return np.array (sum (([bit]*(count*bitwidth) for count, bit in
    defs), []), np.int8)

    is the epitome of intelligence. I wrote it myself. Even increase might be
    useful:

    def increase (x, a):
    return x + a * application_dependent_quantity

    `increase` has become a meaningful operation in the imaginary application
    we're discussing.


    For an upper bound, it's harder to say. If you read to the end of a
    function and can't remember how it started, or what it did in between, it's
    too big. If you're reading on your favourite screen, and the end and the
    beginning are more than one page-scroll apart, it might be too big. If it's
    too big, factoring it into sub-steps and making functions of some of those
    sub-steps is the fix.

    Mel.
    Mel, Aug 23, 2011
    #3
  4. smith jack

    Roy Smith Guest

    In article <>,
    smith jack <> wrote:

    > i have heard that function invocation in python is expensive, but make
    > lots of functions are a good design habit in many other languages, so
    > is there any principle when writing python function?
    > for example, how many lines should form a function?


    Enough lines to do what the function needs to do, but no more.

    Seriously, break up your program into functions based on logical
    groupings, and whatever makes your code easiest to understand. When
    you're all done, if your program is too slow, run it under the profiler.
    Use the profiling results to indicate which parts need improvement.

    It's very unlikely that function call overhead will be a significant
    issue. Don't worry about stuff like that unless the profiler shows its
    a bottleneck. Don't try to guess what's slow. My guesses are almost
    always wrong. Yours will be too.

    If your program runs fast enough as it is, don't even bother with the
    profiler. Be happy that you've got something useful and move on to the
    next thing you've got to do.
    Roy Smith, Aug 23, 2011
    #4
  5. smith jack

    Roy Smith Guest

    In article <j305uo$pmd$>, Peter Otten <>
    wrote:

    > smith jack wrote:
    >
    > > i have heard that function invocation in python is expensive, but make
    > > lots of functions are a good design habit in many other languages, so
    > > is there any principle when writing python function?
    > > for example, how many lines should form a function?

    >
    > Five ;)


    Five is right out.
    Roy Smith, Aug 23, 2011
    #5
  6. smith jack wrote:
    > i have heard that function invocation in python is expensive, but make
    > lots of functions are a good design habit in many other languages, so
    > is there any principle when writing python function?
    > for example, how many lines should form a function?


    Don't compromise the design and clarity of your code just because you heard
    some rumors about performance. Also, for any performance question, please
    consult a profiler.

    Uli

    --
    Domino Laser GmbH
    Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
    Ulrich Eckhardt, Aug 23, 2011
    #6
  7. smith jack wrote:

    > i have heard that function invocation in python is expensive,


    It's expensive, but not *that* expensive. Compare:

    [steve@sylar ~]$ python3.2 -m timeit 'x = "abc".upper()'
    1000000 loops, best of 3: 0.31 usec per loop
    [steve@sylar ~]$ python3.2 -m timeit -s 'def f():
    return "abc".upper()' 'f()'
    1000000 loops, best of 3: 0.53 usec per loop

    So the function call is nearly as expensive as this (very simple!) sample
    code. But in absolute terms, that's not very expensive at all. If we make
    the code more expensive:

    [steve@sylar ~]$ python3.2 -m timeit '("abc"*1000)[2:995].upper().lower()'
    10000 loops, best of 3: 32.3 usec per loop
    [steve@sylar ~]$ python3.2 -m timeit -s 'def f(): return ("abc"*1000
    [2:995].upper().lower()' 'f()'
    10000 loops, best of 3: 33.9 usec per loop

    the function call overhead becomes trivial.

    Cases where function call overhead is significant are rare. Not vanishingly
    rare, but rare enough that you shouldn't worry about them.


    > but make
    > lots of functions are a good design habit in many other languages, so
    > is there any principle when writing python function?
    > for example, how many lines should form a function?


    About as long as a piece of string.

    A more serious answer: it should be exactly as long as needed to do the
    smallest amount of work that makes up one action, and no longer or shorter.

    If you want to maximise the programmer's efficiency, a single function
    should be short enough to keep the whole thing in your short-term memory at
    once. This means it should consist of no more than seven, plus or minus
    two, chunks of code. A chunk may be a single line, or a few lines that
    together make up a unit, or if the lines are particularly complex, *less*
    than a line.

    http://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two
    http://www.codinghorror.com/blog/2006/08/the-magical-number-seven-plus-or-minus-two.html

    (Don't be put off by the use of the term "magical" -- there's nothing
    literally magical about this. It's just a side-effect of the way human
    cognition works.)

    Anything longer than 7±2 chunks, and you will find yourself having to scroll
    backwards and forwards through the function, swapping information into your
    short-term memory, in order to understand it.

    Even 7±2 is probably excessive: I find that I'm most comfortable with
    functions that perform 4±1 chunks of work. An example from one of my
    classes:

    def find(self, prefix):
    """Find the item that matches prefix."""
    prefix = prefix.lower() # Chunk #1
    menu = self._cleaned_menu # Chunk #2
    for i,s in enumerate(menu, 1): # Chunk #3
    if s.lower().startswith(prefix):
    return i
    return None # Chunk #4

    So that's three one-line chunks and one three-line chunk.



    --
    Steven
    Steven D'Aprano, Aug 23, 2011
    #7
  8. smith jack

    Seebs Guest

    On 2011-08-23, smith jack <> wrote:
    > i have heard that function invocation in python is expensive, but make
    > lots of functions are a good design habit in many other languages, so
    > is there any principle when writing python function?


    Lots of them. None of them have to do with performance.

    > for example, how many lines should form a function?


    Between zero (which has to be written "pass") and a few hundred. Usually
    closer to the lower end of that range. Occasionally outside it.

    Which is to say: This is the wrong question.

    Let us give you the two laws of software optimization.

    Law #1: Don't do it.

    If you try to optimize stuff, you will waste a ton of time doing things that,
    it turns out, are unimportant.

    Law #2: (Experts only.) Don't do it yet.

    You don't know enough to "optimize" this yet.

    Write something that does what it is supposed to do and which you understand
    clearly. See how it looks. If it looks like it is running well enough,
    STOP. You are done.

    Now, if it is too slow, and you are running it on real data, NOW it is time
    to think about why it is slow. And the solution there is not to read abstract
    theories about your language, but to profile it -- actually time execution and
    find out where the time goes.

    I've been writing code, and making it faster, for some longish period of time.
    I have not yet ever in any language found cause to worry about function call
    overhead.

    -s
    --
    Copyright 2011, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    I am not speaking for my employer, although they do rent some of my opinions.
    Seebs, Aug 23, 2011
    #8
  9. smith jack

    rantingrick Guest

    On Aug 23, 6:59 am, smith jack <> wrote:
    > i have heard that function invocation in python is expensive, but make
    > lots of functions are a good design habit in many other languages, so
    > is there any principle when writing python function?
    > for example, how many lines should form a function?


    Everyone here who is suggesting that function bodies should be
    confined to ANY length is an idiot. The length of a functions code
    block is inconsequential. Don't worry if it too small or too big. It's
    not the size that matters, it's the motion of the sources ocean!

    A good function can be one line, or a hundred lines. Always use
    comments to clarify code and NEVER EVER create more functions only for
    the sake of short function bodies, WHY, because all you do is move
    confusion OUT OF the function body and INTO the module/class body.

    """Energy can neither be created nor be destroyed: it can only be
    transformed from one state to another"""

    http://en.wikipedia.org/wiki/Conservation_of_energy
    https://sites.google.com/site/thefutureofpython/
    rantingrick, Aug 23, 2011
    #9
  10. smith jack

    Terry Reedy Guest

    On 8/23/2011 11:22 AM, Steven D'Aprano wrote:

    > Even 7±2 is probably excessive: I find that I'm most comfortable with
    > functions that perform 4±1 chunks of work. An example from one of my
    > classes:
    >
    > def find(self, prefix):
    > """Find the item that matches prefix."""
    > prefix = prefix.lower() # Chunk #1
    > menu = self._cleaned_menu # Chunk #2
    > for i,s in enumerate(menu, 1): # Chunk #3
    > if s.lower().startswith(prefix):
    > return i
    > return None # Chunk #4
    >
    > So that's three one-line chunks and one three-line chunk.


    In terms of different functions performed (see my previous post), I see
    attribute lookup
    assignment
    enumerate
    sequence unpacking
    for-looping
    if-conditioning
    lower
    startswith
    return
    That is 9, which is enough.

    --
    Terry Jan Reedy
    Terry Reedy, Aug 23, 2011
    #10
  11. smith jack

    rantingrick Guest

    On Aug 23, 1:29 pm, Terry Reedy <> wrote:

    > In terms of different functions performed (see my previous post), I see
    >    attribute lookup
    >    assignment
    >    enumerate
    >    sequence unpacking
    >    for-looping
    >    if-conditioning
    >    lower
    >    startswith
    >    return
    > That is 9,  which is enough.



    attribute lookup -> inspection
    assignment -> ditto
    enumerate -> enumeration
    sequence unpacking -> parallel assignment
    for-looping -> cycling
    if-conditioning -> logic
    lower -> mutation (don't try to argue!)
    startswith -> boolean-logic
    return -> exiting (although all exits require an entrance!)
    omitted: documenting, referencing, -presumptuousness-

    pedantic-ly yours, rr
    ;-)
    rantingrick, Aug 23, 2011
    #11
  12. Terry Reedy wrote:

    > On 8/23/2011 11:22 AM, Steven D'Aprano wrote:
    >
    >> Even 7±2 is probably excessive: I find that I'm most comfortable with
    >> functions that perform 4±1 chunks of work. An example from one of my
    >> classes:
    >>
    >> def find(self, prefix):
    >> """Find the item that matches prefix."""
    >> prefix = prefix.lower() # Chunk #1
    >> menu = self._cleaned_menu # Chunk #2
    >> for i,s in enumerate(menu, 1): # Chunk #3
    >> if s.lower().startswith(prefix):
    >> return i
    >> return None # Chunk #4
    >>
    >> So that's three one-line chunks and one three-line chunk.

    >
    > In terms of different functions performed (see my previous post), I see
    > attribute lookup
    > assignment
    > enumerate
    > sequence unpacking
    > for-looping
    > if-conditioning
    > lower
    > startswith
    > return
    > That is 9, which is enough.



    I think we have broad agreement, but we're counting different things.
    Analogy: you're counting atoms, I'm grouping atoms into molecules and
    counting them.

    It's a little like phone numbers: it's not an accident that we normally
    group phone numbers into groups of 2-4 digits:

    011 23 4567 8901

    In general, people can more easily memorise four chunks of four digits (give
    or take) than one chunk of 13 digits: 0112345678901.



    --
    Steven
    Steven D'Aprano, Aug 24, 2011
    #12
  13. smith jack

    alex23 Guest

    rantingrick <> wrote:
    > Everyone here who is suggesting that function bodies should be
    > confined to ANY length is an idiot.


    Or, more likely, is the sort of coder who has worked with other coders
    in the past and understands the value of readable code.

    > Don't worry if it too small or too big. It's
    > not the size that matters, it's the motion of the sources ocean!


    If only you spent as much time actually thinking about what you're
    saying as trying to find 'clever' ways to say it...

    > Always use
    > comments to clarify code and NEVER EVER create more functions only for
    > the sake of short function bodies


    This is quite likely the worst advice you've ever given. I can only
    assume you've never had to refactor the sort of code you're advocating
    here.
    alex23, Aug 24, 2011
    #13
  14. smith jack

    alex23 Guest

    rantingrick <> wrote:
    > https://sites.google.com/site/thefutureofpython/


    "Very soon I will be hashing out a specification for python 4000."

    AHAHAHAHAhahahahahahahAHAHAHAHahahahahaaaaaaa. So rich. Anyone willing
    to bet serious money we won't see this before 4000AD?

    "Heck even our leader seems as a captain too drunk with vanity to
    care; and our members like a ship lost at sea left to sport of every
    troll-ish wind!"

    Quite frankly, you're a condescending, arrogant blow-hard that this
    community would be better off without.

    "We must constantly strive to remove multiplicity from our systems;
    lest it consumes us!"

    s/multiplicity/rantingrick/ and I'm in full agreement.
    alex23, Aug 24, 2011
    #14
  15. smith jack

    Red John Guest

    > "We must constantly strive to remove multiplicity from our systems;
    > lest it consumes us!"
    >
    > s/multiplicity/rantingrick/ and I'm in full agreement.


    QFT
    Red John, Aug 25, 2011
    #15
  16. smith jack

    Guest

    On Aug 23, 7:59 am, smith jack <> wrote:
    > i have heard that function invocation in python is expensive, but make
    > lots of functions are a good design habit in many other languages, so
    > is there any principle when writing python function?
    > for example, how many lines should form a function?


    My suggestion is to think how you would test the function, in order to
    get 100% code coverage. The parts of the function that are difficult
    to test, those are the parts that you want to pull out into their own
    separate function.

    For example, a block of code within a conditional statement, where the
    test condition cannot be passed in, is a prime example of a block of
    code that should be pulled out into a separate function.

    Obviously, there are times where this is not practical - exception
    handling comes to mind - but that should be your rule of thumb. If a
    block of code is hard to test, pull it out into it's own function, so
    that it's easier to test.
    --
    // T.Hsu
    , Aug 26, 2011
    #16
  17. smith jack

    Roy Smith Guest

    In article
    <>,
    wrote:

    > On Aug 23, 7:59 am, smith jack <> wrote:
    > > i have heard that function invocation in python is expensive, but make
    > > lots of functions are a good design habit in many other languages, so
    > > is there any principle when writing python function?
    > > for example, how many lines should form a function?

    >
    > My suggestion is to think how you would test the function, in order to
    > get 100% code coverage.


    I'm not convinced 100% code coverage is an achievable goal for any major
    project. I was once involved in a serious code coverage program. We
    had a large body of code (100's of KLOC of C++) which we were licensing
    to somebody else. The customer was insisting that we do code coverage
    testing and set a standard of something like 80% coverage.

    There was a dedicated team of about 4 people working on this for the
    better part of a year. They never came close to 80%. More like 60%,
    and that was after radical surgery to eliminate dead code and branches
    that couldn't be reached. The hard parts are testing the code that
    deals with unusual error conditions caused by interfaces to the external
    world.

    The problem is, it's just damn hard to simulate all the different kinds
    of errors that can occur. This was network intensive code. Every call
    that touches the network can fail in all sorts of ways that are near
    impossible to simulate. We also had lots of code that tried to deal
    with memory exhaustion. Again, that's hard to simulate.

    I'm not saying code coverage testing is a bad thing. Many of the issues
    I mention above could have been solved with additional abstraction
    layers, but that adds complexity of its own. Certainly, designing a
    body of code to be testable from the get-go is a far superior to trying
    to retrofit tests to an existing code base (which is what we were doing).

    > The parts of the function that are difficult
    > to test, those are the parts that you want to pull out into their own
    > separate function.
    >
    > For example, a block of code within a conditional statement, where the
    > test condition cannot be passed in, is a prime example of a block of
    > code that should be pulled out into a separate function.


    Maybe. In general, it's certainly true that a bunch of smallish
    functions, each of which performs exactly one job, is easier to work
    with than a huge ball of spaghetti code. On the other hand, interfaces
    are a common cause of bugs. When you pull a hunk of code out into its
    own function, you create a new interface. Sometimes that adds
    complexity (and bugs) of its own.

    > Obviously, there are times where this is not practical - exception
    > handling comes to mind - but that should be your rule of thumb. If a
    > block of code is hard to test, pull it out into it's own function, so
    > that it's easier to test.


    In general, that's good advice. You'll also usually find that code
    which is easy to test is also easy to understand and easy to modify.
    Roy Smith, Aug 26, 2011
    #17
  18. smith jack

    rantingrick Guest

    On Aug 26, 6:15 am, Roy Smith <> wrote:

    > Maybe.  In general, it's certainly true that a bunch of smallish
    > functions, each of which performs exactly one job, is easier to work
    > with than a huge ball of spaghetti code.  


    Obviously you need to google the definition of "spaghetti code". When
    you move code out of one function and create another function you are
    contributing to the "spaghetti-ness" of the code. Think of plate of
    spaghetti and how the noodles are all intertwined and without order.
    Likewise when you go to one function and have to follow the trial of
    one or more helper functions you are creating a twisting and unordered
    progression of code -- sniff-sniff, do you smell what i smell?

    Furthermore: If you are moving code out of one function to ONLY be
    called by that ONE function then you are a bad programmer and should
    have your editor taken away for six months. You should ONLY create
    more func/methods if those func/methods will be called from two or
    more places in the code. The very essence of func/meths is the fact
    that they are reusable.

    It might still be spaghetti under that definition (of which ALL OOP
    code actually is!) however it will be as elegant as spaghetti can be.

    > On the other hand, interfaces
    > are a common cause of bugs.  When you pull a hunk of code out into its
    > own function, you create a new interface.  Sometimes that adds
    > complexity (and bugs) of its own.


    Which is it? You cannot have it both ways. You're straddling the fence
    here like a dirty politician. Yes, this subject IS black and white!
    rantingrick, Aug 26, 2011
    #18
  19. smith jack

    John Gordon Guest

    In <> rantingrick <> writes:

    > Furthermore: If you are moving code out of one function to ONLY be
    > called by that ONE function then you are a bad programmer and should
    > have your editor taken away for six months. You should ONLY create
    > more func/methods if those func/methods will be called from two or
    > more places in the code. The very essence of func/meths is the fact
    > that they are reusable.


    That's one very important aspect of functions, yes. But there's another:
    abstraction.

    If I'm writing a module that needs to fetch user details from an LDAP
    server, it might be worthwhile to put all of the LDAP-specific code in
    its own method, even if it's only used once. That way the main module
    can just contain a line like this:

    user_info = get_ldap_results("cn=john gordon,ou=people,dc=company,dc=com")

    The main module keeps a high level of abstraction instead of descending
    into dozens or even hundreds of lines of LDAP-specific code.

    --
    John Gordon A is for Amy, who fell down the stairs
    B is for Basil, assaulted by bears
    -- Edward Gorey, "The Gashlycrumb Tinies"
    John Gordon, Aug 26, 2011
    #19
  20. smith jack

    Tobiah Guest


    > Furthermore: If you are moving code out of one function to ONLY be
    > called by that ONE function then you are a bad programmer and should
    > have your editor taken away for six months. You should ONLY create
    > more func/methods if those func/methods will be called from two or
    > more places in the code. The very essence of func/meths is the fact
    > that they are reusable.


    While I understand and agree with that basic tenet, I think
    that the capitalized 'ONLY' is too strong. I do split out
    code into function for readability, even when the function
    will only be called from the place from which I split it out.

    I don't think that this adds to the 'spaghetti' factor. It
    can make my life much easier when I go to debug my own code
    years later.

    In python, I use a small function to block out an idea
    as a sort of pseudo code, although it's valid python. Then
    I just define the supporting functions, and the task is done:

    def validate_registrants():

    for dude in get_registrants():
    id = get_id(dude)
    amount_paid = get_amount_paid(dude)
    amount_owed = get_amount_owed(dude)

    if amount_paid != amount_owed():
    flag(dude)

    I get that this cries out for a 'dude' object, but
    I'm just making a point. When I go back to this code,
    I can very quickly see what the overall flow is, and
    jump to the problem area by function name. The above
    block might expand to a couple of hundred lines if I
    didn't split it out like this.
    Tobiah, Aug 26, 2011
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. CW

    Webmessenger principle

    CW, Sep 22, 2004, in forum: ASP .Net
    Replies:
    5
    Views:
    672
    Steven Cheng[MSFT]
    Sep 23, 2004
  2. Pavel Pluhacek

    principle of stport std::sort

    Pavel Pluhacek, Sep 1, 2003, in forum: C++
    Replies:
    2
    Views:
    409
    llewelly
    Sep 1, 2003
  3. Thomas Matthews

    Dependency Inversion Principle Dilemma

    Thomas Matthews, Dec 18, 2003, in forum: C++
    Replies:
    12
    Views:
    634
    Mike Smith
    Dec 23, 2003
  4. Joe Feldman

    Principle Engineer needed

    Joe Feldman, Sep 28, 2004, in forum: C++
    Replies:
    14
    Views:
    9,071
    nenupharvn
    May 6, 2010
  5. Replies:
    3
    Views:
    358
    Victor Bazarov
    Aug 12, 2005
Loading...

Share This Page