is there any principle when writing python function

S

smith jack

i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?
 
P

Peter Otten

smith said:
i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?

Five ;)
 
M

Mel

smith said:
i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?

It's hard to discuss in the abstract. A function should perform a
recognizable step in solving the program's problem. If you prepared to
write your program by describing each of several operations the program
would have to perform, then you might go on to plan a function for each of
the described operations. The high-level functions can then be analyzed,
and will probably lead to functions of their own.

Test-driven development encourages smaller functions that give you a better
granularity of testing. Even so, the testable functions should each perform
one meaningful step of a more general problem.
for example, how many lines should form a function?
Maybe as few as one.

def increase (x, a):
return x+a

is kind of stupid, but a more complicated line

def expand_template (bitwidth, defs):
'''Turn Run-Length-Encoded list into bits.'''
return np.array (sum (([bit]*(count*bitwidth) for count, bit in
defs), []), np.int8)

is the epitome of intelligence. I wrote it myself. Even increase might be
useful:

def increase (x, a):
return x + a * application_dependent_quantity

`increase` has become a meaningful operation in the imaginary application
we're discussing.


For an upper bound, it's harder to say. If you read to the end of a
function and can't remember how it started, or what it did in between, it's
too big. If you're reading on your favourite screen, and the end and the
beginning are more than one page-scroll apart, it might be too big. If it's
too big, factoring it into sub-steps and making functions of some of those
sub-steps is the fix.

Mel.
 
R

Roy Smith

smith jack said:
i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?

Enough lines to do what the function needs to do, but no more.

Seriously, break up your program into functions based on logical
groupings, and whatever makes your code easiest to understand. When
you're all done, if your program is too slow, run it under the profiler.
Use the profiling results to indicate which parts need improvement.

It's very unlikely that function call overhead will be a significant
issue. Don't worry about stuff like that unless the profiler shows its
a bottleneck. Don't try to guess what's slow. My guesses are almost
always wrong. Yours will be too.

If your program runs fast enough as it is, don't even bother with the
profiler. Be happy that you've got something useful and move on to the
next thing you've got to do.
 
R

Roy Smith

i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?

Five ;)[/QUOTE]

Five is right out.
 
U

Ulrich Eckhardt

smith said:
i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?

Don't compromise the design and clarity of your code just because you heard
some rumors about performance. Also, for any performance question, please
consult a profiler.

Uli
 
S

Steven D'Aprano

smith said:
i have heard that function invocation in python is expensive,

It's expensive, but not *that* expensive. Compare:

[steve@sylar ~]$ python3.2 -m timeit 'x = "abc".upper()'
1000000 loops, best of 3: 0.31 usec per loop
[steve@sylar ~]$ python3.2 -m timeit -s 'def f():
return "abc".upper()' 'f()'
1000000 loops, best of 3: 0.53 usec per loop

So the function call is nearly as expensive as this (very simple!) sample
code. But in absolute terms, that's not very expensive at all. If we make
the code more expensive:

[steve@sylar ~]$ python3.2 -m timeit '("abc"*1000)[2:995].upper().lower()'
10000 loops, best of 3: 32.3 usec per loop
[steve@sylar ~]$ python3.2 -m timeit -s 'def f(): return ("abc"*1000
[2:995].upper().lower()' 'f()'
10000 loops, best of 3: 33.9 usec per loop

the function call overhead becomes trivial.

Cases where function call overhead is significant are rare. Not vanishingly
rare, but rare enough that you shouldn't worry about them.

but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?

About as long as a piece of string.

A more serious answer: it should be exactly as long as needed to do the
smallest amount of work that makes up one action, and no longer or shorter.

If you want to maximise the programmer's efficiency, a single function
should be short enough to keep the whole thing in your short-term memory at
once. This means it should consist of no more than seven, plus or minus
two, chunks of code. A chunk may be a single line, or a few lines that
together make up a unit, or if the lines are particularly complex, *less*
than a line.

http://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two
http://www.codinghorror.com/blog/2006/08/the-magical-number-seven-plus-or-minus-two.html

(Don't be put off by the use of the term "magical" -- there's nothing
literally magical about this. It's just a side-effect of the way human
cognition works.)

Anything longer than 7±2 chunks, and you will find yourself having to scroll
backwards and forwards through the function, swapping information into your
short-term memory, in order to understand it.

Even 7±2 is probably excessive: I find that I'm most comfortable with
functions that perform 4±1 chunks of work. An example from one of my
classes:

def find(self, prefix):
"""Find the item that matches prefix."""
prefix = prefix.lower() # Chunk #1
menu = self._cleaned_menu # Chunk #2
for i,s in enumerate(menu, 1): # Chunk #3
if s.lower().startswith(prefix):
return i
return None # Chunk #4

So that's three one-line chunks and one three-line chunk.
 
S

Seebs

i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?

Lots of them. None of them have to do with performance.
for example, how many lines should form a function?

Between zero (which has to be written "pass") and a few hundred. Usually
closer to the lower end of that range. Occasionally outside it.

Which is to say: This is the wrong question.

Let us give you the two laws of software optimization.

Law #1: Don't do it.

If you try to optimize stuff, you will waste a ton of time doing things that,
it turns out, are unimportant.

Law #2: (Experts only.) Don't do it yet.

You don't know enough to "optimize" this yet.

Write something that does what it is supposed to do and which you understand
clearly. See how it looks. If it looks like it is running well enough,
STOP. You are done.

Now, if it is too slow, and you are running it on real data, NOW it is time
to think about why it is slow. And the solution there is not to read abstract
theories about your language, but to profile it -- actually time execution and
find out where the time goes.

I've been writing code, and making it faster, for some longish period of time.
I have not yet ever in any language found cause to worry about function call
overhead.

-s
 
R

rantingrick

i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?

Everyone here who is suggesting that function bodies should be
confined to ANY length is an idiot. The length of a functions code
block is inconsequential. Don't worry if it too small or too big. It's
not the size that matters, it's the motion of the sources ocean!

A good function can be one line, or a hundred lines. Always use
comments to clarify code and NEVER EVER create more functions only for
the sake of short function bodies, WHY, because all you do is move
confusion OUT OF the function body and INTO the module/class body.

"""Energy can neither be created nor be destroyed: it can only be
transformed from one state to another"""

http://en.wikipedia.org/wiki/Conservation_of_energy
https://sites.google.com/site/thefutureofpython/
 
T

Terry Reedy

Even 7±2 is probably excessive: I find that I'm most comfortable with
functions that perform 4±1 chunks of work. An example from one of my
classes:

def find(self, prefix):
"""Find the item that matches prefix."""
prefix = prefix.lower() # Chunk #1
menu = self._cleaned_menu # Chunk #2
for i,s in enumerate(menu, 1): # Chunk #3
if s.lower().startswith(prefix):
return i
return None # Chunk #4

So that's three one-line chunks and one three-line chunk.

In terms of different functions performed (see my previous post), I see
attribute lookup
assignment
enumerate
sequence unpacking
for-looping
if-conditioning
lower
startswith
return
That is 9, which is enough.
 
R

rantingrick

In terms of different functions performed (see my previous post), I see
   attribute lookup
   assignment
   enumerate
   sequence unpacking
   for-looping
   if-conditioning
   lower
   startswith
   return
That is 9,  which is enough.


attribute lookup -> inspection
assignment -> ditto
enumerate -> enumeration
sequence unpacking -> parallel assignment
for-looping -> cycling
if-conditioning -> logic
lower -> mutation (don't try to argue!)
startswith -> boolean-logic
return -> exiting (although all exits require an entrance!)
omitted: documenting, referencing, -presumptuousness-

pedantic-ly yours, rr
;-)
 
S

Steven D'Aprano

Terry said:
In terms of different functions performed (see my previous post), I see
attribute lookup
assignment
enumerate
sequence unpacking
for-looping
if-conditioning
lower
startswith
return
That is 9, which is enough.


I think we have broad agreement, but we're counting different things.
Analogy: you're counting atoms, I'm grouping atoms into molecules and
counting them.

It's a little like phone numbers: it's not an accident that we normally
group phone numbers into groups of 2-4 digits:

011 23 4567 8901

In general, people can more easily memorise four chunks of four digits (give
or take) than one chunk of 13 digits: 0112345678901.
 
A

alex23

rantingrick said:
Everyone here who is suggesting that function bodies should be
confined to ANY length is an idiot.

Or, more likely, is the sort of coder who has worked with other coders
in the past and understands the value of readable code.
Don't worry if it too small or too big. It's
not the size that matters, it's the motion of the sources ocean!

If only you spent as much time actually thinking about what you're
saying as trying to find 'clever' ways to say it...
Always use
comments to clarify code and NEVER EVER create more functions only for
the sake of short function bodies

This is quite likely the worst advice you've ever given. I can only
assume you've never had to refactor the sort of code you're advocating
here.
 
A

alex23

rantingrick said:

"Very soon I will be hashing out a specification for python 4000."

AHAHAHAHAhahahahahahahAHAHAHAHahahahahaaaaaaa. So rich. Anyone willing
to bet serious money we won't see this before 4000AD?

"Heck even our leader seems as a captain too drunk with vanity to
care; and our members like a ship lost at sea left to sport of every
troll-ish wind!"

Quite frankly, you're a condescending, arrogant blow-hard that this
community would be better off without.

"We must constantly strive to remove multiplicity from our systems;
lest it consumes us!"

s/multiplicity/rantingrick/ and I'm in full agreement.
 
T

ting

i have heard that function invocation in python is expensive, but make
lots of functions are a good design habit in many other languages, so
is there any principle when writing python function?
for example, how many lines should form a function?

My suggestion is to think how you would test the function, in order to
get 100% code coverage. The parts of the function that are difficult
to test, those are the parts that you want to pull out into their own
separate function.

For example, a block of code within a conditional statement, where the
test condition cannot be passed in, is a prime example of a block of
code that should be pulled out into a separate function.

Obviously, there are times where this is not practical - exception
handling comes to mind - but that should be your rule of thumb. If a
block of code is hard to test, pull it out into it's own function, so
that it's easier to test.
 
R

Roy Smith

My suggestion is to think how you would test the function, in order to
get 100% code coverage.

I'm not convinced 100% code coverage is an achievable goal for any major
project. I was once involved in a serious code coverage program. We
had a large body of code (100's of KLOC of C++) which we were licensing
to somebody else. The customer was insisting that we do code coverage
testing and set a standard of something like 80% coverage.

There was a dedicated team of about 4 people working on this for the
better part of a year. They never came close to 80%. More like 60%,
and that was after radical surgery to eliminate dead code and branches
that couldn't be reached. The hard parts are testing the code that
deals with unusual error conditions caused by interfaces to the external
world.

The problem is, it's just damn hard to simulate all the different kinds
of errors that can occur. This was network intensive code. Every call
that touches the network can fail in all sorts of ways that are near
impossible to simulate. We also had lots of code that tried to deal
with memory exhaustion. Again, that's hard to simulate.

I'm not saying code coverage testing is a bad thing. Many of the issues
I mention above could have been solved with additional abstraction
layers, but that adds complexity of its own. Certainly, designing a
body of code to be testable from the get-go is a far superior to trying
to retrofit tests to an existing code base (which is what we were doing).
The parts of the function that are difficult
to test, those are the parts that you want to pull out into their own
separate function.

For example, a block of code within a conditional statement, where the
test condition cannot be passed in, is a prime example of a block of
code that should be pulled out into a separate function.

Maybe. In general, it's certainly true that a bunch of smallish
functions, each of which performs exactly one job, is easier to work
with than a huge ball of spaghetti code. On the other hand, interfaces
are a common cause of bugs. When you pull a hunk of code out into its
own function, you create a new interface. Sometimes that adds
complexity (and bugs) of its own.
Obviously, there are times where this is not practical - exception
handling comes to mind - but that should be your rule of thumb. If a
block of code is hard to test, pull it out into it's own function, so
that it's easier to test.

In general, that's good advice. You'll also usually find that code
which is easy to test is also easy to understand and easy to modify.
 
R

rantingrick

Maybe.  In general, it's certainly true that a bunch of smallish
functions, each of which performs exactly one job, is easier to work
with than a huge ball of spaghetti code.  

Obviously you need to google the definition of "spaghetti code". When
you move code out of one function and create another function you are
contributing to the "spaghetti-ness" of the code. Think of plate of
spaghetti and how the noodles are all intertwined and without order.
Likewise when you go to one function and have to follow the trial of
one or more helper functions you are creating a twisting and unordered
progression of code -- sniff-sniff, do you smell what i smell?

Furthermore: If you are moving code out of one function to ONLY be
called by that ONE function then you are a bad programmer and should
have your editor taken away for six months. You should ONLY create
more func/methods if those func/methods will be called from two or
more places in the code. The very essence of func/meths is the fact
that they are reusable.

It might still be spaghetti under that definition (of which ALL OOP
code actually is!) however it will be as elegant as spaghetti can be.
On the other hand, interfaces
are a common cause of bugs.  When you pull a hunk of code out into its
own function, you create a new interface.  Sometimes that adds
complexity (and bugs) of its own.

Which is it? You cannot have it both ways. You're straddling the fence
here like a dirty politician. Yes, this subject IS black and white!
 
J

John Gordon

In said:
Furthermore: If you are moving code out of one function to ONLY be
called by that ONE function then you are a bad programmer and should
have your editor taken away for six months. You should ONLY create
more func/methods if those func/methods will be called from two or
more places in the code. The very essence of func/meths is the fact
that they are reusable.

That's one very important aspect of functions, yes. But there's another:
abstraction.

If I'm writing a module that needs to fetch user details from an LDAP
server, it might be worthwhile to put all of the LDAP-specific code in
its own method, even if it's only used once. That way the main module
can just contain a line like this:

user_info = get_ldap_results("cn=john gordon,ou=people,dc=company,dc=com")

The main module keeps a high level of abstraction instead of descending
into dozens or even hundreds of lines of LDAP-specific code.
 
T

Tobiah

Furthermore: If you are moving code out of one function to ONLY be
called by that ONE function then you are a bad programmer and should
have your editor taken away for six months. You should ONLY create
more func/methods if those func/methods will be called from two or
more places in the code. The very essence of func/meths is the fact
that they are reusable.

While I understand and agree with that basic tenet, I think
that the capitalized 'ONLY' is too strong. I do split out
code into function for readability, even when the function
will only be called from the place from which I split it out.

I don't think that this adds to the 'spaghetti' factor. It
can make my life much easier when I go to debug my own code
years later.

In python, I use a small function to block out an idea
as a sort of pseudo code, although it's valid python. Then
I just define the supporting functions, and the task is done:

def validate_registrants():

for dude in get_registrants():
id = get_id(dude)
amount_paid = get_amount_paid(dude)
amount_owed = get_amount_owed(dude)

if amount_paid != amount_owed():
flag(dude)

I get that this cries out for a 'dude' object, but
I'm just making a point. When I go back to this code,
I can very quickly see what the overall flow is, and
jump to the problem area by function name. The above
block might expand to a couple of hundred lines if I
didn't split it out like this.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top