Usage of main()

Manuel Graune · Sep 4, 2009

Hello everyone,

the standard structure of a python-program which is taught in all of
the books I on python I read by now is simply something like:

#!/usr/bin/python
print "Hello, world!"
^D

While reading about structuring a larger code-base, unit-testing, etc
I stumbled on the idiom

#!/usr/bin/python
def main():
print "Hello, world"
if __name__ == "__main__":
main()
^D

While experimenting with this I found that the second version in most
cases is *a lot* faster than the simple approach. (I tried this both
on Linux and Windows) I found this even in cases where the code con-
sists simply of something like

j=0
for i in xrange(1000000):
j+=i
print j

How come the main()-idiom is not "the standard way" of writing a
python-program (like e.g. in C)?
And in addition: Can someone please explain why the first version
is so much slower?

Regards,

Manuel

Sean DiZazzo · Sep 4, 2009

Hello everyone,

the standard structure of a python-program which is taught in all of
the books I on python I read by now is simply something like:

#!/usr/bin/python
print "Hello, world!"
^D

While reading about structuring a larger code-base, unit-testing, etc
I stumbled on the idiom

#!/usr/bin/python
def main():
print "Hello, world"
if __name__ == "__main__":
main()
^D

While experimenting with this I found that the second version in most
cases is *a lot* faster than the simple approach. (I tried this both
on Linux and Windows) I found this even in cases where the code con-
sists simply of something like

j=0
for i in xrange(1000000):
j+=i
print j

How come the main()-idiom is not "the standard way" of writing a
python-program (like e.g. in C)?
And in addition: Can someone please explain why the first version
is so much slower?

Regards,

Manuel

I'm trying to come up with an answer for you, but I can't...

The if __name__ == "__main__": idiom *is* the standard way to write
python programs, but it's not there to speed up programs. It's there
so that your program can be executed differently whether it is called
as a runnable script from the command line, or if it is imported.
When you import a module, "__name__" is equal to the name of the
module, but when you execute it, it's "__name__" is "__main__" If you
are importing a library, you generally don't want it to fire off a
bunch of processing until you call the needed functions/methods. I
also use it as a testing ground, and a sort of loose documentation for
my modules. I put stuff in there that shows and tests a general use
of the module, but when I actually import and use it, I definitely
don't want that code to run!

What are you using to test the scripts? I could be completely wrong,
but I find it hard to believe that the second version is much (if any)
faster than the first. Then again, I don't know much about the
internals...

~Sean

r · Sep 4, 2009

How come the main()-idiom is not "the standard way" of writing a
python-program (like e.g. in C)?

Why use a nested function when you already *in* main? thats like
declaring variables when your compiler could just use some simple
logic...

'2.7' -> string because wrapped in quotes
2 -> integer because is in set{0123456789} && no quotes!
1.23 -> because all digits && has a period

....or using "{" and "}" instead of INDENT and DEDENT.

Python removes the unnecessary cruft and redundancy that is C, and
puts the burden on your machine!

http://www.python.org/dev/peps/pep-0020/

And in addition: Can someone please explain why the first version
is so much slower?

leave that one for someone else...

Simon Brunning · Sep 4, 2009

2009/9/4 Manuel Graune said:
How come the main()-idiom is not "the standard way" of writing a
python-program (like e.g. in C)?

Speaking for myself, it *is* the standard way to structure a script. I
find it more readable, since I can put my main function at the very
top where it's visible, with the classes and functions it makes use of
following in some logical sequence.

I suspect that this is the case for many real-world scripts. Perhaps
it's mainly in books and demos where the extra stuff is left out so
the reader can focus on what the writer is demonstrating?

And in addition: Can someone please explain why the first version
is so much slower?

Access to globals is slower than access to a function's locals.

Sean DiZazzo · Sep 4, 2009

Sorry, Sean, unfortunately you are wrong, although it's understandable
that you've missed this.

The lookup of locally scoped references is a lot faster than that of
global ones, primarily due to the lookup order: it first checks the
local scope and then out through surrounding scopes _before_ the
global scope.

So yes, depending on the nature of your code, its quite conceivable to
find distinct performance differences between code using the __main__
idiom and code without.

Interesting. I guess at some point I should try to understand what is
going on under the covers. Thanks.

Manuel Graune · Sep 4, 2009

Sean DiZazzo said:
I'm trying to come up with an answer for you, but I can't...

The if __name__ == "__main__": idiom *is* the standard way to write
python programs, but it's not there to speed up programs. It's there
so that your program can be executed differently whether it is called
as a runnable script from the command line, or if it is imported.

<SNIP>

thanks for your answer. What you are explaining is exactly why I tried
it in the first place. I'm just wondering why (this is my impression,
not necessaryly the reallity) none of the recommended texts on python
put this in the first chapters. Instead - if it is mentioned at all -
it is hidden somewhere in the "advanced" sections. Even if the reason
for this is (I'm guessing...) because it is thought to be to complicated
to explain the "why" right at the beginning, it probably would not hurt
to just tell that this is the "correct" way of doing things right at the
start and add a footnote.

Regards,

Manuel

Carl Banks · Sep 4, 2009

Sorry, Sean, unfortunately you are wrong, although it's understandable
that you've missed this.

The lookup of locally scoped references is a lot faster than that of
global ones, primarily due to the lookup order: it first checks the
local scope and then out through surrounding scopes _before_ the
global scope.

Sorry, alex, unfortunately you are wrong, although it's understandable
that you've missed this.

Actually, Python knows if a variable is local, nonlocal (meaning a
local from a surrounding scope), or global at compile time, so at run
time Python attempts only one kind of lookup.

The speedup comes because local lookups are much faster. Accessing a
local is a simple index operation, and a nonlocal is a pointer deref
or two, then an indexing. However for global variables the object is
looked up in a dictionary.

Carl Banks

Carl Banks · Sep 4, 2009

Speaking for myself, it *is* the standard way to structure a script. I
find it more readable, since I can put my main function at the very
top where it's visible, with the classes and functions it makes use of
following in some logical sequence.

I suspect that this is the case for many real-world scripts. Perhaps
it's mainly in books and demos where the extra stuff is left out so
the reader can focus on what the writer is demonstrating?

Speaking for myself, I almost never put any logic at the top level in
anything other than tiny throwaway scripts. Top level is for
importing, and defining functions, classes, and constants, and that's
it.

Even when doing things like preprocessing I'll define a function and
call it rather than putting the logic at top-level. Sometimes I'll
throw in an if-test at top level (for the kind of stuff I might choose
an #if preprocessor statement in C for) but mostly I just put that in
functions.

Carl Banks

Jan Kaliszewski · Sep 4, 2009

So yes, depending on the nature of your code, its quite conceivable to

find distinct performance differences between code using the __main__
idiom and code without.

But -- it should be emphasized -- it's faster thanks to running code
(an doing name lookups) within a function, and *not* thanks to using
the __main__ idiom (i.e. 'if __name__ == "__main__":' condition).

Cheers,
*j

Jan Kaliszewski · Sep 4, 2009

04-09-2009 o 08:37:43 r said:
Why use a nested function when you already *in* main?

I understand you name global scope as 'main'. But (independently
of using the __main__ idiom and so on) it is still good idea not to
place to much code in the global scope but to place your app-logic
code in functions -- because, as we noted:

* in practice it is considerably faster,

* it helps you with using functions & class browsers.

Cheers,
*j

Mel · Sep 4, 2009

Manuel Graune wrote:
[ ... ]

thanks for your answer. What you are explaining is exactly why I tried
it in the first place. I'm just wondering why (this is my impression,
not necessaryly the reallity) none of the recommended texts on python
put this in the first chapters. Instead - if it is mentioned at all -
it is hidden somewhere in the "advanced" sections. Even if the reason
for this is (I'm guessing...) because it is thought to be to complicated
to explain the "why" right at the beginning, it probably would not hurt
to just tell that this is the "correct" way of doing things right at the
start and add a footnote.

Maybe it's the "correct" way, but it isn't *the* correct way. In my
experience, when I import a program, it isn't because I want to run it
straight through. For that there's `exec`, and subprocess and what not.
More likely I want to get at the internals -- maybe produce my own output
from the intermediate results, for example. For that, a monolithic `main`
function is beside the point.

For a sizeable program you can get the same speed advantage, to within five
9s or so, with a structure like

## ...
if __name__ == '__main__':
process_this_input()
process_that_input()
mess_with_the_collected_data()
write_the_output()

Mel.

Albert Hopkins · Sep 4, 2009

* having a module that can be imported without side effects helps
select
pieces of the module's functionality

* any module should be importable without side effects to make it
easier
to run unit tests for that module

+1

alex23 · Sep 5, 2009

Carl Banks said:
Sorry, alex, unfortunately you are wrong, although it's understandable
that you've missed this.
[...]
The speedup comes because local lookups are much faster. Accessing a
local is a simple index operation, and a nonlocal is a pointer deref
or two, then an indexing. However for global variables the object is
looked up in a dictionary.

Interesting. I guess at some point I should try to understand what is
going on under the covers

Thanks for the clarification.

r · Sep 5, 2009

I understand you name global scope as 'main'. But (independently
of using the __main__ idiom and so on) it is still good idea not to
place to much code in the global scope but to place your app-logic
code in functions -- because, as we noted:

* in practice it is considerably faster,

* it helps you with using functions & class browsers.

Ah yes, thanks Jan!.
And the others mentioning of "side effects" from imports makes a lot
of sense too.

usage of os.posix_fadvise	0	May 30, 2013
main	7	Feb 3, 2007
Is there any advantage to using a main() in python scripts?	8	Dec 11, 2013
Usage of PyDateTime_FromTimestamp	0	Aug 30, 2011
Declaration of main()	30	Mar 29, 2014
Popen in main and subprocess	1	Jan 28, 2012
Memory Usage of Strings	3	Mar 16, 2011
awk like usage in python	0	Nov 9, 2012

Usage of main()

Manuel Graune

Sean DiZazzo

r

Simon Brunning

Sean DiZazzo

Manuel Graune

Carl Banks

Carl Banks

Jan Kaliszewski

Jan Kaliszewski

Mel

Albert Hopkins

alex23

r

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads