package import dangers

E

Ethan Furman

Greetings!

I'm working on a package with multiple modules (and possibly packages),
and I would like to do it correctly. :)

I have read of references to possible issues regarding a module being
imported (and run) more than once, but I haven't been able to find
actual examples of such failing code.

My google search was fruitless (although still educational !-), so if
anyone could point me in the right direction I would greatly appreciate it.

~Ethan~
 
D

Diez B. Roggisch

Ethan said:
Greetings!

I'm working on a package with multiple modules (and possibly packages),
and I would like to do it correctly. :)

I have read of references to possible issues regarding a module being
imported (and run) more than once, but I haven't been able to find
actual examples of such failing code.

My google search was fruitless (although still educational !-), so if
anyone could point me in the right direction I would greatly appreciate
it.

The most common problem is that a file is used as module and as executable
at the same time.

Like this:

--- test.py ---

class Foo(object):
pass


if __name__ == "__main__":
import test
assert Foo is test.Foo

---

This will fail when executed from the commandline because the module is
known twice - once as "__main__", once as "test".

So keep your startup-scripts trivial, or don't ever import from them.

You might create similar situations when modifying sys.path to reach *into*
a package - but that would be sick to do anyway.

Other than that, I'm not aware of any issues.

Diez
 
S

Steven D'Aprano

The most common problem is that a file is used as module and as
executable at the same time.

Like this:

--- test.py ---

class Foo(object):
pass


if __name__ == "__main__":
import test
assert Foo is test.Foo


Why would a module need to import itself? Surely that's a very rare
occurrence -- I think I've used it twice, in 12 years or so. I don't see
why you need to disparage the idea of combining modules and scripts in
the one file because of one subtle gotcha.
 
C

Carl Banks

Why would a module need to import itself? Surely that's a very rare
occurrence -- I think I've used it twice, in 12 years or so. I don't see
why you need to disparage the idea of combining modules and scripts in
the one file because of one subtle gotcha.

I'm sorry, this can't reasonably be characterized as a "subtle
gotcha". I totally disagree, it's not a gotcha but a major time-
killing head-scratcher, and it's too thoroughly convoluted to be
called subtle (subtle is like one tricky detail that messes up an
otherwise clean design, whereas this is like a dozen tricky details
the mess the whole thing up).

It's easily the most confusing thing commonly encountered in Python.
I've seen experts struggle to grasp the details.

Newbies and intermediate programmers should be advised never to do it,
use a file as either a script or a module, not both. Expert
programmers who understand the issues--and lots of experts don't--can
feel free to venture into those waters warily. I would say that's an
inferior solution than the method I advised in another thread that
uses a single script as an entry point and inputs modules. But I'm
not going to tell an expert how to do it.

Average programmers, yes I will. Too easy to mess up, too hard to
understand, and too little benefit, so don't do it. File should be
either a module or script, not both.


Carl Banks
 
S

Steven D'Aprano

I'm sorry, this can't reasonably be characterized as a "subtle gotcha".
I totally disagree, it's not a gotcha but a major time- killing
head-scratcher, and it's too thoroughly convoluted to be called subtle
(subtle is like one tricky detail that messes up an otherwise clean
design, whereas this is like a dozen tricky details the mess the whole
thing up).

Even if that were true, it's still rare for a module to import itself. If
a major head-scratcher only bites you one time in a hundred combination
module+scripts, that's hardly a reason to say don't write combos. It's a
reason to not have scripts that import themselves, or a reason to learn
how Python behaves in this case.

But I dispute it's a head-scratcher. You just need to think a bit about
what's going on. (See below.)

It's easily the most confusing thing commonly encountered in Python.

But it's not commonly encountered at all, in my opinion. I see no
evidence for it being common.

I'll admit it might be surprising the first time you see it, but if you
give it any thought it shouldn't be: when you run a module, you haven't
imported it. Therefore it hasn't gone through the general import
machinery. The import machinery needs to execute the code in a module,
and it can't know that the module is already running. Therefore you get
two independent executions of the code, which means the class accessible
via the running code and the class accessible via the imported code will
be different objects.

Fundamentally, it's no more mysterious than this:

.... class K:
.... pass
.... return K
....
False



I've seen experts struggle to grasp the details.

Perhaps they're trying to hard and ignoring the simple things:

$ cat test.py
class Foo(object):
pass

if __name__ == "__main__":
import test
print Foo
print test.Foo

$ python test.py
<class '__main__.Foo'>
<class 'test.Foo'>

All you have to do is look at the repr() of the class, and the answer is
right there in your face.

Still too hard to grasp? Then make it really simple:

$ cat test2.py
print "hello"
if __name__ == "__main__":
import test2
$ python test2.py
hello
hello


I don't see how it could be more obvious what's going on. You run the
script, and the print line is executed. Then the script tries to import a
module (which just happens to be the same script running). Since the
module hasn't gone through the import machinery yet, it gets loaded, and
executed.

Simple and straight-forward and not difficult at all.


Newbies and intermediate programmers should be advised never to do it,
use a file as either a script or a module, not both.

There's nothing wrong with having modules be runnable as scripts. There
are at least 93 modules in the std library that do it (as of version
2.5). It's a basic Pythonic technique that is ideal for simple scripts.

Of course, once you have a script complicated enough that it needs to be
broken up into multiple modules, you run into all sorts of complications,
including circular imports. A major command line app might need hundreds
of lines just dealing with the UI. It's fundamentally good advice to
split the UI (the front end, the script) away from the backend (the
modules) once you reach that level of complexity. Your earlier suggestion
of having a single executable script to act as a front end for your
multiple modules and packages is a good idea. But that's because of the
difficulty of managing complicated applications, not because there's
something fundamentally wrong with having an importable module also be
runnable from the command line.
 
D

Dave Angel

Steven said:
Why would a module need to import itself? Surely that's a very rare
occurrence -- I think I've used it twice, in 12 years or so. I don't see
why you need to disparage the idea of combining modules and scripts in
the one file because of one subtle gotcha.
I'm surprised to see you missed this. A module doesn't generally import
itself, but it's an easy mistake for a circular dependency to develop
among modules. modulea imports moduleb, which imports modulea again.
This can cause problems in many cases, but two things make it worse.
One is if an import isn't at the very beginning of the module, and even
worse is when one of the modules involved is the original script. You
end up with two instances of the module, including separate copies of
the global variables. Lots of subtle bugs this way.

And there have been many threads right here, probably an average of once
every two months, where the strange symptoms are ultimately caused by
exactly this.

DaveA
 
S

Steven D'Aprano

I'm surprised to see you missed this. A module doesn't generally import
itself, but it's an easy mistake for a circular dependency to develop
among modules.

Circular imports are always a difficulty. That has nothing to do with
making modules executable as scripts.
 
D

Dave Angel

Steven said:
Circular imports are always a difficulty. That has nothing to do with
making modules executable as scripts.
I was mainly making the point that while self-importing would be rare,
circular imports probably crop up fairly often. Circular imports are
(nearly always) a design flaw. But until you made me think about it, I
would have said that they are safe in CPython as long as all imports are
at the top of the file. And as long as the script isn't part of the
dependency loop. Thanks for the word "always" above; in trying to
refute it, I thought hard enough to realize you're right. And what's
better, realized it before hitting "SEND."

I would still say that the bugs caused in circular imports are
relatively easy to spot, while the ones caused by importing the script
can be quite painful to discover, if you aren't practiced at looking for
them.

And my practice is to keep the two separate, only using a module as a
script when testing that module. The only time I've run into the
problem of the dual loading of the script was in a simple program I
copy-pasted from the wxPython demo code. That demo had a common module
(shell) which each individual demo imported. But if you ran that common
module as a script, it interactively let you choose which demo to import.

DaveA
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top