package import dangers

Discussion in 'Python' started by Ethan Furman, Oct 6, 2009.

  1. Ethan Furman

    Ethan Furman Guest

    Greetings!

    I'm working on a package with multiple modules (and possibly packages),
    and I would like to do it correctly. :)

    I have read of references to possible issues regarding a module being
    imported (and run) more than once, but I haven't been able to find
    actual examples of such failing code.

    My google search was fruitless (although still educational !-), so if
    anyone could point me in the right direction I would greatly appreciate it.

    ~Ethan~
    Ethan Furman, Oct 6, 2009
    #1
    1. Advertising

  2. Ethan Furman wrote:

    > Greetings!
    >
    > I'm working on a package with multiple modules (and possibly packages),
    > and I would like to do it correctly. :)
    >
    > I have read of references to possible issues regarding a module being
    > imported (and run) more than once, but I haven't been able to find
    > actual examples of such failing code.
    >
    > My google search was fruitless (although still educational !-), so if
    > anyone could point me in the right direction I would greatly appreciate
    > it.


    The most common problem is that a file is used as module and as executable
    at the same time.

    Like this:

    --- test.py ---

    class Foo(object):
    pass


    if __name__ == "__main__":
    import test
    assert Foo is test.Foo

    ---

    This will fail when executed from the commandline because the module is
    known twice - once as "__main__", once as "test".

    So keep your startup-scripts trivial, or don't ever import from them.

    You might create similar situations when modifying sys.path to reach *into*
    a package - but that would be sick to do anyway.

    Other than that, I'm not aware of any issues.

    Diez
    Diez B. Roggisch, Oct 6, 2009
    #2
    1. Advertising

  3. On Tue, 06 Oct 2009 18:42:16 +0200, Diez B. Roggisch wrote:

    > The most common problem is that a file is used as module and as
    > executable at the same time.
    >
    > Like this:
    >
    > --- test.py ---
    >
    > class Foo(object):
    > pass
    >
    >
    > if __name__ == "__main__":
    > import test
    > assert Foo is test.Foo
    >
    > ---
    >
    > This will fail when executed from the commandline because the module is
    > known twice - once as "__main__", once as "test".



    Why would a module need to import itself? Surely that's a very rare
    occurrence -- I think I've used it twice, in 12 years or so. I don't see
    why you need to disparage the idea of combining modules and scripts in
    the one file because of one subtle gotcha.




    --
    Steven
    Steven D'Aprano, Oct 6, 2009
    #3
  4. Ethan Furman

    Carl Banks Guest

    On Oct 6, 3:56 pm, Steven D'Aprano
    <> wrote:
    > On Tue, 06 Oct 2009 18:42:16 +0200, Diez B. Roggisch wrote:
    > > The most common problem is that a file is used as module and as
    > > executable at the same time.

    >
    > > Like this:

    >
    > > --- test.py ---

    >
    > > class Foo(object):
    > >     pass

    >
    > > if __name__ == "__main__":
    > >    import test
    > >    assert Foo is test.Foo

    >
    > > ---

    >
    > > This will fail when executed from the commandline because the module is
    > > known twice - once as "__main__", once as "test".

    >
    > Why would a module need to import itself? Surely that's a very rare
    > occurrence -- I think I've used it twice, in 12 years or so. I don't see
    > why you need to disparage the idea of combining modules and scripts in
    > the one file because of one subtle gotcha.


    I'm sorry, this can't reasonably be characterized as a "subtle
    gotcha". I totally disagree, it's not a gotcha but a major time-
    killing head-scratcher, and it's too thoroughly convoluted to be
    called subtle (subtle is like one tricky detail that messes up an
    otherwise clean design, whereas this is like a dozen tricky details
    the mess the whole thing up).

    It's easily the most confusing thing commonly encountered in Python.
    I've seen experts struggle to grasp the details.

    Newbies and intermediate programmers should be advised never to do it,
    use a file as either a script or a module, not both. Expert
    programmers who understand the issues--and lots of experts don't--can
    feel free to venture into those waters warily. I would say that's an
    inferior solution than the method I advised in another thread that
    uses a single script as an entry point and inputs modules. But I'm
    not going to tell an expert how to do it.

    Average programmers, yes I will. Too easy to mess up, too hard to
    understand, and too little benefit, so don't do it. File should be
    either a module or script, not both.


    Carl Banks
    Carl Banks, Oct 7, 2009
    #4
  5. On Tue, 06 Oct 2009 17:01:41 -0700, Carl Banks wrote:

    >> Why would a module need to import itself? Surely that's a very rare
    >> occurrence -- I think I've used it twice, in 12 years or so. I don't
    >> see why you need to disparage the idea of combining modules and scripts
    >> in the one file because of one subtle gotcha.

    >
    > I'm sorry, this can't reasonably be characterized as a "subtle gotcha".
    > I totally disagree, it's not a gotcha but a major time- killing
    > head-scratcher, and it's too thoroughly convoluted to be called subtle
    > (subtle is like one tricky detail that messes up an otherwise clean
    > design, whereas this is like a dozen tricky details the mess the whole
    > thing up).


    Even if that were true, it's still rare for a module to import itself. If
    a major head-scratcher only bites you one time in a hundred combination
    module+scripts, that's hardly a reason to say don't write combos. It's a
    reason to not have scripts that import themselves, or a reason to learn
    how Python behaves in this case.

    But I dispute it's a head-scratcher. You just need to think a bit about
    what's going on. (See below.)


    > It's easily the most confusing thing commonly encountered in Python.


    But it's not commonly encountered at all, in my opinion. I see no
    evidence for it being common.

    I'll admit it might be surprising the first time you see it, but if you
    give it any thought it shouldn't be: when you run a module, you haven't
    imported it. Therefore it hasn't gone through the general import
    machinery. The import machinery needs to execute the code in a module,
    and it can't know that the module is already running. Therefore you get
    two independent executions of the code, which means the class accessible
    via the running code and the class accessible via the imported code will
    be different objects.

    Fundamentally, it's no more mysterious than this:


    >>> def factory():

    .... class K:
    .... pass
    .... return K
    ....
    >>> factory() is factory()

    False



    > I've seen experts struggle to grasp the details.


    Perhaps they're trying to hard and ignoring the simple things:

    $ cat test.py
    class Foo(object):
    pass

    if __name__ == "__main__":
    import test
    print Foo
    print test.Foo

    $ python test.py
    <class '__main__.Foo'>
    <class 'test.Foo'>

    All you have to do is look at the repr() of the class, and the answer is
    right there in your face.

    Still too hard to grasp? Then make it really simple:

    $ cat test2.py
    print "hello"
    if __name__ == "__main__":
    import test2
    $ python test2.py
    hello
    hello


    I don't see how it could be more obvious what's going on. You run the
    script, and the print line is executed. Then the script tries to import a
    module (which just happens to be the same script running). Since the
    module hasn't gone through the import machinery yet, it gets loaded, and
    executed.

    Simple and straight-forward and not difficult at all.



    > Newbies and intermediate programmers should be advised never to do it,
    > use a file as either a script or a module, not both.


    There's nothing wrong with having modules be runnable as scripts. There
    are at least 93 modules in the std library that do it (as of version
    2.5). It's a basic Pythonic technique that is ideal for simple scripts.

    Of course, once you have a script complicated enough that it needs to be
    broken up into multiple modules, you run into all sorts of complications,
    including circular imports. A major command line app might need hundreds
    of lines just dealing with the UI. It's fundamentally good advice to
    split the UI (the front end, the script) away from the backend (the
    modules) once you reach that level of complexity. Your earlier suggestion
    of having a single executable script to act as a front end for your
    multiple modules and packages is a good idea. But that's because of the
    difficulty of managing complicated applications, not because there's
    something fundamentally wrong with having an importable module also be
    runnable from the command line.



    --
    Steven
    Steven D'Aprano, Oct 7, 2009
    #5
  6. Ethan Furman

    Dave Angel Guest

    Steven D'Aprano wrote:
    > On Tue, 06 Oct 2009 18:42:16 +0200, Diez B. Roggisch wrote:
    >
    >
    >> The most common problem is that a file is used as module and as
    >> executable at the same time.
    >>
    >> Like this:
    >>
    >> --- test.py ---
    >>
    >> class Foo(object):
    >> pass
    >>
    >>
    >> if __name__ == "__main__":
    >> import test
    >> assert Foo is test.Foo
    >>
    >> ---
    >>
    >> This will fail when executed from the commandline because the module is
    >> known twice - once as "__main__", once as "test".
    >>

    >
    >
    > Why would a module need to import itself? Surely that's a very rare
    > occurrence -- I think I've used it twice, in 12 years or so. I don't see
    > why you need to disparage the idea of combining modules and scripts in
    > the one file because of one subtle gotcha.
    >
    >

    I'm surprised to see you missed this. A module doesn't generally import
    itself, but it's an easy mistake for a circular dependency to develop
    among modules. modulea imports moduleb, which imports modulea again.
    This can cause problems in many cases, but two things make it worse.
    One is if an import isn't at the very beginning of the module, and even
    worse is when one of the modules involved is the original script. You
    end up with two instances of the module, including separate copies of
    the global variables. Lots of subtle bugs this way.

    And there have been many threads right here, probably an average of once
    every two months, where the strange symptoms are ultimately caused by
    exactly this.

    DaveA
    Dave Angel, Oct 7, 2009
    #6
  7. On Tue, 06 Oct 2009 21:44:35 -0400, Dave Angel wrote:

    > I'm surprised to see you missed this. A module doesn't generally import
    > itself, but it's an easy mistake for a circular dependency to develop
    > among modules.


    Circular imports are always a difficulty. That has nothing to do with
    making modules executable as scripts.


    --
    Steven
    Steven D'Aprano, Oct 7, 2009
    #7
  8. Ethan Furman

    Dave Angel Guest

    Steven D'Aprano wrote:
    > On Tue, 06 Oct 2009 21:44:35 -0400, Dave Angel wrote:
    >
    >
    >> I'm surprised to see you missed this. A module doesn't generally import
    >> itself, but it's an easy mistake for a circular dependency to develop
    >> among modules.
    >>

    >
    > Circular imports are always a difficulty. That has nothing to do with
    > making modules executable as scripts.
    >
    >

    I was mainly making the point that while self-importing would be rare,
    circular imports probably crop up fairly often. Circular imports are
    (nearly always) a design flaw. But until you made me think about it, I
    would have said that they are safe in CPython as long as all imports are
    at the top of the file. And as long as the script isn't part of the
    dependency loop. Thanks for the word "always" above; in trying to
    refute it, I thought hard enough to realize you're right. And what's
    better, realized it before hitting "SEND."

    I would still say that the bugs caused in circular imports are
    relatively easy to spot, while the ones caused by importing the script
    can be quite painful to discover, if you aren't practiced at looking for
    them.

    And my practice is to keep the two separate, only using a module as a
    script when testing that module. The only time I've run into the
    problem of the dual loading of the script was in a simple program I
    copy-pasted from the wxPython demo code. That demo had a common module
    (shell) which each individual demo imported. But if you ran that common
    module as a script, it interactively let you choose which demo to import.

    DaveA
    Dave Angel, Oct 7, 2009
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Parvinder
    Replies:
    6
    Views:
    728
    Thomas G. Marshall
    Feb 27, 2005
  2. Corno
    Replies:
    2
    Views:
    532
    David B. Held
    Sep 23, 2003
  3. George P
    Replies:
    3
    Views:
    658
    Alex Martelli
    Sep 11, 2004
  4. r.z.
    Replies:
    13
    Views:
    535
  5. Dale Martenson

    The dangers of sleeping ...

    Dale Martenson, Jun 10, 2004, in forum: Ruby
    Replies:
    2
    Views:
    100
    Michael Geary
    Jun 12, 2004
Loading...

Share This Page