Trying to understand 'import' a bit better

F

Frank Millman

Hi all

I have been using 'import' for ages without particularly thinking about it -
it just works.

Now I am having to think about it a bit harder, and I realise it is a bit
more complicated than I had realised - not *that* complicated, but there are
some subtleties.

I don't know the correct terminology, but I want to distinguish between the
following two scenarios -

1. A python 'program', that is self contained, has some kind of startup,
invokes certain functionality, and then closes.

2. A python 'library', that exposes functionality to other python programs,
but relies on the other program to invoke its functionality.

The first scenario has the following characteristics -
- it can consist of a single script or a number of modules
- if the latter, the modules can all be in the same directory, or in one
or more sub-directories
- if they are in sub-directories, the sub-directory must contain
__init__.py, and is referred to as a sub-package
- the startup script will normally be in the top directory, and will be
executed directly by the user

When python executes a script, it automatically places the directory
containing the script into 'sys.path'. Therefore the script can import a
top-level module using 'import <module>', and a sub-package module using
'import <sub-package>.<module>'.

The second scenario has similar characteristics, except it will not have a
startup script. In order for a python program to make use of the library, it
has to import it. In order for python to find it, the directory containing
it has to be in sys.path. In order for python to recognise the directory as
a valid container, it has to contain __init__.py, and is referred to as a
package.

To access a module of the package, the python program must use 'import
<package>.<module>' (or 'from <package> import <module>'), and to access a
sub-package module it must use 'import <package>.<sub-package>.<module>.

So far so uncontroversial (I hope).

The subtlety arises when the package wants to access its own modules.
Instead of using 'import <module>' it must use 'import <package>.<module>'.
This is because the directory containing the package is in sys.path, but the
package itself is not. It is possible to insert the package directory name
into sys.path as well, but as was pointed out recently, this is dangerous,
because you can end up with the same module imported twice under different
names, with potentially disastrous consequences.

Therefore, as I see it, if you are developing a project using scenario 1
above, and then want to change it to scenario 2, you have to go through the
entire project and change all import references by prepending the package
name.

Have I got this right?

Frank Millman
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,534
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top