PEP 382: Namespace Packages

Discussion in 'Python' started by Martin v. Löwis, Apr 2, 2009.

  1. I propose the following PEP for inclusion to Python 3.1.
    Please comment.

    Regards,
    Martin

    Abstract
    ========

    Namespace packages are a mechanism for splitting a single Python
    package across multiple directories on disk. In current Python
    versions, an algorithm to compute the packages __path__ must be
    formulated. With the enhancement proposed here, the import machinery
    itself will construct the list of directories that make up the
    package.

    Terminology
    ===========

    Within this PEP, the term package refers to Python packages as defined
    by Python's import statement. The term distribution refers to
    separately installable sets of Python modules as stored in the Python
    package index, and installed by distutils or setuptools. The term
    vendor package refers to groups of files installed by an operating
    system's packaging mechanism (e.g. Debian or Redhat packages install
    on Linux systems).

    The term portion refers to a set of files in a single directory (possibly
    stored in a zip file) that contribute to a namespace package.

    Namespace packages today
    ========================

    Python currently provides the pkgutil.extend_path to denote a package as
    a namespace package. The recommended way of using it is to put::

    from pkgutil import extend_path
    __path__ = extend_path(__path__, __name__)

    int the package's ``__init__.py``. Every distribution needs to provide
    the same contents in its ``__init__.py``, so that extend_path is
    invoked independent of which portion of the package gets imported
    first. As a consequence, the package's ``__init__.py`` cannot
    practically define any names as it depends on the order of the package
    fragments on sys.path which portion is imported first. As a special
    feature, extend_path reads files named ``*.pkg`` which allow to
    declare additional portions.

    setuptools provides a similar function pkg_resources.declare_namespace
    that is used in the form::

    import pkg_resources
    pkg_resources.declare_namespace(__name__)

    In the portion's __init__.py, no assignment to __path__ is necessary,
    as declare_namespace modifies the package __path__ through sys.modules.
    As a special feature, declare_namespace also supports zip files, and
    registers the package name internally so that future additions to sys.path
    by setuptools can properly add additional portions to each package.

    setuptools allows declaring namespace packages in a distribution's
    setup.py, so that distribution developers don't need to put the
    magic __path__ modification into __init__.py themselves.

    Rationale
    =========

    The current imperative approach to namespace packages has lead to
    multiple slightly-incompatible mechanisms for providing namespace
    packages. For example, pkgutil supports ``*.pkg`` files; setuptools
    doesn't. Likewise, setuptools supports inspecting zip files, and
    supports adding portions to its _namespace_packages variable, whereas
    pkgutil doesn't.

    In addition, the current approach causes problems for system vendors.
    Vendor packages typically must not provide overlapping files, and an
    attempt to install a vendor package that has a file already on disk
    will fail or cause unpredictable behavior. As vendors might chose to
    package distributions such that they will end up all in a single
    directory for the namespace package, all portions would contribute
    conflicting __init__.py files.

    Specification
    =============

    Rather than using an imperative mechanism for importing packages, a
    declarative approach is proposed here, as an extension to the existing
    ``*.pkg`` mechanism.

    The import statement is extended so that it directly considers ``*.pkg``
    files during import; a directory is considered a package if it either
    contains a file named __init__.py, or a file whose name ends with
    ".pkg".

    In addition, the format of the ``*.pkg`` file is extended: a line with
    the single character ``*`` indicates that the entire sys.path will
    be searched for portions of the namespace package at the time the
    namespace packages is imported.

    Importing a package will immediately compute the package's __path__;
    the ``*.pkg`` files are not considered anymore after the initial import.
    If a ``*.pkg`` package contains an asterisk, this asterisk is prepended
    to the package's __path__ to indicate that the package is a namespace
    package (and that thus further extensions to sys.path might also
    want to extend __path__). At most one such asterisk gets prepended
    to the path.

    extend_path will be extended to recognize namespace packages according
    to this PEP, and avoid adding directories twice to __path__.

    No other change to the importing mechanism is made; searching
    modules (including __init__.py) will continue to stop at the first
    module encountered.

    Discussion
    ==========

    With the addition of ``*.pkg`` files to the import mechanism, namespace
    packages can stop filling out the namespace package's __init__.py.
    As a consequence, extend_path and declare_namespace become obsolete.

    It is recommended that distributions put a file <distribution>.pkg
    into their namespace packages, with a single asterisk. This allows
    vendor packages to install multiple portions of namespace package
    into a single directory, with no risk of overlapping files.

    Namespace packages can start providing non-trivial __init__.py
    implementations; to do so, it is recommended that a single distribution
    provides a portion with just the namespace package's __init__.py
    (and potentially other modules that belong to the namespace package
    proper).

    The mechanism is mostly compatible with the existing namespace
    mechanisms. extend_path will be adjusted to this specification;
    any other mechanism might cause portions to get added twice to
    __path__.

    Copyright
    =========

    This document has been placed in the public domain.
    Martin v. Löwis, Apr 2, 2009
    #1
    1. Advertising

  2. Martin v. Löwis

    Carl Banks Guest

    On Apr 2, 8:32 am, "Martin v. Löwis" <> wrote:
    > I propose the following PEP for inclusion to Python 3.1.
    > Please comment.
    >
    > Regards,
    > Martin
    >
    > Abstract
    > ========
    >
    > Namespace packages are a mechanism for splitting a single Python
    > package across multiple directories on disk. In current Python
    > versions, an algorithm to compute the packages __path__ must be
    > formulated. With the enhancement proposed here, the import machinery
    > itself will construct the list of directories that make up the
    > package.


    -0

    My main concern is that we'll start seeing all kinds of packages with
    names like:

    com.dusinc.sarray.ptookkit.v_1_34_beta.btree.BTree

    The current lack of global package namespace effectively prevents
    bureaucratic package naming, which in my mind makes it worth the
    cost. However, I'd be willing to believe this can be kept under
    control some other way.


    Carl Banks
    Carl Banks, Apr 2, 2009
    #2
    1. Advertising

  3. Martin v. Löwis

    Kay Schluehr Guest

    On 2 Apr., 17:32, "Martin v. Löwis" <> wrote:
    > I propose the following PEP for inclusion to Python 3.1.
    > Please comment.
    >
    > Regards,
    > Martin
    >
    > Abstract
    > ========
    >
    > Namespace packages are a mechanism for splitting a single Python
    > package across multiple directories on disk. In current Python
    > versions, an algorithm to compute the packages __path__ must be
    > formulated. With the enhancement proposed here, the import machinery
    > itself will construct the list of directories that make up the
    > package.
    >
    > Terminology
    > ===========
    >
    > Within this PEP, the term package refers to Python packages as defined
    > by Python's import statement. The term distribution refers to
    > separately installable sets of Python modules as stored in the Python
    > package index, and installed by distutils or setuptools. The term
    > vendor package refers to groups of files installed by an operating
    > system's packaging mechanism (e.g. Debian or Redhat packages install
    > on Linux systems).
    >
    > The term portion refers to a set of files in a single directory (possibly
    > stored in a zip file) that contribute to a namespace package.
    >
    > Namespace packages today
    > ========================
    >
    > Python currently provides the pkgutil.extend_path to denote a package as
    > a namespace package. The recommended way of using it is to put::
    >
    > from pkgutil import extend_path
    > __path__ = extend_path(__path__, __name__)
    >
    > int the package's ``__init__.py``. Every distribution needs to provide
    > the same contents in its ``__init__.py``, so that extend_path is
    > invoked independent of which portion of the package gets imported
    > first. As a consequence, the package's ``__init__.py`` cannot
    > practically define any names as it depends on the order of the package
    > fragments on sys.path which portion is imported first. As a special
    > feature, extend_path reads files named ``*.pkg`` which allow to
    > declare additional portions.
    >
    > setuptools provides a similar function pkg_resources.declare_namespace
    > that is used in the form::
    >
    > import pkg_resources
    > pkg_resources.declare_namespace(__name__)
    >
    > In the portion's __init__.py, no assignment to __path__ is necessary,
    > as declare_namespace modifies the package __path__ through sys.modules.
    > As a special feature, declare_namespace also supports zip files, and
    > registers the package name internally so that future additions to sys.path
    > by setuptools can properly add additional portions to each package.
    >
    > setuptools allows declaring namespace packages in a distribution's
    > setup.py, so that distribution developers don't need to put the
    > magic __path__ modification into __init__.py themselves.
    >
    > Rationale
    > =========
    >
    > The current imperative approach to namespace packages has lead to
    > multiple slightly-incompatible mechanisms for providing namespace
    > packages. For example, pkgutil supports ``*.pkg`` files; setuptools
    > doesn't. Likewise, setuptools supports inspecting zip files, and
    > supports adding portions to its _namespace_packages variable, whereas
    > pkgutil doesn't.
    >
    > In addition, the current approach causes problems for system vendors.
    > Vendor packages typically must not provide overlapping files, and an
    > attempt to install a vendor package that has a file already on disk
    > will fail or cause unpredictable behavior. As vendors might chose to
    > package distributions such that they will end up all in a single
    > directory for the namespace package, all portions would contribute
    > conflicting __init__.py files.
    >
    > Specification
    > =============
    >
    > Rather than using an imperative mechanism for importing packages, a
    > declarative approach is proposed here, as an extension to the existing
    > ``*.pkg`` mechanism.
    >
    > The import statement is extended so that it directly considers ``*.pkg``
    > files during import; a directory is considered a package if it either
    > contains a file named __init__.py, or a file whose name ends with
    > ".pkg".
    >
    > In addition, the format of the ``*.pkg`` file is extended: a line with
    > the single character ``*`` indicates that the entire sys.path will
    > be searched for portions of the namespace package at the time the
    > namespace packages is imported.
    >
    > Importing a package will immediately compute the package's __path__;
    > the ``*.pkg`` files are not considered anymore after the initial import.
    > If a ``*.pkg`` package contains an asterisk, this asterisk is prepended
    > to the package's __path__ to indicate that the package is a namespace
    > package (and that thus further extensions to sys.path might also
    > want to extend __path__). At most one such asterisk gets prepended
    > to the path.
    >
    > extend_path will be extended to recognize namespace packages according
    > to this PEP, and avoid adding directories twice to __path__.
    >
    > No other change to the importing mechanism is made; searching
    > modules (including __init__.py) will continue to stop at the first
    > module encountered.
    >
    > Discussion
    > ==========
    >
    > With the addition of ``*.pkg`` files to the import mechanism, namespace
    > packages can stop filling out the namespace package's __init__.py.
    > As a consequence, extend_path and declare_namespace become obsolete.
    >
    > It is recommended that distributions put a file <distribution>.pkg
    > into their namespace packages, with a single asterisk. This allows
    > vendor packages to install multiple portions of namespace package
    > into a single directory, with no risk of overlapping files.
    >
    > Namespace packages can start providing non-trivial __init__.py
    > implementations; to do so, it is recommended that a single distribution
    > provides a portion with just the namespace package's __init__.py
    > (and potentially other modules that belong to the namespace package
    > proper).
    >
    > The mechanism is mostly compatible with the existing namespace
    > mechanisms. extend_path will be adjusted to this specification;
    > any other mechanism might cause portions to get added twice to
    > __path__.
    >
    > Copyright
    > =========
    >
    > This document has been placed in the public domain.


    Wow. You python-dev guys are really jumping the shark. Isn't your Rube
    Goldberg "import machinery" already complex enough for you?
    Kay Schluehr, Apr 2, 2009
    #3
  4. Martin v. Löwis

    Chris Rebert Guest

    On Thu, Apr 2, 2009 at 11:38 AM, Carl Banks <> wrote:
    > On Apr 2, 8:32 am, "Martin v. Löwis" <> wrote:
    >> I propose the following PEP for inclusion to Python 3.1.
    >> Please comment.
    >>
    >> Regards,
    >> Martin
    >>
    >> Abstract
    >> ========
    >>
    >> Namespace packages are a mechanism for splitting a single Python
    >> package across multiple directories on disk. In current Python
    >> versions, an algorithm to compute the packages __path__ must be
    >> formulated. With the enhancement proposed here, the import machinery
    >> itself will construct the list of directories that make up the
    >> package.

    >
    > -0
    >
    > My main concern is that we'll start seeing all kinds of packages with
    > names like:
    >
    > com.dusinc.sarray.ptookkit.v_1_34_beta.btree.BTree
    >
    > The current lack of global package namespace effectively prevents
    > bureaucratic package naming, which in my mind makes it worth the
    > cost.  However, I'd be willing to believe this can be kept under
    > control some other way.


    Agreed, although I'd be slightly less optimistic on its usage being
    kept under control. It seems this goes a bit against the "Flat is
    better than nested" principle.
    Then again, we also have the "Namespaces are honkingly great"
    principle to contend with as well, so it's definitely a balancing act.

    Cheers,
    Chris

    --
    I have a blog:
    http://blog.rebertia.com
    Chris Rebert, Apr 2, 2009
    #4
  5. Martin v. Löwis

    Guest

    On Apr 2, 5:59 pm, Ben Finney <> wrote:
    > Kay Schluehr <> writes:
    > > Wow. You python-dev guys are really jumping the shark. Isn't your
    > > Rube Goldberg "import machinery" already complex enough for you?

    >
    > Thanks for your constructive criticism, and your considerate quote
    > trimming.

    Ben, you should use google groups. No trimming necessary.
    , Apr 3, 2009
    #5
  6. > -0
    >
    > My main concern is that we'll start seeing all kinds of packages with
    > names like:
    >
    > com.dusinc.sarray.ptookkit.v_1_34_beta.btree.BTree
    >
    > The current lack of global package namespace effectively prevents
    > bureaucratic package naming, which in my mind makes it worth the
    > cost. However, I'd be willing to believe this can be kept under
    > control some other way.


    In principle, people can do this today already. That they are not
    doing it is a good sign.

    I think this bureaucratic naming in Java originates more from an
    explicitly stated policy that people should use such naming than
    from the ability to actually do so easily.

    Regards,
    Martin
    Martin v. Löwis, Apr 4, 2009
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. mag
    Replies:
    1
    Views:
    647
    Andy Peters
    May 19, 2005
  2. P.J. Eby
    Replies:
    3
    Views:
    269
    Martin v. Löwis
    Apr 29, 2009
  3. making over $4,382 a month

    , Nov 25, 2009, in forum: C Programming
    Replies:
    0
    Views:
    285
  4. Replies:
    0
    Views:
    293
  5. Replies:
    0
    Views:
    262
Loading...

Share This Page