setuptools without unexpected downloads

B

Ben Finney

Howdy all,

The Python distutils has been built upon by setuptools, providing
features asked for by many developers.

One of these is the ability for the 'setup.py' program to declare
package dependencies in a fairly standardised way, and determine if
they're met before attempting to install the package. This is good for
distributing one's software to folks whose operating system is stuck
somewhere in the mid-20th century by not providing such a dependency
system.

However, one of the areas where setuptools clashes with existing
package dependency systems is its default of *downloading* those
packages declared as dependencies, and installing them in its own way
getting in the way of the existing package dependency system.

How can I, as the distributor of a package using setuptools, gain the
benefits of dependency declaration and checking, without the drawback
of unexpected and potentially unwanted download and installation?

I know that a user can choose (if they've read the mass of setuptools
documentation) to disallow this download-and-install-dependencies
behaviour with a specific invocation of 'setup.py'. But how can I
disallow this from within the 'setup.py' program, so my users don't
have to be aware of this unexpected default behaviour?
 
B

Ben Finney

Ben Finney said:
How can I, as the distributor of a package using setuptools, gain
the benefits of dependency declaration and checking, without the
drawback of unexpected and potentially unwanted download and
installation?

To clarify: I want to retain the "assert the specified dependencies
are satisfied" behaviour, without the "... and, if not, download and
install them the Setuptools Way" behaviour.

Instead, I just want the default "dependencies not satisfied"
behaviour for my 'setup.py' program to be: complain the dependencies
aren't met, and refuse to install.

If I can set that default, and allow individual users to specifically
override that and get the "download and install un-met dependencies"
behaviour if they want, all well and good; but that's not necessary.

How can I get this happening with setuptools?
 
D

Diez B. Roggisch

Ben said:
To clarify: I want to retain the "assert the specified dependencies
are satisfied" behaviour, without the "... and, if not, download and
install them the Setuptools Way" behaviour.

Instead, I just want the default "dependencies not satisfied"
behaviour for my 'setup.py' program to be: complain the dependencies
aren't met, and refuse to install.

The problem here is that your favorite OS vendor/distributor not
necessarily offers the required meta-information - so setuptools can't
check the dependencies.

In my opinion, python is steering here to a direction like Java with
it's classpath: scripts like workingenv and it's successor (forgot the
name) provide hand-tailored environments for a specific application.

So maybe you should rather try and bundle your app in a way that it is
self-contained.

Diez
 
B

Ben Finney

Diez B. Roggisch said:
The problem here is that your favorite OS vendor/distributor not
necessarily offers the required meta-information - so setuptools
can't check the dependencies.

Let's assume it does. How would I modify my 'setup.py' script so that
its default behaviour, when dependencies are not met, is not "download
and install dependencies via setuptools" but instead "exit with error
message"?
 
B

Ben Finney

Diez B. Roggisch said:
In my opinion, python is steering here to a direction like Java with
it's classpath: scripts like workingenv and it's successor (forgot
the name) provide hand-tailored environments for a specific
application.

What a silly waste of resources. So, if fifteen different programs
depend on library X, we'd have fifteen *separate* installations of
library X on the same machine?

And when it comes time to upgrade library X because a security flaw is
discovered, each of the fifteen instances must be upgraded separately?
So maybe you should rather try and bundle your app in a way that it
is self-contained.

That entirely defeats the purpose of having packages declare
dependencies on each other. The whole point of re-usable library code
is to *avoid* having to re-bundle every dependency with every separate
application.
 
D

Diez B. Roggisch

Ben said:
Let's assume it does. How would I modify my 'setup.py' script so that
its default behaviour, when dependencies are not met, is not "download
and install dependencies via setuptools" but instead "exit with error
message"?

easy_install has the --no-deps commandline argument. I'm not sure if that is
triggerable from inside setup.py - but in the end, you wanted user choice,
didn't you?

Generally speaking, I think the real problem here is the clash
between "cultures" of dependency-handling. But it's certainly beyond
setuptools scope to cope with every imaginable package management system
out there, and provide means to trigger an installation of say e.g. debian
packages that are needed.

So - if you really want to go the debian/ubuntu/suse/whatever-way, provide
packages in their respective format, with the dependencies met.

If you are not willing to do that, the self-contained solution at least
offers an option for hassle-free first time installation, especially when
there is a mixture of meetable and non-meetable dependencies with the OS
system.

For example - what if there is no debian package that provides module XY in
the required version? Do you rather install it into the global
site-packages, or do you rather keep it private to the program requiring
it? I'd say the latter is better in mostly all circumstances, as it will
not disrupt the workings of other programs/modules.

Diez
 
B

Ben Finney

Diez B. Roggisch said:
easy_install has the --no-deps commandline argument.

As am I.
I'm not sure if that is triggerable from inside setup.py

Thanks for this answer.
- but in the end, you wanted user choice, didn't you?

No. In the end, I want the default to be as described above. User
choice is desirable, but not necessary for this requirement.
Generally speaking, I think the real problem here is the clash
between "cultures" of dependency-handling. But it's certainly beyond
setuptools scope to cope with every imaginable package management
system out there, and provide means to trigger an installation of
say e.g. debian packages that are needed.

Indeed, and that's beyond the scope of what I'm asking about.
 
P

Paul Boddie

Generally speaking, I think the real problem here is the clash
between "cultures" of dependency-handling. But it's certainly beyond
setuptools scope to cope with every imaginable package management system
out there, and provide means to trigger an installation of say e.g. debian
packages that are needed.

If you look at PEP 345...

http://www.python.org/dev/peps/pep-0345/

....you'll see that the dependency information described is quite close
to how such information is represented in Debian packages and with
other dependency management systems. This isn't an accident because
the authors were surely already familiar with such representations,
which have been around for quite some time. Admittedly, it isn't easy
to make a system which observes the rules of all the different
existing systems; for example, can .deb metadata and .rpm metadata be
interpreted in the same way and be taken to mean the same thing?
However, the argument that a dependency manager cannot deal with
different system packages is a weak one: apt and Smart have shown that
dependency management can be decoupled from package management.

Of course, I've already pointed out that despite being written in
Python, there's apparently no interest in the setuptools community to
look at what Smart manages to do, mostly due to spurious licensing
"concerns", and there's always the "argument zero" from people who
choose to ignore existing dependency management solutions: that
Windows doesn't provide such solutions - which is apparently not
entirely true, either.

[...]
For example - what if there is no debian package that provides module XY in
the required version? Do you rather install it into the global
site-packages, or do you rather keep it private to the program requiring
it? I'd say the latter is better in mostly all circumstances, as it will
not disrupt the workings of other programs/modules.

For what it's worth, it is possible to use Debian dependency/package
management as a non-root user with a local site-packages directory,
but it isn't particularly elegant. See this proof of concept for
details:

http://www.boddie.org.uk/paul/userinstall.html

It's a fairly heavy solution which installs a lot of the
administrative toolchain just for local package installations, but you
do get dependency integration with the packages providing the
libraries that may be required by various Python extension modules.

Paul
 
D

Diez B. Roggisch

If you look at PEP 345...
http://www.python.org/dev/peps/pep-0345/

...you'll see that the dependency information described is quite close
to how such information is represented in Debian packages and with
other dependency management systems. This isn't an accident because
the authors were surely already familiar with such representations,
which have been around for quite some time. Admittedly, it isn't easy
to make a system which observes the rules of all the different
existing systems; for example, can .deb metadata and .rpm metadata be
interpreted in the same way and be taken to mean the same thing?
However, the argument that a dependency manager cannot deal with
different system packages is a weak one: apt and Smart have shown that
dependency management can be decoupled from package management.

Do you care to elaborate on how apt has shown that? I use it every day (or
at least often), but I have to admit I never delved to deeply into them -
to me it appears that apt is a retrieval-solution, not so much a dependency
management system. The dependencies are declared in the debs themselves,
used by dpkg-* - aren't they? Sure, apt does solve them, but how so
decoupled from the underlying .deps?

Regarding smart: what I read here

"""
Smart is not meant as an universal wrapper around different package formats.
It does support RPM, DEB and Slackware packages on a single system, but
won't permit relationships among different package managers. While
cross-packaging system dependencies could be enabled easily, the packaging
policies simply do not exist today.
This is not at all different from what you can already do. In fact, Debian
has been shipping the RPM package manager for a few years now. "Possible"
does not equal "good idea", and everybody should stick to their native
package format.
"""

in the FAQ doesn't make me think that it's just a matter of unwillingness
from the setuptools-people but instead an intrinsic property of
dependency-handling that makes cross-package-management-management (or
meta-management) hard.

Apart from the technical intricacies of .deb/.rpm and their respective
tools, on thing sure makes this an argument: THEY evolve as they like, and
it sure puts a lot of additional burden to the setuptools-guys to keep
track with that.
Of course, I've already pointed out that despite being written in
Python, there's apparently no interest in the setuptools community to
look at what Smart manages to do, mostly due to spurious licensing
"concerns", and there's always the "argument zero" from people who
choose to ignore existing dependency management solutions: that
Windows doesn't provide such solutions - which is apparently not
entirely true, either.

[...]
For example - what if there is no debian package that provides module XY
in the required version? Do you rather install it into the global
site-packages, or do you rather keep it private to the program requiring
it? I'd say the latter is better in mostly all circumstances, as it will
not disrupt the workings of other programs/modules.

For what it's worth, it is possible to use Debian dependency/package
management as a non-root user with a local site-packages directory,
but it isn't particularly elegant. See this proof of concept for
details:

http://www.boddie.org.uk/paul/userinstall.html

It's a fairly heavy solution which installs a lot of the
administrative toolchain just for local package installations, but you
do get dependency integration with the packages providing the
libraries that may be required by various Python extension modules.

Certainly a nice solution to a real problem that might be handy for me at
some time. Yet I fail to see how that relates to the above question: if the
OS package repository fails to meet a certain version requirement, how do
you deal with that - installation local to the product you're actually
interested in installing, or in a more public way that possibly interferes
with other software?

Diez
 
P

Paul Boddie

[Quoting me...]
Do you care to elaborate on how apt has shown that? I use it every day (or
at least often), but I have to admit I never delved to deeply into them -
to me it appears that apt is a retrieval-solution, not so much a dependency
management system. The dependencies are declared in the debs themselves,
used by dpkg-* - aren't they? Sure, apt does solve them, but how so
decoupled from the underlying .deps?

Perhaps I was thinking more about the whole apt vs. apt-rpm situation,
since apt-rpm supposedly works with RPMs, albeit in a different
distribution of the software.
Regarding smart: what I read here

"""
Smart is not meant as an universal wrapper around different package formats.
It does support RPM, DEB and Slackware packages on a single system, but
won't permit relationships among different package managers. While
cross-packaging system dependencies could be enabled easily, the packaging
policies simply do not exist today.
This is not at all different from what you can already do. In fact, Debian
has been shipping the RPM package manager for a few years now. "Possible"
does not equal "good idea", and everybody should stick to their native
package format.
"""

in the FAQ doesn't make me think that it's just a matter of unwillingness
from the setuptools-people but instead an intrinsic property of
dependency-handling that makes cross-package-management-management (or
meta-management) hard.

I think the argument is that you use your own system's package format,
but smart is supposed to resolve the dependencies expressed in the
packages itself. There are also universal package formats, but I think
these usually leave some people disappointed, partly because you then
have to consider all the different dependency representations and the
inevitable integration with genuine system packages. I guess this is
why dependency issues were left underspecified in the PEP.
Apart from the technical intricacies of .deb/.rpm and their respective
tools, on thing sure makes this an argument: THEY evolve as they like, and
it sure puts a lot of additional burden to the setuptools-guys to keep
track with that.

I think most of the evolution has been in the surrounding tools,
although stuff like the new Debian Python policy could be complicating
factors. But I don't think the dependency stuff has changed that much
over the years.

[...]
[http://www.boddie.org.uk/paul/userinstall.html]

Certainly a nice solution to a real problem that might be handy for me at
some time. Yet I fail to see how that relates to the above question: if the
OS package repository fails to meet a certain version requirement, how do
you deal with that - installation local to the product you're actually
interested in installing, or in a more public way that possibly interferes
with other software?

My response here was mostly addressing the "global site-packages"
issue since that's usually a big reason for people abandoning the
system package/dependency management. If you can't find a new-enough
system package, you have to either choose a local "from source"
installation (which I would regard as a temporary measure for reasons
given elsewhere with respect to maintenance), or to choose to
repackage the upstream code and then install it through the system
package manager, which I claim can be achieved in a non-global
fashion.

Paul
 
D

Diez B. Roggisch

I think most of the evolution has been in the surrounding tools,
although stuff like the new Debian Python policy could be complicating
factors. But I don't think the dependency stuff has changed that much
over the years.

It might be, yet one thing is for sure: there have been various times in
debian in the last few year where for the sake of their own migration paths
to e.g. newere GCC-versions and the like a lot of seemingly "crude"
packages appeared, that catered to these needs. So it's not only about the
package form, one also has to take the actual distribution and even version
into consideration... seems daunting to me!
My response here was mostly addressing the "global site-packages"
issue since that's usually a big reason for people abandoning the
system package/dependency management. If you can't find a new-enough
system package, you have to either choose a local "from source"
installation (which I would regard as a temporary measure for reasons
given elsewhere with respect to maintenance), or to choose to
repackage the upstream code and then install it through the system
package manager, which I claim can be achieved in a non-global
fashion.

Do I understand that correctly that essentially you're saying: if you want
your software released for a certain distro, package it up for it the way
it's supposed to be? I can understand that and said so myself - but then,
the whole setuptools-debate has come to an end.

Diez
 
P

Paul Boddie

Do I understand that correctly that essentially you're saying: if you want
your software released for a certain distro, package it up for it the way
it's supposed to be? I can understand that and said so myself - but then,
the whole setuptools-debate has come to an end.

Yes, my preference is to install software as native packages, although
I'm generally spoilt by the selection of packages available and the
availability of packaging for current and previous versions of a lot
of Python packages for my system. One problem with promoting system/
native packages is that it can be difficult for people not using the
target platform to actually build the packages themselves, thus
requiring some kind of package maintainer role to be filled by another
person, but generally any demand on a given system for a package leads
to that role being filled by a community member fairly quickly. And
there are some existing semi-automated solutions, too, such as stdeb
and the distutils bdist_rpm option.

Paul

P.S. Of course, the package maintainer problem manifests itself most
prominently on Windows where you often see people asking for pre-built
packages or installers.
 
F

Fredrik Lundh

Paul said:
P.S. Of course, the package maintainer problem manifests itself most
prominently on Windows where you often see people asking for pre-built
packages or installers.

for the record, I'd love to see a group of volunteers doing stuff like
this for Windows. there are plenty of volunteers that cover all major
Linux/*BSD distributions (tons of thanks to everyone involved in this!),
but as far as I can remember, nobody has ever volunteered to do the same
for Windows.

</F>
 
P

Paul Boddie

for the record, I'd love to see a group of volunteers doing stuff like
this for Windows. there are plenty of volunteers that cover all major
Linux/*BSD distributions (tons of thanks to everyone involved in this!),
but as far as I can remember, nobody has ever volunteered to do the same
for Windows.

Steve Holden had a certain amount of enthusiasm for this, I believe,
and I note that there's some infrastructure for automatic snapshot
builds of Python for Windows now. Perhaps there's some cross-over with
the buildbot operators, too.

Paul
 
S

Steve Holden

Fredrik said:
for the record, I'd love to see a group of volunteers doing stuff like
this for Windows. there are plenty of volunteers that cover all major
Linux/*BSD distributions (tons of thanks to everyone involved in this!),
but as far as I can remember, nobody has ever volunteered to do the same
for Windows.
I'd like to see something like this happen, too, and if a group of
volunteers emerges I'll do what I can through the PSF to provide
resources. Activities that benefit the whole community (or a large part
of it) are, IMHO, well worth supporting.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline
 
S

Steve Holden

Ben said:
What a silly waste of resources. So, if fifteen different programs
depend on library X, we'd have fifteen *separate* installations of
library X on the same machine?
You need to get your opinions up to date. Fifteen copies of a single
library is nothing in terms of the code bloat that has happened over the
last forty years, and most of that bloat is for programmer convenience,
either in package development or distribution.

While it's all very well to say that there should only ever be one true
version of a library this requires that developers constrain themselves
(sometimes in ways they consider unreasonable) to backwards
compatibility for the entire lifetime of their code.
And when it comes time to upgrade library X because a security flaw is
discovered, each of the fifteen instances must be upgraded separately?
Yes. Better than upgrading a single library shared between fifteen
applications and having two of them break.
That entirely defeats the purpose of having packages declare
dependencies on each other. The whole point of re-usable library code
is to *avoid* having to re-bundle every dependency with every separate
application.
Agreed, but until we reach the ideal situation where everybody is using
the same package dependency system what's your practical solution?
"Self-contained" has the merit that nobody else's changes are going to
bugger about with my application on a customer's system. The extra disk
space is a small price to pay for that guarantee.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline
 
D

Diez B. Roggisch

Agreed, but until we reach the ideal situation where everybody is using
the same package dependency system what's your practical solution?
"Self-contained" has the merit that nobody else's changes are going to
bugger about with my application on a customer's system. The extra disk
space is a small price to pay for that guarantee.

Couldn't have summed up that better. As much as I loathe Java for a variety
of reasons, and as much as it's CLASSPATH-issues suck especially in larger
projects that have _interproject-dependency-troubles_ (ever felt the joy of
having the need for several XML-parsers to be available in your
AppServer?) - the fact that one has to and can compose the CLASSPATH
explicitly for each app is often an advantage. And I don't mind the extra
diskspace - that's usually eaten by media files of dubious origins, not by
_code_ that someone has written & packaged up...

Diez
 
K

kyosohma

I'd like to see something like this happen, too, and if a group of
volunteers emerges I'll do what I can through the PSF to provide
resources. Activities that benefit the whole community (or a large part
of it) are, IMHO, well worth supporting.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline

What would it entail to do this? Using py2exe + some installer (like
Inno Setup) to create an installer that basically copies/installs the
files into the site-packages folder or wherever the user chooses? If
that's all it is, I would think it would be fairly easy to create
these. Maybe I'm over-simplifying it though.

What are a some examples of packages that need this?

Mike
 
S

Steve Holden

What would it entail to do this? Using py2exe + some installer (like
Inno Setup) to create an installer that basically copies/installs the
files into the site-packages folder or wherever the user chooses? If
that's all it is, I would think it would be fairly easy to create
these. Maybe I'm over-simplifying it though.

What are a some examples of packages that need this?
MySQLdb and psycopg are two obvious examples I have had to grub around
or produce my own installers for. There's generally some configuration
work to do for packages that have been produced without considering
Windows requirements, and ideally this will be fed back to the developers.

I think you may be oversimplifying a little. Pure Python packages aren't
too problematic, it's mostly the extension modules. Unless a copy of
Visual Studio is available (and we *might* get some cooperation from
Microsoft there) that means resorting to MingW, which isn't an easy
environment to play with (in my occasional experience, anyway).

There's going to be increasing demand for 64-bit implementations too.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline
 
B

Ben Finney

Steve Holden said:
You need to get your opinions up to date. Fifteen copies of a single
library is nothing in terms of the code bloat that has happened over
the last forty years, and most of that bloat is for programmer
convenience, either in package development or distribution.

Thanks for the straw man, but I'll decline.

The issue with multiple copies of the same library is *not* disk
storage bloat, but violation of DRY: Don't Repeat Yourself. When the
above putative library X is found to have a bug and needs to be
upgraded to fix, it's wasteful of resources to have to track down
*every* instance of that library, in all the places where separate
packages expect to find it — and it's a recipe for omitting one or
several of those multiple instances, thus getting hard-to-track errors
in code that *should* exist in one place on the system.
While it's all very well to say that there should only ever be one
true version of a library this requires that developers constrain
themselves (sometimes in ways they consider unreasonable) to
backwards compatibility for the entire lifetime of their code.

Not at all. If the package management system is designed well — like
in most modern operating systems — new versions can be installed, and
even relied upon by multiple packages, while the old one stays on the
system for those packages that still haven't migrated to the new. The
foolish part is to have multiple copies of the *same* code scattered
around the system, with different packages looking for the same code
in different places.
Yes. Better than upgrading a single library shared between fifteen
applications and having two of them break.

Then allow the old library to stay on the system, in *one* single,
known location, for the applications that need it; and the new library
in *its* single known location for the applications that use it.
Agreed, but until we reach the ideal situation where everybody is
using the same package dependency system what's your practical
solution?

The above benefits don't require "everybody is using the same package
dependency system". They accrue one machine at a time, by making *that
machine* less prone to the redundancy problems mentioned above.

They also are incremental, and are beneficial to move towards; that
is, even if the machine is using several package management systems,
any movement towards unifying the package management is a reduction in
duplication and hence a reduction in the associated problems.
"Self-contained" has the merit that nobody else's changes are going
to bugger about with my application on a customer's system.

That's nice for you, the programmer. Package management and proper
dependency declaration is a benefit to the customer (the one who owns
the machine, not the programmer) since they can upgrade each package
*once* per machine, and not have to hunt down all the places where a
previous version may still be installed simply because it was
convenient for the programmer.

Yes, this requires good management of APIs for libraries on the part
of the programmers of those libraries, and ensuring that a new version
with the same API maintains its expected behaviour. It also requires
that the programmer depending on those libraries declares dependencies
properly. It's working fine for many operating systems — even in the
absence of a universal package management system.
The extra disk space is a small price to pay for that guarantee.

Hopefully you can discard this straw man argument and engage the
actual problems of redundant installations of identical code packages.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top