eggs considered harmful

H

Harry George

....at least around here.

I run a corporate Open Source Software Toolkit, which makes hundreds
of libraries and apps available to thousands of technical employees.
The rules are that a) a very few authorized downloaders obtain
tarballs and put them in a depot and b) other users get tarballs from
the depot and build from source.

Historically, python packages played well in this context. Install
was a simple download, untar, setup.py build/install.

Eggs and with other setuptools-inspired install processes break this
paradigm. The tarballs are incomplete in the first place. The builds
sometimes wander off to the internet looking for more downloads. The
installs sometimes wander off to the internet looking for
compatibility conditions. (Or rather they try to do so and fail
because I don't let themn through the firewall.)

These are unacceptable behaviors. I am therefore dropping ZODB3, and
am considering dropping TurboGears and ZSI. If the egg paradigm
spreads, yet more packages will be dropped (or will never get a chance
to compete for addition).

I've asked before, and I'll ask again: If you are doing a Python
project, please make a self-sufficient tarball available as well. You
can have dependencies, as long as they are documented and can be
obtained by separate manual download.

Thanks for listening.
 
J

John J. Lee

Harry George said:
These are unacceptable behaviors. I am therefore dropping ZODB3, and
am considering dropping TurboGears and ZSI. If the egg paradigm
spreads, yet more packages will be dropped (or will never get a chance
to compete for addition).

I've asked before, and I'll ask again: If you are doing a Python
project, please make a self-sufficient tarball available as well. You
can have dependencies, as long as they are documented and can be
obtained by separate manual download.

1. Given the presumptuous tone of your own message, I guess I'm not in
danger of coming across as more rude than you when I point out that
your requirements are just that: your own. The rest of the world
won't *always* bend over backwards to support just exactly what you'd
most prefer.

2. You can run your own private egg repository. IIRC, it's as simple
as a directory of eggs and a plain old web server with directory
listings turned on. You then run easy_install -f URL package_name
instead of easy_install package_name . The distutils-sig archives
will have more on this.

3. Alternatively, you could create bundled packages that include
dependencies (perhaps zc.buildout can do that for you, even? not sure)


John
 
R

Robert Kern

Harry said:
...at least around here.

I run a corporate Open Source Software Toolkit, which makes hundreds
of libraries and apps available to thousands of technical employees.
The rules are that a) a very few authorized downloaders obtain
tarballs and put them in a depot and b) other users get tarballs from
the depot and build from source.

Historically, python packages played well in this context. Install
was a simple download, untar, setup.py build/install.

Eggs and with other setuptools-inspired install processes break this
paradigm. The tarballs are incomplete in the first place. The builds
sometimes wander off to the internet looking for more downloads. The
installs sometimes wander off to the internet looking for
compatibility conditions. (Or rather they try to do so and fail
because I don't let themn through the firewall.)

Have you considered establishing a policy that these setuptools-using packages
should be installed using the --single-version-externally-managed option to the
install command? This does not check for dependencies.

Alternately, you can provide a company repository of the tarballs and their
depedencies tarballs. Your users can use the easy_install option --find-links to
point to that URL such that they do not have to go outside of the firewall to
install everything.
These are unacceptable behaviors. I am therefore dropping ZODB3, and
am considering dropping TurboGears and ZSI. If the egg paradigm
spreads, yet more packages will be dropped (or will never get a chance
to compete for addition).

I'm sorry to hear that.
I've asked before, and I'll ask again: If you are doing a Python
project, please make a self-sufficient tarball available as well. You
can have dependencies, as long as they are documented and can be
obtained by separate manual download.

Given the options I outlined above, you can easily satisfy these requirements
for the vast majority of setuptools-using packages that are out there. There are
a handful of packages that only distribute the eggs and not the source tarballs,
but those are rare.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
B

Ben Finney

Harry George said:
Historically, python packages played well in this context. Install
was a simple download, untar, setup.py build/install.

Eggs and with other setuptools-inspired install processes break this
paradigm. The tarballs are incomplete in the first place. The builds
sometimes wander off to the internet looking for more downloads. The
installs sometimes wander off to the internet looking for
compatibility conditions. (Or rather they try to do so and fail
because I don't let themn through the firewall.)

If you provide the build and install script with all the dependencies
already present (in the current directory), my experience is that
setuptools does not do any network actions.
 
H

Harry George

1. Given the presumptuous tone of your own message, I guess I'm not in
danger of coming across as more rude than you when I point out that
your requirements are just that: your own. The rest of the world
won't *always* bend over backwards to support just exactly what you'd
most prefer.

You deleted the "...at least here", which was intended to make clear I
was NOT speaking for the world at large, though possibly for a large
chunk of corporate life. Also, this wasn't out of the lbue. I ha ve
previously discussed this with several development teasm privately,
but the trend appears to be accelerating
2. You can run your own private egg repository. IIRC, it's as simple
as a directory of eggs and a plain old web server with directory
listings turned on. You then run easy_install -f URL package_name
instead of easy_install package_name . The distutils-sig archives
will have more on this.

Again, not speaking for anyone else: With 500 OSS packages, all of
which play by the same tarball rules, we don't have resources to
handle eggs differently.
3. Alternatively, you could create bundled packages that include
dependencies (perhaps zc.buildout can do that for you, even? not sure)

No resources for special handling.
 
H

Harry George

Robert Kern said:
Have you considered establishing a policy that these setuptools-using packages
should be installed using the --single-version-externally-managed option to the
install command? This does not check for dependencies.

I didn't know that one. I'll try it. Thanks.
Alternately, you can provide a company repository of the tarballs and their
depedencies tarballs. Your users can use the easy_install option --find-links to
point to that URL such that they do not have to go outside of the firewall to
install everything.

This is a possibility. The tarballs can be seen in a directory
listing. They are in different subdirs (for different "bundles" of
functionality), so I'll need -f to look several places.
I'm sorry to hear that.

Me too. We worked long and hard to get Python established as a
standard language for corporate systems development, we have a host of
projects that need ZSI, and I look forward to making further inroads
into C++, Java, and VB development camps. Didn't really need a
roadblock at this point.
Given the options I outlined above, you can easily satisfy these requirements
for the vast majority of setuptools-using packages that are out there. There are
a handful of packages that only distribute the eggs and not the source tarballs,
but those are rare.

I agree pure eggs are rare. The fact that they increased this past
quarter was what concerned me. ZODB even looks like a normal tarball,
builds ok, but uses a easy-install-style lookup during install.
 
H

Harry George

Ben Finney said:
If you provide the build and install script with all the dependencies
already present (in the current directory), my experience is that
setuptools does not do any network actions.

--
\ "Self-respect: The secure feeling that no one, as yet, is |
`\ suspicious." -- Henry L. Mencken |
_o__) |
Ben Finney

Thanks for the idea. It doesn't work so well in our context, since
many dependencies are installed long before a particular egg is
attempted.

We need to know the dependencies, install them in dependency order,
and expect the next package to find them. "configure" does this for
hundreds of packages. cmake, scons, and others also tackle this
problem. Python's old setup.py seems to be able to do it.

However, as I understand it, setuptools can't detect previously
installed python packages if they were not installed via eggs. Thus,
my ZSI install was failing on "PyXML>=8.3", even though PyXML 8.4 is
installed. I can't afford to drag copies of all the dependent source
tarballs into an egg's currdir just so it can find them. (We have 6 GB
of tarballs -- who knows how much untarred source that would be.)

I just found hints that you should not attempt to install ZSI form
tarball, but should rather install from an egg. So I was able to
install ZSI for py2.4.

Unfortunately, that means I would have to carry
python-version-dependent renditions of every egg. We have people
running on py23, py24, and py25, thus tripling the number of
tarballs/eggs to manage. This is the very reason we went to a
*source* based repository.
 
R

Robert Kern

Harry said:
We need to know the dependencies, install them in dependency order,
and expect the next package to find them. "configure" does this for
hundreds of packages. cmake, scons, and others also tackle this
problem. Python's old setup.py seems to be able to do it.

No, generic setup.py scripts don't do anything of that kind.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
H

Harry George

Robert Kern said:
No, generic setup.py scripts don't do anything of that kind.

Ok, setup.py itself may not do the work, but from the end users'
perspective it works that way. Setup.py runs a configure and a make,
which in turn find the right already-installed libraries. The point
is, setup.py plays well in such an environment.
 
C

Christopher Arndt

I've asked before, and I'll ask again: If you are doing a Python
project, please make a self-sufficient tarball available as well.

Alomost all projects I know of that provide eggs, also have a CVS or
SVN repository. Just download a tagged release and then use "python
You can have dependencies, as long as they are documented and can be
obtained by separate manual download.

Eggs document dependencies better (i.e with version numbers) than most
other projects do, through the "install_requires" argument to the
"setup()" call in "setup.py". In an egg, this list is found in *-egg-
info/requires.txt.
Ok, setup.py itself may not do the work, but from the end users'
perspective it works that way. Setup.py runs a configure and a make,
which in turn find the right already-installed libraries. The point
is, setup.py plays well in such an environment.

Configure etc. may be able to detect an installed version number of a
package/module because they include scripts to check for those. IMHO
it's silly to place the burden for checking for version numbers on the
developer who wants to distribute an app. The package/module should
provide a standard way to query the version number itself. This is
exactly one of things that setuptools is about.

Chris
 
J

John J. Lee

Harry George said:
(e-mail address removed) (John J. Lee) writes: [...]
2. You can run your own private egg repository. IIRC, it's as simple
as a directory of eggs and a plain old web server with directory
listings turned on. You then run easy_install -f URL package_name
instead of easy_install package_name . The distutils-sig archives
will have more on this.

Again, not speaking for anyone else: With 500 OSS packages, all of
which play by the same tarball rules, we don't have resources to
handle eggs differently.

You said earlier:
The rules are that a) a very few authorized downloaders obtain
tarballs and put them in a depot and b) other users get tarballs from
the depot and build from source.

Not sure how this differs significantly "from running a repository",
in the sense I use it above.


John
 
J

John J. Lee

Harry George said:
Ok, setup.py itself may not do the work, but from the end users'
perspective it works that way. Setup.py runs a configure and a make,
which in turn find the right already-installed libraries. The point
is, setup.py plays well in such an environment.

Some setuptools-based packages do that. Some pure-distutils packages
do that.

Some setuptools-based packages don't do that. Some pure-distutils
packages don't do that.

Regardless of that logic, it's true that, pragmatically, setuptools'
support for dependency resolution encourages an increasing reliance on
explicitly-declared dependencies on *Python* projects ("project" here
meaning something with a setup.py -- a project may contain several
Python packages / modules, and several projects may provide (parts of)
a single Python package, as with zope.*). However, pre-setuptools,
one rarely saw Python packages being discovered using autotools
(configure &c.). *System* features (presence/absence of libraries,
&c.) were indeed discovered that way. I don't know of any Python
projects that previously used autotools and stopped doing so as part
of a switch to setuptools).


John
 
F

Fuzzyman

...at least around here.

I run a corporate Open Source Software Toolkit, which makes hundreds
of libraries and apps available to thousands of technical employees.
The rules are that a) a very few authorized downloaders obtain
tarballs and put them in a depot and b) other users get tarballs from
the depot and build from source.

Historically, python packages played well in this context. Install
was a simple download, untar, setup.py build/install.

Eggs and with other setuptools-inspired install processes break this
paradigm. The tarballs are incomplete in the first place. The builds
sometimes wander off to the internet looking for more downloads. The
installs sometimes wander off to the internet looking for
compatibility conditions. (Or rather they try to do so and fail
because I don't let themn through the firewall.)


I understand your situation and I have some misgivings myself. It
reminds me of the time when I worked in a 'corporate environment' and
I was trying to install a Perl application to get round the internet
blocking.

The application (localproxy - very good) was *intended* to be
installed via CPAN for tracking requirements - which didn't work
behind our proxy firewall. Although the project author (a very
technical guy) knew the direct dependencies, some of these had
dependencies. He *didn't know* the full dependency set for his
project.

Eventually, through trial and error (and a lot of help from the
author) I was able to get it working. But it was painful.

My guess is that a lot of the world's computers are behind firewalls
or proxies that preclude automatic dependency resolution.

*However*, there is a very good reason why setuptools and eggs are
gaining in popularity (and will continue to do so). For the majority
of users eggs are just *so damned convenient*. Being able to do
``easy_install some_project`` and have it just work is fantastic.

There are probably ways round this. For most non-esoteric eggs it
should be possible to create an ordinary installation tarball from an
egg. If you do easy_install of a project into a bare Python
installation (a VM instance for example) then you should be able to
see which dependencies are fetched.

If this is too much then I fear that you may be SOL...

Fuzzyman
http://www.voidspace.org.uk/ironpython/index.shtml
 
J

John J. Lee

Harry George said:
This is a possibility. The tarballs can be seen in a directory
listing. They are in different subdirs (for different "bundles" of
functionality), so I'll need -f to look several places.

One possibility here is to have a script maintain symlinks (or have it
otherwise appropriately configure a web server).

[...]
I agree pure eggs are rare. The fact that they increased this past
quarter was what concerned me. ZODB even looks like a normal tarball,
builds ok, but uses a easy-install-style lookup during install.

All setuptools-based packages work this way: they have a setup.py that
(roughly) imports the setup function from setuptools rather than
distutils.


John
 
P

Paul Boddie

Fuzzyman said:
I understand your situation and I have some misgivings myself. It
reminds me of the time when I worked in a 'corporate environment' and
I was trying to install a Perl application to get round the internet
blocking.

The application (localproxy - very good) was *intended* to be
installed via CPAN for tracking requirements - which didn't work
behind our proxy firewall.

Sounds like an "interesting" bootstrapping issue to me.

[...]
My guess is that a lot of the world's computers are behind firewalls
or proxies that preclude automatic dependency resolution.

I'd argue that mechanisms already exist for automatic upgrades even in
restricted environments, and we're not always talking about "big
bucks" corporate solutions, either. Indeed, the more established GNU/
Linux distributions seem to have had the required flexibility of
dependency resolution *and* not requiring an "always on" connection to
the Internet for quite some time - for obvious reasons if you consider
how long they've been going.
*However*, there is a very good reason why setuptools and eggs are
gaining in popularity (and will continue to do so). For the majority
of users eggs are just *so damned convenient*. Being able to do
``easy_install some_project`` and have it just work is fantastic.

Sure. But being able to install any software (not just eggs via the
Package Index, or Perl software via CPAN, or...) with dependency
resolution isn't alien to a lot of people. Again, it's time to look at
established practice rather than pretend it doesn't exist:

http://mail.python.org/pipermail/python-dev/2006-November/070101.html

Paul
 
H

Harry George

Harry George said:
(e-mail address removed) (John J. Lee) writes: [...]
2. You can run your own private egg repository. IIRC, it's as simple
as a directory of eggs and a plain old web server with directory
listings turned on. You then run easy_install -f URL package_name
instead of easy_install package_name . The distutils-sig archives
will have more on this.

Again, not speaking for anyone else: With 500 OSS packages, all of
which play by the same tarball rules, we don't have resources to
handle eggs differently.

You said earlier:
The rules are that a) a very few authorized downloaders obtain
tarballs and put them in a depot and b) other users get tarballs from
the depot and build from source.

Not sure how this differs significantly "from running a repository",
in the sense I use it above.


John

Significant differences:

"depot": Place(s) where tarballs can be stored, and can then be
reached via http.

"private egg repository": Tuned to the needs of Python eggs. E.g.,
not scattered over several directories or several versions.

Thus a depot of self-contained packages can handle:

1. Multiple "releases" of the depot live at the same time.

2. Packages factored into CD-sized directories (not all in one "-f" location)

3. Multiple versions of Python, without having a new egg for each.


4. Multiple target platforms. Various *NIX and MS Win and Mac systems
-- each at their own OS versions and own compiler versions. All
without having platform-specific and compiler-specific eggs.

5. Different package version selections based on compatibility with
other (non-Python) packages. E.g., to tune for GIS systems vs 3D
animation systems vs numerical analysis systems vs web server systems.

6. Refresh process which does not need to fiddle with egg-ness, or
even know about Python. Everything is a tarball.
 
R

Robert Kern

Harry said:
(e-mail address removed) (John J. Lee) writes:

Significant differences:

"depot": Place(s) where tarballs can be stored, and can then be
reached via http.

"private egg repository": Tuned to the needs of Python eggs. E.g.,
not scattered over several directories or several versions.

Please note that easy_install can use source tarballs, too.
Thus a depot of self-contained packages can handle:

1. Multiple "releases" of the depot live at the same time.

I'm not sure how this is relevant.
2. Packages factored into CD-sized directories (not all in one "-f" location)

Of course, you can specify multiple locations for easy_install to find packages.
You can store these in your ~/.pydistutils.cfg file so you never have to type
them on the command line.
3. Multiple versions of Python, without having a new egg for each.

4. Multiple target platforms. Various *NIX and MS Win and Mac systems
-- each at their own OS versions and own compiler versions. All
without having platform-specific and compiler-specific eggs.

5. Different package version selections based on compatibility with
other (non-Python) packages. E.g., to tune for GIS systems vs 3D
animation systems vs numerical analysis systems vs web server systems.

6. Refresh process which does not need to fiddle with egg-ness, or
even know about Python. Everything is a tarball.

And all of these are obviated by the fact that easy_install can find and build
source tarballs, too.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
Z

zooko

Harry:

For some Free Software Python packages that I publish [1, 2, 3], I've
been trying to gain the benefits of eggs while also making the
resulting packages transparently useful to folks like you.

(You can follow along here: [4].)

One thing I've accomplished is figuring out how to install a Python
package whether it is distutils or setuptools while maintaining my
preferences for "use GNU stow" and "don't run setup code as root":
[5].

The next thing that I'm working on is bundling dependencies in source
tarball form with libraries, so that when you execute setup.py it
installs the dependency from the bundled tarball and does not attempt
to reach the Net.

So, please keep us informed about this issue. There are many benefits
of setuptools, and there are great benefits of compatibility and
standardization, and I'm hoping that we can make setuptools be more
compatible with other paradigms instead of getting into a "everybody
please use it / no everybody please don't use it" tug-of-war.

Regards,

Zooko

[1] http://cheeseshop.python.org/pypi/zfec
[2] http://cheeseshop.python.org/pypi/pyutil
[3] http://allmydata.org/trac/tahoe
[4] http://allmydata.org/trac/tahoe/ticket/15
[5] http://zooko.com/log-2007.html#d2007-06-02-distutils_or_setuptools_with_GNU_stow
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top