[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Distutils] formencode as .egg in Debian ??



At 01:21 AM 11/23/2005 +0100, Martin v. Löwis wrote:
Phillip J. Eby wrote:
Debian should provide the packages, but not as eggs.


For packages that only operate as eggs, and/or require their dependencies as eggs, you are stating a contradiction in terms. Eggs are not merely a distribution format, any more than Java .jar files are.

So I should say

"Debian should not provide eggs, period", since what Debian provides
are packages, and eggs are not?

I don't understand you.

This is getting difficult: I don't actually know what "a contradiction
in terms" is. You seemed to be saying that eggs are not a distribution
format.

They are not a distribution format. There are in fact three physical formats that an egg can take (if we ignore .egg-link files, which are really needed only to work around the absence of symlinks on Windows). In principle, there could be many others.

I suspect that part of the confusion stems that I prefer to use "package" to refer only to a Python package (thing you import), and not to refer to a distribution as a "package". However, Debian calls distributions "packages", so some confusion is perhaps inevitable. What's more, it appears that the Debian policy calls for the Debian package to be named for the contained Python package, regardless of whether that's the name of the distribution.

An "egg" is a "distribution" of a "project" that is importable and can carry both standardized and individualized metadata that can be read by the pkg_resources module. There are various distribution *formats* in which an "egg" may be physically manifested, but the "egg" itself is a logical concept, not a physical one. It is therefore, as I said, "not merely a distribution format". Is that any clearer?

The "contradiction in terms" was that I took your meaning of "package" to be the same as my term "project" - i.e., a functional collection of Python resources. Projects that *are* eggs, can't be provided "but not as eggs". They *are* eggs, so not providing them as eggs means not providing them at all.

In contrast, projects that are not built with setuptools aren't inherently eggs, but you can certainly make eggs out of them. For these projects, you *do* have the choice to provide them "not as eggs", but then they are also of no use to the projects that need eggs.

As we've already briefly discussed, in the simplest form a project can be made eggs just by adding an appropriately-named .egg-info/PKG-INFO file.


 If so, Debian should not distribute them.

This is what I don't understand, as it has nothing to do whether or not is a distribution format, at least not that I can see. My statement was that eggs are not merely a distribution format; they are a logical concept that can be physically packaged in various ways, and if it's necessary to invent yet another physical layout, well, we can do that too.


If eggs are,
in fact, a distribution format: what is the contradiction then?
I would still claim that Debian should not distribute them, but
instead distribute policy-conforming Debian packages instead.

Which would be the same as saying you wouldn't distribute, say, setuptools itself. Setuptools is an egg, and can't function except as an egg, because it is more than a Python package. Again, an "egg" is some specific release of a project and its introspectable metadata.


I still don't understand you. If a package subclasses a distutils command, is it no longer a distutils setup?

It is not a distutils setup because it does not invoke
distutils.core.setup.

Now I really don't understand you.  Line 43 of setuptools/__init__.py reads:

    setup = distutils.core.setup

So, how is it not invoking distutils.core.setup?


What if it bundles a library module that includes a subclass of a distutils command? Where, precisely, do you draw the line between a "distutils setup" and something else?

Extending distutils is fine. An extension is a feature that, if not
invoked, has no effect. easy_setup changes install in a way that
has an effect.

So do all the packages that rework install_data to be more to their liking - and there are quite a lot of them, as I discovered when I began testing easy_install.


Nothing except performance considerations prevents you having a separate .pth file for each and every egg

That is not true. Usability also suffers if sys.path becomes long.

How? I don't understand this. Someone using eggs rarely has reason to manually manipulate sys.path unless they are adding some kind of plugin directory to it. If they want to know what package version they are using, pkg_resources provides a superior API for querying it; I can say e.g. 'require("TurboGears")' and receive back a list of all the eggs that compose or are required by TurboGears, along with their locations. (Or conversely, receive a DistributionNotFound or VersionConflict error explaining what's missing or what was already imported that's a different version than the one needed.)


>> but in a way unfriendly to dpkg
I don't understand you here. Are you saying that it's not possible for dpkg to do a post-install or uninstall operation like adding or removing a line from a file?

That is certainly possible - but currently, each maintainer would have
to come up with his own solution. This is more tedious to do than it
could be.

easy_deb implements this, so it seems to me it would be a simple matter of running easy_deb to produce the .deb from the .egg. (Caveat: I have not used easy_deb, but its author assures me that it is able to handle the .pth manipulation in a sane way.)


Of course, this creates additional work for package maintainers that wouldn't be present with setuptools' normal .egg file/directory distributions, and my assumption was that the maintainers would prefer to be able to ignore such issues and get the benefit of dependencies defined by the upstream developers. Eggs keep each project in its own little bubble, where it can't overwrite anything else and can be uninstalled without removing any overlapping parts.

I don't see how the maintainer could use the dependency information
in the egg files. Debian policy is that the .deb files need to
define proper dependencies, so the maintainer has to lookup
and edit the dependency information *anyway*. Using the egg
package name is of limited, help, either, because Debian policy
mandates a certain naming scheme for packages, giving the
FormEncode package a name of python2.4-formencode.

What I would suggest here is having a namespace (e.g. pyegg2.4-whatever) for naming packages based on their PyPI names, so that there can be an automated relationship between setuptools dependencies and Debian ones. This doesn't work for existing Debian packages, of course, but it seems to me that they could in fact have the same contents as their pyegg cousin; both could simply use the .egg-info approach. (easy_deb uses python-pypi-whatever, which seems a bit long to me, but then, it's also implemented and my pyegg2.4 idea isn't.)

Anyway, I don't see any obvious reasons why this can't be an automated process, even for the system library dependencies. easy_deb even has a simple configuration file that can augment the setuptools-style dependencies with explicit Debian dependencies. There's also nothing stopping us from defining a way to add Debian dependency information to setup(); in fact setuptools encourages this by offering an extensible system to allow distutils extensions to offer and validate new setup() keywords and use them to generate additional metadata in the egg. This would make it possible to push back Debian dependency information to upstream maintainers, if this were desired.



Reply to: