[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Amend Debian Python Proposal to Include More Python Metadata?



Hey Donald, thanks for starting this conversation.  I for one am super
appreciative of all the consideration you give for Debian's little slice of
the world.

There's a lot to unpack in this thread, and I'm a little under the weather[1],
so hopefully this makes sense.

Big +1 for recording the files that get installed via the .egg-info/RECORD
file.  As you know, I've been working on dirtbike, which is a "rewheeling"
tool to turn an installed package into a .whl.  While Debian tightly controls
via policy the set of wheels we'll allow into the archive, dirtbike has code
for parsing the RECORD file.  Unfortunately this is never exercised in
practice because we don't have RECORD files - at least not for the packages we
care about[2].

Big +1 for using setuptools everywhere.  By my count, of the packages that I
happen to randomly have installed on my Ubuntu 16.04 system, I have 67
.egg-info files and 113 .egg-info directories.  I'd rather have .egg-info
directories everywhere.

+1 for a lintian warning if distutils is used.  I guess I'm +0 on forcing that
through pybuild because it'll be unobvious and mysterious, and kind of lets
upstreams off the hook.  I'd mildly prefer to patch packages that use
distutils because that's much more discoverable, but I can appreciate that
that's a lot of work we'd be imposing on maintainers, so I won't argue this
too much (other than to say that if pybuild forces it, let's definitely
document this in its manpage!).

On Jan 22, 2016, at 12:40 PM, Scott Kitterman wrote:

>Currently --record includes the .pyc files which is both unneeded and bad.
>Before this gets added either in setuptools or by us, this needed to be
>fixed.

+1 for (I think) another reason than has already been discussed.  We won't be
generating .egg-info directories on the end-user's machine, but instead the
machine the package is built on.  That could be a maintainer's own system or a
central build machine depending on various conditions.  But because the pycs
are generated on the end-user's machine, we actually don't know what pycs will
be generated when the debs are installed, so the egg-info/RECORD file *can't*
contain them, at least not accurately.

On Jan 22, 2016, at 11:54 AM, Donald Stufft wrote:

>Regardless of what happens in this thread, pip is going to stop mucking
>around in files that are owned by some other tool without some sort of
>opt-in/--force style flag *and* we're going to be restructuring things to try
>and guide people away from using pip outside of a virtual environment or
>through the user of --user as well.

Of course, I'd still like --user to be the default[3].  I think that's still
the eventual goal for pip, but isn't yet implemented because $priorities.

>A more controversial way that comes with possibly some extra benefits (which
>Debian may not care about) is to use ``pip`` itself as the build tool rather
>than directly invoking setup.py. In this pip would be responsible for mucking
>around with the distutils/setuptools mess instead of that needing to be
>handled by Debian specific tooling.

I'd like to better understand how this would work.  IIUC, the Fedora ecosystem
is making or already has made this switch, but I don't know how it works.
Obviously, we don't want to install wheels into /usr/lib/python3/dist-packages.

FTR, I am still working on updating Debian's pip.  I'm slowly shaving the yak,
but there are still a few things to do.  If you want to help, get in touch.

>[3] Import pkg_resources is not the fasest thing in the world, it has to scan
>every directory on sys.path looking for things to activate and it comes with
>a bunch of side effects. This happens implicitly for any project using
>console scripts.

Which frankly sucks.  It's also fragile.  Every once in a while a broken
package gets uploaded that breaks pkg_resources, and it's not easy to debug or
repair.  I really hope Brett can fix this when/if he builds this functionality
into the stdlib.

Cheers,
-Barry

[1] double meaning: I have a cold and we're in the early stages of an historic
snow storm!

[2] dirtbike has two fallbacks, both of which use `dpkg -L` to get the list of
installed files.  The first fallback uses `dpkg -S` to find the binary package
that contains the Python package's .egg-info file/directory (doesn't matter
for this purpose).  pkg_resources doesn't have one of those.  The second
fallback imports the Python package and looks for the binary package
containing the top-level directory.  It's all rather ugly, and I'd like to
just use the .egg-info/RECORD file, but I suspect I'll still need the import
fallback for pkg_resources.

[3] https://github.com/pypa/pip/issues/1668

Attachment: pgpDYz7Hq512M.pgp
Description: OpenPGP digital signature


Reply to: