[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Considering letting python-pip use vendored dependencies



As mentioned in debian-devel recently [1], at the DebConf 21 Debian
Python BoF [2] we discussed the idea of reverting our unbundling patches
to python-pip. The group consensus was that it wasn't worth the effort
to maintain these patches.

Debian Security Team and my co-maintainers: Do you have an opinion on
doing this?

What does pip vendor, and why?
------------------------------
See: https://github.com/pypa/pip/tree/main/src/pip/_vendor which
includes a README explaining their rationale.

Why is unbundling pip messy in Debian?
--------------------------------------
pip (and setuptools) have a special position in Debian, as they are
bootstrapped into virtualenvs, used by Python developers and deployment
systems. Virtualenvs exist to provide isolated Python library stacks,
and they need to be able to install these libraries, usually via pip.
They are bootstrapped from wheels in /usr/share/python-wheels.

How do we unbundle pip?
-----------------------
Our unbundling approach was to build wheels for every one of the
vendored modules, in /usr/share/python-wheels with dirtbike [3].
Historically, every one of these modules built its own python-X-whl
binary package, that python-pip-whl depended on. But as the dependency
set grew we consolidated building the dirtbiked wheels into the
python-pip source package and generated an appropriate Built-Using
field.

This means we've avoided using pip's vendored copies of modules, but at
the cost of building our own vendored copies at build time. To update
the vendored copies (e.g. to apply security updates), we have to do a
sourceful upload of python-pip.

Over time, our unbundling has caused issues:
1. Version mismatches. We've had old/newer versions that upstream knew
   wouldn't work with pip, but we didn't pick up on until bug reports
   came in.
2. The unbundled libraries were visible in the virtualenv. If they were
   updated, the above mismatches could occur again.
   Their presence also confused users.
   This has now been resolved, but there's still the possibility for new
   versions of one of these libraries, in the virtualenv, breaking pip.
3. Historically, pip have occasionally patched their vendored libraries,
   causing our pip to behave differently / brokenly:
   https://github.com/pypa/pip/issues/7784
4. We've had bugs in our unbundling, e.g.
   https://bugs.debian.org/958396
   https://bugs.launchpad.net/ubuntu/+source/python-pip/+bug/1935882
   https://bugs.launchpad.net/ubuntu/+source/python-pip/+bug/1880749
   https://bugs.launchpad.net/ubuntu/+source/python-pip/+bug/1869247
   https://bugs.launchpad.net/ubuntu/+source/python-pip/+bug/1833229
   https://bugs.launchpad.net/ubuntu/+source/python-pip/+bug/1822842
5. Upstream doesn't really support the unbundling mechanism (even though
   they provide it), and don't have any CI coverage of it at the moment.
   This means we've pushed them further away, over time (esp. given the
   bugs above).
   Related to that, but not solely because of it, upstream support has
   been likely to suggest to users that they avoid Debian python things.

We don't have great CI coverage of pip. We don't currently run their
test suite (but we probably should), and we have limited integration
testing to avoid executing untrusted code from the Internet in our
autopkgtests. This has improved over time, but it's still far from
perfect.

I spent a couple of weeks earlier this year getting virtualenv and pip
working correctly in Debian and all the Ubuntu stable releases. In many
cases they'd been somewhat broken for years. Most of the issues were
around the de-vendoring patch. Some of the bugs were quite subtle.

This is a good sign that the Debian maintenance of pip hasn't been
keeping up with the bugs, and suggests to me that the cost of
maintaining this patch isn't worth the benefit.

In the past, we've discussed whether we should continue to de-bundle,
and have done so because we could and it seemed important. I'm not so
sure about that, any more.

If we did start letting pip vendor its dependencies, we'd need to
duplicate any significant patches that Debian currently carries to them,
from what I can see, that's just these two:

1. https://salsa.debian.org/debian/python-certifi/-/blob/debian/master/debian/patches/0001-Use-Debian-provided-etc-ssl-certs-ca-certificates.cr.patch
2. https://salsa.debian.org/python-team/packages/python-urllib3/-/blob/debian/master/debian/patches/02_require-cert-verification.patch (which is possibly noop)

SR

[1]: https://lists.debian.org/msgid-search/20210902223835.GB2145@mithrandir.lan.emorrp1.name
[2]: https://lists.debian.org/msgid-search/20210827233103.72rnnuzdxhppvipz@satie.tumbleweed.org.za
[3]: https://tracker.debian.org/pkg/dirtbike

-- 
Stefano Rivera
  http://tumbleweed.org.za/
  +1 415 683 3272


Reply to: