ensurepip in venv fails if incompatible wheels present in $venv/share/python-wheels
Hi,
Today, a colleague of mine experienced some weird behavior from `python3
-m venv`; he is actually using Ubuntu, but I've tracked it down to
changes made by Debian to `ensurepip` and I've been wondering whether it
should be considered a bug and fixed, so I thought I'd try to ask on
this list :)
The behavior itself is admittedly pretty esoteric, as it arises when you
attempt to manipulate a virtual environment with different versions of
Debian-Python, which you probably shouldn't do in the first place? Or at
least I wouldn't do it, I'd just re-create the virtual environment from
scratch.
This is how I've been able to reproduce it:
1. Create a virtual environment directory with Debian-Python 3.6 --
`python3.6 -m venv test-venv`.
2. Attempt to manipulate it with Debian-Python 3.5 -- `python3.5 -m venv
test-venv`
This results in the following error:
```
The virtual environment was not created successfully because ensurepip is not
available. On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.
apt-get install python3-venv
You may need to use sudo with that command. After installing the python3-venv
package, recreate your virtual environment.
Failing command: ['/home/lukes/test-venv/bin/python3.5', '-Im', 'ensurepip', '--upgrade', '--default-pip']
```
This in itself is not entirely helpful, because `ensurepip` *was*
actually available as the `python3-venv` package was already installed.
The problem is that this message is displayed whenever the "failing
command" shown above results in a `CalledProcessError`, irrespective of
the error's cause, which can also be that `ensurepip` did in fact run
but not successfully to completion. So I would suggest amending the
`venv.EnvBuilder._setup_pip` method along the following lines, for more
resilient error reporting (this is based on Debian-Python 3.5, but
AFAICS, in Debian-Python 3.8, the code is still the same):
```diff
--- a/usr/lib/python3.5/venv/__init__.py 2019-08-20 22:05:10.000000000 +0200
+++ b/usr/lib/python3.5/venv/__init__.py 2019-11-01 17:55:51.425990509 +0100
@@ -257,25 +257,33 @@
# following command will produce an unhelpful error. Let's make it
# more user friendly.
try:
subprocess.check_output(
cmd, stderr=subprocess.STDOUT,
universal_newlines=True)
- except subprocess.CalledProcessError:
- print("""\
+ except subprocess.CalledProcessError as err:
+ if ': No module named ensurepip' in err.output:
+ print("""\
The virtual environment was not created successfully because ensurepip is not
available. On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.
apt-get install python3-venv
You may need to use sudo with that command. After installing the python3-venv
package, recreate your virtual environment.
Failing command: {}
""".format(cmd))
+ else:
+ print("""\
+Tried to run this command: {}
+But it failed with the following unexpected error:
+
+{}
+""".format(cmd, err.output))
sys.exit(1)
def setup_scripts(self, context):
"""
Set up scripts into the created environment from a directory.
```
Still, even in the current version, it's great that the failing command
is at least shown, pinpointing what to investigate further. Which I did,
and realized that the problem wasn't that `ensurepip` wasn't available,
but that running it resulted in a different error, which `venv` just
didn't display. This is the `ensurepip` error (the first line was
repeated multiple times, but I'll just paste it once):
```
/home/lukes/test-venv/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.13.1) or chardet (2.3.0) doesn't match a supported version!
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.5/ensurepip/__main__.py", line 4, in <module>
ensurepip._main()
File "/usr/lib/python3.5/ensurepip/__init__.py", line 268, in _main
default_pip=args.default_pip,
File "/usr/lib/python3.5/ensurepip/__init__.py", line 174, in bootstrap
_run_pip(args + _PROJECTS, additional_paths)
File "/usr/lib/python3.5/ensurepip/__init__.py", line 66, in _run_pip
import pip
File "/tmp/tmpl2su6y_t/pip-8.1.1-py2.py3-none-any.whl/pip/__init__.py", line 16, in <module>
File "/tmp/tmpl2su6y_t/pip-8.1.1-py2.py3-none-any.whl/pip/vcs/mercurial.py", line 9, in <module>
File "/tmp/tmpl2su6y_t/pip-8.1.1-py2.py3-none-any.whl/pip/download.py", line 39, in <module>
ImportError: cannot import name 'requests'
```
So this is what's happening:
1. `ensurepip` under Debian creates a directory under
`test-venv/share/python-wheels` where it copies the wheels necessary
to bootstrap `pip` from `/usr/share/python-wheels`; cf. the
`ensurepip._bootstrap` function.
2. On subsequent runs, the `test-venv/share/python-wheels` directory is
not cleaned when present. This means that in our case, it already
contains wheels copied over by `ensurepip` from the first run of
`venv` under Debian-Python 3.6, plus different versions of (mostly?)
the same wheels which have now been copied over by `ensurepip` from
Debian-Python 3.5. So we have two different sets of dependencies in
wheel format, including among others the following libraries:
- requests-2.18.4, urllib3-1.22, chardet-3.0.4 (from Debian-Python
3.6)
- requests-2.9.1, urllib3-1.13.1, chardet-2.3.0 (from Debian-Python
3.5)
3. `ensurepip` only adds to `sys.path` the wheels that it has been
manipulating (copying over), cf. the `copy_wheels` function, which is
nested within `ensurepip._bootstrap`. So far so good, there's no
interference.
4. Unfortunately, when `ensurepip._run_pip` tries to `import
pip._internal`, *all* of the wheels under
`test-venv/share/python-wheels` are now added to `sys.path` via the
mechanism in `pip/_vendor/__init__.py` -- there's a glob which just
adds all of the `*.whl` files in that directory.
4. As a consequence, depending on the order in which the different wheel
versions of the dependencies get added to `sys.path`, we can end up
trying to import an inconsistent set of dependencies, as indicated by
the error above when trying to import `requests`: the first versions
found on `sys.path` are requests-2.18.4 (from Debian-Python 3.6), but
urllib3-1.13.1 and chardet-2.3.0 (from Debian-Python 3.5). Which
results in `requests` failing to import because the versions don't
match, and everything collapses.
Now I'm not entirely sure what the purpose of copying the wheels over
into `test-venv/share/python-wheels` even is (there must be a good
reason, it's just not obvious to me why not add
`/usr/share/python-wheels/*.whl` to `sys.path` directly), but AFAICS
they're only used during the bootstrap process anyway -- they don't show
up on `sys.path` once I run Python from the virtual environment. So I
guess a possíble solution would be to reset this directory each time
`ensurepip` runs, to make sure that there's only one set of dependencies
in there at a time (the correct one). Something like:
```diff
--- a/usr/lib/python3.5/ensurepip/__init__.py 2019-08-20 22:05:09.000000000 +0200
+++ b/usr/lib/python3.5/ensurepip/__init__.py 2019-11-02 00:10:00.170871478 +0100
@@ -1,9 +1,10 @@
import glob
import os
import os.path
import pkgutil
+import shutil
import sys
import tempfile
__all__ = ["version", "bootstrap"]
@@ -146,11 +147,13 @@
# pip to look in when attempting to locate wheels to use to satisfy
# the dependencies that pip normally bundles but Debian has debundled.
# This is critically important and if this directory changes then both
# python-pip and python-virtualenv needs updated to match.
venv_wheel_dir = os.path.join(sys.prefix, 'share', 'python-wheels')
- os.makedirs(venv_wheel_dir, exist_ok=True)
+ if os.path.isdir(venv_wheel_dir):
+ shutil.rmtree(venv_wheel_dir)
+ os.makedirs(venv_wheel_dir)
dependencies = [
os.path.basename(whl).split('-')[0]
for whl in glob.glob('/usr/share/python-wheels/*.whl')
]
copy_wheels(dependencies, venv_wheel_dir, sys.path)
```
Alternatively, if creating `test-venv/share/python-wheels` is not a hard
requirement, it could be entirely avoided, `ensurepip._bootstrap` could
just directly add the wheels in `/usr/share/python-wheels` to
`sys.path`, and there wouldn't be any additional, possibly incompatible
wheels for the `pip/_vendor/__init__.py` glob mechanism to add. Plus
people like me would stop wondering where that `test-venv/share`
directory came from which they don't see when they create virtual
environments on other OSs, or using `pyenv` Python. It could look
something like this:
```diff
--- a/usr/lib/python3.5/ensurepip/__init__.py 2019-08-20 22:05:09.000000000 +0200
+++ b/usr/lib/python3.5/ensurepip/__init__.py 2019-11-02 01:00:22.743359249 +0100
@@ -128,38 +128,41 @@
def copy_wheels(wheels, destdir, paths):
for project in wheels:
wheel_names = glob.glob(
'/usr/share/python-wheels/{}-*.whl'.format(project))
if len(wheel_names) == 0:
raise RuntimeError('missing dependency wheel %s' % project)
assert len(wheel_names) == 1, wheel_names
wheel_name = os.path.basename(wheel_names[0])
path = os.path.join('/usr/share/python-wheels', wheel_name)
- with open(path, 'rb') as fp:
- whl = fp.read()
- dest = os.path.join(destdir, wheel_name)
- with open(dest, 'wb') as fp:
- fp.write(whl)
- paths.append(dest)
+ # Only perform copy if an actual destdir was provided...
+ if destdir is not None:
+ with open(path, 'rb') as fp:
+ whl = fp.read()
+ dest = os.path.join(destdir, wheel_name)
+ with open(dest, 'wb') as fp:
+ fp.write(whl)
+ paths.append(dest)
+ # ... otherwise just append the original path to paths:
+ else:
+ paths.append(path)
with tempfile.TemporaryDirectory() as tmpdir:
# This directory is a "well known directory" which Debian has patched
# pip to look in when attempting to locate wheels to use to satisfy
# the dependencies that pip normally bundles but Debian has debundled.
# This is critically important and if this directory changes then both
# python-pip and python-virtualenv needs updated to match.
- venv_wheel_dir = os.path.join(sys.prefix, 'share', 'python-wheels')
- os.makedirs(venv_wheel_dir, exist_ok=True)
dependencies = [
os.path.basename(whl).split('-')[0]
for whl in glob.glob('/usr/share/python-wheels/*.whl')
]
- copy_wheels(dependencies, venv_wheel_dir, sys.path)
+ copy_wheels(dependencies, None, sys.path)
# Put our bundled wheels into a temporary directory and construct the
# additional paths that need added to sys.path
additional_paths = []
copy_wheels(_PROJECTS, tmpdir, additional_paths)
# Construct the arguments to be passed to the pip command
args = ["install", "--no-index", "--find-links", tmpdir]
if root:
```
Both of these approaches get rid of the problem, as in, the series of two
commands listed at the beginning...
```sh
$ python3.6 -m venv test-venv
$ python3.5 -m venv test-venv
```
... runs fine with either of these modifications.
Just to be extra clear: with "regular" (non-Debian) Python, this problem
doesn't happen because `venv`/`ensurepip` doesn't do any of the magic
around the `test-venv/share/python-wheels` directory; this directory
isn't even created, it's a Debian-specific modification. So it's not an
upstream problem, I even tried those two commands with Python 3.5 and
3.6 installed via `pyenv` to make sure, and it worked fine out of the
box.
So what do you think? Is this worth fixing? Should I report it somewhere
else?
And thank you for taking the time to read this, I've probably been more
verbose than necessary, as I wasn't sure how much shared context I could
assume :)
Best,
David
Reply to: