[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

ensurepip in venv fails if incompatible wheels present in $venv/share/python-wheels



Hi,

Today, a colleague of mine experienced some weird behavior from `python3
-m venv`; he is actually using Ubuntu, but I've tracked it down to
changes made by Debian to `ensurepip` and I've been wondering whether it
should be considered a bug and fixed, so I thought I'd try to ask on
this list :)

The behavior itself is admittedly pretty esoteric, as it arises when you
attempt to manipulate a virtual environment with different versions of
Debian-Python, which you probably shouldn't do in the first place? Or at
least I wouldn't do it, I'd just re-create the virtual environment from
scratch.

This is how I've been able to reproduce it:

1. Create a virtual environment directory with Debian-Python 3.6 --
   `python3.6 -m venv test-venv`.
2. Attempt to manipulate it with Debian-Python 3.5 -- `python3.5 -m venv
   test-venv`

This results in the following error:

```
The virtual environment was not created successfully because ensurepip is not
available.  On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.

    apt-get install python3-venv

You may need to use sudo with that command.  After installing the python3-venv
package, recreate your virtual environment.

Failing command: ['/home/lukes/test-venv/bin/python3.5', '-Im', 'ensurepip', '--upgrade', '--default-pip']
```

This in itself is not entirely helpful, because `ensurepip` *was*
actually available as the `python3-venv` package was already installed.
The problem is that this message is displayed whenever the "failing
command" shown above results in a `CalledProcessError`, irrespective of
the error's cause, which can also be that `ensurepip` did in fact run
but not successfully to completion. So I would suggest amending the
`venv.EnvBuilder._setup_pip` method along the following lines, for more
resilient error reporting (this is based on Debian-Python 3.5, but
AFAICS, in Debian-Python 3.8, the code is still the same):

```diff
--- a/usr/lib/python3.5/venv/__init__.py 2019-08-20 22:05:10.000000000 +0200
+++ b/usr/lib/python3.5/venv/__init__.py 2019-11-01 17:55:51.425990509 +0100
@@ -257,25 +257,33 @@
         # following command will produce an unhelpful error.  Let's make it
         # more user friendly.
         try:
             subprocess.check_output(
                 cmd, stderr=subprocess.STDOUT,
                 universal_newlines=True)
-        except subprocess.CalledProcessError:
-            print("""\
+        except subprocess.CalledProcessError as err:
+            if ': No module named ensurepip' in err.output:
+                print("""\
 The virtual environment was not created successfully because ensurepip is not
 available.  On Debian/Ubuntu systems, you need to install the python3-venv
 package using the following command.

     apt-get install python3-venv

 You may need to use sudo with that command.  After installing the python3-venv
 package, recreate your virtual environment.

 Failing command: {}
 """.format(cmd))
+            else:
+                print("""\
+Tried to run this command: {}
+But it failed with the following unexpected error:
+
+{}
+""".format(cmd, err.output))
             sys.exit(1)

     def setup_scripts(self, context):
         """
         Set up scripts into the created environment from a directory.

```

Still, even in the current version, it's great that the failing command
is at least shown, pinpointing what to investigate further. Which I did,
and realized that the problem wasn't that `ensurepip` wasn't available,
but that running it resulted in a different error, which `venv` just
didn't display. This is the `ensurepip` error (the first line was
repeated multiple times, but I'll just paste it once):

```
/home/lukes/test-venv/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.13.1) or chardet (2.3.0) doesn't match a supported version!
Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3.5/ensurepip/__main__.py", line 4, in <module>
    ensurepip._main()
  File "/usr/lib/python3.5/ensurepip/__init__.py", line 268, in _main
    default_pip=args.default_pip,
  File "/usr/lib/python3.5/ensurepip/__init__.py", line 174, in bootstrap
    _run_pip(args + _PROJECTS, additional_paths)
  File "/usr/lib/python3.5/ensurepip/__init__.py", line 66, in _run_pip
    import pip
  File "/tmp/tmpl2su6y_t/pip-8.1.1-py2.py3-none-any.whl/pip/__init__.py", line 16, in <module>
  File "/tmp/tmpl2su6y_t/pip-8.1.1-py2.py3-none-any.whl/pip/vcs/mercurial.py", line 9, in <module>
  File "/tmp/tmpl2su6y_t/pip-8.1.1-py2.py3-none-any.whl/pip/download.py", line 39, in <module>
ImportError: cannot import name 'requests'
```

So this is what's happening:

1. `ensurepip` under Debian creates a directory under
   `test-venv/share/python-wheels` where it copies the wheels necessary
   to bootstrap `pip` from `/usr/share/python-wheels`; cf. the
   `ensurepip._bootstrap` function.
2. On subsequent runs, the `test-venv/share/python-wheels` directory is
   not cleaned when present. This means that in our case, it already
   contains wheels copied over by `ensurepip` from the first run of
   `venv` under Debian-Python 3.6, plus different versions of (mostly?)
   the same wheels which have now been copied over by `ensurepip` from
   Debian-Python 3.5. So we have two different sets of dependencies in
   wheel format, including among others the following libraries:
   - requests-2.18.4, urllib3-1.22, chardet-3.0.4 (from Debian-Python
     3.6)
   - requests-2.9.1, urllib3-1.13.1, chardet-2.3.0 (from Debian-Python
     3.5)
3. `ensurepip` only adds to `sys.path` the wheels that it has been
   manipulating (copying over), cf. the `copy_wheels` function, which is
   nested within `ensurepip._bootstrap`. So far so good, there's no
   interference.
4. Unfortunately, when `ensurepip._run_pip` tries to `import
   pip._internal`, *all* of the wheels under
   `test-venv/share/python-wheels` are now added to `sys.path` via the
   mechanism in `pip/_vendor/__init__.py` -- there's a glob which just
   adds all of the `*.whl` files in that directory.
4. As a consequence, depending on the order in which the different wheel
   versions of the dependencies get added to `sys.path`, we can end up
   trying to import an inconsistent set of dependencies, as indicated by
   the error above when trying to import `requests`: the first versions
   found on `sys.path` are requests-2.18.4 (from Debian-Python 3.6), but
   urllib3-1.13.1 and chardet-2.3.0 (from Debian-Python 3.5). Which
   results in `requests` failing to import because the versions don't
   match, and everything collapses.

Now I'm not entirely sure what the purpose of copying the wheels over
into `test-venv/share/python-wheels` even is (there must be a good
reason, it's just not obvious to me why not add
`/usr/share/python-wheels/*.whl` to `sys.path` directly), but AFAICS
they're only used during the bootstrap process anyway -- they don't show
up on `sys.path` once I run Python from the virtual environment. So I
guess a possíble solution would be to reset this directory each time
`ensurepip` runs, to make sure that there's only one set of dependencies
in there at a time (the correct one). Something like:

```diff
--- a/usr/lib/python3.5/ensurepip/__init__.py    2019-08-20 22:05:09.000000000 +0200
+++ b/usr/lib/python3.5/ensurepip/__init__.py    2019-11-02 00:10:00.170871478 +0100
@@ -1,9 +1,10 @@
 import glob
 import os
 import os.path
 import pkgutil
+import shutil
 import sys
 import tempfile


 __all__ = ["version", "bootstrap"]
@@ -146,11 +147,13 @@
         # pip to look in when attempting to locate wheels to use to satisfy
         # the dependencies that pip normally bundles but Debian has debundled.
         # This is critically important and if this directory changes then both
         # python-pip and python-virtualenv needs updated to match.
         venv_wheel_dir = os.path.join(sys.prefix, 'share', 'python-wheels')
-        os.makedirs(venv_wheel_dir, exist_ok=True)
+        if os.path.isdir(venv_wheel_dir):
+            shutil.rmtree(venv_wheel_dir)
+        os.makedirs(venv_wheel_dir)
         dependencies = [
             os.path.basename(whl).split('-')[0]
             for whl in glob.glob('/usr/share/python-wheels/*.whl')
             ]
         copy_wheels(dependencies, venv_wheel_dir, sys.path)
```

Alternatively, if creating `test-venv/share/python-wheels` is not a hard
requirement, it could be entirely avoided, `ensurepip._bootstrap` could
just directly add the wheels in `/usr/share/python-wheels` to
`sys.path`, and there wouldn't be any additional, possibly incompatible
wheels for the `pip/_vendor/__init__.py` glob mechanism to add. Plus
people like me would stop wondering where that `test-venv/share`
directory came from which they don't see when they create virtual
environments on other OSs, or using `pyenv` Python. It could look
something like this:

```diff
--- a/usr/lib/python3.5/ensurepip/__init__.py    2019-08-20 22:05:09.000000000 +0200
+++ b/usr/lib/python3.5/ensurepip/__init__.py    2019-11-02 01:00:22.743359249 +0100
@@ -128,38 +128,41 @@
     def copy_wheels(wheels, destdir, paths):
         for project in wheels:
             wheel_names = glob.glob(
                 '/usr/share/python-wheels/{}-*.whl'.format(project))
             if len(wheel_names) == 0:
                 raise RuntimeError('missing dependency wheel %s' % project)
             assert len(wheel_names) == 1, wheel_names
             wheel_name = os.path.basename(wheel_names[0])
             path = os.path.join('/usr/share/python-wheels', wheel_name)
-            with open(path, 'rb') as fp:
-                whl = fp.read()
-            dest = os.path.join(destdir, wheel_name)
-            with open(dest, 'wb') as fp:
-                fp.write(whl)
-            paths.append(dest)
+            # Only perform copy if an actual destdir was provided...
+            if destdir is not None:
+                with open(path, 'rb') as fp:
+                    whl = fp.read()
+                dest = os.path.join(destdir, wheel_name)
+                with open(dest, 'wb') as fp:
+                    fp.write(whl)
+                paths.append(dest)
+            # ... otherwise just append the original path to paths:
+            else:
+                paths.append(path)

     with tempfile.TemporaryDirectory() as tmpdir:
         # This directory is a "well known directory" which Debian has patched
         # pip to look in when attempting to locate wheels to use to satisfy
         # the dependencies that pip normally bundles but Debian has debundled.
         # This is critically important and if this directory changes then both
         # python-pip and python-virtualenv needs updated to match.
-        venv_wheel_dir = os.path.join(sys.prefix, 'share', 'python-wheels')
-        os.makedirs(venv_wheel_dir, exist_ok=True)
         dependencies = [
             os.path.basename(whl).split('-')[0]
             for whl in glob.glob('/usr/share/python-wheels/*.whl')
             ]
-        copy_wheels(dependencies, venv_wheel_dir, sys.path)
+        copy_wheels(dependencies, None, sys.path)

         # Put our bundled wheels into a temporary directory and construct the
         # additional paths that need added to sys.path
         additional_paths = []
         copy_wheels(_PROJECTS, tmpdir, additional_paths)

         # Construct the arguments to be passed to the pip command
         args = ["install", "--no-index", "--find-links", tmpdir]
         if root:
```

Both of these approaches get rid of the problem, as in, the series of two
commands listed at the beginning...

```sh
$ python3.6 -m venv test-venv
$ python3.5 -m venv test-venv
```

... runs fine with either of these modifications.

Just to be extra clear: with "regular" (non-Debian) Python, this problem
doesn't happen because `venv`/`ensurepip` doesn't do any of the magic
around the `test-venv/share/python-wheels` directory; this directory
isn't even created, it's a Debian-specific modification. So it's not an
upstream problem, I even tried those two commands with Python 3.5 and
3.6 installed via `pyenv` to make sure, and it worked fine out of the
box.

So what do you think? Is this worth fixing? Should I report it somewhere
else?

And thank you for taking the time to read this, I've probably been more
verbose than necessary, as I wasn't sure how much shared context I could
assume :)

Best,

David


Reply to: