[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: dh_python and python policy analysis



Manoj Srivastava writes:
>  policy document. The current version, and future updates, are to be
>  found at http://www.golden-gryphon.com/software/manoj-policy/

unreachable, comments for the posted text follow

>   1.1. Categorization of Python software
> 
>    Program/script
> 
>              This consists of software directly called by an end user of
>            external program, and is independently interpreted by the Python
>            interpreter. Usually starts with the magic bytes #!, with the
>            interpreter being /usr/bin/python* or /usr/bin/env python*.
> 
>    Modules
> 
>              This is code included in python "programs/scripts", and not
>            invoked directly (serving as library modules in compiled
>            languages).
> 
>      Modules can be categorized under two orthogonal criteria: firstly, based
>    on the whether or not they are implemented purely in python, like so:
> 
>    Pure Python Module
> 
>              These are python source code, to be interpreted by the Python
>            interpreter just like program/script code is, and may work across
>            many versions of Python.
> 
>    Extension Module
> 
>              Extensions are C code compiled and linked against a specific
>            version of the libpython library, and so can only be used by one
>            version of Python.

There should be no reason to link the extension against the python
library.  Usually many extensions which are developed upstream on
Windows do link by default to libpython.  Other extensions linking
against libpython are those with build infrastructure maybe predating
distutils.  python-semanage is an example (and should not link using
-z defs).

Another thing to mention here is a "Python package", a directory
containing an __init__.py file plus modules and extensions.

>      Another way of categorizing modules is based on whether or not they are
>    available for use by third party scripts/modules.
> 
>    Public
> 
>              Public modules are available for use in other Python scripts or
>            modules using the import directive. They are installed in one of
>            the directories:
> 
>                 /usr/lib/pythonX.Y
> 
>                           This directory is reserved for official python
>                         modules. No other package apart from upstream
>                         official Python modules should install modules in
>                         this directory.
> 
>                 /usr/lib/pythonX.Y/site-packages
> 
>                           This is where most add-on modules live. Often,
>                         packages do not directly install modules here, but
>                         instead use utility packages like python-central and
>                         python-support to byte compile and install modules as
>                         needed.
> 
>                 /var/lib/python-support/pythonX.Y
> 
>                           Packages using python-support actually have their
>                         packages linked in from this directory, but no
>                         package should directly install modules there
>                         directly. See the documentation for python-support
>                         for details.

maybe shorten that to "all directories in sys.path"; not sure if an
explicit list of directories is needed.

>              Packages may install public Python modules in directories
>            specific to Python packaging utilities -- which specify the
>            directories under which such modules should be droppped, and the
>            the structure of these directories is defined by the utilities
>            themselves. Please note that these directories are not in the path
>            for Python, and are not available for modules to be imported from.
>            At the time of writing, such uility specific directories include:
                                           ^^
> 
>            /usr/share/pycentral
>            /usr/share/python-support

These location are tool specific and should not be referenced
explicitely in the packaging scripts (debian/rules)

> 2. Goals of the new Python policy
> 
>      The new policy is designed to reduce the load on people packaging python
>    modules when one of the following events occur, and, by the dint of doing
>    so, ease the transition that occur as new Python versions are introduced,
>    old ones removed, and as the default version of Python changes, with
>    minimal impact on the target system. In case of the following events:

No, not the whole design goal.  Although the document is titled
"developer's view", the other goals should be mentioned as well.
These are meant to work around processes in debian which are currently
suboptimal, but unlikely to change.

 - The need to support more than one version of a python runtime or
   to support different implementations was seen.  It takes a while
   until applications support a new version.

 - The old schema of using pythonX.Y-foo packages let's land packages
   in the NEW queue, when support for another python runtime is added
   to the package.  That certainly is a process, which could be
   addressed by FTP master (do not process a pythonX.Y+1-foo package
   manually, if pythonX.Y-foo is already in the archive).

   Having pythonX.Y-foo mentioned in the control file would disallow
   binary NMU's in situations where a python runtime is dropped or
   added (the control needs to be regenerated).  A solution would be
   to define an own target to regenerate the control file, which is
   not called during the normal package build.  Such source package
   would not be binNMUable, but could be the target of automated
   uploads.

 - Putting extension modules for more than one python version into
   a package eases transition of these packages to the testing
   distribution, provided that the package supports to default python
   version in testing and the default python version in unstable.
   The schema used before with python-foo depending on
   python<default ver>-foo required an extra upload of every package
   containing an extension, adding new dependencies on new shared
   libraries in unstable, but not yet in testing. All packages having
   a python (<< X.Y) dependency had to be moved to testing at once.

   Having some python modules packaged as pythonX.Y-foo and others as
   python-bar leads to difficulties expressing dependencies for
   packages using a python version other than the default one.  Once
   an extension is packaged as pythonX.Y-foo, python-bar depending on
   pythonX.Y-foo needs to depend on all supported python versions, and
   an application depending on python-bar needs to depend on all
   python dependencies of pythonX.Y-foo.

   The proposal made by a packager to just support one python version
   dropping all versioning is as useful as renaming libc6-dev to
   libc-dev and then doing the transition to a libc6.x-dev (substitute
   the shared library with an unversioned -dev package of your choice).
 
   The alternative of dropping the python-foo package and just keeping
   the pythonX.Y-foo packages was not followed anymore (ftp master
   intervention and rewriting of control files).

These are design decisions made for the distribution.  There are
concerns about some upstream developers that the design of dropping
the pythonX.Y namespace for packages may not be rebust enough.

Another consequence of the current design: the default python version
*has* to be installed, other supported versions can be installed
additionally, not as a replacement.

>    New python version introduced
> 
>               *   Most pure Python modules with no restrictions on the
>                 version of Python supported, and those pure Python modules
>                 that only have a lower bound on the versions of python
>                 supported (for example, "2.3-", or "all"), would require no

"2.3-" -> ">= 2.3", not "all".  the range notation cannot express
things like "2.2, >= 2.4". The use case for the latter is the jython
package (now removed from testing) still at an implementation level of
the corresponding cpython version, with i.e. 2.3 not a supported
python version anymore.  So in the following text "set of versions",
instead of "range of versions" should be used.

>      The new policy also reduces the numbers of packages in the archive, by
>    supporting multiple versions of Python in the same binary package (at the
>    cost of increased size of that one package, but it should still result in
>    space saving.)

Maybe mention the two cases, where the package size increases:

 - extension modules
 - pure modules where different versions of the upstream package are
   shipped and are directly installed into /usr/lib/pythonX.Y/

> 3. Recipe for developers

>     3.1.1. Python versions supported by the source
> 
>      The XS-Python-Version field in debian/control specifies the versions of
>    Python supported by the package [30][1]. While this is a requirement only
>    if using the utility package python-central (python-support, for example,
>    prefers debian/pyversions), setting this is "appreciated" in any case,
>    according to the [31]new policy wiki[32][2]. This is used to track
>    packages during Python transitions.
                                       ... and test rebuilds.

>      This can be a single version, or one or more of a list of
>    non-overlapping ranges. The lowest range may optionally omit a low end,
>    and the highest range may optionally omit an upper end. In other words,
>    the overall range may be open ended. The ranges are often matched to the
>    set of all known Python version that have existed, and the supported set
>    is the intersection of the known versions of python and the range
>    specification.

XS-Python-Version can have the values "current" and "current_ext" as
well (plus the list of ranges), which will expand to "current", if the
package does not have any extensions and can be used with another
python default version without a new upload. It's replaced by the
version number of the current default version in the Python:Versions
substitution variable. "current_ext" normally only needs to be used
for packages having a private extension module. dh_pysupport doesn't
use this information, but requires the developer to explicitely pass
the directory containing the extension module.

>     1.   If the current version of Python is supported by the package, then:
> 
>           *   For packages with private modules or private extensions
>             compiled for the current python version and for applications
>             using /usr/bin/python, this should be set to the string "all" (or
>             "-", in the case of debian/pyversions). [33][3] If the module
>             doesn't work with all Python version, the range of versions
>             supported should be used [34][4]
> 
>           *   For packages with public modules, this should be set to the
>             string "all" (or "-", in the case of debian/pyversions), unless
>             not all versions of Python are supported (in which case the
>             setting should specify the versions or range of versions actually
>             supported, like ">= 2.4" or ">= 2.2, << 2.y".

again, "current" is a legitimate value.

>     3.1.2. Depends:
> 
>      The rules for calculating the dependencies a package has are simple.
> 
>     1.   If a script invokes /usr/bin/pythonX.Y, than the package must depend
>        on pythonX.Y. This is because no amount of automatic byte compiling
>        would ever get rid of the requirement that /usr/bin/pythonX.Y has to
>        be present for the script to function.

I think, that is too strict.  The current behaviour is depending on
the dh_python implementation scanning all files for that interpreter
line.  Consider a package with scripts in /usr/bin: foo, foo2.3,
foo2.4, calling python, python2.3 and python2.4, which would lead to a
dependency on all supported python versions.  The scripts work for the
default python version, for the non-default python versions only if
the corresponding pythonX.Y package is installed.

>     2.   For package that contains extensions, the range of Python versions
>        required has to be restricted to Python versions for which the
>        extensions have been built and shipped in the package. For packages
>        with private extension modules, this means that the range of python
>        versions it depends on has to be set to whatever version of Python was
>        used during the build process (since private extension module packages
>        can only be compiled for one version of Python at any time).

not limited to extensions only, but applies to any other package
hardcoding some kind of version information in the packaged files.

>      Packaged modules that require other modules to work, must depend on the
>    corresponding python-foo packages. They must not depend on any
>    pythonX.Y-foo package directly.
> 
>      Packaged modules available for only one particular version of Python
>    that need other modules (say, "bar"), must depend on the corresponding
>    pythonX.Y-bar packages, and must not depend on any python-bar. For
>    consistency, if the package ("foo") provides several pythonX.Y-foo
>    packages, and it needs the module "bar", it must also depend on
>    pythonX.Y-bar corresponding to each version "X.Y" for the virtual packages
>    pythonX.Y-foo that it provides.
> 
>    --------------------------------------------------------------------------
> 
>     3.1.3. Provides
> 
>      Packages with public modules and extensions should be named, or should
>    provide, python-foo. Pure Python public modules that support all Python
>    versions need not have a Provides field.

... unless there is an application using a non-default python version
using this module. or else you require the application depending on
any indirect dependency of python-foo.

>      For package that contains public extensions, the range of versions
>    supported has to be restricted to Python versions for which the extensions
>    have been built and shipped in the package.
> 
>      Public pure Python modules that have a subset of all python versions
>    supported, or for public extensions, the Provides field indicates which
>    versions of Python are supported (for which one may import the module).
>    For every version of python supported the package should provide
>    pythonX.Y-foo packages. This assumes that the package correctly depends on
>    all the appropriate versions of any version specific module that it itself
>    requires.
> 
>    --------------------------------------------------------------------------
> 
>     3.1.4. Build-Depends:
> 
>      If the package provides public extension modules, then build depending
>    on "python-all-dev (>= 2.3.5-11)" shall ensure that all the >pythonX.Y-dev
>    packages are available during building.

Limiting this to public extension modules is not robust, if a pure
python module hardcodes version information.

>   3.2. Deprecating "current" in versions supported

There is currently no agreement about that.

>      Currently, the string "current" in the field XS-Python-Version is used
>    by python-central to indicate that the package contains private modules,

that is wrong.

>    and explicitly state that the package is only built for the current Python
>    version, and not for any other version supported in Debian. This is
>    flawed, for the following reasons:
> 
>      *   The value corresponds to the version of Python the package is
>        currently built for; but in all other cases the semantics of the
>        XS-Python-Version field is to indicate which versions of Python are
>        supported by the package, and indicates compatibility, not the version
>        it is currently built against. So this special case breaks the
>        semantics of the field.

that is the reason that "current, >= 2.3" is supported.

>      *   By hijacking the XS-Python-Version field to indicate the version of
>        Python it is currently compiled against, it hinder s the propagation
>        of compatibility information, so the ability of the maintainer to
>        indicate the range of Python versions this package is compatible with.

Not sure I understand that paragraph, what do you want to say?

>      *   The information conveyed by this field is redundant; it should be
>        clear that the package contains private modules, based on the
>        directories the modules are shipped in, and also the fact that it
>        ought to build depend on python-dev and not on python-all-dev.

python-all-dev does not differentiate between "current" and
"current_ext";  dh_pysupport has an implementation decision to hide
that information as a parameter to dh_pysupport. That information
should be visible in the control file.

>      *   The semantics of "current" are not fixed, since they depend on the
>        contents of the package python-defaults, and are ill suited for the
>        debian/control file.

again, there's nothing wrong about that, as you can use current
together with a list of support versions.

>   3.3. Script
> 
>      These are executable scripts which start with the magic string #!.
> 
>    --------------------------------------------------------------------------
> 
>     3.3.1. Supported versions
> 
>      If a script invokes /usr/bin/pythonX.Y, then the version supported by
>    the source package (XS-Python-Version or debian/pyversions) should be
>    restricted to X.Y, assuming that the field is being provided. Or else, it
>    should be set to the list of python versions that the script can support,
>    or "all". [36][6] If there are separate scripts that invoke different
>    versions of Python, then all these versions must be in the Depends field
>    -- if you still want to continue packaging instead of just shooting the
>    upstream.
> 
>      No script must use /usr/bin/python if it needs a Python version strictly
>    greater or strictly lower than the current version.

see above (packages having foo, foo2.3, and foo2.4 scripts), that
requirement should not be hardcoded in that case.

>   3.5. Private Extension
> 
>      These are compiled files linked to python libraries, and kept in a

no, in most cases, any extension needs to be linked against libpython.

>    private directory. Since these files are compiled with one specific
>    version of python, and do not live in versioned directories, only one
>    version of python is supported at any given time.
> 
>    --------------------------------------------------------------------------
> 
>     3.5.1. Supported versions
> 
>      The version supported by the source package (XS-Python-Version or
>    debian/pyversions) is either the specific version of Python supported, or
>    "all" [40][10], if there are no specific restrictions based on Python
>    version.
> 
>      If a single version of Python is supported, then the versions supported
                                                            ^ singular!

>    by the binary package (XB-Python-Version field or the file .versions) is
>    set to that version (copied from XS-Python-Version).

>    If the current version is not supported, this field it set to the
>    minimum version actually supported by the module.

what do you mean?

>    If the current version is supported (or
>    there are no restrictions on the version of python supported), then this
>    field is set to the current version.

>   3.6. Public pure Python Module
> 
>      Public modules should be packaged with a name of python-foo, where foo
>    is the name of the module. Such a package should support the current
>    Debian Python version, and more if possible.
> 
>      There are two kinds of public pure Python modules, the most common being
>    the variety that live in unversioned public module directories, and, in
>    rare cases, pure python modules that live in versioned public module
>    directories. The latter is usually the case when the pure Python module
>    imports an public extension module from the same directory, and thus the
>    public extension and pure python modules must be in the same directory.
>    Otherwise, pure python modules should live in an unversioned public module
>    directory.

What is the intention of that paragraph? I would understand
/usr/lib/site-python as "unversioned public module directory", and
/usr/lib/pythonX.Y/site-packages as "versioned public module
directory", but these terms are not defined, and you do seem to have a
different understanding.

>      Depending on the packaging utility used, the modules live in either
>    /usr/share/python-central or in /usr/share/python-support/$package.

that is wrong. 1) there's nothing wrong having these still in
/usr/lib/pythonX.Y/site-packages.  2) please avoid naming of these
directories in the document. these should be considered private
directories for the tools. For the case of pycentral, you can get the
directory name using "pycentral pycentraldir <package name>".

>      Official pure Python modules generally live in a different set of

official?

>    directories than unofficial ones, but are otherwise treated exactly like
>    other public pure Python Module which live in unversioned directories as
>    detailed below.

[skipping 3.6.x] for review

>    --------------------------------------------------------------------------
> 
>     3.6.1. Byte compiling
> 
>      In the common case of pure Python modules in unversioned public module
>    directories, tools exist to help byte compile the pure Python modules for
>    all versions of Python installed on the target system. In case of pure
>    Python modules in versioned public module directories, byte compilation is
>    up to the package scripts.

maybe that's not the best place to mention /etc/python/debian_config,
but scripts byte-compiling files should honor the byte-compile property.
packages should only byte-compile the files belonging to the package,
or else error message for byte-compilations are reported for random packages.

3.8 missing: Packages using embedded python interpreters
(libapache2-pythonX.Y, which should not be collapsed as
libapache2-python), vim, and maybe other packages.

>     4.1.1. rtupdate script invocation
> 
>     1.   in the pre-installation phase of the python package, the package
>        supplied scripts are called with the parameter: pre-rtupdate <old
>        runtime> <new runtime>
> 
>          A failure in any script results in the failure of the
>        pre-installation script of the python package.
> 
>        [42]Note   Whether or not all scripts are run, or the process aborts
>                 at the first failure, is still under flux
> 
>        Since such a failure of a script would leave all packages whose
>        pre-rtupdate has been run in a dangling state, a bug in a pre-rtupdate
>        will always be a critical bug. Be very very careful when working on a
>        pre-rtupdate script.

I'm adding a "failed-pre-rtupdate" hook for the next upload, which is
run, if the pre-rtupdate hook fails (to allow the package to go to a
sane state again).


I do have a different view on raising build and support information to
the Packages/Sources files; in other cases the document clarifies
things well.  IMO the current section 3.6 makes things more confusing
than they are (at least for me).  Would it be helpful to add
paragraphs starting with "Example:" in sections where they are useful? 
I.e. most package maintainers won't need the rtupdate scripts, and
therefore could skip reading when they don't need these scripts.


  Matthias



Reply to: