[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#876055: Environment variable handling for reproducible builds



(Re-sending this to the bug rather than to debian-policy, sorry for the
duplicate on -policy)

On Mon, 18 Sep 2017 at 18:00:51 -0700, Vagrant Cascadian wrote:
> There is a huge difference between variables that *might* affect the
> build as an unintended input that gets stored in a resulting packages in
> some manner, and variables that are designed to change the behavior of
> parts of the build toolchain.
> 
> I consider unintended variables that affect the build output a bug, and
> variables designed and intended to change the behavior of the toolchain
> expected, reasonable behavior.

There is a *huge* number of variables that are intended to change
behaviour, and may or may not affect the behaviour of this specific
package. Which of your categories are these in?

For example, basically any well-behaved programming language or
programming-language-like environment has an equivalent of PYTHONPATH,
PERL5LIB, PKG_CONFIG_PATH and similar variables, which will pull in
arbitrary code (perhaps from /opt or ~/.local or something) with
arbitrary behaviour; and increasingly many tools respect the XDG Base
Directory spec, so XDG_DATA_HOME and XDG_DATA_DIRS provide a search path
for things normally found in /usr. I don't think it's desirable for
maintainers to feel that every debian/rules needs to start with something
like this (note this list is very incomplete, I could list dozens like
this without trying very hard):

    undefine GI_TYPELIB_PATH
    undefine LD_LIBRARY_PATH
    undefine PERL5LIB
    undefine PKG_CONFIG_PATH
    undefine PYTHONPATH
    export PATH = /usr/bin:/usr/sbin:/bin:/sbin
    export XDG_DATA_DIRS = /usr/share

... and indeed if a maintainer did that, that would make it needlessly
difficult for another maintainer to test-build their package against
their new version of Perl or Python or GObject-Introspection or whatever.

Similarly, there is an intractably huge number of environment variables
that can affect the result of Automake and make. Do you know about all
of them? Including RM, PC, AR, LOADLIBES (and those are just for make's
implicit rules)?

I think the assumption has to be that every environment variable is
potentially intended to affect the build unless otherwise stated,
because the set of environment variables that *could* affect the build
is extremely large. It would be most useful if we were to identify a
restricted subset of environment variables for which there is consensus
that the variable is meant to be merely user preference and shouldn't
affect the build - even better if there's some document like the devref
that lists whether it is more appropriate for a package maintainer to
unset each of those variables or reset them to some initial value if
they become a problem.

Perhaps those variables should be a whitelist, or perhaps there is
some wording for Policy that would identify them while excluding the
legitimately build-affecting ones - but either way I think the
assumption should be "there is a limited subset of environment
variables that are required to preserve reproducibility when varied,
and the rest are uninteresting".

The environment variables that are not sanitised by debuild
might be an interesting starting point for classification -
we know that for debuild users, the rest do not matter in
practice. Dpkg::Build::Info::get_build_env_whitelist() is probably
another interesting set (particularly since it's used by recent sbuild).

> In practice, for the vast majority of packages in Debian, it is a
> relatively small number of environment variables to get fairly solid
> reproducibility coverage... at least from what we've seen so far.

Set PERL5LIB to a location containing libraries that change gtk-doc
behaviour. All packages that use gtk-doc in buster (where gtk-doc was
written in Perl) are now unreproducible.

Set PYTHONPATH to a location containing libraries that change gtk-doc
behaviour. All packages that use gtk-doc in sid (where gtk-doc was
translated into Python) are now unreproducible.

Set LD_LIBRARY_PATH or LD_PRELOAD to interpose libc functions and change
their behaviour. Basically everything is now unreproducible.

Set SHELL to a non-POSIX shell. Basically everything is potentially now
unreproducible.

I don't think trying to address those is a route that Debian should go
down.

Regards,
    smcv


Reply to: