Perl symbol problem - release critical (Re: Bug#489132)
Raphael Hertzog writes ("Bug#489132: lenny release notes, upgrade dpkg first"):
> To work-around a problem that can happen in the perl 5.10 upgrade (see
> #479711), the perl scripts contained in dpkg (update-alternatives,
> dpkg-divert) have been modified... but for the work-around to be used, the
> new dpkg must obviously be installed first, before the dist-upgrade.
I don't think this is the right solution. To be honest I'm just
astonished at this situation, which is terrible. It is the
consequence of a mistake in the Debian Perl policy - a mistake which
has caused trouble on every previous upgrade, too.
Here is a summary of the problem:
Perl extensions (XS modules) are not compatible across Perl versions
due to ABI changes. For this reason Perl upstream put the Perl
version number in the paths at which modules are installed. The
Debian Perl maintainer has decided not to do this.
As a result, if you try to load a Perl extension from a script when
the versions of the Perl interpreter (in perl-base) and the module
(in a different package) are incompatible, you are trying to load a
library .so with an incompatible ABI.
As it happens, because of lazy symbol resolution, this is detected
very late: after Perl thinks it has loaded the module, the runtime
linker finds a missing symbol and has no option but to kill the
process.
(If it weren't for that, then loading the library would fail; the
Essential scripts which are trying to load the module will then fall
back, so the system would remain functional in a basic way and could
recover.)
As a result, it is possible for a situation to arise where Essential
scripts in the dpkg package (and presumably in other packages) don't
work, without any of the dependencies having been violated. Unless
you're an expert, once your system is in this state you're hosed.
Some observations and opinions:
* This problem is clearly release critical. I don't think documenting
a release critical bug of this severity in the release notes is
acceptable. Furthermore, the proposed workaround is very cumbersome
due to the necessary installation ordering.
* The Debian Perl maintainer's decision to remove the Perl version
number from the module path is clearly wrong. Here is what upstream
have to say, and the Debian Perl maintainer's explanation:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=479711#85
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=479711#120
This explanation is basically that `Debian's dependency system
means that it will work anyway'. This is
(a) not a reason to deviate from upstream - at best it is only a
lack of a reason not to deviate;
(b) false.
That it is false can be seen from the fact that a problem like this
has happened for the last three releases: #158835, #278417,
and now #479711. (There may be other reports of course.)
* Suppressing lazy symbol resolution may work in this case, but it is
not correct. ABI changes may result in random crashes due to
different structure sizes and do not necessarily involve missing
symbols - so the problem may go undetected. If we think that we
want to fix it in etch->lenny by suppressing lazy symbol resolution,
we need to:
(a) check what the actual ABI differences are and that either
there aren't any others besides missing symbols, or that
every module will definitely fail to load
(b) regard this as a workaround and do something sensible next
time.
* One of the Perl upstream commenters in #479711 suggests that the
answer is to use a `pre-inst dependency' which apparently none of
the submitters have realised is what dpkg already has and calls
Pre-Depends. However, a Pre-Depends doesn't solve this problem
because there is no correct order to upgrade the packages:
regardless of whether you upgrade Perl first, or the modules first,
something may break.
* The fundamental problem is that there are currently some Perl
module packages in lenny which whose dependencies are not violated
by unpacking them into an etch system, but which will break the
execution of essential packages. This definitely cannot be fixed
without changing at least those Perl module packages (because
an etch system will be willing to install the broken Perl module
packages right now, and the only thing currently stopping it doing
so is that lenny isn't released).
Possible solutions that I see for lenny:
1. Reinsert the Perl version number in the Perl module packages.
This is the correct long-term solution but involves at least
rebuilding about 300 packages.
2. Find out which modules are used in this way by Essential packages.
Arrange somehow for those modules to fail at `require' when loaded
with Perl 5.8 from etch. This might involve rebuilding only
those modules.
3. Make the lenny Perl 5.10 package _also_ look in the directory with
5.10 in the name. Change the module(s) used by Essential packages
to put their modules in that directory. Make the lenny Perl 5.10
package suppress RTLD_LAZY always. (Specialisation of the above.)
4. Tell everyone in the release notes that it's hideously broken,
and give them an error-prone 6-rune recipe for upgrading, which if
not followed will break their system. This is as requested in
#489132. This is I think hopeless. What is the point of Debian if
it can't manage to get an upgrade right ?
Perhaps someone has some other ideas.
Ian.
Reply to: