[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

ABI incompatibility problems between Perl versions



On Wed, May 28, 2014 at 08:10:50AM +0200, Guillem Jover wrote:
> On Tue, 2014-05-27 at 23:56:03 +0300, Niko Tyni wrote:

> > In the fallback option, dpkg-dev should probably set PERL_DL_NONLAZY=1
> > before trying to load File::FcntlLock::XS. See #479711. 
> 
> > (Hm, my preliminary testing indicates that 5.20.0 may introduce new
> >  challenges around PERL_DL_NONLAZY.  Urgh. Will investigate.)
> 
> Thanks, will take that into account! And I'm obviously interested in
> any results from that investigation, which might imply having to switch
> to always use the pure version for example. :)

OK, it looks like may have a problem.

Summary: Perl 5.18 segfaults when loading 5.20 XS modules, and Perl 5.20
segfaults when loading 5.18 XS modules. This may break upgrades. The
most robust fix I can think of is to start using a version specific
'vendorarch' directory instead of /usr/lib/perl5.


We've been using a version-independent directory (/usr/lib/perl5) for
XS modules since at least 2001. The dependencies on perlapi-<version>
are supposed to ensure binary interface compatibility between the Perl
interpreter and the dynamically loaded plugins (*.so).

However, there are necessarily periods during upgrades where these
dependencies are not satisfied: either a newer version of perl-base or a
newer version of an XS module may be unpacked but not yet configured. In
particular, the <old-prerm upgrade> phase of maintainer scripts can get
called in this situation.

In the Perl 5.10 transition, when we first hit this (#479711),
/usr/bin/update-alternatives was implemented in Perl and tried to use
Locale::gettext if it was available. This caused an untrappable fatal
error from the dynamic linker when the ABI versions were incompatible,
which was fixed/worked around by setting PERL_DL_NONLAZY=1 in the
environment so that the error became trappable.

Later, upstream added a version check (the macro XS_APIVERSION_BOOTCHECK)
at the start of the boot_<module> subroutine that every XS module
has. This seemed to be an improvement, even if it didn't completely
remove the need for PERL_DL_NONLAZY.

Now, both 5.18 and 5.20 apparently introduced changes to the interpreter
variables in the binary interface that broke both of these guards.  I
won't go in the details here, but it looks like ld.so is happy because no
symbol names have been changed or removed, while XS_APIVERSION_BOOTCHECK
comes too late: a wrong pointer has already been dereferenced and caused
a segfault.

I'm not quite sure how bad this is for us. The key scripts in dpkg
that caused problems earlier have since been rewritten in C. However,
I see that various parts of debconf still have PERL_DL_NONLAZY settings
indicating they at least used to get loaded in ABI incompatible contexts.

I can see some possible avenues to work around the segfaults, like
intentionally renaming a symbol in 5.20 so that ld.so would catch the
incompatibility. However, I note that mixing ABI incompatible code is
something that upstream doesn't really support. As such, it's (obviously)
not tested that trying to load ABI incompatible XS modules leads to
graceful failures, and I don't expect that "fixing" this is a priority
for them. At the very least, it would require us diverting during 5.20,
as upstream obviously can't change that ABI anymore.

I think a much cleaner and more robust fix would be to move to version
specific directories for XS modules ($Config{vendorarch}) so that ABI
incompatible modules will not be visible to the Perl interpreter at
all. This would be just a slight generalization of the @INC multiarch
change discussed in #748380 and needs the same amount of fixing in the
archive (hardcoded references to /usr/lib/perl5, currently in at least
60 or so packages).

It would also make it possible to create coinstallable packages of the
same XS module for different Perl versions, albeit probably with rather
ugly names (maybe liblocale-gettext-perl-5.20). I think Russ has proposed
something like that in the past as a solution for the upgrade problems.

There's a handful of non-XS modules that contain architecture dependent
code and therefore install into vendorarch (at least libpar-packer-perl,
libanyevent-perl, and libcommon-sense-perl.) The above scheme would mean
that these would need a dependency on perlapi-* (or something equivalent)
to make sure perl looks in the right directory.  I don't think this is
a problem.

(Guillem: unfortunately I think a package offering File::Fcntllock::Pure
would be in the same set and we may be back at square one in that area...)

Thoughts?
-- 
Niko Tyni   ntyni@debian.org


Reply to: