[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Are soname bumps required when library upgrades break compatability?



On Tue, 11 Sep 2007 09:20:27 -0700
Brandon <winterknight@nerdshack.com> wrote:

> >> I'm asking mostly for bug reporting. I don't maintain any libraries.
> >> When I file a bug report against a library for breaking ABI
> >> compatability without bumping the soname, do I report it as serious?
> >> Or just important? What would the justification be for reporting as
> >> serious if there is no policy regarding it?
> 
> >Yes, serious at least, but I'd even say âgraveâ: ârenders package
> >unusableâ (by its dependencies) in reportbug.
> 
> That is a bit of a stretch. 

Not at all - the library is at fault and the bug should be serious or
grave. Some libraries would justify 'critical' for this kind of
breakage. Imagine if glibc updated to libc7 code (or downgraded to
libc5 code) whilst retaining SONAME=6. Policy 8.1 is still the only
justification you need to set and retain that severity. The same
breakage changes severity according to the impact on a typical system,
typically measured by the number of applications involved and/or
whether those applications are Priority: required etc.

> It would not be unusable. 

It renders the reverse dependencies of that library unusable through
absolutely no fault of their respective maintainers. The bug is in the
library so the fact that the application fails after libfoo2 is
upgraded without bumping to libfoo3 IS the fault of the library and
therefore all bugs that result from that breakage - whether they occur
in the library or the applications that *depend* on the library - are
RC.

If something like dpkg becomes involved in the breakage, the entire
system could become not just unusable but quite possibly unfixable.

A dependency is just that - the application absolutely requires that
the library abides by the approved interface - it depends on the
library retaining the previous behaviour, even if that behaviour is
not actually what the library upstream intended. Applications commonly
work around minor bugs in library implementations - fixing those bugs
can break the applications so the library upstream (like me) have to
leave the buggy function in place and implement a new function with a
new name that implements the fixed functionality so that applications
can migrate to the new function within the same SONAME. Then, when
everyone has migrated, the library can bump the SONAME, remove the
buggy code and everyone carries on as normal. e.g. a gnucash library
had problems with leap years - the library was fixed by implementing
a new function with a slightly different name but had to retain the
deprecated code until the SONAME could be changed. Applications that
use the fixed code then need to depend on libfoo (>= 1.2.5) if the fix
was in the 1.2.5 release. (I use a src/deprecated.c file for such buggy
routines and clearly document the bugs in the API documentation, as
well as how to migrate from the old to the new. Bumping the SONAME is
then just a case of dropping the contents of src/deprecated.c)

It is not the fault of the application if the library changes that
behaviour. The library should transition gracefully and allow users to
continue using ALL reverse dependencies of the library until all
reverse dependencies have been successfully rebuilt against the new
API. That is what a library transition involves and that is why they
are so painful in a big distribution.

> You can't really
> say "breaks unrelated packages" either, because they would have a
> direct relation.

It doesn't have to - a bug that breaks related packages to make them
unusable is sufficient to make it RC whether or not a system-critical
library is involved.

An incomplete or mis-handled API transition will break packages and
make them unusable. Take a look at gnucash in sid right now.
Mishandling a library transition can happen in many ways - one of which
is to not bump the SONAME properly. Whatever the cause, the library has
violated Policy 8.1 and an RC bug is justified. The very fact that a
bug exists nearly always means that someone has had experience of the
transition breaking something on their system. It is the breakage that
violates Policy, not the lack of a bump in the SONAME - that is just
the most common way of avoiding the breakage in the first place,
principally by changing the package name to match the new SONAME.

When the ABI of a library changes, anything linked against the old ABI
will eventually end up trying to read or write to memory that was
addressed in the old ABI but is no longer accessible in the new one.
That causes a segmentation fault - a crash. All crashes are bugs, all
crashes that directly result from a new version of a library package
replacing the old version are RC. The reason is simple - if the library
migrates into testing in that state, it will break the installations of
users in testing. One reason Debian has 'testing' is to try and avoid
such breakage. SONAME's are just the mechanism for protecting testing
from poorly handled library transitions - hence the relative lack of
detail on SONAME's in Policy. Policy mandates how libraries must behave
but not how the upstream actually achieve that behaviour.

The application linked to the old ABI is unusable - depending on where
the first bad operation occurs, it usually fails to start which, in the
eyes of most users, makes the application useless. The application is
the victim here - gnucash got duplicate bugs being filed (in Debian and
upstream) just because users were blaming the application for a library
bug.

I had experience of a different seg fault in my paid employment - the
application started, then ran normally for all of 38 seconds (yes, we
timed it) and crashed - even when no user operations were made in that
time. The PITA was that it takes another minute to clean up the
dangling processes and then 55 seconds for the application to start up.
So for 38 seconds of operational life, we had to wait the best part of
2 minutes to return to the start of that 38 seconds. Hmm. That's
unusable in my book, especially when the average operation in that
program takes around 90 seconds and there are 300 operations a day on
each installation so it's meant to be working mostly flat out.

Bugs arising from a mishandled library transition are:
1. Always RC for the library
2. More often than not also FTBFS for all applications using that
library
3. Render the application unusable on at least some systems, usually
all - unless the user is canny and pins the library at the old version.

That is the current state of gnucash in Debian. g-wrap failed to
transition correctly (even though g-wrap isn't a 'typical' library,
the failure to migrate cleanly led to the same errors). I filed an RC
bug using Policy 8.1 as my justification. I also filed an RC bug against
gnucash because the transition meant that it would no longer build from
source. This is understandable - the ABI has changed usually because
the API has also changed. i.e. the binary calls fail because the new
library has new functions, has removed some function calls and/or new
internal structs etc. which mean that the various .h files have changed
and the compilation of anything that used to use the old library will
break. That is what a library transition involves and that is why
libtool versioning requires that libraries do NOT remove interfaces
within the same SONAME. Removing an interface equates to removing a
function declaration OR modifying the types or numbers of the arguments
to (or return value (s) from) any function in any way. You can add new
functions to a library without bumping the SONAME but all the old
functions must retain the same binary interface, minor bugs and all.

> I was just wondering about how I would justify keeping the bug at RC,
> if the maintainer wanted to downgrade it. 

Policy 8.1 is clear - if the bug documents a crash in an application
that was not present before the library was updated and the library has
not changed the SONAME or package name, the library justifies an RC
bug. The maintainer should not downgrade it and you would be justified
in reinstating that severity.

-- 


Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/

Attachment: pgppMwF5QOhYE.pgp
Description: PGP signature


Reply to: