[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Package dependency versions and consistency



Adrian Bunk wrote:
> On Fri, Dec 18, 2020 at 04:25:19PM -0800, Josh Triplett wrote:
> >...
> > I'm not suggesting there should be 50 versions of a given
> > library in the archive, but allowing 2-4 versions would greatly simplify
> > packaging, and would allow such unification efforts to take place
> > incrementally, via transitions *in the archive* and *in collaboration
> > with upstream*, rather than *all at once before a new package can be
> > uploaded*.
> > 
> > (I also *completely* understand pushing back on having 2-4 versions of
> > something like OpenSSL; that'd be a huge maintenance and security
> > burden. That doesn't mean we couldn't have 2-4 semver-major versions of
> > a library to emit ANSI color codes, and handle reducing that number via
> > incremental porting in the archive rather than via prohibition in
> > advance.)
> 
> It is important to always remember that the main product we are 
> delivering to our users are our stable releases.

(This is somewhat off-topic, but: I think that Debian stable is *one* of
the main products of Debian, but not by any means the only one. Debian
testing and unstable/experimental are also incredibly valuable. We need
solutions that work for all of those. Those solutions *do* need to work
for stable as well, though, and I'll address the rest of your mail in
that regard.)

> We do have 4 different versions of autoconf in the archive.
> This works because autoconf does not have CVEs.

There's a great deal of software out there with similar properties, most
notably that it doesn't sit at a security boundary. That doesn't just
include build-time code. Also, some types of security vulnerabilities
are rare-to-nonexistent in other ecosystems. A library, written in a
safe language, whose job is to generate ANSI terminal color codes, is
not likely to have security vulnerabilities. It's not critical to force
all packages to move to the latest version of that library immediately,
before they can upload at all.

Bundling *can* make it much more difficult to handle security support,
for a variety of reasons (updating distinct embedded copies, dealing
with more version skew, etc). But in the absence of bundling, if the
*only* issue is that there may be 2-4 semver-major versions in the
archive, I'd expect the process to be roughly "upload new versions of
those packages, trigger rebuilds of dependencies". On balance, I
wouldn't expect substantial scaling issues with the former. The *latter*
would be where we may need some tooling improvements, for ecosystems
that do the equivalent of static linking or library bundling at build
time and ship a compiled artifact in their binary package.

> If a library is so complex that your "unification efforts in 
> collaboration with upstream" would apply, chances are there
> will be CVEs if anyone does a security audit of the code.

I'm not talking about complexity of an individual library; that's not
the primary issue here. I'm talking about quantity. If your package has
300 dependencies, most of which are relatively small, focused,
self-contained libraries, the "collaboration with upstream" part is
about collaboration with the upstream of your package, not the upstreams
of the dependencies.

If you want to package abc version 1.2.3, and among many other things,
abc depends on xyz version 2.1.4, and xyz has a new version 3.0.1 now,
it makes sense to work with the upstream of abc, sending them a patch to
migrate to the new version, and waiting for abc 1.2.4 to come out with
that update. It *doesn't* make sense to maintain a downstream Debian
patch to make abc work with the newer xyz. abc can just build-depend on
xyz-2, and a later version of abc can build-depend on xyz-3. That isn't
a reflection of complexity in xyz, or in abc.

Also, sometimes those dependencies are indirect through other
dependencies, and to transition forward, you may want to move multiple
dependencies forward in concert, for compatibility reasons or just to
minimize duplication within one application.

> > I think much of our resistance to allowing 2-4 distinct semver-major
> > versions of a given library comes down to ELF shared libraries making it
> > painful to have two versions of a library with distinct SONAMEs loaded
> > at once, and while that can be worked around with symbol versioning,
> > we've collectively experienced enough pain in such cases that we're
> > hesitant to encourage it. Our policies have done a fair bit to mitigate
> > that pain. But much of that pain is specific to ELF shared libraries and
> > similar.
> 
> No, the only real pain is providing security support.

Debian has gone through many library transitions that have incurred
substantial pain, including those where a lack of symbol versioning
resulted in serious issues if two versions of the same library ended up
in the same address space. That's in addition to the normal pain of
library transitions, and in addition to all the *infrastructure* that
Debian has built up around library versioning (such as shlibs files and
symbols files). That has led to guidance such as not versioning most
-dev packages, and instead forcing all new package uploads to transition
to the new version of the library.

By contrast with that, security support may not be nearly as much of an
issue. The *majority* of libraries in Debian don't require any security
updates at all.

> >...
> > The
> > dependency and library mechanisms of some other ecosystems, are designed
> > to support having multiple distinct versions of libraries in the same
> > address space, with fully automatic equivalents of symbol versioning.
> >...
> 
> How can Debian security support packages from such ecosystems?

By following the security advisories from those ecosystems, uploading
new versions, and rebuilding the packages that depend on them. A
security upload would be an upload of a semver-compatible version.

Not every package is OpenSSL or libpng. I don't expect Debian to have
3-4 versions of OpenSSL (though I *do* expect it to have OpenSSL 3 and
OpenSSL 1.1 in parallel for a while). I think it's reasonable to *allow*
2-4 versions of a small library for emitting ANSI terminal color codes.
And I think we should have actual written policy supporting that.

> If there is a CVE in a library that is used by 20 different packages
> in 20 different versions, how does the ecosystem help Debian with
> applying this CVE fix to all 20 versions with reasonable effort?

"20 different versions" doesn't tend to happen; again, note that I'm not
talking about bundling here, and that'll need solving another way. In
ecosystems that use semantic versioning, the primary issue is providing
the distinct *major* versions of packages required to satisfy
dependencies.

I'm not talking about packaging xyz 1.2.3, 1.2.4, 1.3.1, and 2.0.1. When
xyz 1.3.1 is uploaded, it can safely replace 1.2.4, and packages using
xyz 1.2.4 can get rebuilt via binNMU if needed.

I'm talking about packaging xyz 1.3.1 and 2.0.1, as separate xyz-1 and
xyz-2 packages, and allowing the use of both in build dependencies.
Then, a package using xyz-1 can work with upstream to migrate to xyz-2,
and when we have no more packages in the archive using xyz-1 we can drop
it.

That's different from requiring *exactly one* version of xyz, forcing
all packages to transition immediately, and preventing people from
uploading packages because they don't fork upstream and port to
different versions of dependencies.

- Josh Triplett


Reply to: