[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#971515: kubernetes: excessive vendoring (private libraries)



On Wednesday, 21 October 2020 6:16:03 AM AEDT Sean Whitton wrote:
> I think that my message [1] is what makes you think that the package
> would not have got through NEW?

It was not your message but my own experience with introducing of 100+ 
packages through NEW, especially those ones with large burden of vendored 
libraries, including Kubernetes. The main hassle usually is to convince FTP-
masters when some vendoring is _necessary_ (case by case) and the usual 
request is to package all vendored libraries separately. With rare exceptions 
some (few) vendored libraries are allowed like when a library is a fork, 
customised for the particular project and therefore not re-usable by other 
software. Another example is when vendored library is an obsolete software 
phasing out in future releases. Few micro-libraries might be tolerated when 
vendored, especially when they are not widely used. Also vendoring might be 
acceptable when software components with mutual/circular dependencies are 
shipped in one or several name spaces - in other words when a software code 
base is not from one name space but from several. None of those cases applies 
to Kubernetes.

A specific example (libpod/podman) is mentioned in 

  https://lists.debian.org/debian-devel/2020/03/msg00441.html

    "Podman was rejected due to "many embedded packages in vendor/" with only
     6 or 7 private libs versus 120 libraries removed in favour of packaged
     ones."


> There are a few issues tangled together here.

IMHO it is really one issue of how we maintain Debian packages. If you want 
to distinguish few issues then they are all closely related.


> NEW is mainly about license and DFSG compliance, and secondarily about
> the idea that we want to avoid accepting packages where doing so would
> make Debian worse, even if it would also make Debian better along other
> dimensions.  As a simple example, we try to avoid accepting a package
> that is already packaged under a slightly different name, because in
> most cases it is worse for both users and contributors to have the same
> thing in the archive twice (not talking about vendoring here).

It is also about preserving integrity of Debian identity. We try to prevent 
monolithic bundles like Kubernetes in favour of maintaining ecosystem of re-
usable libraries, packaged individually.


> In this case, the reason I wrote in [1] that I would probably have
> rejected the package, had I come across it in NEW, is that it seemed to
> me that having this package in Debian would make the archive less
> maintainable by contributors other than Janos who might need to work
> with the package.  (After the discussion on -devel, I'm no longer so
> sure about that opinion of mine.)

  https://wiki.debian.org/UpstreamGuide#No_inclusion_of_third_party_code

If your concern is about security support then IMHO Kubernetes can not be 
meaningfully supported from security prospective, with or without vendored 
dependencies.

Also I must point out that Kubernetes upstream have the worst management of 
vendored libraries that I have ever seen. Examples include vendoring multiple 
versions of the same library, etc.
A particular case when upstream failed to update a problematic vendored 
library for years(!) practically destroys faith in upstream care for good 
hygiene of vendored dependencies:

  https://github.com/kubernetes/kubernetes/issues/27543

Note the expressed _resistance_ to upgrading a vendored library...

With the above example, how can anyone have confidence in upstream security 
patching? After all Kubernetes have more vendored code than its own.



> It's not correct to say, however, that the package "is in violation of
> ftpmaster's policy for inclusion of new packages".  That could only be
> true is if the package met one of the "serious violations" listed in the
> REJECT-FAQ, which is basically DFSG and licensing issues, and a few
> obvious clangers.  Instead, what we have is a situation in which there
> is reason to be worried about the way the package is put together, but
> the opinion of one FTP team member at one particular point in time
> carries about the same weight as the opinion of any experienced
> packager.

There is an established practice, a tradition if you wish, that is followed 
all the time even if not explicitly described in REJECT-FAQ. Debian clearly
tries to be modular whenever possible.


> In other words, I suggest that we ignore the NEW issue entirely, and
> just consider whether the way the package is currently put together
> imposes an unreasonable burden on Debian contributors other than Janos
> who want to work on the package, or users who want to patch it, etc.
> The sorts of questions we should try to answer:
> 
> - does the vendoring make Debian security support harder (discussion on
>   -devel suggests it makes it easier)
> 
> - everyone seems to think the level of vendoring is at best a necessary
>   evil;

Let's not attempt to fabricate perception of consensus please. "Everyone" is 
more like "everyone who noticed the discussion and cared to express an 
opinion" at best, and even those were not all in agreement...


>   if someone wants to try to reduce the level of vendoring (as
>   Dmitry did when he was maintainer), is the current way the package is
>   built going to make it harder for people to work on making that sort
>   of contribution?

If Debian only cared about maintainers' convenience or reducing maintainers 
efforts then Debian would not be what it is now. We favour technical elegance 
often in expense of maintainers' comfort.


> Are there issues the TC should think about which do not fall under this
> way of looking at things?  I.e., weighing the impact on people other
> than Janos who want to work on the package, vs. the impact on people who
> want recent kubernetes to be part in the archive at all?

Is Debian ecosystem of packaged reusable libraries worth caring about?

If so then why grant exception to one particular package? We have several (or 
more) sophisticated Golang packages using hundreds of packaged libraries.

In the early days of packaging Kubernetes we did not even have most 
components packaged and I've been spending most effort on packaging, 
introducing and stabilising dependency libraries.
These days the major work has already been done and the argument for 
monolithic vendoring is much weaker.


> If I'm right about what the question for the TC is, I hope that Janos
> and Dmitry can both help us discuss it in a way which sets aside the
> heat which characterised the -devel thread.  It is completely
> understandable to (a) feel very frustrated at Debian not including
> recent versions of a useful piece of free software; and also (b) feel
> very frustrated when someone chooses to accept a less-than-ideal
> approach as necessary when one has put a lot of time into trying to find
> workable alternatives.

There is a compromise and a delicate balance between those concerns.
Sometimes it may be necessary to vendor few libraries strategically.
But most packaged libraries can (and should) be reused. It requires some 
effort, knowledge and experience.

Packaging software "the Debian way" may be harder but it have benefits 
precisely because we do things differently. There is little value to build 
Kubernetes exactly the same way how upstream builds it because they already 
provide binary releases.

Kubernetes maintainer wants to disengage from the process and do a quick but 
sloppy work that does not involve cooperation and mutual maintenance of 
individually packaged dependencies.

Debian way of packaging Golang software have distinct advantages over 
upstream bundle. In Debian, maintainers own dependency tree, helping each 
other to maintain ecosystem of reusable libraries. Golang software benefits 
from compartmentalisation because vendored libraries are not tested on build.
That is, Kubernetes is build from large number of untested vendored 
libraries.

Individually packaged re-usable libraries are usually running their test 
suites on different architectures, exposing bugs with greater visibility.
Another argument for using packages libraries is distro-consistency of used 
components.

Case of Kubernetes over-vendoring is obvious. Making exception for Kubernetes 
threatens to open pandora's box and destroy the very identity of Debian.

-- 
All the best,
 Dmitry Smirnov
 GPG key : 4096R/52B6BBD953968D1B

---

Within any important issue, there are always aspects no one wishes to
discuss.
    -- George Orwell

Attachment: signature.asc
Description: This is a digitally signed message part.


Reply to: