[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFC] Proposal for new source format

On Tue, 2019-10-22 at 20:21 -0700, Russ Allbery wrote:
> I define reproducibility as generating the same Debian source package
> from a signed Git tag of my packaging repository plus, for non-native 
> packages, whatever release artifacts upstream considers canonical
> (which may be a signed tarball or may be a Git tag or may be
> something else entirely).

That is a great definition of reproducibility if all you are interested
in is the Debian version of the package.  It is not so great if you
want is the upstream version of the package - ie, it is important to
you that it behaves identically or at least diverges in accountable
ways.  In that case you want a clear audit trail from the upstream
source to the Debian binary.

On Tue, 2019-10-22 at 20:21 -0700, Russ Allbery wrote:
> All of this business with patches and whatnot is an implementation
> detail.

If you are thinking of patches in terms of .dpatch files in
debian/patches then we both agree, as I don't consider the
representation to be particularly important.  It could be branches
stored in git for all I care, perhaps managed by a tool like gquilt. 

What is important to me is the source contain an audit trail or how
Debian got from the upstream source to the Debian package.  If I
understand your position correctly, your proposal boils down adding a
(single) branch to the upstream .git for the debian changes.  My
problem isn't with using git - it's with the word "single".  It isn't
even with you using a single branch, as perhaps that's appropriate for
the packages you maintain (which it would be if the only change is to
add a debian directory).  My problem is the implication that since it
good enough for you, it's good enough for every package.  It's not. 
When you are carrying a lot of changes it's bloody horrible.

Perhaps an illustration may help.  I used to be a consumer of RedHat
kernels. Back in the 2.6 days they carried 100's if not thousands of
individual patches for stuff they backported form Linux 3.0.  (I gather
they still do carry a lot of patches for their LTS releases.) When you
wanted to add your own modification there was invariably conflicts, and
without knowing what patches it conflicted with and why it was just
impossible.  Then Oracle released their "own" Linux distribution.  It
was a copy of RedHat, something Oracle didn't go out of its way to
acknowledge.  Effectively Oracle was garnishing for themselves part of
RedHat's revenue stream (support fees) using a rebadged RedHat product.
RedHat responded by doing effectively what you are suggesting.  They
replaced source rpm's audit trail of every change they made and why
with one humongous, uncommented patch.  Technically they were operating
in accordance with the layers reading of the GPL I guess - they were
distributing the source.  But it sure as hell wasn't in accordance with
a programmers definition of "source" (which is along the lines of
something you can edit), as porting a patch from a the .orig kernel to
RedHat's became damned near impossible.

A second illustration is the kernel development process itself.  One
huge patch is not considered acceptable.  They must be smaller, easily
understood, digestible patches.  The quilt source format encouraged
that format - to the point of having lintian checks for it.  Nowhere do
you propose a similar mechanism - or even acknowledge it's important.

On Tue, 2019-10-22 at 23:20 -0700, Russ Allbery wrote:
> Checking reproducibility only back to a set of patches does *not*
> provide a real guarantee of reproducibility, since a supply-chain
> attack could still have introduced malicious code in the patch 
> generation process.

You are damming the good because it's not perfect.  It's true there are
still ways of attacking the code, it merely renders those attacks
visible and attributable.  In fact rendering all changes visible and
attributable by insisting they are signed is *precisely* the mechanism
the kernel uses to defend itself both from malware attacks of the type
you envisage and when someone attempts to add copyrighted code that
opens the kernel to legal attack later.  Turns out a bit of sunlight is
a great disinfectant.

On Tue, 2019-10-22 at 23:20 -0700, Russ Allbery wrote:
> like an argument for dropping all of the features that I want and 
> retaining only the feature that you want, when you can derive the 
> feature that you want (at some additional complexity cost, to be 
> sure) from the format that I'm arguing for.

I can see how you might think that.  The reality is a different.  At no
stage have I suggested you should be prevented from using git, or
indeed any other mechanism you desire.  I have said if you adopt a new
system like dgit please figure our a way of implementing one feature
the one you are replacing (quilt) - a way to audit changes.  But it has
been proposed that everybody be forced to drop whatever workflow they
might like in favour of dgit, and you look to be arguing in favour of
that idea.  If we moved the source format to git as you are proposing,
you would be forcing everybody drop the features they like and adopt
your workflow.

I don't see that entangling the preferred format of some Debian
developers to format we distribute things in gains us anything.  With
the current split between development format and source distribution
format you get to develop in any way you please.  If it wasn't split
now there would have been no dgit.  The source format can be optimised
for distribution and in fact is so optimised.  It is pretty much as
efficient as it can be - it contains all the things you need to work on
the source, and little else.  Even if we went down the path of
insisting everything be on salsa in a git repository, I still don't
understand why you would distribute 2.8GB bare kernel git repository
instead of 100MB of sources.  It's not like the DD's gain anything from
such a move.  To me it would make more sense to cease distributing the
source as we do now.  The policy would become if you want the source
clone it from salsa.  I would not favour such a change, but at least
you could argue it reduces amount of work Debian does to distribute
it's product.

Attachment: signature.asc
Description: This is a digitally signed message part

Reply to: