On Tue, 2019-10-22 at 20:21 -0700, Russ Allbery wrote: > I define reproducibility as generating the same Debian source package > from a signed Git tag of my packaging repository plus, for non-native > packages, whatever release artifacts upstream considers canonical > (which may be a signed tarball or may be a Git tag or may be > something else entirely). That is a great definition of reproducibility if all you are interested in is the Debian version of the package. It is not so great if you want is the upstream version of the package - ie, it is important to you that it behaves identically or at least diverges in accountable ways. In that case you want a clear audit trail from the upstream source to the Debian binary. On Tue, 2019-10-22 at 20:21 -0700, Russ Allbery wrote: > All of this business with patches and whatnot is an implementation > detail. If you are thinking of patches in terms of .dpatch files in debian/patches then we both agree, as I don't consider the representation to be particularly important. It could be branches stored in git for all I care, perhaps managed by a tool like gquilt. What is important to me is the source contain an audit trail or how Debian got from the upstream source to the Debian package. If I understand your position correctly, your proposal boils down adding a (single) branch to the upstream .git for the debian changes. My problem isn't with using git - it's with the word "single". It isn't even with you using a single branch, as perhaps that's appropriate for the packages you maintain (which it would be if the only change is to add a debian directory). My problem is the implication that since it good enough for you, it's good enough for every package. It's not. When you are carrying a lot of changes it's bloody horrible. Perhaps an illustration may help. I used to be a consumer of RedHat kernels. Back in the 2.6 days they carried 100's if not thousands of individual patches for stuff they backported form Linux 3.0. (I gather they still do carry a lot of patches for their LTS releases.) When you wanted to add your own modification there was invariably conflicts, and without knowing what patches it conflicted with and why it was just impossible. Then Oracle released their "own" Linux distribution. It was a copy of RedHat, something Oracle didn't go out of its way to acknowledge. Effectively Oracle was garnishing for themselves part of RedHat's revenue stream (support fees) using a rebadged RedHat product. RedHat responded by doing effectively what you are suggesting. They replaced source rpm's audit trail of every change they made and why with one humongous, uncommented patch. Technically they were operating in accordance with the layers reading of the GPL I guess - they were distributing the source. But it sure as hell wasn't in accordance with a programmers definition of "source" (which is along the lines of something you can edit), as porting a patch from a the .orig kernel to RedHat's became damned near impossible. A second illustration is the kernel development process itself. One huge patch is not considered acceptable. They must be smaller, easily understood, digestible patches. The quilt source format encouraged that format - to the point of having lintian checks for it. Nowhere do you propose a similar mechanism - or even acknowledge it's important. On Tue, 2019-10-22 at 23:20 -0700, Russ Allbery wrote: > Checking reproducibility only back to a set of patches does *not* > provide a real guarantee of reproducibility, since a supply-chain > attack could still have introduced malicious code in the patch > generation process. You are damming the good because it's not perfect. It's true there are still ways of attacking the code, it merely renders those attacks visible and attributable. In fact rendering all changes visible and attributable by insisting they are signed is *precisely* the mechanism the kernel uses to defend itself both from malware attacks of the type you envisage and when someone attempts to add copyrighted code that opens the kernel to legal attack later. Turns out a bit of sunlight is a great disinfectant. On Tue, 2019-10-22 at 23:20 -0700, Russ Allbery wrote: > like an argument for dropping all of the features that I want and > retaining only the feature that you want, when you can derive the > feature that you want (at some additional complexity cost, to be > sure) from the format that I'm arguing for. I can see how you might think that. The reality is a different. At no stage have I suggested you should be prevented from using git, or indeed any other mechanism you desire. I have said if you adopt a new system like dgit please figure our a way of implementing one feature the one you are replacing (quilt) - a way to audit changes. But it has been proposed that everybody be forced to drop whatever workflow they might like in favour of dgit, and you look to be arguing in favour of that idea. If we moved the source format to git as you are proposing, you would be forcing everybody drop the features they like and adopt your workflow. I don't see that entangling the preferred format of some Debian developers to format we distribute things in gains us anything. With the current split between development format and source distribution format you get to develop in any way you please. If it wasn't split now there would have been no dgit. The source format can be optimised for distribution and in fact is so optimised. It is pretty much as efficient as it can be - it contains all the things you need to work on the source, and little else. Even if we went down the path of insisting everything be on salsa in a git repository, I still don't understand why you would distribute 2.8GB bare kernel git repository instead of 100MB of sources. It's not like the DD's gain anything from such a move. To me it would make more sense to cease distributing the source as we do now. The policy would become if you want the source clone it from salsa. I would not favour such a change, but at least you could argue it reduces amount of work Debian does to distribute it's product.
Attachment:
signature.asc
Description: This is a digitally signed message part