On Wed, Jan 08, 2025 at 10:19:34AM +0100, Julien Plissonneau Duquène wrote: > Le 2025-01-07 21:52, Peter Pentchev a écrit : > > > > Hm. That sounds interesting, but I think the Debian project cannot > > protect such a mirror from automatically bringing in non-DFSG content > > that appears in the remote repository. One might even take this one step > > further and go to content forbidden by law in various jurisdictions. > > Then we are going to have the same issue implementing automated upstream > release imports in packaging repositories, e.g. with the Janitor, and this > is a service I would very much like to have. Unfortunately you are correct that the same problem would arise. > I would worry more about malicious content getting automatically pulled in. > But anyway this can probably be mitigated the way large platforms do: make > it possible to easily report abuse and being diligent in investigating them, > eventually putting the repository offline until the issue is cleared. Hm, I would be really, really surprised if there was even one "large platform" that did not shift the responsibility to the user by having them sign a terms of service document upon account registration. Also, I'm not sure that some issues can really be cleared; see below. > Additional automated checks could be implemented to suspend updates and > require human review e.g. with LICENSE changes unless the file contents > matches a whitelist. That would put the responsibility on the uploader to review not only the actual changes (as in a diff) between the releases, but each and every individual file in each and every commit between the two releases. I don't think this is completely realistic. Why each and every individual file? Well, consider this: - version 3.14.1 is tagged - version 3.14.1 is uploaded to Debian - somebody pushes a commit to the upstream repo that adds a file that really does not belong there - two more "real" commits are pushed - somebody pushes a commit that reverts the "add a bad file" one - three more "real" commits are pushed - version 3.14.2 is tagged - version 3.14.2 is uploaded to Debian ...so, if at this point the mirror pulls in the Git commits between versions 3.14.1 and 3.14.2, there will exist several publicly-accessible blobs that will contain the file that really does not belong there. Clearing the issue would require rewriting Git history, squashing commits or dropping them altogether, which would make the Debian version of the "upstream" Git repository no longer be a mirror. > Alternatively the mirroring could be implemented to pull only the release > tags after a package is uploaded to the archive (which means that someone > reviewed the changes), and dealt with on a case-by-case basis for non-free > packages or packages that have +dfsg repacking. In Git repositories, pulling the release tag involves pulling (and making available) all the commits leading up to it, even the reverted ones, so... see above. In general, automatically mirroring Git repository content is... fraught with various issues. G'luck, Peter -- Peter Pentchev roam@ringlet.net roam@debian.org peter@morpheusly.com PGP key: https://www.ringlet.net/roam/roam.key.asc Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13
Attachment:
signature.asc
Description: PGP signature