Hi everybody, Short Reason: Too many packages of no use to our users. Longer reason: Many packages get added to Debian that are of no (direct) use to our users. Each package adds metadata to the indices that needs to be downloaded, processed by tools and also clutters up the whole package list for no practical benefit. A split out packages file will allow us to minimise the effect on users. More and more packages are being uploaded into the Debian archive which are only ever used for building packages. These are not only never intended to be installed onto an end-user's system, they are even actively discouraged from being used directly by a user. The two currently most notable examples are packages used by the Go and Rust programming languages and their ecosystem, but there well may be others[1]. While we need their library packages to build the applications that use them, they are entirely statically compiled and none of the libraries will ever be installed on a normal user's system. Moreover, the language ecosystem in Debian actively discourages users from installing them for anything other than rebuilding a Debian package. If you do general (non-Debian-specific) development using Go or Rust on your machine, the expectation is that you will use the language-specific tools to install your dependencies [2]. Currently however, all of those packages end up in the indices we generate, which users have to download and package managers have to read and deal with. Each of those packages therefore slightly increases the size of these indices for little reason and while many users have access to large bandwidth connections and fast CPUs, that is not the case for many other users and does not benefit global warming. For the Rust ecosystem, those sizes increase even more, as each of their libraries can provide multiple features. For example, a TLS library can link against GnuTLS or OpenSSL or some other random TLS implementation. Such features may even be combined in various different ways, resulting in an excess number of possible feature combinations for one Rust package, called "crate". Those are "mapped" to the Debian package world by creating something we call *feature packages*, with one such feature package per feature or combination thereof (usually grouped by common dependencies). Those feature packages are empty packages, only containing a symlink for their /usr/share/doc/… directory. Their size is smaller than the metadata they will produce. Adding new features means one more trip through the NEW queue each time such new binary packages are introduced. The FTPTeam disagree with the feature-package solution[3], so currently there is a workaround. By collapsing the features into the main library package and declaring the features using the Provides header similar functionality is achieved. However this doesn’t work in all situations, for example: Tools can generate really long Provides: lines, with the current record being around 250kb. That's long enough that a tool (not dak itself) broke on it already. And those lines may grow larger in future. Some features may need different (build-)dependencies (say, GnuTLS vs OpenSSL), should those conflict with each other, you cannot combine them into one package and must fall back to the feature package solution. Generally, the workaround involves changing upstream's dependency structure in order to fit it into the aforementioned Debian constraints, and so of course this may not always play nicely with other packages that expect the unchanged upstream dependency structure. The feature-package solution is a 1-to-1 mapping. There have been multiple discussions between the FTPTeam and the Rust package maintainers. The FTPTeam does not want those feature packages in the part of main downloaded by users and currently rejects them from NEW, while the Rust maintainers see them as needed and the workaround as just that. Both sides agree that this is not a productive and sustainable solution and that we need to agree on something better. The current proposal is to reduce the main Packages.xz files size by splitting[4] out all of the packages that are not intended for users, writing those into an own file. Those packages would have a section of "buildlibs", independent of their other properties. That section should only be activated on buildds and in situations that need build-dependencies available (say, an archive rebuild, a user rebuilding packages that need Build-Dependencies from there), but not by default anywhere else. This section will allow feature packages and *may* even let them bypass binary-NEW if they only add new feature (empty) packages. The exact way of how this gets implemented, both in dak and also apt, is still being discussed between the ftpteam and the apt maintainers. We have ideas from writing out section based packages files to presenting it as a subcomponent to main, and we think we will have something finalized pretty soon. It possibly needs small changes on the side of release managers, wanna-build admins or other tools that need to read the full Packages information, we will provide more information on that when we are sure about the changes. Advantages of this approach are that the mechanism by which we assign packages into the buildlibs file instead of the main file are flexible on the archive side. Whilst we intend to use the package section initially, this is a policy decision which can be altered without clients needing to update. We also have the ability, should it ever be necessary, to add other indices files where it makes sense. For the timeline for this change: We hope that this will be ready before bullseye (especially if we would end up needing a patch in apt), so that after the release we could gradually switch to split Packages files. Footnotes ------------- [1] The focus currently is on Rust, as it has the most pressing need to resolve the issue. We know that the new section may also be useful for Golang, and we know something of how that is currently handled. This is, however, definitely not limited to just those, if you think that your package set is a good candidate to move here, please get in contact with us. [2] The go get for Golang, cargo build for Rust ways. [3] While the trip through NEW for basically nothing is annoying, the real problem is the metadata size. [4] We first thought about an entire new archive, but that is much more separate, creating a higher workload on maintaining it. Additionally, it would create problems following the licenses of packages. Then we thought about a new component besides main/contrib/non-free, and while that works better, it still has many negative side effects including requiring extra package uploads, extra tracking for the release team and requiring multiple components if we later decide that we need to support this for all of main, contrib and non-free. -- bye, Joerg
Attachment:
signature.asc
Description: PGP signature