Hi all, I've been familiarizing myself with packaging Rust crates into Debian for a while. Recent observations lead me to suspect that, current practice of Rust packaging isn't optimal. A little background: Debian packages, as most other package systems do, have only, well, packages. Rust packages, or crates, have one particular addition: features. They can be enabled or disabled by other crates using them, making code and dependencies under feature flags fully optional. Also, they use https://semver.org[semver], or semantic versions, which itself carries a set of rules regarding version compatibility. Actually, their own https://doc.rust-lang.org/cargo/reference/resolver.html[variant]. In Debian we have a different set of rules. == The problems The current practice of packaging Rust crates in Debian basically maps features into "subpackages", e.g. for a crate `curl` with a feature `ssl`, it's packaged into `librust-curl-dev` and `librust-curl+ssl-dev`. The former holds the actual source code. The latter may or may not be a separate package. The Debian Rust team has a https://salsa.debian.org/rust-team/debcargo-conf/#setting-collapse-features-in-debcargo-conf[reason for this]: > This means that the following dependency chain is not a problem in rust: > > - crate A with feature AX depends on crate B with feature BY > - crate B with feature BX depends on crate A with feature AY > > This is a perfectly valid situation in the rust+cargo ecosystem. Notice that there is no dependency cycle on the per-feature level, and this is enforced by cargo; but if collapse_features is used then package A+AX+AY would cyclicly depend on package B+BX+BY. > This is reflected in the Debian packages by producing Provides lines for all combinations of features, and this can become a quite large section. For versions, it's split into "level" subpackages: 1.2.3 for crate `foo` would be `librust-foo-dev`, `librust-foo-1-dev`, `librust-foo-1.2-dev`, `librust-foo-1.2.3-dev`. Depending on different level, combined with version requirement on it, is how currently version constraint is expressed. This practice creates a myriad of subpackages: This time we use a "feature-rich" crate as example: The `async-std` crate has 23 features not including `default`. That's 24 * 4 = 96 combinations. ---- $ apt search rust-backtraceapt search 'rust-backtrace\+' Sorting... Done Full Text Search... Done librust-backtrace+cpp-demangle-dev/testing,unstable,testing 0.3.66-2 amd64 Acquire a backtrace at runtime - feature "cpp_demangle" librust-backtrace+rustc-serialize-dev/testing,unstable,testing 0.3.66-2 amd64 Acquire a backtrace at runtime - feature "rustc-serialize" and 1 more librust-backtrace+serde-dev/testing,unstable,testing 0.3.66-2 amd64 Acquire a backtrace at runtime - feature "serde" and 1 more librust-backtrace+verify-winapi-dev/testing,unstable,testing 0.3.66-2 amd64 Acquire a backtrace at runtime - feature "verify-winapi" librust-backtrace+winapi-dev/testing,unstable,testing 0.3.66-2 amd64 Acquire a backtrace at runtime - feature "winapi" $ apt info librust-backtrace+serde-dev Package: librust-backtrace+serde-dev Version: 0.3.66-2 Priority: optional Section: rust Source: rust-backtrace Maintainer: Debian Rust Maintainers <pkg-rust-maintainers@alioth-lists.debian.net> Installed-Size: 9,216 B Provides: librust-backtrace+serialize-serde-dev (= 0.3.66-2), librust-backtrace-0+serde-dev (= 0.3.66-2), librust-backtrace-0+serialize-serde-dev (= 0.3.66-2), librust-backtrace-0.3+serde-dev (= 0.3.66-2), librust-backtrace-0.3+serialize-serde-dev (= 0.3.66-2), librust-backtrace-0.3.66+serde-dev (= 0.3.66-2), librust-backtrace-0.3.66+serialize-serde-dev (= 0.3.66-2) Depends: librust-backtrace-dev (= 0.3.66-2), librust-serde-1+default-dev, librust-serde-1+derive-dev Homepage: https://github.com/rust-lang/backtrace-rs Download-Size: 1,268 B APT-Sources: http://apt-cacher.local/debian bookworm/main amd64 Packages Description: Acquire a backtrace at runtime - feature "serde" and 1 more This metapackage enables feature "serde" for the Rust backtrace crate, by pulling in any additional dependencies needed by that feature. . Additionally, this package also provides the "serialize-serde" feature. ---- To solve, or at least ease this problem, the `collapse_features` flag has been introduced. Once set to true, all feature packages will be converted to *Provides* virtual packages. Then the *Provides* field of `debian/control` exploded with exponential combination of feature packages and split version packages. ---- $ rg field-too-long build/*.build build/rust-redis_0.21.6-1_amd64-2022-08-31T01:00:47Z.build 1702:E: librust-redis-dev: field-too-long Provides (5386 chars > 5000) build/rust-postgres-types_0.2.4-2_amd64-2022-10-17T06:16:48Z.build 1473:E: librust-postgres-types-dev: field-too-long Provides (7206 chars > 5000) build/rust-async-std_1.12.0-1_amd64-2022-08-31T16:57:07Z.build 1978:E: librust-async-std-dev: field-too-long Provides (5114 chars > 5000) build/rust-async-std_1.12.0-1_amd64-2022-08-31T16:28:09Z.build 1978:E: librust-async-std-dev: field-too-long Provides (5494 chars > 5000) ---- This approach created another problem: if two features each depends on a dependency that's mutually exclusive, the two dependencies would both be in the package's *Depends*, rendering the package uninstallable. The version constraint couldn't express complex requirements like `>= 1.5, < 3.0`. This currently translates to `librust-foo-2-dev | librust-foo-1-dev (>= 1.5-~~)`, which doesn't work for foo 1.x due to `|` constraint not implemented yet. There are also remnants of actual separate feature packages, on which the topic the link above points at, `collapse_features`, is concerned. == Why they need to be changed === Features Crates, when not compiled, are just sources files. The code is there, in the same package, whether features enabled or not. crates.io, the official Rust package repository, doesn't provide separate feature downloads; only packages. For clarity, let's divide crates into two categories: library crates, which are for use in other crates; and binary crates, including applications and linkable libraries, are to be compiled and for "end use". Library crates, whether in crates.io or packaged elsewhere, are still a bunch of source files. Only the test stage involves compilation in Debian context. Currently we "build" them, as in running `cargo test`, but no "build artifacts" are collected. They are installed to the system as a copy of the source files. autopkgtest runs that again. For a library crate, when everything goes smoothly, it could be used by others as normal. Nothing to do about features. When a feature needs to be disabled, however, for example when it needs nightly Rust but we don't provide it, it's represented as absence of a feature package in current practice. A binary crate that depends on it fails to find the feature package, and fails building. What if we don't map features? The binary crate would depend on the library crate's package, successfully install it, and fail at build stage. Same result. Rust has an outstanding reputation for easy to understand error messages, so no need to worry if the packager couldn't find out the reason. But the build time increased! Well, we could put that in a custom field in `debian/control`, such as `X-Rust-Disabled-Features: nightly`, so our build tool would know a feature is missing and exit early. When a binary crate is packaged, the packager decides which features and dependencies are enabled or disabled, so they need to know this information anyway. With "featureless" packages, we have far shorter *Depends* and *Provides* fields, fewer packages (ideally 1:1 to crates), and less clutter. ==== What about circular dependency? I did a little analysis on crates.io-index, and found there are three kinds of circular dependency in the Rust package ecosystem. 1. A crate depending on older versions of itself. Strange, but acceptable. We already have versioned packages, e.g. `librust-curl-dev` for 0.4 and `librust-curl-0.3-dev`. 2. A "main" package depending on a "component" package, and the component dev-depending on main. dev-deps are for tests, which we could put in `debian/tests/control`. 3. The exact situation described above: A+X → B+Y && B+Z → A+W. This could be further divided into two situations: a. building A or B, could be considered the same as 2; or b. building binary C that depends on A+X or B+Z, where the packager decides which features and/or dependencies are to be enabled, no need to put in *Depends*. === Versions We can have `librust-foo-dev >= 0.1.0, librust-foo-dev << 0.3` in *Depends*. == Proposal For features, we still put strong dependencies in *Depends*. Optional and test dependencies should be put in `debian/tests/control`. This is effectively *Test-Depends*. Optional dependencies could be *Suggests*, but it's largely irrelevant for end users, so we may as well not. Current practice needs a transition. We could progressively update to not depend on feature packages, those with no reverse dependencies first, then remove features in *Provides*. For versions, we should start expressing version constraints in *Depends* versions. This also needs a transition, but can be integrated into the feature one. -- Sdrager, Blair Noctis
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature