[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Mapping Rust behavior 1:1 to Debian packages may not be a good idea



Hi all,

I've been familiarizing myself with packaging Rust crates into Debian for a
while. Recent observations lead me to suspect that, current practice of Rust
packaging isn't optimal.

A little background:

Debian packages, as most other package systems do, have only, well, packages.
Rust packages, or crates, have one particular addition: features. They can be
enabled or disabled by other crates using them, making code and dependencies
under feature flags fully optional.

Also, they use https://semver.org[semver], or semantic versions, which itself
carries a set of rules regarding version compatibility. Actually, their own
https://doc.rust-lang.org/cargo/reference/resolver.html[variant]. In Debian we
have a different set of rules.

== The problems

The current practice of packaging Rust crates in Debian basically maps features
into "subpackages", e.g. for a crate `curl` with a feature `ssl`, it's packaged
into `librust-curl-dev` and `librust-curl+ssl-dev`. The former holds the actual
source code. The latter may or may not be a separate package.

The Debian Rust team has a
https://salsa.debian.org/rust-team/debcargo-conf/#setting-collapse-features-in-debcargo-conf[reason
for this]:

> This means that the following dependency chain is not a problem in rust:
>
> - crate A with feature AX depends on crate B with feature BY
> - crate B with feature BX depends on crate A with feature AY
>
> This is a perfectly valid situation in the rust+cargo ecosystem. Notice that
there is no dependency cycle on the per-feature level, and this is enforced by
cargo; but if collapse_features is used then package A+AX+AY would cyclicly
depend on package B+BX+BY.
> This is reflected in the Debian packages by producing Provides lines for all
combinations of features, and this can become a quite large section.

For versions, it's split into "level" subpackages: 1.2.3 for crate `foo` would
be `librust-foo-dev`, `librust-foo-1-dev`, `librust-foo-1.2-dev`,
`librust-foo-1.2.3-dev`. Depending on different level, combined with version
requirement on it, is how currently version constraint is expressed.

This practice creates a myriad of subpackages:  This time we use a
"feature-rich" crate as example: The `async-std` crate has 23 features not
including `default`. That's 24 * 4 = 96 combinations.

----
$ apt search rust-backtraceapt search 'rust-backtrace\+'
Sorting... Done
Full Text Search... Done
librust-backtrace+cpp-demangle-dev/testing,unstable,testing 0.3.66-2 amd64
  Acquire a backtrace at runtime - feature "cpp_demangle"

librust-backtrace+rustc-serialize-dev/testing,unstable,testing 0.3.66-2 amd64
  Acquire a backtrace at runtime - feature "rustc-serialize" and 1 more

librust-backtrace+serde-dev/testing,unstable,testing 0.3.66-2 amd64
  Acquire a backtrace at runtime - feature "serde" and 1 more

librust-backtrace+verify-winapi-dev/testing,unstable,testing 0.3.66-2 amd64
  Acquire a backtrace at runtime - feature "verify-winapi"

librust-backtrace+winapi-dev/testing,unstable,testing 0.3.66-2 amd64
  Acquire a backtrace at runtime - feature "winapi"

$ apt info librust-backtrace+serde-dev
Package: librust-backtrace+serde-dev
Version: 0.3.66-2
Priority: optional
Section: rust
Source: rust-backtrace
Maintainer: Debian Rust Maintainers <pkg-rust-maintainers@alioth-lists.debian.net>
Installed-Size: 9,216 B
Provides: librust-backtrace+serialize-serde-dev (= 0.3.66-2),
librust-backtrace-0+serde-dev (= 0.3.66-2),
librust-backtrace-0+serialize-serde-dev (= 0.3.66-2),
librust-backtrace-0.3+serde-dev (= 0.3.66-2),
librust-backtrace-0.3+serialize-serde-dev (= 0.3.66-2),
librust-backtrace-0.3.66+serde-dev (= 0.3.66-2),
librust-backtrace-0.3.66+serialize-serde-dev (= 0.3.66-2)
Depends: librust-backtrace-dev (= 0.3.66-2), librust-serde-1+default-dev,
librust-serde-1+derive-dev
Homepage: https://github.com/rust-lang/backtrace-rs
Download-Size: 1,268 B
APT-Sources: http://apt-cacher.local/debian bookworm/main amd64 Packages
Description: Acquire a backtrace at runtime - feature "serde" and 1 more
 This metapackage enables feature "serde" for the Rust backtrace crate, by
 pulling in any additional dependencies needed by that feature.
 .
 Additionally, this package also provides the "serialize-serde" feature.
----

To solve, or at least ease this problem, the `collapse_features` flag has been
introduced. Once set to true, all feature packages will be converted to
*Provides* virtual packages. Then the *Provides* field of `debian/control`
exploded with exponential combination of feature packages and split version
packages.

----
$ rg field-too-long build/*.build
build/rust-redis_0.21.6-1_amd64-2022-08-31T01:00:47Z.build
1702:E: librust-redis-dev: field-too-long Provides (5386 chars > 5000)

build/rust-postgres-types_0.2.4-2_amd64-2022-10-17T06:16:48Z.build
1473:E: librust-postgres-types-dev: field-too-long Provides (7206 chars > 5000)

build/rust-async-std_1.12.0-1_amd64-2022-08-31T16:57:07Z.build
1978:E: librust-async-std-dev: field-too-long Provides (5114 chars > 5000)

build/rust-async-std_1.12.0-1_amd64-2022-08-31T16:28:09Z.build
1978:E: librust-async-std-dev: field-too-long Provides (5494 chars > 5000)
----

This approach created another problem: if two features each depends on a
dependency that's mutually exclusive, the two dependencies would both be in the
package's *Depends*, rendering the package uninstallable.

The version constraint couldn't express complex requirements like `>= 1.5, <
3.0`. This currently translates to `librust-foo-2-dev | librust-foo-1-dev (>=
1.5-~~)`, which doesn't work for foo 1.x due to `|` constraint not implemented
yet.

There are also remnants of actual separate feature packages, on which the topic
the link above points at, `collapse_features`, is concerned.


== Why they need to be changed

=== Features

Crates, when not compiled, are just sources files. The code is there, in the
same package, whether features enabled or not. crates.io, the official Rust
package repository, doesn't provide separate feature downloads; only packages.

For clarity, let's divide crates into two categories: library crates, which are
for use in other crates; and binary crates, including applications and linkable
libraries, are to be compiled and for "end use".

Library crates, whether in crates.io or packaged elsewhere, are still a bunch of
source files. Only the test stage involves compilation in Debian context.
Currently we "build" them, as in running `cargo test`, but no "build artifacts"
are collected. They are installed to the system as a copy of the source files.
autopkgtest runs that again.

For a library crate, when everything goes smoothly, it could be used by others
as normal. Nothing to do about features.

When a feature needs to be disabled, however, for example when it needs nightly
Rust but we don't provide it, it's represented as absence of a feature package
in current practice. A binary crate that depends on it fails to find the feature
package, and fails building.

What if we don't map features? The binary crate would depend on the library
crate's package, successfully install it, and fail at build stage. Same result.
Rust has an outstanding reputation for easy to understand error messages, so no
need to worry if the packager couldn't find out the reason.

But the build time increased! Well, we could put that in a custom field in
`debian/control`, such as `X-Rust-Disabled-Features: nightly`, so our build tool
would know a feature is missing and exit early.

When a binary crate is packaged, the packager decides which features and
dependencies are enabled or disabled, so they need to know this information
anyway.

With "featureless" packages, we have far shorter *Depends* and *Provides*
fields, fewer packages (ideally 1:1 to crates), and less clutter.

==== What about circular dependency?

I did a little analysis on crates.io-index, and found there are three kinds of
circular dependency in the Rust package ecosystem.

1. A crate depending on older versions of itself. Strange, but acceptable. We
already have versioned packages, e.g. `librust-curl-dev` for 0.4 and
`librust-curl-0.3-dev`.

2. A "main" package depending on a "component" package, and the component
dev-depending on main. dev-deps are for tests, which we could put in
`debian/tests/control`.

3. The exact situation described above: A+X → B+Y && B+Z → A+W. This could be
further divided into two situations:
	a. building A or B, could be considered the same as 2; or
	b. building binary C that depends on A+X or B+Z, where the packager decides
	which features and/or dependencies are to be enabled, no need to put in
	*Depends*.

=== Versions

We can have `librust-foo-dev >= 0.1.0, librust-foo-dev << 0.3` in *Depends*.

== Proposal

For features, we still put strong dependencies in *Depends*. Optional and test
dependencies should be put in `debian/tests/control`. This is effectively
*Test-Depends*. Optional dependencies could be *Suggests*, but it's largely
irrelevant for end users, so we may as well not.

Current practice needs a transition. We could progressively update to not depend
on feature packages, those with no reverse dependencies first, then remove
features in *Provides*.

For versions, we should start expressing version constraints in *Depends*
versions. This also needs a transition, but can be integrated into the feature
one.

-- 
Sdrager,
Blair Noctis

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


Reply to: