Re: The noudeb build profile and dh-only rules files
On Mon, Jul 08, 2019 at 10:04:06PM +0200, Moritz Mühlenhoff wrote:
> Theodore Ts'o <tytso@mit.edu> schrieb:
> > Back in the days of boot/root installation floppies, saving every last
> > byte was clearly important.
>
> It's probably worth discussing/investigating whether udebs in general still
> make sense for d-i in 2019?
>
> It was a design choice made 15 years ago, but disk/network constraints
> changed a lot since then. Maybe ditching them entirely would actually
> reduce a lot of toil in d-i and make d-i development more flexible?
> (Honest question, not trying to insinuate anything)
I think there are a lot of things one could plausibly say about good
directions for the Debian installer. "Move to debs" isn't the one I'd
pick, although if somebody thinks it's useful they could certainly try
working out the details. I would say that udebs themselves don't
particularly cause much toil in d-i, although they do cause some for
maintainers of non-installer packages that are reused by d-i (as shown
in this thread).
Note that udebs don't exist solely because of disk constraints, much
less network constraints (which were never really a big consideration):
another reason for the separate package type was so that there was no
plausible way they could accidentally be installed on "real" systems,
and so a good argument could be made that they didn't need to comply
with all of Debian policy. For the ones that are merely minor
variations of existing runtime library packages, the difference is less
obvious, but it's more striking when you look at the udebs that form
part of the installer itself. So IMO any attempt to drop udebs and move
d-i to debs needs a lot of thought about the knock-on effects.
(Some more stream-of-consciousness stuff follows. Credentials: I have a
solid decade or so of contributions to d-i and either wrote or heavily
hacked on quite a few of its core components, although I haven't been
particularly active recently: so any express or implied criticism should
be read as being directed at least as much at myself as anyone else.
But in any case I'm not trying to apportion blame. I'm taking some
rhetorical liberties when I say "we designed d-i" etc., since I only got
involved in 2004 and wasn't around in the earliest stages.)
In 2003ish, I think setting the goal that d-i should reuse as much as
possible from Debian made a lot of sense. boot-floppies suffered really
badly from being more or less a monolith that was off on its own
relative to the rest of Debian, and you can see the reaction to that
very clearly in d-i's design: it's broken down into many small
components which can be maintained and uploaded (and migrated to
testing, for that matter) independently, and reuses both code and
concepts from elsewhere in Debian. Once it got past a tipping point of
basic usability in late 2003 / early 2004, it became very accessible for
Debian developers to come in and hack on. I have fond memories of time
spent in various in-person d-i meetings that routinely had something
like a dozen active hackers: maybe this just reflects my interests, but
it's by far the largest group I've worked with on any single project in
Debian before or since.
If we'd been doing the same thing 10-15 years later, I think we'd have
taken different decisions:
* There might have been other interesting axes of reuse, as well as or
instead of prioritising reusing other parts of Debian: just as we
might have decided to reuse Anaconda back in the day, nowadays I hear
good things about Calamares. There've been benefits to being
Debian-specific, but there've also been costs.
* I think/hope that we collectively understand software engineering
rather better than we did then, and we would certainly have designed
something that was more amenable to unit testing. (Some parts of d-i
do have localised unit tests, but some of the most difficult
components are really very challenging to test outside an installer
environment, and mostly that isn't a property intrinsic to what
they're doing but rather to their implementation choices.)
* Size constraints and such are indeed not what they were 15 years ago,
although there's still some value in trying to be lean. The thing I
would take from this is not to abandon udebs, though, since despite
the name I don't think size is their only important property; rather,
if we were designing d-i today I think we might very well make
different language choices. We've pushed shell a long way, but it
really isn't a particularly suitable language.
* Things like cloud instances are a big deal now when they really
weren't 15 years ago. In many cases these are deployed using
pre-built images and an installer doesn't really get involved, but
there are still more situations where installer performance matters
now than when we were designing d-i.
You could take the lesson from this that we should ditch d-i and move to
another system. (Indeed, Joey wrote
https://joeyh.name/blog/entry/propellor_is_d-i_2.0/ a while back which
is arguably saying pretty much that, or people are working on Calamares,
or there's the Ubuntu effort to use subiquity, etc.) On the other hand,
unlike the transition away from boot-floppies, we now have a widely-used
system with a very flexible customisation mechanism (preseeding), and
I'd be concerned that Debian might systematically undervalue its users'
time if it chose to move away from d-i. We are where we are. It
doesn't hurt us that other installers exist, but I'd like d-i to stay
healthy and improve. So here are some very biased ways I think we could
systematically improve d-i in the next decade:
* Incrementally rethink some implementation choices.
Our policy for a long time has been that we only use busybox sh and
C, and there were good reasons for that at the time, but it's also
2019 now and maybe other choices are possible without doing
unreasonable things to the sizes of those images that are still
constrained.
A memory-safe language with good testing support and a good testing
culture would be great, though it does also need to work on every
Debian architecture, which IIRC Rust doesn't quite; we've kicked
around the idea of maybe a stripped-down Python in the past, which
would preserve some useful live-hackability properties while being
much more capable. Only the parts of the installer up to the
retriever are really tightly size-constrained, and lots of
interesting things like the partitioner come after that point.
Anything like this could be done piecewise, but I think we would want
to pick basically one extra language and stick with it, so it'd need
to be worthwhile.
* Targeted rewrites of components with overwhelming technical debt.
I proposed an os-prober rewrite a while ago
(https://lists.debian.org/debian-boot/2017/01/msg00245.html), but
haven't got very far with it; maybe this is the sort of size that
could be tackled as a GSoC project?
partman is also extraordinarily delicate. I think there are probably
fewer than half a dozen people on the planet who properly understand
it, and it could really do with (at least) a robust test suite and
the ability for recipes to be nested in ways that correspond roughly
to the various ways in which block devices can be nested, rather than
the current extremely ad-hoc arrangements for dealing with things
like RAID and encryption. Since it runs after the retriever, it can
probably afford to use a better implementation language, although it
is itself divided into many components so any rewrite would involve a
lot of thought.
Anything like this should have solid unit tests as a mandatory part
of its development.
* Look into reducing the burden on non-installer maintainers.
This sort of thread comes up because Debian developers who don't
really work on the installer but whose software happens to be part of
the installer need to carry code to support it. Maybe there are ways
to relax that and at the same time avoid some of the problems we
sometimes run into where d-i ends up blocked on non-installer
packages, without having to abandon the useful properties of udebs
entirely. For example, we could allow d-i to be more flexible about
pulling in parts of debs at build time, or we could work out how to
mix some (maybe tagged?) debs and udebs in the installer environment,
or something like that. Not sure, and it wouldn't be
straightforward, but it's worth a look.
* Performance.
Migrating hot spots away from big piles of shell would help, but I'm
sure there are other bits of low-hanging fruit. I profiled partman
some years ago and was able to find serious improvements in its
partition update and display code without needing to do anything more
complicated than a bit of caching and hoisting code out of inner
loops. There's surely more of the same. Other approaches might
involve doing more to ensure we're disabling fsync et al where
appropriate, or installing the relatively fixed base system from an
image rather than from debs.
* Just ordinary maintenance. Really.
Cyril has been doing sterling work keeping things going, but I hope
they won't be offended when I say that I have the impression that
d-i's bus factor seems to be decreasing gradually; maybe it works
well enough for most DDs that they don't need to be very involved,
but it's not as sustainable as the days when we could easily get a
dozen or more people on feature development. If anything on this
list sounds interesting to you, go and help out for a while first so
that you get a feel for how things work!
--
Colin Watson [cjwatson@debian.org]
Reply to: