Hello world, So, on -devel-announce, I mentioned: > * New "testing" distribution > This is a (mostly finished) project that will allow us > to test out distribution by making it "sludgey" rather > than frozen: that is, a new distribution is added between > stable and unstable, that is regularly and automatically > updated with new packages from unstable when they've > had a little testing and now new RC bugs. > > (Anthony Towns; debian-devel) It's basically ready to be stuck in the archive now, as far as I can tell, but since it's not exactly a trivial change, it's probably time to discuss it a bit more. The basic idea, simplified immensely, is to address this problem: > * Testing updates to frozen is suboptimal: updates go into > incoming, wait there for a while, get added to frozen, > we discover they introduce as many release critical bugs > as they solve, rinse, repeat. The "wait for a while" part > is particularly suboptimal, but without it, it's not really > a freeze. The current way we do things is basically to build a new package, hope it works as advertised, and let people test it. If it doesn't work, we repeat as many times as necessary, or eventually just throw the package out. A better way to handle this, which I suspect everyone's just spontaneoulsy reinvented as the read the above, is to try to keep around a previous version of the package that was usable. That way if the new packages don't work, we can just keep the old one rather than having to throw it out entirely. That, essentially, is the point of the "testing" distribution: to contain a consistent set of the most recent "believed-to-be-reliable" packages. Some subheadings follow. Why call it testing? ~~~~~~~~~~~~~~~~~~~~ One thing that the freeze is really bad at is fixing "normal" bugs. The point of packages in testing is not that they should be perfect or bug-free, just that they should be usable. There's a lot of difference between what we'd like to release (0 bugs, many many features) and what we'll accept for release (~0.005 RC bugs :), and this is really where beta testing should fit in. It also sorts nicely compared to "stable" and "unstable" :) What does "acceptable for release" mean? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For one thing, it means the packages are all consistent: if libgtk1.2.7 is in the distribution, none of the packages should be depending on libgtk1.2.8. For another, it means packages shouldn't have any release critical bugs. It also means a package should be at the same version across all architectures it's present in [0]. It also means the maintainer of the package should be relatively happy with it. It means the package shouldn't have any release critical bugs: that is, no security holes [1] (critical or grave), the package shouldn't crash your system (critical), it should be usable for someone on the planet at least (grave), and it shouldn't violate policy too severely, by having incorrect dependencies, or no copyright, eg [2] (important). Note that what I'm writing here is what I think's best, and what's implemented. If there's an objectively better way of doing things, well, that's why I'm posting. [3] Okay. So the next question you're probably asking yourselves is "how does it work". Well, you don't have to ask yourself, you can ask me. Here's a summary. Archive Layout ~~~~~~~~~~~~~~ As package pools aren't close to being rolled out, I'm opting for as minor a change as possible (which isn't really very minor). So instead of two distributions, stable and unstable, we have three distributions, stable, testing and unstable. As usual packages get uploaded via dinstall to unstable, broken and buggy however they might be. Eventually, by some automated process yet to be described, they eventually get added to the testing distribution. After some amount of time testing gets frozen, fixed, and released (the theory being that this will be easier than freezing unstable, fixing it, and releasing). So basically we'd have: unstable -- bleeding edge, broken, etc testing -- leading edge, maybe buggy, but working stable -- static, usable, going out of date Automated Process? ~~~~~~~~~~~~~~~~~~ So pretty much all the policy is encoded in some "automated process" which updates testing. It works at the moment, basically as follows: 1. First, it loads up all the Sources and Packages files in testing and unstable. 2. It compares and contrasts them, working out what source packages are new in unstable. 3. For each of these new source packages it checks: a. That the package has had two weeks of testing, or it's a medium or high urgency package (and has had either one week, or three days of testing). b. That each binary has been recompiled for each arch it's on. c. That each binary has 0 RC bugs, or fewer than the testing version does [4]. 4. It then collects the source packages that pass 3, and tries installing them in various combinations to see if the number of uninstallable packages in "testing" either drops or remains the same. If so, they're in. If not, they're not. There are a bunch of helper scripts that ensure that dists/testing is fully populated either by symlinks to unstable, or by the files themselves, and that ensure that if the file in unstable is deleted by dinstall, that the symlink is changed to a hardlink to the old file rather than being left dangling. This has been being prototyped on auric, so you can see some stuff about it at http://auric.debian.org/~ajt/, and you can point apt at it too. Pointing apt at it probably isn't really too clever: it doesn't really have the bandwidth for users doing upgrades, or random people doing mirrors, and I keep changing things around fairly frequently to see how the scripts hold up. But you can do it. The actual scripts to do this are all in my home directory on auric, and so are probably only accessible to developers. auric:~ajt/doit.sh is the place to start if you want to have a look. Okay, so what next? Effects on the Release Cycle ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ So the main point of this is to create a distribution that, essentially, doesn't have any release critical bugs [5] and can be kept that way with much less effort on the part of the release manager. That should have a pretty profound effect with regard to speeding up the freeze, since it removes one of the two main bottlenecks [6]. So, here's a rough guide as to how releases might work with a testing distribution with a focus on minimising time in the freeze: * Development time: packages are worked on, new upstream versions are installed. testing is kept fairly bug-free. Users can point apt at testing, and give feedback to the developers before the freeze, without having to worry about bash not working. * Freeze preparation: boot-floppies, CD scripts, release notes are updated to work with the new and updated packages. * Freeze: any remaining problems in testing are dealt with, either by adding them to the release errata, downgrading them, fixing them, or removing the package entirely. Since the remaining problems should be small, the freeze should be able to be kept very short. In addition, development and freeze preparation are entirely parallelizable. It's plausible and even desirable to simply continue to maintain boot-floppies, CD scripts, and release notes throughout the development phase. In an ideal world, testers should be able to obtain bootable CDs for testing as well as stable, in general. Even if the latter doesn't happen, though, eliminating just a few bugs remaining in testing, and any new bugs uncovered by boot-floppies or CD generation should be a lot easier than fixing all the bugs in unstable, as well as what new bugs are uncovered by boot-floppies, CD generation. There's a bit more to it than that actually, but this mail's probably already getting long enough, So that leaves... Transitioning ~~~~~~~~~~~~~ So, here's how I see us ending up when we've *finished* the transition: potato/ woody/ sid/ stable -> potato testing -> woody unstable -> sid That is, all the unreleased architectures, and all the new and broken or untested packages are in sid; potato's still stable; and the packages in woody are getting less and less buggy. To effect this, we would: * desymlink potato/binary-powerpc and potato/binary-arm (which point to sid presently) * remove sid/binary-powerpc and sid/binary-arm * create symlink trees in sid for each of the released architectures pointing at woody. * remove the symlinks for unreleased architectures from woody * point unstable at sid, and update dinstall so uploads to unstable go to sid * point the testing scripts at woody The testing scripts need to cope with a few things here: * Some .deb's in sid will symlink to woody. The .deb's in woody shouldn't be deleted while they're needed by sid. * When a .deb in woody is updated, the .deb will already be in sid (and will have been for two weeks). As such, there should simply be a symlink from woody to the actual .deb in sid to conserve mirror space. * When a .deb in sid is updated, there may be a symlink from woody to it. This symlink needs to be replaced with a copy of the real .deb since there's nothing to link to anymore. They do cope with this at the moment, and they're being prototyped on auric in /org/scratch/ajt/froody (like woody, but a little weird :). The way they cope with this is to keep a separate copy of the testing tree in /org/scratch/ajt/hidden, which rather than having any symlinks is all hard links to the actual .debs. When any of the .debs is removed, the hard links still remain, and can be copied (well, hard linked again actually) into the visible tree. So there you have it. It's coded. It works. It serves a useful purpose. I think we should use it. Cheers, aj [0] As opposed to a package being present in all architectures. That is, I think it's only appropriate to consider "foo doesn't build on the bar architecture" a release critical bug if it's already been built there before. And I also think it's appropriate for that bug to be downgradable if foo is simply removed from binary-bar. [1] That compromise root, or user's data. What about denial-of-service bugs? They don't actually fit into the existing severity levels, that I can see. [2] An explicit enumeration of what "too severely" means should appear in the next policy update, hopefully. [3] Here's hoping it won't degenerate like the "Intent to split" mail did. Yeesh. [4] The number of RC bugs against the testing version is assumed to be the number of RC bugs against the package when that version was the latest in unstable. If it's wrong, it's probably an underestimate so requiring fewer RC bugs in the new package isn't likely to introduce too many new errors. [5] What release critical bugs will it have? Obviously, it'll still have any security problems that get discovered, but presumably they'll be fixed within a day or two. There'll be bugs that have existed for a long time, but that no one's noticed until recently: things like the strange copyright and Depends: of dvidvi. One source of significant numbers of RC bugs in testing might be -policy changes: requiring Build-Depends:, or moving /usr/doc to /usr/share/doc, or requiring packages to be built with libc6 can declare huge numbers of packages buggy, and take a while to fix. The other source of bugs that could be problematic are problems like the bugs against net-tools and nscd: ones that are obviously critical, but aren't reproducable or diagnosed well enough to be fixed. [6] The other being getting boot-floppies to a point where they can be released. -- Anthony Towns <aj@humbug.org.au> <http://azure.humbug.org.au/~aj/> I don't speak for anyone save myself. GPG signed mail preferred. ``We reject: kings, presidents, and voting. We believe in: rough consensus and working code.'' -- Dave Clark
Attachment:
pgp8qzcb2pOhX.pgp
Description: PGP signature