[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Woody retrospective and Sarge introspective



Hello world,

So, I'm going to try breaking my -devel-announce habit. I wonder if this
means nobody'll notice this mail. I guess I can hope, hey?

As promised: I hereby boldly predict that Debian GNU/Linux 3.0 (woody)
will be released before the end of July, 2002!

And now, a brief musical interlude.

      The battle's done, and we kinda won,
      So we sound our victory cheer:
            Where do we go from here? 

Anyway, as the dust settles, it's probably about time to work out what
just happened and figure out, well, where *do* we go from here?

So what did just happen? Well, we just tried a release with a completely
new methodology: rather than hack on stuff for a while, then spend
however long it takes making it reliable, we tried to keep woody bug
free for the entire time we worked on it.

It didn't quite work out that way, of course. We had a whole bunch
of problems:

	2000/01/16 - 2000/12/19:
		woody and unstable were literally the same thing, so
		woody certainly didn't have minimal bugs by any means. on
		the 19th, woody got rolled back to equal potato, losing
		X 4.0, and bunches of other stuff. it took a few more
		months to get those packages back in in a way that didn't
		break things too badly. Up until the first few months
		of 2001 (ie, around six months after potato's release),
		it required pretty constant attention, all of which did
		a good job of distracting me and presumably others from
		getting on with the other parts of releasing woody.

	2000/01/16 - 2002/04/09:
		woody missing functional boot-floppies. in more detail:
			- 2001/04/08: no boot-floppies at all
			- 2001/06/21: boot-floppies that didn't work even 
			              on i386
			- 2001/08/24: boot-floppies only built on some of the
			              architectures that released with potato
			              (alpha was the last one)
			- 2001/10/18: boot-floppies not built for all the
				      architectures that ended up
				      releasing with woody
			- 2002/03/14: debootstrap has RC bugs of one sort or
			              another
			- 2002/04/09: b-f's built everywhere with releasable
			              components
		(there was an additional b-f's build done between May 16, and
		 May 21st)	

	2000/01/16 - 2002/04/16:
		missing CDs for some/all architectures (we had unofficial
		i386 CDs since 2001/07/20 and probably earlier, we
		didn't get alpha CDs until some time after 2002/04/16,
		sparc CDs weren't bootable at least until the most recent
		b-f's upload in May) On the other hand, we didn't really
		make much of an effort towards getting official woody
		CDs until early this year, either.

	2001/10/19 - 2002/06/21:
		more architectures in woody than the security team were
		able to support. (mips, mipsel and s390 were added on
		the 19th.  hppa and ia64 had been added earlier than
		that, so it's possible that it would've been too much
		effort to support those architectures too) the security
		team had been trying to improve the situation for quite a
		while on general principle, but nobody realised just how
		much of a problem it was until March or April this year.
		everyone probably remembers how the story went from May
		onwards.

Those issues alone were enough to make woody not be released before June:
releasing without CDs or any way of installing particular architectures is
unacceptable, as is releasing software we can't do security updates for.
And given that July was largely spent making sure the new security
infrastructure worked, and fixing bits that didn't, it pretty well
means that we couldn't have released _any_ earlier, if we hadn't done
the above substantially better, somehow.

What's particularly interesting is that we *didn't* end up spending the
last few months before the release with everything else ready, trimming
down the release critical bug list. By contrast, even though potato had
working boot-floppies and CD images at the end of the second test cycle
[0], we still had to make changes to dpkg, gcc, postgresql, sysvinit,
xfree86 and so on. There were about 150 packages updated all up in the
month between test cycle 2 and test cycle 3 (test cycle 3 took another
month, with no changes). In the three months leading up to woody's release
(May - July), there were about 100, which compares fairly favourably,
especially when you also consider the increase in both packages (3900
to 8300 in main/i386) and architectures (6 to 11).

Also fairly indicative of the decreased effect of the release critical bugs
on the release process overall is probably the "spike" in mid-February on
the RC bug graph:

	http://bugs.debian.org/~wakkerma/bugs/graph.png

It drops from a peak of about 442 on 2002/02/15 to a low of 95 on
2002/03/23. This coincides most notably with the end of linux.conf.au
2002 on the 9th (which freed up a fair amount of time for me and thus
resulted in a fair number of packages getting dropped from woody and
bugs getting downgraded), and also with a couple of bug squash parties
and probably some renewed vigour amongst NMUers in general [0].

Those were the big ticket items, anyway -- at least as far as I'm
concerned. It wasn't all that happened by any means. We also had:

	crypto-in-main: This took absolutely ages to get done. Finding
		a lawyer to get legal advice from, working out what
		exactly we wanted advice about, making sure we understood
		the advice, working out what we'd do about it, and then
		doing it all took months. It was probably important to
		get this done: a fair chunk of software -- most notably
		everything postgresql related -- that was in main in
		potato had started depending on the SSL libraries by mid
		last year, and having to remove that from main would have
		been fairly unpleasant in a bunch of ways. This was all
		finished reasonably well by the end of April, though.

	the gradual freeze: You'll probably remember that for most of last
		year I was advocating a gradual freeze: first we'd freeze
		policy, then base, then boot-floppies, standard and
		base, and finally optional and extra, then release. This
		was a complete flop. We got away with freezing policy
		reasonably well (although even it had to be "corrected"
		with a helpful little webpage [1]). We tried freezing
		base at the end of November, but by January we still had
		a fair number of severe bugs in base (glibc alone was
		still having packaging updates in late March and April),
		and base uploads started getting automatically promoted
		to testing again, albeit with twice the delay.

		The only part of this that probably *was* effective
		was the "NO MAJOR CHANGES!!" policy, which has been
		more-or-less in effect since last late last August (11
		months prior to woody's release, although I guess most
		of you probably could've worked that out yourself :).
		It was *probably* more effective than we needed, in fact.

	the RC bug list: Unfortunately, we pretty majorly screwed this one
		up. The bug tracking system has never been particularly
		good at handling multiple distributions. This hasn't
		been too much of a problem when we just had stable
		and unstable, since we've never worried too much about
		tracking the bugs in stable, but it's a fairly serious
		problem with testing, since it makes it hard to get a
		grip on what RC bugs remain in testing: some RC bugs
		are new in the unstable version, others have been fixed
		in the unstable version, and so forth. This needs some
		fairly significant changes to the BTS before we can fix
		it though (which was why it hasn't been done already).

There's probably more, but given that it'd be nice to finish writing the
email before sarge is released, let's move on. Hopefully that covers
the most significant parts of "what just happened?" on the "release
management" side of things. That, naturally enough, leaves us with
"where next?".

Well, what's next is another distraction:

	http://lists.debian.org/debian-devel-announce/2001/
		debian-devel-announce-200104/msg00004.html

Grep for "realistic schedule". Doh.

I think there're probably three things we want out of sarge's transition
to stable:

	(1) Speed.
	(2) Less wasted time.
	(3) Better communication and more transparency.

The main things blocking (1) were:

	(a) the six month delay before we started trying to release
	(b) boot-floppies taking twelve months to develop
	(c) CDs not being ready
	(d) not noticing that security wasn't ready, then getting it ready

By (2), I'm referring to things that could have been ready by the time
we released (or that *were* ready), but that nevertheless didn't make
the release. dpkg 1.10, xfree86 4.2, better i18n in a number of places,
and a few other things. The things blocking this were (probably):

	(e) the "NO MAJOR CHANGES!" freeze, 11 months prior to release
	(f) the staged freeze, from a similar time
	(g) the real freeze, from 3 months prior to release

The final point, (3), was probably the most irritating for a bunch of
people. Particularly from January 2002, it really wasn't made particularly
clear where we were going, or how were going to get there. In most ways,
there was little to be done about that for woody: it's difficult to tell
people what's going to happen when, if you don't know yourself; and that's
naturally the case when the things that need to happen involve significant
development, which each of boot-floppies (from January 'til April or so),
CD images, and the security buildd stuff did. OTOH, we can probably do
a better job of avoiding getting ourselves in that situation in future.


So anyway, that's my take on what happened and what the problems are. I
mightn't be the most unbiassed observer, of course. There might be other
problems that are more significant than those above: delays getting
packages into testing because other packages in unstable are broken
can be annoying -- and maybe it's more important to fix that than to
release quicker; the RC bug list might still be a bottleneck, in spite of
"testing" or the above arguments. I don't think so, but you might.


My opinion on what we should do about this hasn't changed much: I still
think the best way of getting consistent, controllable is to maintain
a candidate distribution in a releasable state permanently.

Where we failed in doing that for woody is probably pretty obvious:
we didn't have releasable boot-floppies, official CDs, or a security
updates repository for woody for most of its life as "testing". As such
testing *wasn't* releasable for quite a while, and we had to wait ages
before we did actually release.


Time for another digression. I've been asked a few times if I intend
to continue as release manager for sarge, and I've been fairly coy and
noncommital about answering. Nominally, there're only two things you
need to be release manager: the support of the DPL (to get your name
up in lights on the "organisation" page), and the support of ftpmaster
(so that when you say "it's released", it actually is). In reality,
at least in my experience, it's at least as important to have most
people either to listen to what you say and to head in that direction,
or to give you an equally or more effective alternative. I can sing
and dance all I like about having everything releasable at all times,
but if no one else gives a damn about it, we're not going to get anywhere.

Now, I don't really know how to judge the second part. I'm concerned that
there've been comments like:

    <moshez> aj certainly has given the impression he wishes to continue
      to be RM, and from the "vote [1] bdale" I surmise he has had some
      understanding with bdale...''

(this isn't the case, remotely) or that I have to go through huge
flamewars fairly regularly about release oriented matters [2].  Now,
maybe that's just a minority thing, or maybe it's just indicative of
the project as a whole at the moment, but it's not helpful for me to be
feeling nervous about raising these matters, least of all when *more*
transparency and communication is what we're aiming for.

Now, like I said I don't really know how to judge this well. Maybe the
easiest way to start is by seeing if the hystrionics in this thread
can be kept to a minimum. Anyway, I'd like to continue bossin' y'all
around as RM, but it's possible everyone's sick of me doing that and
would rather it come from someone else. I've no idea. But presuming that
there's at least some curiousity as to what I see happening for sarge,
I'll keep on rambling anyway.


The way I would like to see sarge run is basically to get it to
"roughly" releasable status ASAP then keep it there. By that I mean
getting some official CDs made for testing quickly, and automating them
so they get updated every week or so. I mean getting some installers for
sarge into the archive and maintained across all architectures as sarge
develops. Likewise for security updates. And, in general, likewise for
anything else that's important to have working when we release.

The easiest one of these to get and keep working is probably the
CD images.  I've already talked to at least Phil Hands about this,
and it looks like we should be able to setup raff.debian.org to serve
jigdo images for sarge (and some isos) updated on anything up to a
daily basis.  Getting them to be bootable obviously depends on us getting
some installation tools that work with sarge, but hopefully that can be
worked around in the meantime -- CDs that're only useful for upgrades
are better than none at all.

Security updates are probably not too difficult to manage either. The
hardest part is probably keeping track of which packages need updating and
what the fixes are. On the upside, the current security buildd stuff that
we just did for woody should work with sarge already. It's not clear if we
need a separate security team to manage updates to packages in testing,
or exactly how much effort it will be. (And before suggesting that we've
already got this problem licked with the new security infrastructure,
please realise that we've got the possibility of releasing a Hurd or
BSD based distro in the not too-distant future, either of which have
reasonable odds of introducing new complexities to maintaining security
updates; and, in any case, it's the unforseeable problems that we're
aiming to avoid)

Getting the installer right, however, could be quite difficult. We've
never done it before, so who knows? There are many problems with
boot-floppies, but the only ones that're relavant here are the ones
affecting their maintainability. In particular it's proven exceedingly
difficult to get boot-floppies to build across all architectures, and
to update it from one release to the next.

One set of problems is related to just how difficult it is to get b-f's
to build. Unlike just about everything else in Debian, b-f's doesn't
autobuild.  Try it, you won't like it. Building involves getting an up
to date checkout from CVS, making sure your kernel has the appropriate
options available, getting various .debs from the archive to flesh out
the installation disks themselves, trimming unused symbols from any
libraries to make sure they fit in 1.44MB, and so on. Kernel updates,
program updates, just about everything tends to knock b-f's off balance
and require a fair degree of thought and effort to make them work again.

The official solution to this is the debian-installer project. It's a
rewrite of the installation system from the ground up, with two important
features. The most straight forward is that it uses debconf for its user
interaction. The more important, as far as ongoing maintainability is
concerned, is that whereas the woody installer is built as a single
"chunk" (ie "boot-floppies"), debian-installer is made up from many
"udebs" (micro-debs) which, generally, are smaller counterparts to
real debs. For example there is a debootstrap.udeb, which provides
the installer with the tool to actually unpack the base system. It's
built as part of the regular debootstrap upload, and autobuilds quite
successfully. This applies to the entire d-i installer: all the tools,
busybox, parted, kernel-image, whatever are all built as .udebs and
that's all there is to it.

Well, actually it's not quite all, and this is the first problem with
d-i that probably needs addressing. You can't put a udeb onto a floppy
disk and expect it to boot. You have to run a special script that gathers
all the udebs you want to boot with (a kernel, a UI of some sort, a tool
to let you get any other udebs you might need and aren't including from
the net or a CD or similar), makes a filesystem out of them, and makes
it bootable. This is fine and necessary, the part that's a problem is
that this script is likely to need to be run by hand (not autobuilt)
on a machine of the target architecture, probably as root. While this
is better than the b-f's situation, it's not particularly efficient
or scalable. What we want is to be able to upload a new version of
debootstrap, and then have udebs automatically built, and floppy and
CD images automatically constructed for all architectures. Probably the
most difficult part of this is making, say, a bootable powerpc image on
an i386 host.

The second issue, which isn't really a problem, but could use addressing
anyway, is that as a complete rewrite of just about everything, d-i is a
pretty risky project. At present, even getting cfdisk to work with d-i
is a major undertaking, since it has a UI that's not debconf. However,
the nice thing about .udebs is that, being modular, you can mix and match
them however you like. As such, it should be possible to construct a
dbootstrap.udeb or a pgi.udeb, that uses the boot-floppies UI or PGI's UI
to install a Debian system, while retaining most of d-i's maintainability
benefits. This isn't likely to be easy on two grounds.  First, d-i fairly
fundamentally assumes you're using debconf, and neither dbootstrap nor
PGI are likely to sit well with that. Avoiding that might be as simple
(if inelegant) as just not running the postinst's of any udebs when you
"install" them, or it might be more complex. The other problem is that,
for obvious reasons, dbootstrap and PGI don't have any code to download
and install new udebs. That's probably a necessary feature if you want
to allow floppy installs. PGI probably has the additional problem that
it'd need xfree86 udebs before it could really work -- if anyone wants
to try making PGI.udebs, it might be a good idea to get the text-only
mode working first.

The third issue, just for completeness, is that d-i still only works in
a very limited sense. It's not ported to non-i386 architectures, and it
isn't remotely as flexible as woody's installer from a user's POV yet.
There's no big surprise there though, really.


So anyway, that's my theory. What I'd like to see in the next month
or so are some official "testing" CD images (jigdos for everything,
.isos for at least CD#1 and CD#2 of i386, I guess) on a .debian.org
site that're getting updated regularly (at least every two weeks). I'd
like to see some limited, but working d-i images for i386 in a similar
time frame; for bonus marks adding some bootable 50MB (or so) testing
CD images along with the others. I'd also like to see some some sort of
movement towards getting proper security support for testing, but real
milestones for that are hard (for me) to pick.

(Bdale's probably going to send out a mail about all the /real/ features
we're aiming towards for sarge sometime soon, so happily I don't have
to :)

Cheers,
aj

[0] Probably worth reading again, in case all the excitement since has
    let you forget. NMUs are good, mmmkay?

      http://lists.debian.org/debian-devel-announce/2002/
		debian-devel-announce-200201/msg00014.html

[1] http://people.debian.org/~ajt/woody_policy_addenda.txt

[2] eg, Bug#97671 which went all the way to the tech ctte, or
    things like getting b-f's to stop adding new features or changing
    kernels and so on while they don't work everywhere, or on what hurd
    should do to be suitable for a stable release.

-- 
Anthony Towns <aj@humbug.org.au> <http://azure.humbug.org.au/~aj/>
I don't speak for anyone save myself. GPG signed mail preferred.

 ``If you don't do it now, you'll be one year older when you do.''

Attachment: pgp5Halu9PzeA.pgp
Description: PGP signature


Reply to: