Re: unexpected NMUs || buildd queue

To: Wouter Verhelst <wouter@grep.be>
Cc: Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de>, debian-devel@lists.debian.org
Subject: Re: unexpected NMUs || buildd queue
From: Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de>
Date: Sun, 18 Jul 2004 08:51:30 +0200
Message-id: <[🔎] 87658l6c5p.fsf@mrvn.homelinux.org>
In-reply-to: <[🔎] 20040717191549.GJ2980@grep.be> (Wouter Verhelst's message of "Sat, 17 Jul 2004 21:15:49 +0200")
References: <[🔎] 20040715162323.GE5379@downhill.at.eu.org> <[🔎] 20040715184228.GA25062@on.debian.linux.org.pl> <[🔎] 20040715220249.GF5379@downhill.at.eu.org> <[🔎] 878ydkmm9j.fsf@mrvn.homelinux.org> <[🔎] 20040717004626.GJ11864@grep.be> <[🔎] 87brifclt4.fsf@mrvn.homelinux.org> <[🔎] 20040717072309.GD17407@grep.be> <[🔎] 874qo7au26.fsf@mrvn.homelinux.org> <[🔎] 20040717163106.GB2033@grep.be> <[🔎] 871xja8ot5.fsf@mrvn.homelinux.org> <[🔎] 20040717191549.GJ2980@grep.be>

Wouter Verhelst <wouter@grep.be> writes:

> On Sat, Jul 17, 2004 at 08:35:18PM +0200, Goswin von Brederlow wrote:
>> Wouter Verhelst <wouter@grep.be> writes:
>> > On Sat, Jul 17, 2004 at 10:58:57AM +0200, Goswin von Brederlow wrote:
>> >> Wouter Verhelst <wouter@grep.be> writes:
>> >> > In any case, buildd doesn't write to disk what it's doing (the
>> >> > build-progress file is written by sbuild), so if it's aborted
>> >> > incorrectly (i.e., it doesn't have time to write a REDO file), that
>> >> > information goes lost.
>> >> >
>> >> > That's probably a bug, but once you know about it, it's easy to work
>> >> > around (it just means you have to clean up after a crash, but you have
>> >> > to do that anyway, so...)
>> >> 
>> >> Which is one of the things realy screwed up on the buildd/sbuild
>> >> combination.
>> >
>> > What's your alternative?
>> >
>> > You have to clean out the chroot anyway when the system goes down
>> > unexpectedly, or anything horrible might happen. The alternative would
>> > be to clean out and rebuild the chroot automatically -- don't tell me
>> > multibuild tries to do that?
>> 
>> That depends on what method of chroot cleaning / regeneration is being
>> configured.
>> 
>> One option is to have a template chroot (as tar.gz for example) and to
>> untar that for every build anew. Cleaning is a simple rm.
>> 
>> Another option is to have an LVM volume and make a new snapshot of it
>> for every build. Cleaning removes the snapshot.
>
> When and how are those template chroots or volumes updated? What steps
> are taken to ensure those updates don't take away too many resources
> whilst still ensuring the chroots are (reasonably) up-to-date?

The template is normaly managed by the buildd admin. At some time he
goes into maintainance mode (which would mosly stop any new builds
from being started but doesn't always need to wait for a build to
finish) and does his thing. It is still his job to keep the template
working.

The current behaviour of just having one chroot that always just
installs/purges packages without getting refreshed is reflected in one
possible configuration. If the buildd admin thinks any of the other
methods waste resources he can keep the old way.

The buildd will update build-essential and a buildd-essential package
when cloneing the template to ensure builds are always done with
current core packages. The multibuild buildd could compare the version
of buildd-essential from the template and the updated one and notifiy
the buildd admin if they differ to much (e.g. debian revision differs
is fine but major version gives a notification).

That also means that through build-essential and buildd-essential the
maintainers (or any DD NMUing them) can force the buildd to
update. E.g. when one version of binutils is found buggy
buildd-essential can conflict with it or depend on a newer one.

> In case of the tar.gz template, what happens when - say - a bug exists
> in a postrm script of one of the GNOME packages, resulting in a number
> of multi-gigabyte chroots laying around on the disk?

A failure in purging the Build-Depends doesn't mean the build has
failed. Normaly if a build succeeds you don't keep a copy of the
chroot and a cleanup failure will recreate the chroot form the
template. On the other hand you might want to keep the chroot on build
failures but then that is before the cleanup and postrm won't be
caled. You get those multi-gigabyte chroots laying around even with
working postrm, if you so choose.

A normal configuration (what I consider normal) would not keep the
chroots around. Only on build failure the source build tree is kept
with a log of what was used in the chroot. A tool to recreate a chroot
by looking at a buildd log and using snapshots.debian.net is on my
todo list and in my opinion that is a better solution than keeping
broken chroots around.

Another thing is that multibuild plans include giving packages with
simmilar build-depends preferably to one buildd. That should optimize
(minimize) downloading Build-Depends. Multibuild is also ment to give
packages with build-depends chains to one buildd which can use the
local debs (probably only post signing) instead of having to wait for
a dinstall run.

The multibuild client could implement purging of only those packages
that are not needed for the next build. After each build the purge
function for the chroot type is called and gets a list of future
builds as parameter. Currently that is complety ignored (I always wipe
the chroot and untar a new one in my test config) but I had exactly
that feature in mind designing the buildd <-> chroot interface.

Think about the time saved for m68k by not purging and reinstalling
gnome between two gnome package builds.

But back to your question again: Say you do keep a copy (tar.gz) of
the chroots around after a failure. Hopefully gnome packages will be
build in a bunch and the postrm will only be called once.

>> The current way corresponds best to having a fixed chroot and cleaning
>> via debfoster.
>
> Sorry, parse error. Do you mean to say that this is what multibuild does
> by default currently? If not, I'd appreciate it if you could elaborate a
> bit.

I have an abstract interface between the buildd and chroot
handling. You can implement your own type of chroot handlings, like
creating a new NBD on a remote server for every build, or use one of
the premade ways.

One of those implementations is that you have one chroot (per
unstable/testing/experimental) that gets cleaned (via debfoster
currently) after the build and reused for the next. Thats currently
the closest to the existing setup.

Other methods are to make a new lvm snapshot, to cdebostrap or to just
untar.gz a fresh template.

>> There is also the possibility of rebuilding the chroot from scratch
>> (which calls cdebootstrap and a few extra commands to configure the
>> chroot).
>
> I'd hope this is not the default, unless you've given up on
> outperforming buildd/sbuild ;-P

Its all configure options. The preformance of cdebootstraping a fresh
chroot for every build coupled with lvm volumes can be way quicker
than the existing setup. Creating a chroot takes some time, but you
gain that back because cleanup would not purge packages but dump the
snapshot (if killing processes left running in the chroot after a
build proves to work well). Installing a new chroot is faster than
purgin a gnome or kde build.

But I don't think the cdebootstrap method will be used for cloning the
per build chroot. Thats more for bootstraping a fresh template every
once in a while or on failures (see quotes below).

>> The build also has two levels of creating a chroot:
>> 
>> 1. bootstraping a new template (which is usualy done with cdebootstrap
>> but could be untaring a meta template and updating it)
>> 
>> 2. cloning a template for a specific build (which means untaring,
>> making a snapshot or linking the static template into the right place)
>> 
>> Under normal operation the buildd just clones a new chroot for every
>> build and removes it afterwards (debfoster and unlink for the static
>> case). If a chroot failure is detected (like repeated failures to
>> install or purge packages) the build will try to bootstrap a fresh
>> template and might stop if that fails or also doesn't work.
>
> ... possibly resulting in a buildd which doesn't do shit for 9 hours. Or
> so. The buildd scenario (failure to uninstall a package resulting in not
> bothering to uninstall it anymore) is far more effective at avoiding
> that issue, I think. If not, at least it doesn't waste CPU and starts
> idling sooner (so it appears in the logs sooner, too).

Most configurations wipe the chroot after the build. A cleanup failure
means nothing for them. e.g. you just untar the template.tar.gz for
the next build anyway.

You might be right about the waste or wrong. For most archs the extra
time to create a fresh chroot from the template is neglible and the
benefits make up for it. If that truely is the case and maybe even for
m68k will have to be seen with a running implementation.

Hey, if it turns out that a failure to uninstall something should be
ignored (provided it does not break installing other debs) we can put
that in the specific chroot handling. The interface is abstract enough
to make that easily possible.

>> Most systems will no longer suffer from install/purge problems from
>> one build getting dragged into the next build.
>
> Well, that doesn't really happen all that often with buildd either. In
> case things break down at uninstall or purge time, buildd simply doesn't
> care; it leaves the packages installed in the chroot, and goes on to
> build the next package. Fast and simple; no time is wasted trying to
> clean up. Indeed, every once in a blue moon the chroot breaks more or
> less because of a postinst bug, but it doesn't happen that much that I'd
> want to waste CPU time and disk buffers to useless stuff such as
> "recreating a buildd chroot from scratch, because we *think* it might be
> broken". That sounds almost like the "Format C:" strategy many would-be
> computer experts practice far too often.
>
> Frankly, all this trouble to get a clean chroot seems a bit excessive to
> me. There's nothing requiring us to build in a perfectly clean chroot,
> you know; all buildd does is make sure the build-depends and
> build-conflicts are fulfilled. What more do you need?

It's the buildd admins choice. If you want you could stop cleaning the
chroot alltogether unless its a full moon or your birthday. You just
implement a configuration for the chroot handling that does 'if [ !
$full-moon -a ! $birthday ]; then exit 0; fi' first thing in the
cleanup.

> Of course, avoiding broken chroots is cool; but you'll get those anyway.
> If not because an install didn't work, then probably because debootstrap
> or some upgrade failed. Why waste so many of your precious CPU cycles to
> avoid something if it'll happen anyway?

Different priorities. My CPU cyclesare not that precious to me. Way
more time is spend waiting for downloads than untaring a choot
takes. But that is just my setup. Different needs, different
configurations.

>> I expect using a tar.gz template will be the most used config. Also a
>> buildd should stop before it runs amok and fails 200 packages. That is
>> the plan anyway.
>
> Hey, that'd be a cool feature, indeed.
>
>> As a sidenote the build dir will be mounted into the chroot normaly
>> and umounted before cleaning up. So failed builds still remain when
>> the chroot is just wiped.
>
> Sometimes builds fail because the phase of the moon wasn't right. I
> don't think I want to see the build chroot wiped out too fast, but
> that's probably just me.

Normaly recreating the chroot with the exact same versions use before
should be enough. For the full moon problems, e.g. the FS messed up a
file during dpkg --unpack, its not. You do want to keep chroots around
for the off chance it was a full moon then do so. But it will be at
the cost of disk space.

When using lvm snapshots it would be easy to configure it to keep the
last 10 chroots of failures around or up to 10G or up to a week or
only for a list of packages that had problems before. It's certainly is
possible.

> Forgive me for being sceptical; I might sound negative, but I'm really
> just interested in how you're dealing with some issues I found out about
> when I had the "wonderful" idea of fixing the numerous bugs in the
> hackish bunch of scripts I thought buildd and sbuild were. It was only
> then that I realized how great some of their concepts are. Which is not
> to say that their coding style is great, but that's a different issue
> altogether ;-)

The way its being delt with is by hacking the problems into abstract
chunks and providing a flexible interface between them. The chroot
handling is such a chunk, a black box that can performe certain
functions.

How it performs them (cdebootstrap, lvm snapshot, untar, good old
cleanup) does not intrest the buildd code. Thorugh that I hope we can
easily adapt to different needs and hardware restrictions.

Documentation and code is also simplified because only the black box
interface needs to be understood for each component to get a good
picture.

MfG
        Goswin

Reply to:

Follow-Ups:
- Re: unexpected NMUs || buildd queue
  - From: Wouter Verhelst <wouter@grep.be>

References:
- Re: unexpected NMUs || buildd queue
  - From: Andreas Metzler <ametzler@downhill.at.eu.org>
- Re: unexpected NMUs || buildd queue
  - From: Bartosz Fenski aka fEnIo <fenio@o2.pl>
- Re: unexpected NMUs || buildd queue
  - From: Andreas Metzler <ametzler@downhill.at.eu.org>
- Re: unexpected NMUs || buildd queue
  - From: Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de>
- Re: unexpected NMUs || buildd queue
  - From: Wouter Verhelst <wouter@grep.be>
- Re: unexpected NMUs || buildd queue
  - From: Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de>
- Re: unexpected NMUs || buildd queue
  - From: Wouter Verhelst <wouter@grep.be>
- Re: unexpected NMUs || buildd queue
  - From: Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de>
- Re: unexpected NMUs || buildd queue
  - From: Wouter Verhelst <wouter@grep.be>
- Re: unexpected NMUs || buildd queue
  - From: Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de>
- Re: unexpected NMUs || buildd queue
  - From: Wouter Verhelst <wouter@grep.be>

Prev by Date: Re: buildd.net has been shutdown
Next by Date: Re: -= PROPOSAL =- Release sarge with amd64
Previous by thread: Re: unexpected NMUs || buildd queue
Next by thread: Re: unexpected NMUs || buildd queue
Index(es):
- Date
- Thread