[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: unexpected NMUs || buildd queue



Wouter Verhelst wrote:
[snip]
> > > What's your alternative?
> > 
> > Obviously to clean the chroot automatically unless a clean-buildd-shutdown
> > flag was written. Shouldn't be hard to implement.
> 
> I'd like to see that implemented properly. There are a number of issues
> I can think of offhand that would make it hard:
> * the system crash might the result of the build itself. Runaway
>   memory-eating loop, causing the swapper to trash so horribly that a
>   power cycle is the fastest way to get it up and running again, for
>   example. Yes, those happen. In such a case, you don't want to wipe out
>   the chroot; you want to check out what went wrong, and you might need
>   whatever's in the chroot to find out.

Then, on startup, the buildd should check for a existing chroot, and
stop or move it aside if it wasn't clean.

> * unstable is a moving (and breaking) target. Doing a debootstrap --
>   especially when tried noninteractively -- doesn't always work. And
>   yes, we need the chroot to be unstable. Think about it.

Try an unstable debootstrap, if this fails, try testing and upgrade,
if this fails as well, try stable and upgrade. Alternative: Keep an
clean unstable chroot tarball around and update it regularily.

> There are probably more things I could come up with, but I didn't try
> hard. Wiping out and recreating the buildd chroot isn't an option.
> Neither is creating a new one alongside the original, unless the disk
> space requirements are a non-issue (which isn't true for some of our
> archs).

Worst case would be to stop the buildd in such a condition. Many
buildd machines should be able to do better.

> The only other option I could think of is to implement an AI that would
> investigate the chroot and remove any anomalies before restarting the
> next build, obviously all the while creating a perfectly detailed log
> (as in, exactly the amount of details you'll need to learn about what
> went wrong; nothing more and nothing less). That'd be nice to have, I'd
> say... ;-)
> 
> Really, such cleanups can't be properly automated IMO. I agree that
> there are cases where buildd could be improved, but that doesn't mean
> manual cleanups can be avoided; and after a system crash, if a cleanup
> is required, it must be done manually.

There will surely remain some cases where automatic cleanup isn't
possible, but handling common failure modes automatically should work.


Thiemo

Attachment: signature.asc
Description: Digital signature


Reply to: