[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: unexpected NMUs || buildd queue



Wouter Verhelst <wouter@grep.be> writes:

> On Sat, Jul 17, 2004 at 09:46:33PM +0200, Goswin von Brederlow wrote:
> The time required for running *one* of those is, indeed, laughable. The
> time required for running those "regularly", whatever that means, isn't;
> especially not on a system where CPU time is the primary resource and
> thus has to be considered scarce by design; You're interested in
> efficiency, not in time. To put it otherwise, the important question is
> "what percentage of our non-idle time is spent on the actual
> dpkg-buildpackage run?"

Relevant is how much time is idle. As long as there is a large amount
of idle time there for future grows I don't see a problem.

But not all archs have that. So we do care about reducing overall
build time. If we can save 10 minutes per build but waste 1 minute for
preparing the chroot for the build we still have 9 minutes saving.
But can we? A successfull build will always take the same amount of
time in dpkg-buildpackage no matter what. For those cases you are
right, reducing the time not spend in dpkg-buildpackage is the only
way to speed things up.
I guess we have to implement it and compare different configurations
for the chroot handling to judge the overhead produced by different
methods. There is not much point arguing about it now since its mainly
just statistics (which we don't have).

But what we can reduce, and what also eats up a lot of time, is the
time spend in build failures. If we can avoid a build failure (like
one resulting in dep-wait) by setting it to dep-wait directly we win a
lot. (Far more than the overhead of chroot maintainance I think).

The design ideas of the multibuild server is such as to reduce the
work being wasted on the buildd clients or on buildd admins. The main
change there is tracking of Build-Depends and Depends so as not to try
to build source that fail and have to be put into dep-wait by the
admin. But there are some other things planed.

Avoiding a build for gnome or kde packages or building them in the
right order with proper delays for dinstall runs inbetween will save
hours of wasted build time for m68k (less for others), save a huge
delay till the admin comes around and saves the admins time. For
several archs the admin time seems to be the limiting factor (not for
m68k, true).

>> Even creating the chroot from scratch with cdebootstrap is a matter of
>> one to a few minutes nowadays.
>
> Not bothering to clean the chroot takes no time at all; and experience
> tells me that buildd is fairly capable at maintaining a chroot which,
> although not perfectly clean and up-to-date, can still be used to build
> packages in.
>
> In fact, in the three years that I've been a buildd maintainer now, I
> can remember only a handful of occasions where the buildd chroot had
> become so badly broken that I had to intervene before buildd would start
> to build again, and where the cause was *not* a power or hardware
> failure.

But what about power failure? Its pretty hard to recover from
that. You can't be sure the chroot is in good health and even apt/dpkg
can't be truest to tell you, esspecially without journaling FS or
large disk caches that might have been lost.

> I do remember many horror stories of (c)debootstrap failures, though
> (although admittedly less cdebootstrap problems).

If cdebotsrap is to be used regulary (e.g. for every build) I think
the only option is to use testing and upgrade it. Otherwise you are
right, it would fail all the time. But its an option.

>> And for m68k considering disk buffers as problem is a joke. The 128 MB
>> ram will have been flushed and reflushed just by installing/purging
>> those 200MB Build-Depends of the last gnome or kde build.
>
> That's not the issue.
>
> If you're building someting which uses a lot of build-time dependencies,
> disk buffers are more than just a nice to have (else the system will
> need to read stuff off of the hard disk every time you #include some
> header file).
>
> If you start untarring and apt-get upgrade'ing and the like while such a
> package is building, you slow down your build. Doing that to one build
> doesn't really matter; doing that all the time will severly reduce the
> efficiency of your system.

Why would you ever do that in parallel except when the admin does
maintainance on the template for the chroot (for the tar.gz case for
example). That would be something you do once a week or month or only
when needed. Certainly nothing to be done all the time.

And you save some time doing it in parallel when needed. The current
build will be slowed down but it will keep on going while you wait for
the downloads or user input. That time would now be lost unless you
update the chroot while its building (which I can't recomend for a gcc
update).

> I was just trying to show how you lose a lot of time, always; either you
> lose it because the whole system is waiting for apt-get upgrade, either
> you lose time because you reduce the efficiency of the disk buffers.
>
> And no, I don't think that time is negligible just because *one*
> "apt-get update; apt-get dist-upgrade" run takes up less than half a
> minute.

If it does turn out to be a burden the configuration resembling the
current way can be used. Nobody is forcing it to change but the
possibility is there.

>> You also can't count the time the apt-get itself takes since with the
>> current setup you do do exactly the same calls to update the system.
>
> Yes, but only once; once it's updated, it stays updated.
>
> In a cloned chroot scenario, you either need to update your template
> chroot between builds (which increases the risk of ending up with a
> broken chroot, and increases the time it takes to start a build,
> reducing the efficiency of your system), or you risk having to update a
> certain package which is pulled in by a common build-dependency on each
> and every build you do.
>
> Either way, I think your scenarios all result in a less efficient build
> system.
>
>> So the difference is untar/gzip and tar/gzip. Yes, they can take some
>> time on m68k.
>
> They take the same percentage of time away from builds on /every/
> architecture. Wasting time isn't an issue because your processor is
> slow; it's an issue because the processor is the resource you're trying
> to use as efficiently as possible.
>
> It's just more visible on m68k, that's all.

Yes, but its not the limiting factor on most archs.

>> But that is easily gained by not failing a kde or gnome package build
>> that installs 200Mb Build-Depends just to notice the installed version
>> isn't good enough.
>
> I agree that there are some bugs in the system currently in use; this is
> one of them (well. A design issue, really). It's not related to the
> issues we're discussing, though.

Agreed. The statement that multibuild will build packages faster is
based on the multibuild server design. That part greatly affects
(reduces) the build failures that can be avoided and will be themain
selling argument in our eyes.

Comparatively to that we are arguing about peanuts and non existing
statistics. Its all 'i think it will be' so lets stop guessing and
wait for it. Ok?

>> >> > There are probably more things I could come up with, but I didn't try
>> >> > hard. Wiping out and recreating the buildd chroot isn't an option.
>> >> > Neither is creating a new one alongside the original, unless the disk
>> >> > space requirements are a non-issue (which isn't true for some of our
>> >> > archs).
>> >> 
>> >> Worst case would be to stop the buildd in such a condition. 
>> >
>> > You're advocating manual cleanup again here :-P
>> 
>> Yes. better than keeping on building with a broken system as is done now.
>
> I guess this is where we differ in opinion.

I can agree on that.

> I do not consider a chroot where "apt-get build-dep foo; apt-get -b
> source foo" succeeds to be broken (no, I specifically do not care about
> uninstallation). Only if that fails for reasons specific to the chroot,
> I agree it is broken.
>
> What buildd does is, if there's a build-time dependency that cannot be
> uninstalled, to just not bother and continue building with the
> superfluous dependency installed. That's far more efficient IMO.

Yes. Its something to think about in the chroot implementations (the
one that keeps the chroot). It is a good point and I'm willing to
implement the same behaviour there given time.

> (not that the above command sequence is what sbuild actually does, but
> you know what I mean)
>
> [...]
>> Here another feature of multibuild comes to mind.
>> 
>> Multibuild keeps track of the build times of packages and the relative
>> speeds of buildds. Multibuild can then guess the expected build time
>> for a package. If that time is exceeded by a sizeable margin the
>> buildd admin can be notified and on inaction the package will be
>> returned so another buildd can have a shot at it.
>> 
>> The same goes for pakages getting stuck in other temporary states,
>> like being state uploaded for a week.
>
> Hmm. These sound cool.
>
>> Packages that have finished being build will remain in the buildd
>> admins control only for a limited time before getting assigned to a
>> pool for that arch or maybe even a general pool of all buildd admins.
>> Packages that aren't handled by the buildd admin for some reason (like
>> sickness) get then processed by any admin having some spare time to
>> process the pool.
>
> I don't like this one as much, though. Oh well; maybe it's just me.

The current idea is to have imap folders on the multibuild server
(that also has the buildd logs). An upload first lands in the admins
private imap folder and stays there for a while. If the admin takes no
action (could be just tagging it as 'investigating) for some time the
mail would appear in the common imap folder.

The timespan could be 1d or 1w or each buildd admin could set it
himself (going on vacation would set it to 0 or add a forward to
someone else).

But this build log handling is still just an idea. Nobody has started
to implement that part yet. For starters it will be as its now, the
admin gets notified. Ideas on it are welcome.

MfG
        Goswin



Reply to: